Comparison of OH reactivity measurements in the atmospheric simulation chamber SAPHIR

. Hydroxyl (OH) radical reactivity ( k OH ) has been measured for 18 years with different measurement techniques. In order to compare the performances of instruments deployed in the ﬁeld, two campaigns were conducted per-forming experiments in the atmospheric simulation chamber SAPHIR at Forschungszentrum Jülich in October 2015 and April 2016. Chemical conditions were chosen either to be representative of the atmosphere or to test potential limita-tions of instruments. All types of instruments that are currently used for atmospheric measurements were used in one of the two campaigns. The results of these campaigns demonstrate that OH reactivity can be accurately measured for a wide range of atmospherically relevant chemical conditions (e.g. water vapour, nitrogen oxides, ity method (CRM) has a higher limit of detection of 2 s − 1 at a time resolution of 10 to 15 min. The performances of the instruments were systematically tested by stepwise increasing, for example, the concentrations of carbon monoxide (CO), water vapour or nitric oxide (NO). In further experiments, mixtures of organic reactants were injected into the chamber to simulate urban and forested environments. Overall, the results show that the instruments are capable of measuring OH reactivity in the presence of CO, alkanes, alkenes and aromatic compounds. The transmission efﬁciency in Teﬂon inlet lines could have introduced systematic errors in measurements for low-volatile organic compounds in some instruments. CRM instruments exhibited a larger scatter in the data compared to the other instruments. The largest differences to reference measurements or to calculated reactivity were observed by CRM instruments in the presence of terpenes and oxygenated organic compounds (mixing ratio of OH reactants were up to 10 ppbv). In some of these experiments, only a small fraction of the reactivity is detected. The accuracy of CRM measurements is most likely limited by the corrections that need to be applied to account for known effects of, for example, deviations from pseudo ﬁrst-order conditions, nitrogen oxides or water vapour on the measurement. Methods used to derive these corrections vary among the different CRM instruments. Measurements taken with a ﬂow-tube instrument combined with the direct detection of OH by chemical ionisation mass spectrometry (CIMS) show limita-tions in cases of high reactivity and high NO concentrations but were accurate for low reactivity ( < 15 s − 1 ) and low NO ( < 5 ppbv) conditions.

ity method (CRM) has a higher limit of detection of 2 s −1 at a time resolution of 10 to 15 min. The performances of the instruments were systematically tested by stepwise increasing, for example, the concentrations of carbon monoxide (CO), water vapour or nitric oxide (NO). In further experiments, mixtures of organic reactants were injected into the chamber to simulate urban and forested environments. Overall, the results show that the instruments are capable of measuring OH reactivity in the presence of CO, alkanes, alkenes and aromatic compounds. The transmission efficiency in Teflon inlet lines could have introduced systematic errors in measurements for low-volatile organic compounds in some instruments. CRM instruments exhibited a larger scatter in the data compared to the other instruments. The largest differences to reference measurements or to calculated reactivity were observed by CRM instruments in the presence of terpenes and oxygenated organic compounds (mixing ratio of OH reactants were up to 10 ppbv). In some of these experiments, only a small fraction of the reactivity is detected. The accuracy of CRM measurements is most likely limited by the corrections that need to be applied to account for known effects of, for example, deviations from pseudo first-order conditions, nitrogen oxides or water vapour on the measurement. Methods used to derive these corrections vary among the different CRM instruments. Measurements taken with a flowtube instrument combined with the direct detection of OH by chemical ionisation mass spectrometry (CIMS) show limitations in cases of high reactivity and high NO concentrations but were accurate for low reactivity (< 15 s −1 ) and low NO (< 5 ppbv) conditions.

Introduction
Most gas species in the atmosphere are transformed by their reaction with the hydroxyl radical (OH). These processes lead to the formation of oxidised, secondary pollutants such as ozone and aerosol. Due to the large number of organic OH reactants (Goldstein and Galbally, 2007), several methods have been developed in order to measure OH reactivity (the inverse OH lifetime). OH reactivity (k OH ) is the sum of OH reactant concentrations ([X]) weighted by their reaction rate coefficient with OH (k OH+X ): (1) Predicting trace gas loadings and lifetimes requires a comprehensive understanding of the atmosphere's chemical cycling and oxidative capacity, which is aided by the measurement of total OH reactivity. Measurements can be compared to calculations from OH reactant concentrations in order to quantify unexplained reactivity. In addition, the total loss rate of OH can be calculated if OH concentrations are concurrently measured in order to analyse the OH budget by com-paring the total OH loss rate with the sum of OH production rates. The measurement of OH reactivity has been shown to be extremely useful . Up to several tens per second unexplained reactivity was identified in biogenicdominated environments such as in a forest in Michigan (Di Carlo et al., 2004;Hansen et al., 2014), in the Amazonian rainforest  and in a boreal forest in Finland (Nölscher et al., 2012b). The magnitude of missing reactivity appears to be dependent on the biogenic source, time of the day and season Nölscher et al., 2016). However, the agreement between measured and calculated reactivity is also a valuable result, because it indicates that all trace gases that are relevant for the photochemistry were measured. This was the case in environments that were influenced by anthropogenic OH reactants as in New York  and in the North China Plain (Fuchs et al., 2016), in isoprene-dominated environments during daytime in a Mediterranean forest (Zannoni et al., 2016) and in a chamber study (Nölscher et al., 2014). In addition, the gap between measured and calculated OH reactivity could be closed in some field studies if oxygenated VOCs (volatile organic compounds) derived from model calculation were additionally taken into account (e.g. Chatani et al., 2009;Lou et al., 2010;Kaiser et al., 2016;Whalley et al., 2016). First attempts were also made to measure OH reactivity fluxes (Nölscher et al., 2013).
The application of OH reactivity measurements for the analysis of the OH budget also provided new results. A gap in the understanding of OH recycling processes was found in a field study in Nashville in 1999 , in China in 2006 , in Borneo in 2008 (Whalley et al., 2011) and in chamber experiments investigating the oxidation of isoprene by OH . Because of the close connection between oxidation of organic compounds by OH and ozone production, OH reactivity can help to calculate local ozone production rates .
Several methods to measure OH reactivity have been developed since the first measurements were made by Penn State University (PSU) (Kovacs and Brune, 2001). The different methods fall into two categories. One method determines the OH reactivity directly from the time-dependent decay of measured OH that is artificially produced. The other method determines k OH indirectly from the concentration change of a reference species, which competes with atmospheric reactants in their reaction with artificially produced OH.
In the instrument developed by Kovacs and Brune (2001), the decay of OH is measured in a flow tube through which ambient air is drawn by the direct detection of OH using laser-induced fluorescence. OH is continuously produced by water photolysis. The time-resolved OH decay is measured by varying the reaction time using a movable injector to produce OH. A compact aircraft instrument was later developed H. Fuchs et al.: OH reactivity comparison in SAPHIR 4025 by PSU and deployed for the first time in 2006 (Mao et al., 2009). Similar instruments were built at Indiana University (Hansen et al., 2014) and at the University of Leeds (Ingham et al., 2009). The latter apparatus was recently replaced by an instrument applying a pump-probe technique (see below).
In an alternative instrument a flow-tube set-up is combined with a chemical ionisation mass spectrometer (CIMS) which detects sulfuric acid (H 2 SO 4 ) following the chemical conversion of OH to H 2 SO 4 (Berresheim et al., 2000;Muller et al., 2017). In this instrument developed by the German Meteorological Service (DWD), OH is produced by water photolysis in the flow tube. The reaction of OH with ambient OH reactants is terminated by chemically removing OH by its reaction with sulfur dioxide, which is injected at two positions within the flow tube, giving one reaction time for the OH decay. The remaining OH concentrations for the two injection positions are measured to calculate the OH reactivity. Sadanaga et al. (2004) developed an instrument that uses a pump-probe technique, called laser photolysis -laserinduced fluorescence (LP-LIF). OH is produced by ozone photolysis using radiation of short laser pulse at 266 nm at a low repetition rate of 1 to 2 Hz. The OH decay is observed by laser-induced fluorescence with a high time resolution. The pump-probe technique has the advantage that the flow conditions do not need to be exactly known in order to determine a reaction time. This technique is now used by several groups such as Tokyo Metropolitan University (Sadanaga et al., 2004), the University of Leeds , the University of Lille (Parker et al., 2011) and Forschungszentrum Jülich (FZJ) (Lou et al., 2010).
The indirect technique for the measurement of OH reactivity was pioneered by Sinha et al. (2008). The comparative reactivity method (CRM) is based on the detection of pyrrole that reacts with artificially produced OH in clean or ambient air. The pyrrole competes with the ambient OH reactants, so that the pyrrole concentration depends on the ambient OH reactivity. In most CRM instruments, pyrrole is detected by proton-transfer-reaction mass-spectrometry (PTR-MS) but can also be detected by gas chromatography (GC) (Nölscher et al., 2012a). The CRM method is more commonly used than the direct OH measurement techniques because of the commercial availability of PTR-MS instruments. It is applied by the Max Planck Institute Mainz (MPI) (Sinha et al., 2008), IMT Lille Douai, formally called Mines Douai (MDOUAI) Michoud et al., 2015), Laboratoire des Sciences du Climat et de l'Environnement (LSCE) (Zannoni et al., 2015), Indian Institute of Science Education and Research Mohali (Kumar et al., 2014), the Finnish Meteorological Institute (Praplan et al., 2017), Peking University (Yang et al., 2017), the University of Leicester and University of California, Irvine (Kim et al., 2016).
Only two side-by-side comparisons have been performed between two CRM instruments in a remote environment (Zannoni et al., 2015) and between a CRM and a pump-probe instrument in an urban environment .
Both comparisons show generally good agreement between measurements within 20 to 50 %.
In 2014 a workshop was held at the Max Planck Institute for Chemistry in Mainz in order to assess the current status and future of OH reactivity measurements (Williams and Brune, 2015). At the workshop, a comparison campaign was suggested to investigate the performance of instruments under different atmospheric chemical conditions. Large environmental chambers are ideal for this purpose, as they ensure that all instruments sample air with the same chemical composition. In addition, chemical conditions can be systematically varied. This was demonstrated in several comparison exercises in the atmospheric simulation chamber SAPHIR at Forschungszentrum Jülich (e.g. Schlosser et al., 2009;Dorn et al., 2013) as well as in the EUPHORE chamber (e.g. Pang et al., 2014). Here, we report the results of two k OH comparison campaigns that were conducted in the SAPHIR chamber. The two comparisons were not blind: quick-look data were presented from some groups during the campaigns. After first data submission without the knowledge of the final results from other participants or OH reactant concentrations, data were allowed to be revised. Only final data are presented in this paper, but changes after the first data submission are described.
A large number of OH reactivity instruments applying different techniques were successfully used in these campaigns (CRM instruments of MPI, IMT Lille Douai and LSCE; a flow-tube LIF instrument from PSU; a CIMS instrument from DWD and LP-LIF instruments from Lille, Leeds and FZJ, Table 1). The CRM instrument from the Finnish Meteorological Institute was also used in the campaign, but measurements by this instrument failed due to technical problems and no valid data could be acquired.
2 Experiments in the SAPHIR chamber

The SAPHIR chamber
The outdoor atmospheric simulation chamber SAPHIR is made of a double-wall Teflon (FEP) film and has a cylindrical shape (5 m diameter, 18 m length). The Teflon chamber is mounted inside a steel frame that is equipped with a shutter system that allows for experiments in the dark or in the presence of sunlight. The space between the inner and outer Teflon film is continuously purged with nitrogen (Linde, purity > 99.9999 %) to prevent contamination from outside. In addition, the pressure inside the chamber is 45 Pa higher than ambient pressure. Small leakages and air sampling by instruments require the air to be replenished to maintain the pressure difference. This leads to a small dilution of trace gases by 3 to 5 % per hour. The dilution can be as high as 60 % per hour if the replenishment flow needs to be high.
Ultra-pure air (Linde, purity > 99.9999 %) is used to purge the chamber with a high flow (up to 250 m 3 h −1 ) in order to   Berresheim et al. (2000) 2 s −1 (30 to 40 s −1 ) a Limit of detection/accuracy as stated by the operator. b October 2015. c Comparative reactivity measurement. d Faster flow of 1 L min −1 in inlet line. e Produced continuously by water photolysis (185 radiation of a Pen-Ray lamp). f Derived from the difference in the C1 and C2 measurement. g April 2016. h Laser flash photolysis and laser-induced fluorescence. i Flow-tube and laser-induced fluorescence. j Limit of detection without the dilution, which amplifies this number by a factor of 5. k Peak value produced by flash ozone photolysis (266 nm of a quadrupled Nd:YAG laser). l Flow-tube and chemical-ionisation mass-spectrometry. m Sampling rate from the chamber. clean the chamber. The high flow rate is also required to humidify the chamber air with steam from boiling water that is supplied by a Milli-Q water device. Ozone produced by a silent discharge ozoniser can be added to the chamber air. Two fans ensure that the air is well mixed, so that all instruments always sample the same air mass (e.g. Schlosser et al., 2009;Dorn et al., 2013). The use of high-purity air ensures that there are no measurable gaseous OH reactants present in the chamber after the purging procedure. Small amounts of mostly unidentified organic compounds and nitrogen oxide compounds like HONO (< 100 pptv) can be observed in some cases during the humidification. The total OH reactivity measured by the OH reactivity instrument that is permanently installed at the chamber shows that the reactivity is typically well below 1 s −1 after humidification. In fact, instruments measured on average no significant OH reactivity in these campaigns in the clean chamber (see below).
If the chamber is exposed to sunlight, well-characterised sources for HONO, formaldehyde and acetaldehyde lead to an increase in OH reactivity (production rates are typically 200 pptv h −1 ). Photolysis of HONO (Rohrer et al., 2005) is also the dominant source of OH and nitrogen oxides in the chamber. The source strength depends on the relative humidity, temperature and radiation. The overall increase in OH reactivity is of the order of 0.2 s −1 per hour, and is much smaller than the reactivity from added OH reactants in these campaigns.
OH reactants were added either from gas mixtures via calibrated flow controllers or as liquids that were injected into a heated inlet line with a syringe. The vapours were transported by a flow of synthetic air into the chamber. In addition, a recently built plant chamber allows for the quantitative transfer of mixtures of biogenic organic compounds from up to six trees into the SAPHIR chamber (Hohaus et al., 2016). Environmental parameters in the plant chamber can be fully controlled.

Calculated OH reactivity
A number of instruments for the detection of OH reactants took measurements concurrently with the OH reactivity instruments (Table 2). Nitrogen oxides (NO and NO 2 ) were detected by a chemiluminescence instrument (Eco Physics TR 780). CO was measured using a Piccarro cavity ringdown instrument (Picarro G2301) and by gas chromatography (GC, Trace Analytical RGA 3). Both measurements agreed within 5 %. Data from the cavity ring-down instrument were used here for calculations of the OH reactivity due to its higher accuracy. This instrument also measured methane and water vapour concentrations. Organic compounds were detected by PTR-TOF-MS (proton-transferreaction time-of-flight mass-spectrometry, Ionicon) and GC (Agilent 7890N). Measurements agreed for those species that could be detected by both instruments, such as isoprene better than 20 % with some larger discrepancies for acetaldehyde and β-caryophyllene in the second set of experiments in 2016 (Table 2). Differences between measurements need to be regarded as additional uncertainties in the calculation of OH reactivity.
PTR-TOF-MS measures the sum of methyl vinyl ketone (MVK) and methacrolein (MACR) and the sum of monoterpenes. In order to calculate OH reactivity, PTR-TOF-MS measurements were used taking the relative distribution of MVK and MACR and monoterpenes as measured by GC, because PTR-TOF-MS has a high time resolution. Formaldehyde was additionally measured by a Hantzsch monitor (Aero Laser AL 4001). All reaction rate constants used for the calculation of OH reactivity are taken from IUPAC (International Union of Pure and Applied Chemistry) recommendations (Atkinson et al., 2004(Atkinson et al., , 2006IU-PAC, 2017) if not stated differently in Table 2. Temperature and pressure are assumed to be the same in the instruments and the SAPHIR chamber. This approach is applicable as indicated by temperature and pressure measurements in the instruments. The overall 1σ uncertainty of the calculated OH reactivity is around 20 % in most experiments but can be higher (e.g. 40 % in case of the experiment with sesquiterpenes) depending on the uncertainty in the OH reactant measurements, the agreement between simultaneous measurements by different instruments and the uncertainty in reaction rate constants.

Experiments performed in 2015
Two campaigns were conducted for this comparison. The first one took place in October 2015. All instruments listed in Table 1 were used in this campaign with the exception of the CIMS instrument.
In the experiments, the chamber was flushed with highpurity air before each experiment, until trace gas concentrations were below the limit of detection of instruments (Table 2). The chamber air was humidified to approximately 75 % relative humidity (RH) at the beginning of each experiment, except for the experiment on 6 October 2015, when the experiment was started with 25 % RH and the humidity was increased to 90 % RH in three steps. Relative humidity typically dropped to 40 to 50 % during the experiment due to temperature changes and dilution. Ozone was also added at the beginning of the experiments to allow OH production in the LP-LIF instruments if ozone was not expected to affect the chemical composition of the chamber air (e.g. by ozonolysis reaction or by the conversion of NO to NO 2 ). Initial ozone mixing ratios were typically between 50 and 80 ppbv.
OH reactivity was typically increased in several steps to maximum values of approximately 50 s −1 at the end of the experiment (maximum 150 s −1 ). The time between two injections of trace gases was approximately 45 min. In addition, chemical conditions were changed during the course of some experiments, such as opening or closing the chamber roof or adding nitrogen oxides or water vapour. Chemical conditions in the different experiments are summarised in Table 3.
Some experiments aimed to primarily test the instruments' performances: linearity with CO (5 October 2015), the influence of humidity (6 October 2015) and the presence of NO (7 October 2015). The last test was repeated on 15 October. However, due to an operational error, ozone was added at the beginning of the experiment, so that a mixture of NO and NO 2 was present. In order to reduce the ozone concentration, 2,3-dimethyl-2-butene (TME), which reacts rapidly with ozone, was injected twice.
The other experiments focused on the instruments' performances in the presence of specific OH reactants and atmospheric mixtures of reactants. In part of these experiments, OH reactants were also oxidised by either OH or ozone.

Experiments performed in 2016
In the second campaign in 2016, only three instruments measured OH reactivity: a CRM instrument (MPI), a LP-LIF instrument (FZJS) and the CIMS instrument from DWD. The CRM and LP-LIF instruments were the same as in the 2015 campaign. The CIMS instrument sampled air with a high flow rate (2280 L min −1 ), requiring the chamber to be operated with a high replenishment flow. As a consequence, all trace gases were diluted at a high rate of approximately 60 % per hour. Oxidation products could not accumulate. Accordingly, the experimental procedure was different in these experiments compared to those in 2015: humidification was done two to four times over the course of an experiment in order to maintain a sufficiently high water vapour concentration for the production of OH in the LP-LIF and CIMS instruments (typical range of relative humidity between 25 and 80 %). If ozone was present in the experiment, ozone was also injected several times. Initial ozone concentrations were around 100 ppbv and dropped to 20 ppbv before re-injection. Similar chemical conditions as in the first campaign were tested in order to achieve comparable results. Tests were done with single, anthropogenic OH reactants (CO, pentane) in the presence (

Data coverage
The CRM instrument from the Finnish Meteorological Institute (FMI) was used in the first campaign, but no valid measurements could be acquired due to technical problems of this instrument. Data were submitted for all other instruments for nearly all experiments and are included in the comparison. A leak in the OH injection system of the MDOUAI CRM instrument was found after the third experiments and this leak possibly led to systematic errors in the measurements in this experiment. Data from the experiment on 7 October 2015 were therefore rejected for this instrument. The sampling system of the CRM instrument from LSCE was changed on the second day of the campaign (6 October 2015). Measurements from this experiment were rejected. On 12 October 2015, the flow-tube instrument from PSU did not take measurements, except for the last 2 h due to technical problems. All other instruments took measurements at all times during the campaign.

Procedure of data comparison
The measurement comparison was not strictly blind, but some rules were applied to which all participants had agreed prior to the campaign. The general outline of the campaign was as follows: -Before the official campaign started, a test experiment with CO was performed in the SAPHIR chamber. During this experiment, the participants were informed about the added CO concentrations in order to test the functionality of their instruments (4 October 2015, not included in the comparison, and 7 April 2016).
-During the campaign, participants were informed about the types of trace gases which were planned to be added to the chamber air before the experiments. Concentrations of reactants, however, were not disclosed to the participants.
-During the campaign, participants had the opportunity to present quick-look data of measured values at daily meetings, but data were not exchanged or distributed.
-After the campaign, all participants independently submitted their evaluated data to a neutral person at Forschungszentrum Jülich who was not involved in reactivity measurements. Only after all data were received, the measured trace gas concentrations and the k OH data on all participants were made available.
-After data disclosure, some participants applied corrections to their data and submitted a revised data set together with an explanation for the correction.
-The comparison in this paper is based on the final data versions.
Changes of data that were made as a result of the comparison are described in the next section for each instrument.
3 Instruments for the detection of OH reactivity

Comparative reactivity method (CRM)
The comparative reactivity method (CRM) is an indirect method for the measurement of OH reactivity developed by Sinha et al. (2008). The measurement principle relies on the competition of the reaction of OH with either a known pyrrole concentration or ambient OH reactants. Pyrrole acts as a reference species that is typically not present in ambient air. A small flow of humidified, ultra-pure nitrogen (flow rate approximately 240 cm 3 min −1 ) passes over a Pen-Ray lamp, leading to formation of OH by water photolysis at 185 nm with concentrations of approximately 1 to 3 × 10 12 cm −3 (Table 1). Water photolysis, however, not only produces OH but also HO 2 radicals. The higher reactivity of OH compared to HO 2 , also towards surfaces, may lead to HO 2 concentrations exceeding the concentration of OH in the reactor. Ambient OH reactants and/or pyrrole react with OH in a reaction volume (94 cm 3 ) made of glass, with the inner surface covered by Teflon. The instrument is alternately switched between two measurement modes: the small flow of pyrrole (approx. 2 to 3 cm 3 min −1 ) is mixed into a flow of purified air (C2-mode) (approximately 300 cm 3 min −1 ) or into a flow of ambient air (air sampled from the chamber in these experiments) (C3-mode). As OH exclusively reacts with pyrrole in the C2-mode, maximum reduction of the pyrrole concentration is achieved, whereas the pyrrole concentration is higher in the C3-mode, when ambient OH reactants are also present. In order to calculate the OH reactivity, the initial pyrrole concentration needs to be known (typically 1 to 2 × 10 12 cm −3 ). Because a small fraction of the radiation of the Pen-Ray lamp enters the reaction volume, a small fraction of the pyrrole is photolysed (typically less than 10 %). Therefore, the pyrrole concentration is measured when zero air is sampled and when the light of the Pen-Ray lamp is turned on (C1-mode). This is typically done once a day.
The design of the reaction volume is identical for all instruments, because they were all manufactured by the Max Planck Institute for Chemistry in Mainz. Three CRM instruments are included in this comparison, by MPI, IMT Lille Douai (MDOUAI) and LSCE. The instruments differ mainly in the exact operational conditions such as flow rate, pyrrole and OH concentration and the inlet lines (Table 1). The transformation of raw data into k OH values requires corrections  that have been characterised for each CRM instrument (Table 4). These corrections, described below, can significantly differ between instruments due to the different operating conditions.
The pyrrole concentration is monitored by proton-transferreaction mass-spectrometry (PTR-MS) in nearly all instruments but can also be detected by GC (Nölscher et al., 2012a). This is done for the instrument from the Finnish Meteorological Institute.
A number of corrections need to be applied to the signals measured in the different modes due to a variety of factors : -The OH production rate in the two measurement modes can be different if the water vapour concentration is not the same in both modes.
-OH can be significantly reformed by the reaction of HO 2 that is present at high concentrations in the reactor with ambient NO.
-The reaction deviates from pseudo first-order conditions.
-Ambient OH reactant concentrations are diluted due to the additional nitrogen flow. The dilution factor is calculated from measured flow rates.
Corrections are usually determined from instrumental characterisation in the laboratory and in the field, with the assumption that they are representative of ambient air measurements. Typical values of corrections are listed in Table 4.
All groups operating a CRM used empirical functions to correct for deviations from the pseudo first-order decay for the final data evaluation . The exact value, however, depends on the chemical composition of OH reactants (see below). Different representative mixtures are taken to characterise of this correction for the various CRM Table 4. Correction applied to the raw data. Some corrections are non-linear and depend on several parameters (such as the pyrrole and OH concentrations). Values are given for typical atmospheric conditions. The instrument zero for the MDOUAI CRM was due to a contamination in the inlet system that was only present in this campaign. Corrections due to deviations from a pseudo first-order reaction assumption depend on the actual OH reactivity value and specific VOC (see text for details). Typical numbers are given for 10 and 60 s −1 . Interferences are present from NO, NO 2 and O 3 in some instruments. The correction often depends on the concentrations of the interfering species in a non-linear way. Therefore, only typical values can be given here.
Instrument Instrument Humidity a Deviation from 10 ppbv NO a / 10 ppbv NO a 2 / Dilution Other/ zero a /s −1 pseudo first-order a s −1 instruments and operating conditions are optimised to reduce the correction dependence on the chemical composition. The error associated to this correction can then be factored into the measurement uncertainty . Additional instrument-specific corrections are described in the following section.

MPI CRM instrument
The correction of measurements taken with the CRM by MPI for deviations from a pseudo first reaction uses results from numerical simulations. However, this can only be applied if the relative importance of the most abundant reactive compounds in the sampled air is known (Sinha et al., 2008). The model correction was not applicable in this comparison, because no data on the concentration ratios of the main OH reactants were known in contrast to typical situations in field campaigns. In this campaign, the empirical correction procedure was also chosen as an alternative correction procedure that was shown to be advantageous by Michoud et al. (2015). Tests with isoprene, methanol, ethane, propane, propene and toluene were done to determine the correction factor.
In addition to the corrections applied by all groups operating a CRM instrument, measurements by the MPI CRM were corrected for the presence of ambient ozone. The necessity of this correction was recognised after the first comparison of results from the 2015 campaign. This correction was not applied in the first version of submitted data. The procedure to correct data was then determined in laboratory characterisation experiments. The correction was applied to data from the 2016 campaign, from the beginning.
OH is reformed in the reaction of HO 2 with O 3 in the reaction volume of the CRM, where O 3 is present in sampled ambient air but is also produced in the photolysis of oxygen by the 185 nm radiation of the Pen-Ray lamp. The assumption is that the effect of OH reformation on the measurement is typically insensitive to the exact concentration of ambient ozone, which is not present in all modes of the measurement cycle. If this assumption is not true, the OH reactivity is underestimated depending on the ambient ozone concentration. This was observed for the MPI CRM instrument in this campaign. Therefore, measurements were corrected by an empirical function derived from laboratory measurements after the first data submission. Although the ozone concentration in the reaction volume was similar to the concentration in the other two CRM instruments, no ozone dependence was seen for the MDOUAI and LSCE instruments. The exact reason is not clear but might be related to different HO 2 concentrations in the instruments. The insensitivity of other CRM instruments to the ozone interference indicates that operating conditions exist for which the interference is negligible. Further investigations should be performed to characterise these conditions.
In addition to the ozone correction, errors in the calculation of the dilution factor and the calibration of the pyrrole sensitivity were noticed for the MPI CRM instrument af-ter the first data submission in 2015. Although corrections were made after knowledge of OH reactant concentrations and measurements of other instruments, unreasonable data already suggested the need for these corrections before. Also, the correction for the presence of NO 2 was again characterised for conditions when also O 3 was present and slightly changed in the final data. Furthermore, the correction due to the deviations from pseudo first-order decay were changed in the final data because a reanalysis of the concentration of test compounds used for the characterisation revealed higher impurities than certified by the manufacturer.
The total increase in OH reactivity measurements between the first and final submission was typical within the range of a factor of 1.5 to 3 but was a factor of 4 to 5 for low OH reactivities around 10 s −1 in the presence of ozone mixing ratios of 40 ppbv.

MDOUAI CRM instrument
The performance of the MDOUAI CRM instrument was worse in this campaign than previously observed due to additional sources of noise from the PTR-MS instrument and the inlet system. It was recognised that the pump (Teflon surfaces) in the sampling line upstream of the CRM instrument, which is necessary to avoid a pressure drop between ambient pressure and the CRM reactor , released contaminants which caused an additional OH reactivity of 15 s −1 on average during measurements. This instrument zero was subtracted from all measurements. The value was determined daily in each experiment between the humidification of chamber air and the injection of OH reactants. Deviations from pseudo first-order behaviour of the kinetics were characterised by tests with isoprene, propene, ethene, ethane and propane. Data were not revised after the first data submission.

LSCE CRM instrument
At the beginning of the campaign, a pressure change was observed for the two measurement modes of the CRM instrument at the exit of the reactor that could affect the measurements. The total flow rate in the sampling line was increased and only a small part was sampled into the reaction volume in order to avoid a change in pressure. Therefore, the sampling flow was not directly injected into the CRM reactor, but it was first pulled through a pump with Teflon surfaces. The flow was restricted by a valve (Teflon surfaces) before the air entered the reactor. This sampling flow system was used for the first time during this campaign and may have reduced the performance of the instrument.
Corrections were applied to the raw data as described by Michoud et al. (2015) to obtain the OH reactivity values. Specifically, the correction for the deviation from the pseudo first-order conditions was determined from laboratory and field tests using certified concentration of gas standards con-taining propane and isoprene. The same procedure was applied in a previous field campaign in an isoprene-dominated environment (Zannoni et al., 2016). Previous field deployments of the same instrument were conducted in environments with low NO x concentrations, for which a correction for OH recycling by NO was not needed. For this reason, a correction for high NO x concentrations was determined in laboratory tests after the campaign in SAPHIR and data from the experiments from LSCE were revised after the first submission for experiments, when NO x was injected.
Instrument operators decided to use the part of the experiments before OH reactants were added to subtract a background signal, when positive, non-zero values were on average measured (5, 7, 9 and 16 October 2015) in their first data evaluation. However, this correction was not applied in the final data set, because it was agreed not to use knowledge of the chemical conditions for the data evaluation if it is not required.
Changes in the revised OH reactivity measurements were smaller than ±20 % except for measurements at high NO mixing ratios on 7 October 2015, when changes were up to 80 s −1 as no correction for the NO interference was applied in the first submitted data.

Direct OH loss rate measurement by laser-photolysis -laser-induced fluorescence (LP-LIF)
All other instruments measured the decay of OH in the presence of ambient OH reactants in a flow tube. In most of the instruments, OH radicals are detected by LIF (Heard and Pilling, 2003). All methods measuring the OH decay have a higher time resolution compared to the CRM instruments (Table 1), because no time is used up when switching between ambient and purified air. In general, fewer corrections are required to derive the OH reactivity from the measured OH decay.
Four LP-LIF instruments were used in the campaigns: instruments from University of Lille and University of Leeds and two instruments from FZJ, one of which is permanently installed at the SAPHIR chamber (FZJS) and the other of which is used for mobile field deployment (FZJM).
In the laser-photolysis instruments, ambient air passes (flow rate 10 to 20 L min −1 ) through a flow tube. Part or all of the air is drawn into an OH fluorescence detection cell. The exact position and design of the flow tube and the fluorescence cell differ among the instruments. OH is produced by flash photolysis of ozone with subsequent reaction of O( 1 D) with water vapour. Radiation is provided by a quadrupled Nd:YAG laser pulse at 266 nm, which is operated at a low repetition rate of 1 to 2 Hz. The OH concentration is measured with a high frequency of 3 to 8.5 kHz, so that the decay of OH can be observed with a high time resolution between two photolysis shots. Tens of consecutive decays are summed up to increase the signal to noise ratio.
The OH radicals decay in a pseudo first-order reaction with ambient OH reactants, so that the time-resolved OH signal can be fitted to a single-exponential function that directly gives the OH reactivity. Differences between the fitting procedures of the instruments are described in the Supplement. The accuracy of the time basis of the OH decay is only determined by the accuracy of the photon-counting electronics.
Measurements need to be corrected for an instrument zero that is subtracted from all measurements. This zero loss rate in the flow tube is due to the wall loss reactions and likely limited by the diffusion of OH. Values are typically of the order of a few s −1 (Table 4) and are regularly determined by sampling high-purity zero air.
Conversion of HO 2 to OH in the presence of ambient NO can influence the measurements. As there is no concurrent production of HO 2 in the ozone photolysis, LP-LIF instruments are less affected by this recycling process compared to instruments that use water photolysis for OH production. It is expected that this recycling process only becomes relevant for NO mixing ratios higher than 20 ppbv for typical atmospheric chemical compositions of ambient air (Lou et al., 2010). In this case, the single-exponential decay of OH turns into a bi-exponential decay that can clearly be identified in the summed decays. If a bi-exponential decay is observed, the faster decay time can be attributed to the OH reactivity. The underlying assumption is that the timescale of OH formation is slow compared to the OH loss. This is reasonable for typical atmospheric conditions but may not be applicable in all cases, specifically in artificial air mixtures. In field experiments, bi-exponential behaviour in the OH decay due to OH recycling at high NO concentrations in ambient air measurements has been observed by the FZJS and Lille instruments. A bi-exponential function was applied to measurements in a campaign in China for the FZJS instrument (Lou et al., 2010). Measurements of the Lille instrument that showed bi-exponential behaviour during a campaign on the campus of the University of Lille were evaluated by applying a single-exponential function. Measurements were evaluated by only using the first part of the decay curve that contained information on the faster decay . No significant difference between this procedure and the results from a bi-exponential fit was found. The fitting of the data using a single-or bi-exponentially decay function is discussed later in the paper, as differences were observed in the returned value of the OH reactivity in this campaign at high NO x (> 20 ppbv) depending on the type of fit used.
In the normal operational procedure, no dilution or only a small amount needs to be taken into account for most of the instruments. If there is insufficient ambient ozone to generate a measurable OH signal, then an addition of O 3 -containing flow needs to be added to the flow tube and a small dilution correction needs to be made. This was required in experiments without the presence of ozone in the chamber air.
The number of data points on the decay curve that are above the noise level decreases with increasing OH reactivity, so that the accuracy and precision of the measurements start to decrease for exceptionally high OH reactivity values (for example higher than 60 s −1 for the FZJ LP-LIF instrument). In addition, initial inhomogeneities in the OH distribution in the flow tube due to inhomogeneities of the laser photolysis beam can impact the shape of the observed OH decay curve for these high reactivity values. For this reason, an additional dilution flow can be applied in order to reduce the OH reactant concentrations and improve the data quality. This was done in some instruments (FZJ, Lille) in this campaign, when the measured reactivity exceeded a threshold (e.g. > 150 s −1 for the Lille instrument) but is not required as indicated by measurements by the Leeds LP-LIF instrument.
Imperfect alignment of the photolysis laser can enhance the inhomogeneities in the initial OH distribution, so that deviations from a single-exponential OH decay can also occur at low reactivity values. This was observed in this campaign in the Lille and Leeds LP-LIF instruments but recognised only at the end of (Lille) or after (Leeds) the campaign. As a consequence, the evaluation procedures were changed for measurements in this campaign in order to account for this effect.
Data from FZJS and FZJM instruments were not revised after the first data submission and no instrument-specific description is required here. The Lille and Leeds instruments required a campaign-specific data evaluation that was applied before (Lille) or after (Leeds) the first data submission.

Lille LP-LIF instrument
Quick-look data presented from the LP-LIF instrument from Lille systematically deviated from measurements of the other instruments. The overestimation of approximately 30 % was confirmed by determining the reaction rate constant of the reaction between CO and OH in test experiments, in which a mixture of CO in synthetic air was sampled. This overestimation was due to two reasons: (1) misalignment of the photolysis laser leading to deviations from single-exponential behaviour of the OH decay curve, likely due to an inhomogeneous initial OH concentration (see above); (2) the procedure of analysing the decay by adapting the length of the decay curve used for the fit. The length is limited to 15 times the first estimate of the inverse reactivity  in order to avoid noise from the background signal over longer periods of time. As a consequence, the fitted zero decay time appeared to change if the length of the curve used for the fit was shortened for zero-air measurements like done for the high reactivity values. This was then used to account for the deviations in the reactivity measurements by determining an artificial zero decay time as a function of the fit length. In the final data, this zero decay time, which depends on the fit length and therefore reactivity value, was subtracted from the measurements (Fig. S1 in the Supplement). With this method, correct reactivity values could be calculated for the laboratory test experiments with CO. The drawback is that the accuracy is lower for high reactivity values due to the decreasing number of points used for the fit.
This correction would not be needed for a good alignment of the photolysis laser. It is therefore only needed for the data evaluation of this campaign but could be used to deal with similar alignment problems in the future.

Leeds LP-LIF instrument
Similar behaviour of the decay curves to that observed for the Lille LP-LIF instrument was recognised for the Leeds LP-LIF instrument after the campaign. In the decay, a fast component was followed by a slower component rather than single-exponential behaviour. This behaviour was also apparent during the zero decay measurements conducted with zero air for this campaign. As a consequence, the fit of the single-exponential function was started after the fast section of the decay curve for the data evaluation (fit range between 150 and 400 ms) for low reactivity values (k OH < 10 s −1 ). An accurate determination of the OH reactivity was more difficult for high reactivity values (k OH > 10 s −1 ), when values became similar to the fast component of the decay. A singleexponential function between 100 and 200 ms was fitted to the measured decay curve in this case.
Similar to the procedure that was applied to the data from the Lille LP-LIF instrument, zero-air measurements were evaluated using the same fit ranges as for evaluating low and high reactivity values. A decay time of (2.3 ± 0.4) s −1 was obtained for the low reactivity case. This is close to the real loss rate of OH in the instrument without OH reactants (instrument zero). A higher value of (4.8 ± 0.6) s −1 was determined if the fit range was shifted to an earlier start as it is for evaluating decays for high reactivity values. These two values were subtracted as instrument zeros when either one of the fit ranges were used. The higher value acts as a correction for the overlap of the faster instrumental component and the OH decay due to chemical reactions. For decays taken on the 7 and 15 October 2015 when NO was present, a fit range between 105 and 150 ms was chosen, giving an instrument zero of (5.1 ± 1.1) s −1 .
The difference between the revised data, when this evaluation scheme was applied, and the initially submitted data is mainly due to the higher instrument zero that was subtracted for k OH > 10 s −1 , so that these values are 2.5 s −1 lower than before. Deviations of the OH decay from single-exponential behaviour for conditions without OH recycling in the instrument were not observed in other field campaigns in the past. This correction is specific for the data from this campaign.

Direct OH loss rate measurement by flow-tube technique with laser-induced fluorescence (PSU instrument)
The flow-tube LIF instrument from PSU also measures the decay of OH radicals. In contrast to LP-LIF instruments, OH is continuously produced by water photolysis at 185 nm in this instrument using a Pen-Ray lamp with concurrent HO 2 production as in the CRM instruments.
In the PSU instrument, the reaction time is varied by a movable injector, which is used to change the distance between OH injection and the point of OH detection (Kovacs and Brune, 2001;Mao et al., 2009). The reaction time is calculated from the velocity measured with a hot-wire anemometer and the known distance travelled for each position of the injector. Within each scan, more than 100 data points were used to calculate the decay. Finally, during normal operation in the field, the PSU instruments sample ambient air with a high flow rate (> 100 L min −1 ). This exceeds the flow rate which can be consumed during operation of the SAPHIR chamber; therefore the PSU instrument had to apply a high dilution flow in this campaign. Only 20 L min −1 were sampled from the chamber, to which 80 L min −1 of high-purity synthetic air provided by the SAPHIR air supply system was added. The dilution factor was determined from monitored flow rates and was verified in several tests during the campaign, in which the ratio of flows was varied. Using a dilution flow has two drawbacks. Firstly, the calculated OH reactivity is very sensitive to the exact ambient and dilution flows. Secondly, any error in the instrument zero decay due to wall loss or trace impurities in the dilution air is amplified by the ratio of the dilution flow to the ambient flow, in this case a factor of 5. Thus the typical limit of detection of 0.5 s −1 becomes 2.5 s −1 .
As for the CRM instruments, measurements by the PSU instrument can be affected by OH recycling from the reaction of ambient NO with HO 2 , which is concurrently produced with OH by water photolysis. The correction of OH recycling in the PSU instrument is based on correcting each point in the decays for the recycling calculated from measured NO and HO 2 before applying the fit to determine the OH reactivity (Shirley et al., 2006).
Changes made after the first data submission in the data by the PSU instrument were mostly smaller than ±10 %. These small changes were due to improvements in the data evaluation algorithms that were made between the first and final submissions. These included improvements in the procedure, how data on measurements from instruments that were used for the corrections were synchronised to the OH reactivity measurements and refinement of instrument parameters such as air velocity and location of the injector.
In addition, the change in the correction procedure for OH regeneration due to the reaction of HO 2 with NO led to the final data being 2.5 times higher than the first data submission at the highest NO mixing ratios on 7 October 2015. Initially a new optimisation fitting procedure was developed and used for the first data submission, but laboratory and modelling studies showed that the method in Shirley et al. (2006) was superior and less uncertain. Thus, the method in Shirley et al. (2006) was used for the revised final data submission.
These changes were specific for this campaign because the instrument was not exactly the same instrument as used in previous and future campaigns. It was assembled from parts of the original PSU instrument and parts (mainly the laser system for the OH detection) provided by the Max Planck Institute for Chemistry in Mainz and the University of California, Berkeley.
3.4 Direct OH loss rate measurement by flow-tube technique with chemical ionisation mass spectrometry (DWD instrument) The measurement scheme of the CIMS instrument by DWD is similar to that of the flow-tube LIF instrument by PSU. However, only one reaction time is currently realised to measure the OH decay (Berresheim et al., 2000;Muller et al., 2017). Excess OH (10 8 cm −3 ) is produced by water photolysis in front of the flow tube with concurrent production of HO 2 . The reduction of its concentration by reacting with ambient OH reactants is measured at two set time periods. This is achieved by terminating this reaction by chemical conversion of OH after a specific reaction time. For this purpose, a high concentration of sulfur dioxide is added at two injection points, so that OH is converted to sulfuric acid. After the OH titration, a high concentration of propane is injected to scavenge any OH present. The injection of sulfur dioxide is alternately switched between these two points. The reaction time is determined by adding known amounts of OH reactivity (e.g. propane) in front of the flow tube. OH wall losses from the flow tube are quantified by using humidified synthetic air. If the OH lifetime in the instrument is of the order of the travel time between the two injection points, no reasonable measurement is possible. In the current set-up, an upper limit of OH reactivity values of 40 s −1 is achieved. Additionally, measurements by the CIMS instrument can also be affected by OH recycling from the reaction of ambient NO with HO 2 that is concurrently produced with OH by water photolysis. Corrections for OH recycling in the CIMS are based on laboratory characterisation at the Hohenpeissenberg station (ambient pressure ∼ 900 hPa). An empirical function corrects for the systematic underestimation seen in CIMS OH reactivity measurements, which is dependent on both the magnitude of OH reactivity and the levels of NO present. The function has been derived for propane, isoprene and ethene for NO concentrations up to 15 ppbv (Muller et al., 2017). Under the assumption that any complex mixture in the SAPHIR chamber behaves like the three OH reactant mixture above, the NO correction was applied to the SAPHIR campaign data set for k OH larger than 2.5 s −1 . The fit function optimised for OH reactivity up to 40 s −1 and NO ranging from 0 to 15 ppbv leads to a systematic overestimation of OH reactivities below 2.5 s −1 (Fig. S3), not representing laboratory observations. Therefore no correction is applied to k OH < 2.5 s −1 . The OH recycling efficiency is partly dependent on the reaction time between the two injection zones. As the NO correction was determined at the laboratory at Hohenpeissenberg Observatory at a pressure of 900 hPa, an uncertainty of +10 % exists for its application at the SAPHIR chamber, as a result of lower flow rates (i.e. longer reaction time in CIMS) at 1000 hPa.
In addition to ambient NO, the CIMS measurements were influenced by an NO impurity in the SO 2 cylinder, leading to the presence of 0.14 ppbv NO in the CIMS flow tube at all times in this campaign. The presence of the NO impurity became evident from the inspection of the CO experiments (7 and 15 April 2016) where a systematic, repeatable underestimation in OH reactivity was found for reactivities above 20 s −1 . Therefore, an NO correction function was applied to the whole data set, also for experiments without NO in the chamber, e.g. in experiments with monoterpenes (13 April 2016) and sesquiterpenes (15 April 2016).
The DWD CIMS instrument is a relatively new instrument that had only been used in a remote environment at the monitoring station at Hohenpeißenberg. Therefore, the correction procedure had been developed for chemical conditions experienced in this campaign and were further refined after the first data submission. They would also be required if the instrument took measurements in similar environments.
The wall loss of OH in the instrument and the time in which the air travelled between the two titration points were initially determined from zero-air phases of the experiments on each day. In order to provide data which are fully independent from the experiments, measurements were revised after data from the other instruments were known. The parameters were determined by an external flow tube with propane and synthetic air concentrations only once before the start of the campaign. This resulted not only in a constant change in the data due to the change in the zero decay time (wall loss) but also a scaling of data due to the change in the calculated reaction time (Table S1 in the Supplement). Final data are on average 10 % lower than initially submitted.

Results and discussion
A summary of OH reactivity measurements of all instruments together with calculated OH reactivity is shown in Figs. 1 to 4 and the results are discussed in detail in the following subsections. For comparing data, the calculated reactivity is taken as the reference value if no oxidation products were formed during the experiment. In all other cases, one of the LP-LIF instruments (FZJS) is taken as reference. This instrument was chosen because its measurements have a high precision and time resolution. Regression lines were determined using the fitexy procedure by Press et al. (1992)   method takes into account the measurement errors of both instruments and is symmetric, i.e. the fitted parameters are independent of whichever of the two instruments is assigned to be the dependent or independent variable.

OH reactivity measurements with zero air
Ultra-pure air was present in the chamber at the beginning of each experiment. As discussed above, it can be assumed that there was no OH reactivity present in this case. For normal operation of the LIF instruments, ozone and water vapour need to be present. A small contamination from OH reac-  tants could appear during the humidification process of the chamber air. Measurements from previous experiments in the chamber indicate that OH reactivity introduced together with water is most often below the limit of detection of the reactivity instrument (approximately 0.2 s −1 , e.g. Fuchs et al., 2013) but always less than 1 s −1 . This is likely due to either contaminants in the water or contaminants coming off the Teflon film of the chamber with increasing humidity. Therefore, these periods are ideal for testing the instrument zeros and the precision of the measurements.
If an instrument zero needed to be taken into account, it was independently determined from the zero-air phase of the experiments for all instruments except for the MDOUAI  CRM instrument. The instrument zeros were typically measured on a daily basis. No systematic change in the value was observed over the course of the campaign for these instruments. The instrument zero of the Leeds LP-LIF instrument was determined only once at the end of the campaign and the zero of the CIMS instrument once at the beginning of the campaign. The same air supply as for the chamber was used in these experiments, except for the CIMS, for which bottled air was used (Linde, purity 99.999 %). The derived values were used to correct all data. No instrument zero is expected for the CRM instruments (except for the contamination in the MDOUAI instrument), because only differences between measurement modes are used to calculate the OH reactivity. Figure 5 shows the histogram of measurements during all zero-air parts in the two campaigns. A Gaussian fit function is fitted to the distribution in order to determine a potential bias in measurements and to estimate the precision of the measurements (Table 5). Overall, the distributions of zero measurements give a Gaussian shape. If all data are put  together, none of the instruments exhibit a significant bias. Some exceptions are observed for specific experiments for some instruments. This result also demonstrates that no significant OH reactivity was present during these parts of the experiments. Partly due to the small number of data points, the distribution is noisier for the measurements by the CRM instruments compared to the distribution for the LP-LIF instruments. No bias of the MDOUAI instrument can be determined because of the use of zero-air phases of experiments to determine an instrument zero (see above). The bias in the other two CRM instruments varies between experiments (Fig. 1): the day-today variability is between ±3 and ±5 s −1 with maximum values of ±10 s −1 .
A small bias is also observed in a few experiments for measurements of the Leeds LP-LIF instrument (smaller maximum value around 2 s −1 ). A small positive bias of approximately 2 s −1 is also seen in the PSU measurements after 13 October 2015 for unknown reasons. However, this change  is likely affected by the amplification of zero variability and errors due to the dilution procedure that was used.
The width of the distribution can only be regarded as an upper limit for the precision because of the deviation of zero measurements from a Gaussian distribution for some instruments (Table 5). Alternatively, the width of the distribution was calculated for a distribution of data after subtraction of the bias observed for each instrument for each individual ex- Table 5. Fit results of the distribution of the measurements to a Gaussian function when no OH reactants were present in the chamber. The distribution is either calculated by taking all data as they are measured or by forcing the mean values on each day to zero for an individual instrument. For the MDOUAI instrument, no independent instrument zero was determined. periment. The width of this distribution gives a more realistic estimate of the precision of the measurements (Fig. S5). The widths of the corrected distributions for the CRM instruments give a precision of approximately 2 s −1 at a time resolution of 10 (LSCE-CRM, MDOUAI-CRM ) or 15 min (MPI-CRM), slightly higher than the stated limits of detection of 1 to 1.5 s −1 ( Table 1). The widths of the distributions give a precision between 0.1 and 0.3 s −1 for LP-LIF instruments at time resolutions between 30 and 160 s for the different instruments, and a precision of 0.4 s −1 for the CIMS (60 to 300 s time resolution) in agreement with their stated limits of detection (Table 1). The PSU flow-tube instrument gives a precision of 0.9 s −1 . The precision in Table 1 of 0.5 s −1 is stated for normal operation of the instrument without the dilution and becomes 2.5 s −1 when corrected for the dilution amplification.

OH reactivity measurements in the presence of CO and CH 4
During several experiments, only CO and CH 4 were present in the chamber for the entire experiment or part of the experiment. These experiments were performed in the dark, so that there was no photochemistry. The linearity of instruments and behaviour for a chemically simple system can be investigated from these experiments. As can be seen in Figs. 1 and 3, measurements of all instruments followed the expected changes (expressed by k calc ) due to the additions of CO and CH 4 . In the tested range of up to 150 s −1 , the agreement is mostly very good. Measurements by the DWD instrument show a clear upper limit of measurable reactivity of 40 s −1 as expected from the measurement principle (see above). Some instruments exhibited large transient deviations from the expected values (e.g. MPI on 6 October 2015 between 10:00 and 11:00 UTC) but otherwise agree well during the CO and CH 4 experiments.
Although linearity appears not to be a problem for all instruments, even for exceptionally high reactivity values ( Fig. S6 and Table S2), the discussion of the results focuses on OH reactivity values below 60 s −1 , which are more relevant for atmospheric measurements. Figure 6 shows the correlation of measured and calculated OH reactivities and Table 6 gives the result of a regression analysis for all periods, when only CO and/or CH 4 were present in the dark chamber. High linear correlation coefficients (R 2 > 0.8) are calculated for all instruments. A linear regression analysis gives slopes between 0.98 and 1.17 for most instruments. Errors of the fitted slopes were always smaller than 0.01, because the precision of data are higher than the scatter of data around the regression line. These results demonstrate the ability of instruments to measure the correct reactivity values. Only the regression analysis for one CRM instrument (LSCE) gives a higher slope of 1.31, mostly due to measurements during the first experiment, whereas better agreement (Table 6) is achieved in other experiments. The larger deviation for this instrument is likely due to the correction for deviations from pseudo first-order behaviour (Table 4). This was determined from characterisation measurements with a mixture of isoprene and propane, which might not represent chemical conditions well with only CO.
Although the slopes of the regression lines indicate on average a good agreement of the measurements for these chemical conditions, the scatter in the correlation plots (Fig. 6) is considerably different for the instruments. The time series in Figs. 1 and 3 show that the scatter in the correlation plot is caused by statistical noise, and for some instruments by irregular systematic deviations pointing to instrumental instabilities. The mean of the relative absolute difference between measured and calculated OH reactivity is 32 to 48 % for CRM instruments, 19 % for the PSU LIF instrument and between 8 and 11 % for LP-LIF and the CIMS instruments. If the PSU instrument was operated similarly to how it was in the field without the large dilution flow, measurements would have scattered significantly less (at least a factor of 5). Thus, for the instruments as configured for this comparison study, the PSU LIF and LP-LIF instruments appear to have the highest measurement precision.

OH reactivity measurements in the presence of isoprene, MVK, MACR and OH reactants in urban environments
In a second set of experiments, chemical conditions included volatile organic compounds, NO 2 and CO (Table 3). The most abundant biogenic species, isoprene and OH reactants were tested, which are representative of alkenes and aro-  (Table 6). The grey area indicates the mean relative difference between measurements and the regression line. Measurements of the MDOUAI instrument during the first three experiments (5-7 October 2015) have a higher uncertainty due to technical problems. Table 6. Results of the correlation analysis (linear correlation coefficient R 2 and slope and intercept of a weighted linear fit) for different subsets of the data. Errors of fit results (not shown here) are not significant within two digits of the fit parameters. | k|/k fit gives the mean value of the relative difference between measurements and the regression line. matic compounds found in urban environments (1-pentene, o-xylene, toluene). Oxygenated VOCs from isoprene oxidation (MVK and MACR) and acetaldehyde were present in separate experiments in 2015 (Fig. 8). In 2016, these species were present in experiments together with isoprene and the urban OH reactant mixture. Similar results are obtained for isoprene and urban OH reactants. Because these experiments partly included oxidation products that were not measured by instrumentation at the chamber, measurements by the LP-LIF FZJS instrument are taken as the reference value. Measurements of this instrument differ less than 10 % from calculations using measured OH reactant concentrations. This difference is smaller than the 1σ accuracy of the calculation, so that results would not significantly differ if calculated OH reactivity was used.
For most instruments (except for LSCE CRM and DWD CIMS instruments), the agreement between measurements found for these chemical conditions is about as good as for the experiments with only CO and CH 4 (Fig. 7). High linear correlation coefficients between measured and calculated reactivity values are obtained (R 2 > 0.80) and slopes of the regression lines are 0.94 and 1.07, showing good absolute agreement ( Table 6).
The performance of the LSCE instrument is better than in the experiments with only CO and CH 4 , but measurements are lower than the reference in this case, whereas measurements are higher in the CO and CH 4 case. As discussed above, the correction for deviation from pseudo first-order kinetics (which is based on a characterisation with propane and isoprene standards) might better represent chemical conditions during the experiment with alkenes, aromatics and isoprene compared to the CO and CH 4 case. In general, this issue can cause a variability in the agreement between measured and calculated reactivity in this campaign. This indi-  (Table 6). The grey area indicates the mean relative difference between measurements and the regression line.

Atmos
cates that a more intensive characterisation of this correction is required for the specific chemical conditions, specifically if individual OH reactants are studied. Measurements by the DWD CIMS instrument give larger deviations from calculated reactivity in these experiments compared to results found in the CO and CH 4 case. The experiments on 9 and 11 April 2016 started with high reactivities of about 40 s −1 , but only 60 % is measured by the DWD CIMS instrument (Fig. 3). The agreement improves when k OH decreases. For values below 10 s −1 , the measure- ments agreed well with calculated reactivities. The exact behaviour of the relationship between measured and calculated reactivity changed for periods of the experiments with different chemical conditions. In addition, an increase in the measured reactivity with increasing water vapour concentration after starting humidification of the chamber air is observed in these experiments. This is less obvious in other experiments (see below). Part of this large discrepancy could be the result of an instrumental instability, which was seen as an intermittent increase in noise in the CIMS reactant ion counts (NO − 3 ) from 9 April 2016 onwards and coincided with the periods deviating from the FZJS instrument observations (see Supplement). This could be relevant because the reactant ion count is used to normalise the HSO − 4 counts, thus obtaining the equivalent OH concentration. At high OH reactivities, when OH signals are smaller, higher noise in the (comparatively) large reactant ion concentrations could thus affect the resultant k OH estimation. The exact reason why there was an increase in noise in reactant ion concentrations remains unclear. Additionally, these two experiments have in common the illumination of the chamber by sunlight and presence of NO 2 in the second part of the experiments. Interestingly, measured and calculated reactivity agree better in these parts of the experiment compared to the first parts. However, there is no obvious reason why these conditions would impact the measurements of this instrument. Chemistry occurring in the inlet system may impact the OH concentration for the more complex chemical composition of air. In the presence of NO (see below), any unaccounted OH recycling would affect the accuracy of the measurements.
In the 2016 experiments, the relationship between measured and calculated reactivities does not change when oxygenated VOCs (MVK, MACR and acetaldehyde) were present compared to the part of the experiments when only the parent VOC was present. This can be seen in the time series of experiments on 9 and 11 April 2016 (Fig. 3). In the 2015 campaign, the impact of the presence of these compounds on the instruments was tested in a separate experiment (16 October 2015, Fig. 2). Because the compounds were consecutively injected, only the observed change in the measured OH reactivity for each instrument is calculated for the analysis (median of 20 min of measurements before and after the injection). This value can be compared to the expected change in the reactivity that is calculated from measured reactant concentrations (Fig. 8).
Measurements of LP-LIF instruments are not affected by these species and agree with the change in calculated reactivity. Also, the flow-tube LIF instrument by PSU and the CRM by MPI give similar values within 10 to 20 %. The change in reactivity measured by the LSCE instrument agrees in the case of MVK and MACR but is less in the case of acetaldehyde. Changes measured by the MDOUAI CRM instrument are up to a factor of 3 lower than observed by the other instruments. Losses on surfaces in the inlet system may explain part or all of the discrepancy. Both instruments used an additional pump with Teflon surfaces in their inlet system (see also the discussion for monoterpenes/sesquiterpenes).
The largest differences are seen in the presence of acetaldehyde for the MDOUAI CRM instrument. In addition, measurements by the LSCE and MDOUAI instruments are more variable compared to those by the other instruments as indicated by the large difference between 25 and 75 percentile values in this case. The presence of oxygenated VOCs may cause additional complications in the reaction system in the CRM that impacts the OH concentration. The oxidation of aldehyde species by OH proceeds by H-atom abstraction from the aldehydic group, leading to the formation of acyl peroxy radicals, RC(O)O 2 . For instance, the oxidation of acetaldehyde will lead to the formation of the acetyl peroxy radical, CH 3 C(O)O 2 , with a yield of approximately 95 % (Cameron et al., 2002;Butkovskaya et al., 2004). The reaction of acyl peroxy radicals with HO 2 is known to efficiently recycle OH in the atmosphere. For the acetyl peroxy radical, Dillon and Crowley (2008) and Winiberg et al. (2016) recently reported an OH yield of 0.5. Also, one reaction pathway in the reaction of MVK with OH forms an acyl peroxy radical that leads to OH reformation in the reaction with HO 2 with a high yield of 0.64 (Praske et al., 2015). These recycling mechanisms can act as a secondary source of OH in CRM instruments, which in turn can mask a fraction of the OH reactivity from aldehyde species for these instruments, leading to a negative bias. Results of model calculations and laboratory investigations (Fig. S4) performed for the MDOUAI instrument confirm that the OH reactivity of acetaldehyde is underestimated by this instrument, which is consistent with observations during the 16 October 2015 experiment (Fig. 2), when acetaldehyde was first introduced in SAPHIR. However, Fig. 2 shows that the two other CRM instruments (LSCE and MPI) are less (or not) impacted by OH recycling from CH 3 C(O)O 2 + HO 2 . These different be-haviours are not well understood and need more investigation.
It is noteworthy that concentrations of acetaldehyde and other aldehydes in the atmosphere are typically smaller than in the experiment in this campaign but can constitute a significant fraction (10 to 20 %) of the total reactivity (e.g. Fuchs et al., 2017). The maximum error that is caused by the underestimation of the total reactivity measurement by the CRM instrument would be less than 13 % if the results of the experiment of this campaign are extrapolated to atmospheric conditions.

OH reactivity measurements in the presence of monoterpenes and sesquiterpenes
The third type of chemical condition tested in the campaigns was the presence of terpenes. This was done either by injecting a mixture of monoterpenes, a sesquiterpene or by flushing real plant emissions into the SAPHIR chamber. These experiments also included ozonolysis reactions of terpenes. Maximum reactivities (25 s −1 ) were lower than in other experiments in 2015. Oxidation products of the ozonolysis reactions were not measured, so that it is expected that calculated reactivities are underestimating the real reactivity. Therefore, one of the instruments (LP-LIF FZJS) is taken as reference for the comparison of the measurements. As seen in the correlation plots (Fig. 9) and the results of the regression analysis for data without the presence of ozone (Table 6), differences between measurements of the LP-LIF instruments and the other instruments are largest in these experiments compared to the other experiments. High linear correlation coefficients are obtained (R 2 > 0.96) and slopes of the regression lines between 0.96 and 1.08 are calculated for the LP-LIF instruments. No systematic change in the relationship of measurements is observed whether ozone and hence ozonolysis products are present or not. Similarly, measurements between LP-LIF instruments agree in the presence of the sesquiterpene (with and without the presence of ozone and ozonolysis reaction products). Because this experiment started with the addition of other OH reactants (Fig. 2), only the measured difference is compared that due to the injection of β-caryophyllene that is observed by each instrument (Fig. 10), as done for the oxygenated VOCs (see above).
Measurements of the flow-tube LIF instrument (PSU) varied more with respect to the reference measurements in the presence of monoterpenes and sesquiterpenes compared to the other chemical conditions. The level of agreement varies among the three experiments (9, 14 and 16 October 2015): when the monoterpene mixture was injected, measurements by the PSU instrument are 10 to 15 % lower than measurements by the FZJS instrument, but they are 20 % higher when plant emissions are transferred into the chamber. During the continuous transfer small inhomogeneities cannot be fully excluded, but the discrepancies between measurements  Table 6). The grey area indicates the mean relative difference between measurements and the regression line.
also remain after the injection and oxidation part of the experiment. The relationship does not depend on the presence of ozone in these two experiments. In the third experiment, changes in the OH reactivity measured by the PSU instrument due to the increase in sesquiterpene concentration (and ozonolysis products) are up to 40 % smaller than the changes observed by the LP-LIF instruments (Fig. 10). The higher and lower values observed in these experiments may not be related to the chemical conditions but instead to the instrument problems. This is indicated by higher values of the PSU instrument compared to measurements by the other instruments in nearly all experiments after 13 October, independent of the chemical conditions (Figs. 7 and 9). Difficulties in maintaining consistent operation of the laser and the electronics driving the movable OH source could have led to much of this variability. As a result, this comparison exercise probably does not represent the capability of the PSU instrument to measure OH reactivity in forest environments. Measurements by the FZJ LP-LIF and DWD CIMS instruments in the experiments with terpenes (Figs. 9 and 11) agree well during the first experiment with monoterpenes (13 April 2016, Fig. 4) and only a small underestimation is seen during the second experiment (14 April 2016). Though no obvious explanation can be provided as to why the CIMS underestimated OH reactivity compared to LP-LIF up to 12 s −1 on 14 April 2016, the CIMS instrument performance might have been influenced by unidentified internal chemical reactions. This also corresponds to observations in the presence of isoprene, MVK, MACR or a mixture of urban OH reactants (see above).
Because NO was present as an impurity in the CIMS sulfur dioxide titration gas mixture (see above and Table 4), a NO correction function was also applied in experiments with monoterpenes (13 April 2016) and sesquiterpenes (15 April 2016). This could explain some of the smaller, but systematic differences compared to LP-LIF measurements (Muller et al., 2017).
Similarly to the CIMS instrument, the agreement between CRM and LP-LIF instruments is worse in the presence of monoterpenes and the sesquiterpene compared to other experiments. Lower linear correlation coefficients (R 2 ) between 0.48 and 0.72 and a higher scatter of data with relative mean absolute residuum values between 0.34 and 0.45 (numbers only for periods without ozone) are observed. The agreement is even worse during the ozonolysis parts of the experiments, when CRM instruments measure values that are up to five times smaller than measurements of the LP-LIF instruments. Similar results are seen in the experiment with the sesquiterpene (Fig. 10). In all these cases, the level of agreement varies among the CRM instruments and the specific experiment, but the measurements tend to be significantly smaller than those of the other instruments. The residence times in the sampling lines of the CRM instruments were generally longer (5 to 6 s) compared to the sampling lines of the other instruments (0.5 to 4 s) and the volume to surface ratio was lower, because CRM instruments used 1/4 OD PFA tubing in 2015. In addition, two CRM instruments (MDOUAI and LSCE) used a sampling pump with Teflon surfaces to introduce the sample into the CRM reactor. Oxygenated and low-volatility (monoterpene and sesquiterpene) species may adsorb on these surfaces and the pump may have therefore played a role in the underestimation seen for these instruments. One instrument (MPI-CRM) used a heated inlet line. Other instruments used up to 1 OD PFA or Silconert-coated stainless steel tubing (Table 1). Results (Figs. 9 and 10) show that MPI measurements are partly significantly higher than those of the LSCE and MDOUAI instruments, suggesting that the underestimation of the LSCE and MDOUAI CRM instruments could be partly due to a loss of OH reactants in the sampling system (unheated inlet line+pump). However, an impact of the monoterpene or sesquiterpene chemistry on the CRM measurements cannot be ruled out.
In the experiments with terpenes in 2016 (Figs. 9 and 11), a better agreement between measurements by the MPI CRM and FZJS LP-LIF is found, specifically for the experiment with β-caryophyllene, compared to the experiments in 2015. The reason for this improvement is not clear but could be related to the larger diameter inlet tube used in 2016 compared  to 2015 (Table 1), supporting the potential influence of losses in the inlet system for these compounds.

OH reactivity measurements in the presence of NO
The presence of NO can affect measurements of the OH reactivity in all instruments due to the recycling of OH by the reaction of HO 2 with NO that is contained in ambient air (see above). These effects are amplified if OH is produced by water photolysis, because HO 2 is concurrently formed with OH. In 2015, the NO concentration was increased stepwise (up to 120 ppbv) in the presence of CO on 7 and 15 October 2015 (Fig. 1). NO was also present in the two experiments with ur-ban OH reactants (12 and 13 October 2015). Figure 12 shows the dependence of the relative difference between measured and calculated reactivity on the NO mixing ratio in these experiments. In 2016, the presence of NO was tested in two experiments in combination with the presence of pentane and in the urban OH reactant mixture (8 and 12 April 2016, Fig. 13). Due to the lack of OH reactant concentration measurements on 8 April 2016, measurements performed by the FZJS instrument are taken as the reference with which to analyse the impact of NO on the performance of the other two instruments (Fig. 13).
Discrepancies between calculated and measured reactivity are mostly within the range of differences observed in other experiments for LP-LIF instruments. For NO mixing ratios higher than 20 ppbv, median values deviate up to 20 % from calculated reactivities. The OH production rate from recycling reactions is within the range of the OH destruction rate in the case of CO at highest NO mixing ratio in the experiment on 7 October 2015. Nearly all the LP-LIF instruments applied a bi-exponential fit function in this case except the Leeds LP-LIF instrument. Only the FZJM/FZJS instruments applied a bi-exponential fit to measurements on the second experiment with high NO (15 October 2015). However, separating OH reactivity from OH recycling by applying a biexponential fit function might still lead to some systematic errors for the experiment with high NO in the presence of only CO because of the OH recycling rate that was higher than for typical atmospheric conditions. Therefore, the faster decay rate could deviate from the OH reactivity. The scatter in the measurements increases with increasing NO mixing ratio as expected from the lower precision of measurements for high reactivity values.
As with all other LP-LIF instruments, OH reactivity from the Leeds LP-LIF agrees well with calculated OH reactivities below 20 ppbv NO. However, values are increasingly lower for higher NO mixing ratios. The lower values of OH reactivity for NO higher than 20 ppbv are caused by the application of a single-exponential fit to the OH decay data rather than a bi-exponential fit as used by other LP-LIF groups. Similar behaviour is achieved if a mono-exponential fit is applied to measurements by the Lille and FZJS/FZJM LP-LIF instruments during the experiment with CO and NO (7 October 2015). A bi-exponential fit, although it gives an OH reactivity closer to the calculated value for this particular experiment, is not necessarily the correct function to apply to fit atmospheric data, and so was not used by Leeds (even though a bi-exponential fit returns a larger value of OH reactivity). Model simulations under relevant conditions indicate that a bi-exponential fit can return an OH reactivity that is greater than the true value. Fitting the data more rigorously requires a modelling approach, similar to that applied when OH recycling was observed in a laboratory kinetics study (Onel et al., 2014). Hence, although application of a bi-exponential fit improves the agreement with the calculated value for this experiment, caution is needed when applying it to atmospheric data at high NO where conditions could be different.
Measurements by the PSU LIF instrument, which also uses water photolysis as an OH source, show a tendency to underestimate reactivity values with increasing NO mixing ratios. The maximum median of the relative difference is 20 %. This difference was significantly reduced from 70 to 20 % in the final data compared to the data submitted before reactivity measurements from all groups and OH reactant concentrations were made available. The correction procedure for the presence of NO was changed later from a new procedure to the one described in Shirley et al. (2006) (see above).
Measurements by the MDOUAI and LSCE CRM instruments do not exhibit a clear trend in the relative difference between measured and calculated reactivity with NO. In contrast, measurements by the MPI CRM instrument give lower reactivity values compared to calculated reactivities with increasing NO mixing ratios in both campaigns in 2016 (Figs. 12 and13). Measurements by all CRM instruments were corrected by applying an empirical correction function. The magnitude of the correction is of the order of the OH reactivity values (Table 4), making results very sensitive to any systematic error in the correction procedure. The differences in the corrections needed for each CRM instrument emphasise the necessity for a careful characterisation of the instrument.
In the campaign in 2016, the relative difference between measured reactivity by the DWD CIMS instrument and the reference (FZJS LP-LIF instrument) is small for NO mixing ratios lower than 5 ppbv but increases with increasing NO mixing ratio (Fig. 13) to up to a factor of 1.3 (median value) for 10 to 20 ppbv NO.
This difference demonstrates that the correction applied to the CIMS measurements leads to systematic errors for NO mixing rations larger than 5 ppbv, in particular for the urban mixture (12 April 2016). The chemical composition does seem to play a role (Table 3): the correction in the pentane experiment fits well in contrast to the urban mix experiment, in which it partly produces inaccurate results. The strength of OH recycling by the reaction of HO 2 with NO is also dependent on the CIMS internal abundance of HO 2 which was not measured during the SAPHIR campaign. Also, the correction term can become rather large for high OH reactivity and large NO concentration (up to 30 s −1 ), which illustrates the limit of the instrument in its current configuration (Fig. 4).

Influence of humidity on OH reactivity measurements
In experiments in 2015, humidity was similar in most experiments and only systematically varied in one experiment on 6 October 2015 (Fig. 1). In contrast, water vapour concentrations were highly variable in experiments in 2016 because of the high dilution flow that was required. Figure 14 shows the dependence of the relative difference of measurements by the instruments (taking measurements by the FZJS instrument as reference) for all experiments, in which an overall good agreement is observed (CO, pentane). A clear trend towards overpredicting OH reactivity with increasing water vapour can be seen for measurements by the MPI CRM. This trend is consistent with lower measurements at the lowest water mixing ratios observed in the experiment on 6 October 2015 ( Figs. 1 and S7). In contrast, the results from 6 October 2015 do not indicate that the other CRM instruments are affected in the same way by water vapour. No clear trend with water vapour is observed for the LSCE and MDOUAI instruments. Some changes in the relationship between the CIMS instrument and the LP-LIF instrument are seen after water vapour additions in some experiments in 2016 (for example 9 and 11 April 2016, Fig. 3). On 9 and 11 April 2016, the CIMS instrument showed an instrumental instability before the water addition, which could explain some changes in the relationship (Fig. S2). No systematic trend in the entire data set is observed (Fig. 14). Also, on 15 April 2016 CIMS measurements deviate from observations of the LP-LIF instrument, when the humidity was increased after the addition of sesquiterpenes, but deviations could be due to errors in either one of the instruments. The LP-LIF instrument observed an increase in OH reactivity, whereas the decreasing trend of the CIMS measurements does not change. The increase observed by the FZJS LP-LIF instrument could be due to desorption of sesquiterpenes inside the instrument but could also be due to desorption from the chamber wall increasing the reactivity in the chamber. However, the decrease observed by the CIMS instrument would be consistent with the dilution of trace gases in the chamber. Both instruments agree better again after the injection of ozone, when sesquiterpenes have become small.
In the CRM and CIMS instruments, the concentrations of OH and HO 2 depend on the water vapour concentration as they are produced together by water vapour photolysis. In CRM instruments, corrections are applied, when the water vapour concentration changes between the different mea- surement modes which are required to calculate the OH reactivity (Table 4). Also, a humidity dependence in the detection sensitivity of pyrrole is taken into account. Fast changes in the water vapour concentrations, for example during the humidification procedure of the chamber air, can therefore cause systematic errors. However, systematic differences are observed on a longer timescale than the duration of the humidification (< 30 min) in these experiments. Humiditydependent memory effects in the inlet system could be the reason for this behaviour. Another possibility could be that observations are related to changes in the OH and HO 2 concentrations that depend on the water vapour concentration. OH recycling processes and correction factors depend on the radical concentrations and chemical conditions. For experiments in 2015, the dependence of the correction factor for the deviation from pseudo first-order kinetics was not well characterised for the low OH concentrations at low water vapour concentrations, so that the deviation from calculated OH reactivity might be due to a systematic error in this correction for these conditions. Further investigations will be necessary to understand the exact influence of the water vapour concentration on the OH reactivity measurements by the MPI CRM instrument and the reason for the instrumental instability of the CIMS instrument. It cannot be fully excluded that the observed effects are related to the experimental procedure of humidifying the chamber air that leads to a relatively fast change in the water vapour concentration.

Previous comparisons
Two comparisons of OH reactivity instruments were performed in the past. In one campaign, the MDOUAI CRM and the Lille LP-LIF instruments took measurements on the campus of the University of Lille in October 2012 . Either a complex mixture of VOCs or oxygenated VOCs or ambient air was sampled. Experiments with synthetic mixtures of hydrocarbons and OVOCs indicated that the CRM instrument was underestimating the reactivity by 39 and 53 %, respectively, while the LP-LIF measurements were in agreement with the calculated reactivity values within their uncertainty. The discrepancy was attributed to the photolysis of aromatic compounds and oxygenated VOCs in the CRM instrument. This effect is expected to be insignificant in this campaign, because the lamp intensity in the MDOUAI instrument was lowered as a result of the campaign in 2012.
In the present study, a good agreement with calculated reactivity is found in experiments with urban OH reactants including aromatic compounds. This confirms that photolysis processes observed in 2012 for the MDOUAI CRM no longer play a role.
The comparison between the two instruments during ambient measurements showed that the CRM measurements were lower than the LP-LIF measurements by 22 % on average. No dependence in the agreement between the MDOUAI CRM and Lille LP-LIF in the presence of NO was observed in 2012, consistent with results in this campaign for these instruments (Fig. 12). Similar to the results of this campaign, the accuracy of the determination of the instrument zero of the LP-LIF instrument limited the accuracy of the measurements.
Measurements taken with the MDOUAI CRM were also compared to measurements by the LSCE CRM during a field campaign at a remote site in France in summer 2013 (Zannoni et al., 2015). Both instruments sampled either ambient air or emissions from enclosed plants. Measurements by both instruments agreed well overall in that campaign (the regression yielded a slope of 0.96), but the linear correlation coefficient R 2 was only 0.54 for reactivity values below 50 s −1 because of the large scatter in the data at OH reactivity below 50 s −1 . This is consistent with results in this campaign, where measurements from these two instruments were often similar but also differed by 20 to 50 % in some experiments ( Figs. 1 and 2).
The MDOUAI instrument was operated under similar conditions to the present study, i.e. with a long sampling line and a pump in the inlet system. The inlet system of the LSCE instrument did not include a pump, unlike in the present study, and consisted of sampling Teflon line with a small diameter (1/8 -OD) and a PTFE filter. It is likely that losses of lowvolatility compounds during sampling impacted the previous comparison, similarly to observations in the present study.

Summary and conclusions
Measurements of OH reactivity were compared in experiments performed in the atmosphere simulation chamber SAPHIR in two campaigns in 2015 and 2016. All instrument types presently used for atmospheric measurements were used in one or both of the campaigns. A few additional instruments exist worldwide (e.g. Yang et al., 2016), but they are similar to the instruments in these campaigns.

Summary of findings
Not only were many measurements successfully performed in these campaigns but also a number of findings already led to an improvement in the data quality during the process from measurements to the final data: an ozone-dependent background signal was found for measurements of the MPI CRM; application of the correction of measurements is recommended due to the deviation from pseudo first-order conditions in CRM instruments by empirical correction factors (Michoud et al., 2014); misalignment of the photolysis laser beam in the LP-LIF instruments can lead to a complication in the data evaluation procedure.
These results will also improve the precision and accuracy of measurements in the future. The findings of the comparison of the final data set are as follows: -Measurement techniques are capable of measuring OH reactivity for a range of chemical conditions that are relevant for ambient air measurements but with different levels of precision and accuracy. Losses of OH reactants in inlet lines could be of importance.
-Measurements by LIF and CIMS instruments have a higher precision than CRM instruments leading to a limit of detection better than 1 s −1 at a time resolution of a few minutes. For chemically complex conditions, the scatter of the data is within the range of 10 % for LP-LIF instruments and 10 to 20 % for the CIMS and the flow-tube LIF instruments. The precision of data from the flow-tube LIF instrument was reduced in this campaign compared to typical operation in the field due to the application of a high dilution flow. Measurements by CRM instruments exhibit a higher limit of detection of approximately 2 s −1 at a time resolution of 10 to 15 min. The scatter of the CRM data for chemically complex conditions ranges from 17 to 45 % (mean relative difference between measurements and linear regression with reference values). Additional work is needed on the CRM technique to improve the measurement precision at a level closer to that observed for other instruments.
-Biases in the measurements by the LP-LIF instruments are lower than their limit of detection with a few exceptions in some experiments for the Lille and Leeds instruments. The instrument zero in the PSU instrument varied by 1.3 s −1 , but this value is amplified by the dilution factor of 5 that is not normally used in field measurements. The smaller number of data points for the CRM instruments makes conclusions about a day-to-day variability of a potential bias less accurate. However, the distribution of measurements during zero-air measurements becomes more compact for the LSCE and MPI CRM instruments if an offset is subtracted for each individual experiment (Table 5).
-Maximum absolute deviations of LP-LIF measurements from calculated reactivities or measured reactivities from the instrument taken as the reference (FZJS) are 12 % (mostly less than 5 %). Deviations are smaller than the accuracy of the calculation from measured OH reactant concentrations (Table 2). Results from this campaign demonstrate a high accuracy of LP-LIF instruments.
-The accuracy of CRM instruments varies with chemical conditions. Whereas measurements agree on average with calculated reactivities within 5 % (higher deviations for LSCE CRM 31 %) if only CO, CH 4 or pentane are present, deviation from measurements with the FZJS LP-LIF instrument are up to a factor of 2 for mixtures containing terpenic compounds. Also, the scatter of data is larger in these cases. While the impact of OH recycling in the terpenic chemistry cannot be ruled out, losses of these compounds in inlet systems can explain the observed discrepancies. The transmission of low-volatility compounds such as terpenes and their oxidation products needs to be improved for CRM instruments. Inlet systems used in this campaign partly differed from deployments in previous campaigns (for example the use of the additional pump in the inlet of the LSCE instrument), so that losses could have been different in campaigns in the past.
-Even in the presence of up to 120 ppbv NO, agreement with calculated reactivity within the accuracy of measurements and calculations is achieved for the MDOUAI and LSCE CRMs, whereas deviations of up to 50 % for the MPI CRM instrument and a factor of 1.8 for the CIMS are observed. All these instruments applied large corrections to account for OH recycling from the reaction of HO 2 with NO. The variability in the accuracy of the correction emphasises the need for a careful characterisation of the instrument-specific operational conditions.
-Measurements by LP-LIF instruments are not affected as much as the other instruments by OH recycling reactions even for NO mixing ratios higher than 20 ppbv. In this case, a bi-exponential fit function to the OH decay curve rather than a single-exponential fit improves the agreement with the calculated value of the OH reactivity. A bi-exponential function was not applied to OH decays measured by the Leeds LP-LIF, so larger deviations were observed for NO mixing ratios higher than 20 ppbv. Although a bi-exponential fit to the data generated a closer agreement with the calculated value of the OH reactivity for the conditions of this particular experiment, it should be noted that it may not represent the best function to fit to atmospheric data at high NO (where the composition is different to this experiment), and careful thought needs to be given as to the optimum function to fit to the data.
-Measurements by the flow-tube LIF instrument give larger deviations (±20 %) in the chemically more complex experiments compared to conditions with single, anthropogenic reactants (deviations ±3 %), although the flow-tube LIF measurements are likely affected by instrument issues related only to the instrument that was assembled for this comparison.
-Experiments in 2016 reveal a so far unrecognised effect of the water vapour concentration on measurements by the MPI CRM instrument (factor of 2 difference at 1 % water vapour mixing ratio), although changes of the humidity and therefore radical concentrations are taken into account in the evaluation. The water vapour correction procedure might have not been applicable here, because humidity changes were faster than typical in the atmosphere. Water vapour was changed in only one experiment in 2015. Results do not indicate that the other CRM instruments are affected in the same way by water vapour.
-The accuracy of measurements by the CIMS instrument varied between experiments. Compared to the calculated OH reactivity, an agreement is observed within the accuracy of measurements and calculations for the experiments with CO and pentane (deviation of the regression slope from 1 : 1 line of 13 %). For the isoprene and urban reactant mixture cases, lower accuracy is observed with a deviation of the regression slope from 1 : 1 line of 27 %. In contrast to that, the regression slope is 1.01 for the monoterpene/sesquiterpene cases when measurements are referenced to the FZJS LP-LIF instrument. On some days a change in the relationship between measurements by the CIMS instrument and the LP-LIF instrument is observed with changing water. Overall, the variability in the level of agreement hints to instrumental instabilities.

Conclusions for future instrument operation and measurements in the past
Overall, the comparison demonstrates that OH reactivity measurements by LP-LIF instruments are precise and accurate for a wide range of atmospheric conditions. Instrumental parameters such as laser alignment and instrument zero are recommended to be regularly checked to achieve a high accuracy and to avoid additional complications in the data evaluation.
In this campaign, the flow-tube LIF instrument gives slightly less accurate and precise measurements compared to the LP-LIF instruments, which is related to the different operational conditions compared to previous campaigns. A different laser system was used and a high dilution flow was applied, which reduced the instrument performance. Had it been possible to use the field PSU instrument without dilution, it is likely that its precision and accuracy would have been similar to that of the LP-LIF instruments.
The OH reactivity scheme of the CIMS instrument is relatively new. It has only been deployed at the monitoring station Hohenpeissenberg so far, where OH reactivity values are typically small (2 to 10 s −1 ). Further improvements of the data quality for high NO conditions (> 3 pbbv) are needed to expand the device ability to measure in more polluted regimes.
The accuracy of the current observations depend on the quality of NO concentration measurements and the assumption that the OH decay obeys single-exponential behaviour. All OH recycling processes need to be well characterised. Any deviation from these assumptions leads to systematic errors, and needs further investigations to capture other unknown complex mixtures of OH reactants under polluted conditions. Additional reaction times (injection points for SO 2 and propane) and concurrent measurements of RO x and HO 2 concentrations could help characterise the OH recycling processes in unknown mixtures of OH reactants in the future.
While CRM instruments are less precise and accurate than other techniques, a reasonable agreement is usually observed between the CRM instruments and the other techniques for air mixtures containing simple compounds such as CH 4 , CO and isoprene, and for urban air mixtures containing anthropogenic hydrocarbons and NO x . The correction factors, which depend on the exact instrumental conditions such as the OH, HO 2 and pyrrole concentrations in the reaction volume, are a potential source of systematic errors. In order to minimise these errors, the CRM operating conditions are such that the ratio of pyrrole to OH concentrations ranges from 1.7 to 2, so that corrections for operating under nonpseudo first-order conditions can be within 10 % (1σ ) for different air compositions . The error associated to this correction needs to be propagated to the measurement uncertainty. The largest correction is for the recycling of OH from HO 2 + NO, which is only relevant for urban atmospheres. This correction can be of the same order of magnitude as the measured OH reactivity value and needs to be carefully characterised on each CRM instrument.
The level of agreement is degraded when low-volatility terpenoid compounds and/or their oxidation products are sampled. Although all CRM instruments use the same detection scheme and the same reaction tube, measurements differ between the three CRM instruments and also significantly differ from other LIF-based techniques. While OH recycling in the CRM reactor cannot be ruled out when these species are sampled, losses of OH reactants in inlet lines and sampling pumps (partly different than typically inlet lines in field campaigns) could have led to additional systematic errors in this campaign. The quality of the measurements depends on both the instrumental technique but also the procedure used to transfer the sample into the instrument. A high flow (short residence time) in the inlet lines and/or the use of inert inlet line materials like Silconert coated steel might help to reduce inlet line effects as indicated by the results of LP-LIF instruments. This improvement is a prerequisite to investigate whether the terpenoid chemistry inside CRM reactors can lead to an underestimation of ambient measurements.
The CRM method is a younger technique compared to the LP-LIF and flow-tube LIF method, but the number of instruments has quickly increased due to the commercial availability of detectors for pyrrole such as the PTR-MS instrument. Results of this campaign emphasise that careful instrument characterisation for the specific operational conditions are required in order to achieve accurate measurements. Future work should focus on improving its performances in terms of precision and limit of detection. In addition, the accuracy of measurements would improve if corrections could be lowered.
The results of this campaign demonstrate that all detection schemes that are currently applied to OH reactivity measurements give reasonable results for a range of chemical conditions which are relevant for ambient air measurements. These first comprehensive comparison campaigns were conducted to assess the current performance of the instruments. The results already led to the implementation of changes in some instruments to achieve better data quality. More work will be done in order to improve the instrument performance for issues that have been identified which currently limit the precision or accuracy of measurements. More comparison campaigns could help to further increase the trustability of measurements by conducting them in a formal, blind way and/or at even more realistic conditions with ambient air.
In the field, OH reactivity measurements are often used to identify unexplained reactivity from OH reactants that were not measured as individual species . Large unexplained reactivity (several 10 s −1 ) was found in several field campaigns in biogenic environments, such as the boreal forest in Finland (Nölscher et al., 2012b) and rainforests (Edwards et al., 2013;Nölscher et al., 2016), as well as in urban environments in wintertime (e.g. Dolgorouky et al., 2012;Yoshino et al., 2012). Results here show that measurements by the LIF instruments are accurate. In these comparison campaigns, deviations that are seen mostly for CRM instruments for complex conditions involving large concentrations of terpenic compounds show the tendency for OH reactivity to be underestimated. Therefore, results do not indicate that high, unexplained reactivity values that were measured