Ambient measurements of aromatic and oxidized VOCs by PTR-MS and GC-MS: intercomparison between four instruments in a boreal forest in Finland

. Proton transfer reaction mass spectrometry (PTR-MS) and gas chromatography mass spectrometry GC-MS) are commonly used methods for automated in situ measurements of various volatile organic compounds (VOCs) in the atmosphere. In order to investigate the reliability of such measurements, we operated four automated analyzers using their normal ﬁeld measurement protocol side by side at a boreal forest site. We measured methanol, acetaldehyde, acetone, benzene and toluene by two PTR-MS and two GC-MS instruments. The measurements were conducted in southern Finland between 13 April and 14 May 2012. This paper presents correlations and biases between the concentrations measured using the four instruments. A very good correlation was found for benzene and acetone measurements between all instruments (the mean R value was 0.88 for both compounds), while for acetaldehyde and toluene the correlation was weaker (with a mean R value of 0.50 and 0.62,

A variety of models can be used to investigate the atmospheric chemistry of VOCs. Some simulate the VOC emissions from vegetation (e.g. Grote and Niinemets, 2008;15 Smolander et al., 2014), others simulate the degradation of VOCs due to their chemical reactions with e.g. atmospheric oxidants (e.g. Jenkin et al., 1997; and others model their role in new particle formation and other boundary layer and tropospheric processes (e.g. Fast et al., 2006;Holzinger et al., 2007;Makkonen et al., 2012). Such models often involve dozens of chemical species (including VOCs and 20 trace gases) and complicated chemical and physical processes. In order to evaluate the performance of such simulations and models, reliable measurements of atmospheric concentrations of various VOCs are needed.
Traditionally VOC concentrations have been measured by collecting samples into canisters or onto adsorbents with subsequent off-line analysis with gas GC-MS, on the other hand, can be highly specific for compound identification, but it has lower time resolution (typically 30 min or more). Both of these methods have been used for measurements in different environments, ranging from highly polluted urban areas to remote locations with low VOC concentrations (e.g. Karl et al., 2003;Rinne et al., 2005;Jordan et al., 2009;Holst et al., 2010;Molina et al., 2010;Hellén et al., 15 2012b; Hakola et al., 2012).
Typically, a long-term measurement setup consists of a single analyzer, which is periodically calibrated. Occasionally these instruments are compared with each other either in the laboratory or in the field. The laboratory comparisons are usually conducted by measuring VOC concentrations of a known standard mixture (see e.g. Apel  been conducted before in high latitude boreal forest, where the anthropogenic influence on the concentrations is rather small. The main aim of this study was to evaluate how reliable the real-time measurements of aromatic and oxygenated VOCs are when a single stand-alone instrument is used. This was achieved by comparing VOC concentration measurements of four real-time in-5 struments: two PTR-MSs and two GC-MSs. This study was part of ACTRIS (Aerosols, clouds and trace gases research infrastructure network, http://www.actris.net/, cited on 20 November 2014), which aims to harmonize the European trace gas measurements and to establish a reliable network of continuous long-term measurements. The concentration measurements of three oxygenated VOCs (methanol, acetaldehyde and 10 acetone) and two aromatic VOCs (benzene and toluene) were compared in this study.

Measurement site
The measurements were conducted between 13 April and 14 May 2012 at the SMEAR II site (Station for Measuring Forest Ecosystem-Atmosphere Relations, 61 • 51 N, 15 24 • 17 E, 181 m a.s.l.) in Hyytiälä, southern Finland. The site is a well-characterized measurement station located in a rural boreal forest dominated by Scots pine (Pinus sylvestris) (for details see Hari and Kulmala, 2005;Ilvesniemi et al., 2010). In addition to Scots pine, there are some Norway spruces (Picea abies) and broadleaved trees such as European aspens (Populus tremula) and birches (Betula sp.). The annual mean 20 temperature of the site is 3 • C, with the coldest month being January (mean −9 • C) and the warmest July (mean 15 • C). The annual mean precipitation is 700 mm. The nearest village (Korkeakoski) is about 6 km away and the nearest big city (Tampere, ca. 200 000 inhabitants) is about 50 km from the site. The concentrations and sources of oxidized and aromatic VOCs at the site have 25 previously been characterized by Rinne et al. (2005Rinne et al. ( , 2007; Patokoski et al. (2014) Rantala et al. (2014). Oxidized and aromatic VOCs arrive at the SMEAR II station from both long range and local anthropogenic sources (Liao et al., 2011;Patokoski et al., 2014). OVOCs are also emitted by the surrounding vegetation at the site and formed in the oxidation of e.g. monoterpenes (Rinne et al., 2005(Rinne et al., , 2007Aaltonen et al., 2013;Aalto et al., 2014;Rantala et al., 2014). 5

The measurement setup
The concentrations were measured with two different gas chromatography-mass spectrometers (GC-MS1 and GC-MS2) and two similar proton transfer reaction quadrupole mass spectrometers, which are hereafter called PTR-MS1 and PTR-MS2. Both PTR-MSs were operated by the University of Helsinki, the GC-MS1 was operated by Empa 10 (Switzerland) and the GC-MS2 was operated by the Finnish Meteorological Institute. The two GC-MSs and the PTR-MS1 used the same ca. 20 m long inlet line (Teflon PTFE, 8 mm id), which sampled 10 m above the ground with a sample air flow of 20 L min −1 (Fig. 1).
The PTR-MS2 is part of the permanent instrumentation of the site and sampled from 15 a tower about 30 m away from the common inlet of the other instruments. It measured the ambient air concentrations during every third hour, as the instrument was used for other measurements during the other two hours (Aalto et al., 2014;Rantala et al., 2014). The measurement cycle included six heights (4.2, 8.4, 16.7, 33.6, 50.4  and get ionized if their proton affinity is higher than that of water. From the drift tube the ions are guided to a quadrupole mass spectrometer for mass selection and are then detected by a secondary electron multiplier. The VOCs gain one proton (H + ) in the proton transfer reaction, thus their mass increases by one atom mass unit (amu) and they are singly charged. As PTR-MS has 10 a mass resolution of one Thomson (Th, i.e. mass-to-charge-ratio), different compounds with the same nominal mass cannot be distinguished. Therefore it cannot be used for exact identification of the measured compounds (for more details about the instrument, see Lindinger et al., 1998;de Gouw et al., 2003a;Warneke et al., 2003;de Gouw and Warneke, 2007 (Warneke et al., 1996(Warneke et al., , 2001Tani et al., 2003;de Gouw and Warneke, 2007;Taipale et al., 2008).

25
The drift tube pressures and voltages of the two PTR-MSs were not the same, as the instruments were optimized individually. PTR-MS1, which is the newer instrument, had a drift tube pressure of 2.2 mbar and voltage of 600 V, while PTR-MS2 ran with a drift course of a few days and sometimes even within hours. This variation is taken into account by normalizing the count rates and sensitivities with the primary ion signals (Taipale et al., 2008). Desorption of impurities inside the instrument or inside the inlet system can cause a notable offset in the count rates of many of the VOCs (Steinbacher et al., 2004). 15 These background signals are taken into account, by regular measurement of VOC free air (hereafter "zero air"). The background signals are then subtracted from the measured signals. During this campaign, zero air was measured every second hour with PTR-MS1 and every third hour with PTR-MS2. The zero air, was produced by pumping ambient air through a catalytic converter (Parker Balston zero air generator 20 HPZA-3500, USA and Parker ChromGas Zero Air Generator 3501, USA). Taipale et al. (2008) calibrated the instrument by diluting 50-120 mL min −1 of standard gas to 1000-3000 mL min −1 of zero air, which was done with a set-up that uses a 60 L standard gas bottle (with an initial pressure of 140 bar). The flow is regulated manually with a pressure regulator and is fine-tuned with a needle valve. Hereafter 25 this calibration method is referred to as "manual calibration". During this campaign calibrations were done using an automatic calibration method using mixing units. These mixing units dilute a standard gas flow of ca. 6 mL min −1 to a zero air flow of ca. consist of a 1 L (40 bar) standard gas cylinder and two mass flow controllers, which regulate the standard gas and the zero air flow (Bürkert 8710-10, and Bürkert 8710-03, Bürkert GmbH Germany, respectively) automatically. The comparability of the manual and the automatic calibration methods was studied in separate calibration method comparison tests, which were performed after the campaign for both PTR-MSs. 5 Both instruments were calibrated three times during the campaign, using the same gas standard mixture (Apel-Riemer Environmental Inc., CO, USA), consisting of 13 different VOCs including methanol, acetaldehyde, acetone, benzene and toluene in the range of 0.84-1.14 ppb.
The detection limits of the PTR-MSs were calculated as three times the SD (3σ) 10 of the background measurement. This background signal varies over time, leading to a change in the detection limit. Possible changes in background signals were taken into account by calculating detection limits separately for all calibration periods of the PTR-MSs.

15
The analysis of VOCs with gas chromatographic techniques relies on the separation of the VOC species in a chromatographic column. Traditionally, the samples have been collected into a canister or adsorbent tubes and analyzed off-line in the laboratory. With more recent in situ GC-MS systems, the samples are collected directly into adsorbent traps at the measurements site, from which they are desorbed by heating the trap 20 in the gas chromatograph. After separating the compounds by their retention times in the chromatographic columns, they are ionized by electron ionization and detected individually with a quadrupole mass spectrometer.

GC-MS1
The instrumental set-up of the adsorption-desorption system coupled to a gas 25 chromatograph-mass spectrometer (GC-MS1) is described in detail by Legreid et al.  2007,2008). Briefly, every 46 min a 12 min air sample of 350 mL was collected. VOCs were collected in a two-stage adsorbent system connected to a GC-MS (Agilent 6890-5973N, Agilent Technologies, CA, USA). The water removal was performed on the sampling trap (0.6 g of Hayesep D, Supelco, Switzerland) at room temperature. The hydrophobic nature of the adsorbent material allowed most of the water to pass through 5 the trap, and the remaining humidity was removed by dry helium flushing. Thereafter, the compounds were refocused on a microtrap (14 mg of Hayesep D, Supelco, Switzerland) at −40 • C to improve the separation of the compounds on the analytical column.
The compounds were rapidly desorbed from the trap (180 • C) and transferred to the GC. The chromatographic separation was performed on a 25 m×0.32 µm CP-Porabond 10 U column (Varian Inc., CA, USA) with 7 µm film thickness. Individual compounds were detected by operating the MS in single ion monitoring (SIM) mode, for an improved signal-to-noise ratio. The compounds were identified by their mass spectra and quantified using a 24-compound OVOC standard gas mixture in the range of 350-450 ppb (Apel-Riemer Environmental Inc., CO, USA), and a 30-compound VOCs standard gas 15 mixture in the range of 1-10 ppb (National Physical Laboratory, UK). Calibration was performed once every 23 h by filling a calibrated sample loop (127 µL), which was flushed with helium into the adsorbent trap. Methanol was only recovered at 45 %, and this was corrected for the measurement campaign.The detection limit for each compound was calculated as three times the SD 20 of five zero air samples.

GC-MS2
Measurements of GC-MS2 were conducted using an in situ thermal desorption unit (Markes Unity, Markes International Ltd, UK) with a gas chromatograph (Agilent 7890A, Agilent Technologies, CA, USA) and a mass spectrometer (Agilent 5975N, Agilent 25 Technologies, CA, USA). The column used was the 60 m-long DB-5 with an inner diameter of 0.235 mm and a film thickness of 1 µm. One 60 min sample was taken every other hour. Ozone was removed by a heated stainless steel inlet (temp 150 • C, length

Uncertainty of the instruments
The uncertainty of PTR-MS or of GC-MS measurements is affected by several factors. The total uncertainty (∆U) can be estimated by using the Gaussian propagation of uncertainty when the uncertainties of different steps of the data processing are known. In the following chapters, the total uncertainty calculations of PTR-MSs and the GC-MSs 15 are described separately. One should keep in mind that, in addition to the total uncertainty described in this chapter, the measurements may still have additional constant error of unknown magnitude, which can bias the measured concentrations.

Uncertainty of the PTR-MS measurement
The total measurement uncertainty of PTR-MS consists of two parts; the uncertainty 20 of the signal (∆U signal ) and the uncertainty of the calibration (∆U calibration ): The measured signal and the background signal are normalized with the primary ion signal for the VMR calculation. The normalized background signal (I zero ) is subtracted Introduction from the normalized measured signal (I meas ) and this background corrected normalized signal is divided by the normalized sensitivity, S, which is obtained from the calibration. Thus, The uncertainty of the signal in Eq. (1) contains the uncertainties of the measured 5 signal (∆U meas ) and the background signal (∆U zero ), Measured count rates (cps, counts per second) and count rates of the zero measurement were converted to counts (I counts and I counts, zero ) by multiplying by the dwell time (2 s for each molecule). As the PTR-MS statistics follow the Poisson distribution, the 10 uncertainty of a single measurement point (∆I meas ) is simply the square root of the counts ( I counts ). One background measurement consisted of 11 measurement points, from which the average background signal was derived and the nearest background value was subtracted from each individual ambient measurement point. The uncertainty of one background measurement (∆I zero ) was calculated as the SD of the 11 15 measurement points. In order to normalize I counts and I counts, zero they both need to be divided by the primary ion (H 3 O + and H 3 O + H 2 O) counts, which are obtained by multiplying the count rates of the primary ions by their dwell times. However, the primary ion signal is much higher than the measured signals and the zero signals. In addition, it remained approximately 20 constant during the time when the I counts and the nearest I counts, zero were measured. Thus, the primary ion signal uncertainty is less than 1 % and it was neglected. The uncertainty of the calibration (∆U cal ) is due to the uncertainty of the sensitivity (∆S) and the uncertainty of the calibration gas standard (∆U stdgas ) due to uncertainty of the concentrations in the calibration gas standard (∆χ cal ): PTR-MS sensitivity for a certain compound is determined by calibrating the instrument 5 with a known concentration of that compound. When the ratio of the sensitivity and its uncertainty is assumed to be constant, the sensitivity's uncertainty can be determined from the SD of a series of calibrations, performed using the same instrument settings. The laboratory tests for the similarity of the two calibration methods were done under the same instrument conditions, making the relative sensitivity uncertainty (∆S) obtainable from those measurements. The manufacturer of the calibration gas standard reports relative uncertainty (∆χ cal ), of ±5 % for the concentration of each VOC compound in the calibration gas mixture. By combining Eqs.
(1) to (4) and using the Gaussian propagation of error, the total uncertainty of PTR-MS for one measurement point is For N measurement points, the total relative uncertainty can be calculated as where VMR is the average volume mixing ratio of N measurements. Because different measurement points are independent, the total precision can be calculated using the Gaussian propagation of error. However, as ∆S and ∆χ cal are constant, the total systematic error is calculated as a linear sum of the errors of single measurement points. Total uncertainties of one hour were calculated for PTR-MS1 and PTR-MS2, as the 5 data comparison was mostly done using one-hour averages.

Uncertainty of the GC-MS1 measurement
The total uncertainty is divided into two components: precision (∆U precision ) and systematic error (∆U systematic ): The precision is calculated as where DL is the detection limit, χ is the mole fraction (peak area) of the considered peak and σ sample, rel is the relative SD of the sample. The first term of Eq. (8) considers the resolution of the instrument (e.g. background noise) and the second term considers the 15 reproducibility of the instrument. For low mole fractions the first term dominates, while for high mole fractions the second term dominates. The systematic error of GC-MS1 includes: the error due to uncertainty of the calibration standard's mole fractions (∆χ cal ), systematic integration errors due to peak overlay or poor baseline separation (∆χ int ), systematic errors due to blank correction (∆χ blank ), 20 and potential further instrument problems (∆χ instrument ) caused by e.g. sampling line artefacts, possible non-linearity of the detector or changes of split flow rates. Hence, the systematic error is The systematic error due to the calibration gas uncertainty (∆χ cal ) is calculated as: where A sample is the peak area of the sample measurement, A cal the peak area of the calibration standard measurement, V sample the volume of sample, V cal the sample volume of the calibration standard, and δχ cal certified standard uncertainty of calibration 5 standard and potential drift of the calibration standard. The systematic integration error (∆χ int ) is where δA cal is the relative error in peak area due to integration of the calibration measurement, δA sample is the integration error of the sample measurement and χ cal is the 10 mole fraction of the calibration standard peak. If a blank correction has to be applied, the error of this correction is described as the deviation from the mean blank value: where σ blank is the SD of the zero gas measurements and N is the number of those zero-gas measurements. For more details on the uncertainty calculation of GC-MS1 15 see Hoerger et al. (2014). The precision of acetone, acetaldehyde, benzene, and toluene was around 5 %, whereas the precision for methanol was 10 %. The total expanded uncertainty was around 15 % for acetone, benzene, and toluene, 23 % for acetaldehyde, and 28 % for methanol (Table 2). These values are in good agreement with previous studies (Apel 20 et al., 2008

Uncertainty of the GC-MS2 measurement
Total uncertainty (∆U) of the GC-MS2 is calculated as where ∆χ cal is the uncertainty of the standard preparation, ∆χ blank is the uncertainty of the blank level, ∆χ analysis is the uncertainty of the analysis and ∆χ flow is the uncertainty 5 of the sample flow. Uncertainties of the standard preparation and sample flow were given, respectively, by the manufacturers of the calibration gas standard and the mass flow controller. The blank level uncertainty was calculated as the SD of all blank values measured during the campaign. The uncertainty of the analysis was obtained from the relative SD of the analysis of 15 identical calibration standards. Analytical uncertainties 10 calculated from partial uncertainties at a concentration level of 2 ppb were 17 % for acetone, 4 % for benzene and 5 % for toluene.

Data processing
The concentrations measured with different instruments had temporal discrepancies, as all of the instruments had different sampling intervals. PTR-MS1 measured several compounds sequentially, each with an integration time of 2 s, which lead to a 1 min resolution. The ambient concentrations were measured 43 times during each hour, after which the background was sampled 11 times. PTR-MS2 measured ambient concentrations every third hour, during which each of the six measurement heights were sampled every sixth minute. Also, PTR-MS2 measured background during the same hour as the 20 ambient concentrations were measured. Each measurement height was sampled eight times, followed by 11 background samples. In this analysis, the concentrations measured at 8.4 m height were used. GC-MS1 collected a sample for 12 min, after which the sample was analysed for 34 min. In order to make the instrument comparison as consistent as possible, the measurements were averaged for the same time periods whenever possible. For the comparison between the two PTR-MSs and PTR-MS1 and GC-MS2, hourly averages were used. For the comparison between PTR-MS1 and GC-MS1, PTR-MS1 data was averaged for the same 12 min time periods when GC-MS1 samples were taken. 5 The detection limits of all the instruments were determined as three times the SD of the instrument noise (i.e. the zero air sample concentration). Values below the detection limits were removed from the GC-MS data. When hourly or 12 min average values were calculated from PTR-MS data, the averages were calculated for all data points. If an average value was below the detection limit, it was removed from further analysis. Data 10 points below the detection limits were not removed before average value calculation, in order to avoid biasing the average.

PTR-MS sensitivities
The sensitivities and uncertainties of sensitivities of the two PTR-MSs and the perfor-15 mance of the two different calibration methods were evaluated in separate laboratory tests after the field campaign. The laboratory tests were done by performing a series of calibrations with both automatic and manual calibration methods, while keeping all instrument parameters constant. The same calibration tests were performed separately for both PTR-MSs. A constant ratio was assumed for the sensitivity and its uncertainty, 20 thus the latter was determined as the SD of the sensitivity measurement series. The results of the calibration tests are presented in Table A1.
Generally, PTR-MS2 had higher sensitivity than PTR-MS1 for all compounds except isoprene. This was particularly the case for the larger molecules (xylenes, trimethylbenzene, naphthalene and α-pinene). The higher sensitivity of PTR-MS2 can be partly Introduction For most of the compounds, calculated sensitivities of both automatic and manual calibration methods agreed within the sensitivity uncertainty (Table A1). However, for methanol and methyl vinyl ketone, the two sensitivities obtained with the two different 5 methods were divergent for both PTR-MSs. For acetonitrile, the two calibration methods resulted in different sensitivities in the case of PTR-MS2. For naphthalene, the two methods resulted in different sensitivities in the case of PTR-MS1.
The sensitivity uncertainties of both calibration methods were lower for PTR-MS2. Regarding the manual calibration method, the pump used to generate zero air for the 10 calibration of PTR-MS1 caused some fluctuation to the zero air flow and thus increased the sensitivity variation (i.e. the SD) between different calibrations. The sensitivity uncertainty of methanol obtained with the automatic calibration system was clearly higher than the uncertainties of all other compounds, 63 % for PTR-MS1 and 25 % for PTR-MS2. 15 Methanol calibration is difficult due to its strong interaction with metal surfaces, as evidenced by the mass flow controller (de Gouw et al., 2003a). Higher methanol sensitivities and sensitivity uncertainties were obtained with the manual calibration method, which contains fewer metal surfaces than the automatic calibration system. It had also been used for a longer time, and the surfaces of the pressure regulator and needle 20 valve were evidently more saturated with methanol than the metal surfaces of the mixing units that were used for the automatic calibration.
In the case of PTR-MS1, the sensitivity uncertainties were higher than the uncertainties of the signal statistics or the concentration uncertainty of the calibration gas standard (Table 1). The signal uncertainty was 1 % or less for all compounds for PTR-25 MS1, while for PTR-MS2 the signal uncertainties were higher, and contributed to the total uncertainty. The higher signal uncertainties of PTR-MS2 were due to the rather low sampling frequency (eight samples per hour) of the PTR-MS2. The signal uncertainty of toluene was particularly high (65 %).

Total uncertainties of the concentration measurements
The total uncertainties of all instruments were below 30 %, with the exception of the methanol uncertainty of PTR-MS1 and the toluene uncertainty of PTR-MS2 (Table 2). GC-MS2 had low total uncertainties for benzene and toluene concentrations. However, uncertainties of GC-MS2 were defined at a concentration of 2 ppb, which is higher than 5 the concentrations measured for benzene and toluene during this campaign. Thus, the uncertainty values are too low. GC-MS1 and the two PTR-MSs had somewhat similar uncertainties for benzene. However, the PTR-MS1 uncertainty for toluene concentration was only 2 % while the PTR-MS2 uncertainty for toluene was 45 %. The high total toluene uncertainty of PTR-MS2 follows from the high signal uncertainty.
For acetone and acetaldehyde, the concentration uncertainties of the PTR-MSs were lower than those of the GC-MSs. In the case of methanol, GC-MS1 and PTR-MS2 had similar uncertainties, while PTR-MS1 had a very high total uncertainty (61 %). The high methanol uncertainty of PTR-MS1 was a consequence of the high sensitivity uncertainty. The different location of the PTR-MS2 inlet could partly explain the higher concentrations observed for methanol, acetaldehyde and acetone. Acetaldehyde and acetone are formed in the oxidation of e.g. monoterpenes and methylbutenol (Kesselmeier et al., 1997;Goldstein and Schade, 2000;Villanueva-Fierro et al., 2004;Millet et al., 15 2010) and acetaldehyde, acetone and methanol are emitted by the surrounding vegetation (Rinne et al., 2007;Aalto et al., 2014;Rantala et al., 2014). The local biogenic contribution of methanol and acetone is likely low as the compounds have relatively long atmospheric lifetimes (4, 16 and 33 days, respectively, in the spring) and high background concentrations originating from distant sources (Patokoski et al., 2014). 20 As such, their concentrations have relatively low small-scale spatial variability at the site.
Occasional traffic at the measurement site may have caused short pollution events of benzene and toluene and concentration differences between the two inlets. However, these episodes were rare and their influence on the hourly average values was prob- wood combustion, as well as distant anthropogenic sources (Hakola et al., 2003;Hellén et al., 2006;Patokoski et al., 2014).

Differences between the instruments by compound
In order to analyze in more detail how consistent the concentration measurements were, boxplots representing the medians and quartiles were drawn for all compounds 5 (Fig. 4). The concentration ranges of different instruments were determined from the boxplots. Accordingly, concentration range is hereafter defined as the range between 25 and 75 percentile.
Correlations between different instruments were studied using scatter plots and by calculating Pearson's correlation coefficients (R) between the instruments (Table 3). As 10 PTR-MS2 used a different inlet than the other three instruments did, its measurements were compared only with PTR-MS1.
Additionally, the overall consistency of the concentration measurements of the four different instruments was investigated by calculating: (1) the mean of all correlation coefficients, (2) the root mean square (RMS) difference of the scatterplot slopes from 15 1 : 1 line, and (3) the RMS of the intercepts for each compound. The RMS difference between the slopes and 1 : 1 line (RMS slope ) was calculated as where slope i is the slope of a scatter plot and N is the number of slopes used for the calculation. In an ideal case, the scatter plot slopes are close to unity, and the 20 RMS slope is close to zero. The slope and intercept values of a scatter plot depend on the positioning of the two datasets on the x and y axes, thus the all slopes and intercepts were calculated for both axis configurations. Generally, the measurements of PTR-MS2 were most scattered for all the compounds (Fig. 4). This was at least partly due to the discontinuous measurement cy-Introduction cle of the instrument, which meant that fewer data points (8 per hour) were available for calculating the hourly average than were available when using the PTR-MS1 (43 per hour). When fewer data points are used, individual divergent values have larger effects on the average value, as the SD is inversely proportional to the square root of data points. Data from the GC-MS2, which had the longest sampling time, were least 5 scattered.
In the following sections, the concentration distributions and correlations between different instruments are discussed separately for all five compounds.

Methanol
Methanol was measured with three out of four instruments: PTR-MS1, PTR-MS2 and 10 GC-MS1. There were large differences in the concentration ranges of the methanol measurements (Fig. 4). PTR-MS2 measured the highest concentrations, varying from 2.6 to 5.5 ppb. The measurements of PTR-MS1 and GC-MS1 were less scattered and the ranges were more congruent: 1.0-6.0 and 0.7-3.3 ppb respectively. Also, the median methanol concentration of PTR-MS2 (3.6 ppb) was clearly higher than the median 15 of PTR-MS1 (2.2 ppb), whereas the median concentration measured with GC-MS1 was the lowest (1.3 ppb). It's important to note that the measurement uncertainty of PTR-MS1 was very high for methanol ( Table 2).
As Fig. 5 and Table 2 show, the correlation of the two PTR-MSs was very good (R = 0.96), but the linear regression slope was 1.80. Thus, concentrations measured 20 with PTR-MS2 are almost two times as high as those measured with PTR-MS1. The correlation between PTR-MS1 and GC-MS1 was also good (0.84), but between these two instruments there was a constant offset and the slope was far from one (0.42). The mean correlations and RMS values of the slopes and intercepts are presented in Table 3, which shows that the measured methanol concentrations correlated well but 25 the RMS slope of 0.87 was far from the ideal 1 : 1 slope.
For methanol the correlation coefficients of this study agreed with those found in prior research. De Gouw et al. (2003bGouw et al. ( , 2004  Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | above 0.92 and slope values between 1.03 and 1.16 for PTR-MS vs. GC-MS, PTR-MS vs. PTR-MS and PTR-MS vs. PTR-Tof-MS, respectively. In this study the slopes were clearly less robust than in previous studies, indicating that the time trends of methanol can be captured well with all instruments, but also suggesting that the quantitative concentration values of all three instruments should be regarded with suspicion. 5 Methanol measurements are known to encounter some challenges. Calibrating PTR-MS for methanol is difficult because methanol deposits on the metal surfaces of the calibration system (de Gouw et al., 2003a), reducing the sensitivity and potentially making the concentrations seem higher than they actually are. Furthermore, an oxygen isotope (O 17 O) is detected with the same mass (33 amu) as methanol in the PTR-MS.

10
However, this is not a problem as it is taken into account in the VMR calculation (Taipale et al., 2008). Apart from the oxygen isotope, no significant interference of any other species has been reported in the literature (de Gouw and Warneke, 2007). The solubility of methanol in water can introduce problems to the GC-MS measurements, because when water is removed from the sample, part of the methanol could 15 be removed as well. An intercomparison campaign was done in 2005 in Germany, during which OVOCs were measured with several GC-MSs at the SAPHIR chamber at Forschungszentrum Jülich (see Apel et al., 2008 for details). During the campaign, the SAPHIR chamber was filled with ambient air and spiked with an unknown number of compounds. The results of the GC-MS1 showed overall good agreement with 20 the other instruments, though a tendency to underestimate the mole fractions in the chamber was observed. For methanol, the loss was around 40 % and it was suspected to occur in the bulk trap during the water removal step. Since the intercomparison in the SAPHIR chamber, the material in the bulk trap of GC-MS1 has aged, and the loss of methanol has increased. During an ACTRIS OVOC intercomparison at Hohenpeissenberg (Germany) in October 2013, the methanol loss was 55 % (unpublished). The methanol concentrations measured during this campaign were corrected for the 55 % loss.

Acetaldehyde
Three instruments out of four, PTR-MS1, PTR-MS2 and GC-MS1, measured acetaldehyde. The concentration range was very similar for all the instruments, between 0.3 and 0.6 ppb (Fig. 4). Also, the median concentrations of 0.4, 0.4 and 0.5 ppb for PTR-MS1, PTR-MS2 and GC-MS1 respectively, are within a 25 % range of each other. Despite PTR-MS measures acetaldehyde with a mass of 45 amu, but in air masses that are strongly influenced by biogenic emissions, several other compounds with the same mass (isomers) exist (de Gouw et al., 2003a). Furthermore, de Gouw et al., 2003a 20 have reported that the acetaldehyde concentration in the calibration gas may decrease over time, which again would lead to an overestimated concentration. The calibration gas standard used in this study was less than one year old during the measurement campaign, so the acetaldehyde concentration in the calibration gas was probably not decreased considerably. Introduction

Acetone
Acetone concentrations were measured with all four instruments. GC-MS1 and PTR-MS2 measured similar acetone concentrations, ranging from 0.9 to 1.3 ppb, whereas the range of PTR-MS1 was slightly lower, between 0.8 and 1.1 ppb. The lowest concentrations were measured with GC-MS2, 0.4-0.6 ppb. The median concentrations of 5 PTR-MS1 (0.9 ppb), PTR-MS2 (1.0 ppb) and GC-MS1 (1.1 ppb) were within 20 %, while the median for GC-MS2 was clearly lower (0.5 ppb). As in previous comparison studies (de Gouw et al., 2003b(de Gouw et al., , 2004Kaser et al., 2013;Warneke et al., 2015), acetone measurements correlated well in this study. The best correlation coefficient was between the two PTR-MSs (0.97). PTR-MS1 also correlated well with both GC-MS1 (0.88) and GC-MS2 (0.91). The different sampling times of the two GC-MSs could cause at least part of their lower correlation (0.77), as acetone concentration can vary within one hour. Furthermore, the slope for PTR-MS1 against GC-MS1 was very good (1.03). However, the intercept was 0.2 ppb, indicating a difference in the background levels of acetone for these two instruments. The slope between 15 PTR-MS1 and PTR-MS2 was rather good (1.25). The slopes between GC-MS2 and both PTR-MS1 and GC-MS1 were rather low, 0.56 and 0.47, respectively. This was probably due to the long sampling time, causing acetone to break through the micro trap. Consequently, even though GC-MS2 measured the time trends of acetone equally well as the other instruments, it underestimated the quantitative concentrations. The 20 average correlation coefficient for acetone was good (0.88), but the low slope values of GC-MS2 plotted against both PTR-MS1 and GC-MS1 (Fig. 5), also increased the RMS slope (0.54). When the RMS slope was calculated only for PTR-MS1 vs. PTR-MS2 and for PTR-MS1 vs. GC-MS1 pairs, it is very close to zero (0.02).
PTR-MS measurements of acetone can be affected by propanal, which is detected 25 at the same mass (59 amu) as acetone. GC-MS1 measured propanal concentrations, and during the whole campaign its concentration was less than 5 % of the acetone con- centration. Hence in this campaign, it could be assumed that PTR-MS measurements at mass 59 amu were acetone.

Benzene
The measured benzene concentrations of all four instruments were in good agreement, as found in previous studies by de Gouw et al. (2003b) In general, the correlations between different instrument pairs were good. The two GC-MSs had the highest correlation (0.92), yet the slope was not close to unity (0.77). The low slope value could be due to different sampling times of these instruments. However, as benzene does not have local sources at SMEAR II, changes in benzene concentration are slow and different sampling times should not have a great effect on 15 the measured concentrations. PTR-MS1 correlated equally well with both PTR-MS2 and GC-MS2 (0.88 and 0.89 respectively). The slope of PTR-MS1 vs. GC-MS2 was reasonably good (0.84), while the slope between the two PTR-MSs was rather high (1.38). Between PTR-MS1 and GC-MS1, the correlation was 0.84 and the slope was very good (0.99). The average correlation coefficient of benzene was the same as the 20 mean R (R) of acetone (0.88), and the RMS slope (0.23) was lower than it was for the other compounds.
Good correlations were expected for benzene, as the temporal and spatial changes in benzene concentration are low at the site and there are no known problems concerning benzene measurements with either GC-MS or PTR-MS. PTR-MS measurements at mass 79 amu have been reported to be only benzene, thus benzene measurements of PTR-MS are not interfered with other VOCs.

Toluene
Toluene was measured with all four instruments. The concentration ranges of PTR-MS1, GC-MS1 and GC-MS2 were the same from 0.01 to 0.08 ppb, with a median of 0.03 ppb. Due to high detection limits for toluene, the toluene concentrations measured with PTR-MS2 were high (0.02-0.16 ppb) and the median value (0.07 ppb) was more 5 than twice as high as when measured by the other instruments.
Although the concentrations of the three instruments agreed well, their correlation values were only moderate. R was 0.62, while the RMS slope was rather far from zero, at 0.45. The best correlation was between the two GC-MSs (0.77). Similarly to benzene, toluene does not have local sources at the site, so the effect of the different sampling times of the two GC-MSs should not be considerable. Yet, the slope of the GC-MS1 vs. GC-MS2 was far from unity (0.60). Between PTR-MS1 and GC-MS2 the slope was good (0.92), and also the correlation coefficient of 0.69 was fairly good, but the slope had rather high confidence interval (±0.18). Both the correlation and slope between PTR-MS1 and GC-MS1 were low, 0.53 and 0.55, respectively. The lowest correlation 15 was between the two PTR-MSs (0.50). Their slope was 1.36, with a high confidence interval of ±0.52. The toluene concentration remained below the detection limits of the PTR-MSs for a large amount of the time during the campaign, biasing the concentrations towards higher values. The number of data points used for the correlation analysis of toluene was less than half of the number of data points used for the other 20 compounds.
In the study by de Gouw et al. (2003b), the correlation between PTR-MS and GC-MS was stronger (R > 0.98 and slope = 1.08) than the correlations found in this study. Additionally, the correlation coefficients between PTR-MS and PTR-Tof-MS reported by Kaser et al. (2013) and Warneke et al. (2015) were stronger than the ones measured  It has been suggested that a p-cymene fragment is detected at the same mass (93 amu) as toluene with PTR-MS (Ambrose et al., 2010). Kaser et al. (2013) reported that in correcting the toluene signals for p-cymene, the linear regression between PTR-Tof-MS and another mass spectrometer improved from 0.72 to 0.98. During this campaign, p-cymene parent ion concentrations were not measured with PTR-MSs. Ear-5 lier measurements at SMEARII showed that between 12 April and 15 May 2011, the toluene concentration was on average 15 times higher than the p-cymene concentration. The mean p-cymene concentration was 8 ppt, while the maximum concentration was 107 ppt (Hakola et al., 2012). Consequently, p-cymene may occasionally have an effect on the toluene concentrations measured with PTR-MS. High p-cymene concen-10 trations could be expected e.g. during the monoterpene pollution episodes (Liao et al., 2011) from the nearby saw mill.

Conclusions
Ambient concentrations of methanol, acetaldehyde, acetone, benzene and toluene were measured using two PTR-MSs and two GC-MSs at a rural boreal forest site in the 15 spring of 2012. Additionally, two different calibration methods, automatic and manual, were tested for the PTR-MSs.
The calibration tests showed that both calibration methods resulted in similar sensitivities for acetaldehyde, acetone, benzene and toluene. For methanol, sensitivities obtained with the automatic method resulted in lower sensitivities than the manual cal-20 ibration method did. Also the sensitivity uncertainties of both PTR-MSs were higher for methanol than for the other compounds.
Very good correlation was found for benzene and acetone measurements between all instrument pairs. The mean correlation coefficient was 0.88 for both compounds. In the case of acetone, the RMS difference from the 1 : 1 line was high. However, probably 25 due to the long sampling time of the GC-MS2, acetone broke through the adsorbent trap, resulting in measured concentrations that were too low. When the acetone data of GC-MS2 was omitted from the calculation, the RMS difference from the 1 : 1 line was close to zero. To measure acetone or other very volatile OVOCs using GC-MS2 it is recommended to use: a shorter sampling time, a lower flow or a stronger or a cooled adsorbent trap. The correlation coefficients of acetaldehyde and toluene were quite far from unity, with respective averages of 0.50 and 0.62. The cause of the bad correlation in the case of acetaldehyde remains unresolved. Toluene concentrations were below the detection limits of the PTR-MSs for a considerable amount of the time, which biased the concentrations towards higher values and also reduced the amount of data points used for analysis.

10
Methanol measurements showed a robust correlation between the instruments. However, the slope values were far from unity, with an RMS difference of 0.87 from the 1 : 1 line. Hence, all the instruments measured the same time trends of methanol, but the quantitative concentration values must be regarded with caution. It should be noted that the uncertainty in the sensitivity of the instruments, manifesting as the de- 15 viation of the correlation slopes from unity, leads directly to similar uncertainty in the emission measurements of these compounds. This applies to e.g. eddy covariance, surface layer gradient and chamber techniques. Thus, it can be easily estimated that e.g. any emission measurement of methanol has an uncertainty of 50-100 % due to the sensitivity of the instrument used. The results of this study show that when doing 20 long-term measurements of ambient air, occasional comparison measurements are needed to validate the measured concentrations, even if the instrument is calibrated regularly. Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version