Towards standardized processing of eddy covariance flux measurements of carbonyl sulfide

Carbonyl sulfide (COS) flux measurements with the eddy covariance (EC) technique are growing in popularity with the recent development in using COS to estimate gross photosynthesis at the ecosystem scale. Flux data intercomparison would benefit from standardized protocols for COS flux data processing. In this study, we analyze how various data processing steps affect the final flux and provide a method for gap-filling COS fluxes. Different methods for determining the lag time between COS mixing ratio and the vertical wind velocity (w) resulted in a maximum of 12 % difference in the cumulative COS flux. 5 Due to limited COS measurement precision, small COS fluxes (below approximately 3 pmol m−2 s−1) could not be detected when the lag time was determined from maximizing the covariance between COS and w. We recommend using a combination of COS and carbon dioxide (CO2) lag times in determining the COS flux, depending on the flux magnitude compared to the detection limit of each averaging period. Different high frequency spectral corrections had a maximum effect of 10 % on COS flux calculations and different detrending methods only 1.2 %. Relative total uncertainty was more than five times higher for 10 low COS fluxes (lower than ±3 pmol m−2s−1) than for low CO2 fluxes (lower than ±1.5 μmol m−2s−1), indicating a low signal-to-noise ratio of COS fluxes. Due to similarities in ecosystem COS and CO2 exchange, and the low signal-to-noise ratio of COS fluxes that is similar to methane, we recommend a combination of CO2 and methane flux processing protocols for COS EC fluxes.


Introduction
Carbonyl sulfide (COS) is the most abundant sulfur compound in the atmosphere, with tropospheric mixing ratios around 500 ppt (Montzka et al., 2007). During the last decade, studies on COS have grown in number, mainly motivated by the use of COS exchange as a tracer for photosynthetic carbon uptake (also known as gross primary productivity, GPP; Sandoval-Soto et al., 2005;Blonquist et al., 2011;Asaf et al., 2013;Whelan et al., 2018). COS shares the same diffusional pathway in leaves as carbon dioxide (CO 2 ), but in contrast to CO 2 , COS is destroyed completely by hydrolysis and is not emitted. This one-way flux makes it a promising proxy for GPP (Asaf et al., 2013;Yang et al., 2018;Kooijmans et al., 2019).
Eddy covariance (EC) measurements are the backbone of gas flux measurements at the ecosystem scale (Aubinet et al., 2012). Protocols for instrument setup, monitoring, and data processing have been recently harmonized for CO 2 Sabbatini et al., 2018) as well as for methane (CH 4 ) and nitrous oxide (N 2 O; Nemitz et al., 2018) within the Integrated Carbon Observation System (ICOS) flux stations (Franz et al., 2018). The EC data processing chain includes despiking and filtering raw data, rotating the coordinate system to align with the prevailing streamlines, determining the time lag between the sonic anemometer and the gas analyzer signals, removing trends to separate the turbulent fluctuations from the mean trend, calculating covariances, and correcting for flux losses at low and high frequencies. After processing, fluxes are quality-filtered and flagged according to atmospheric turbulence characteristics and stationarity.
Studies on ecosystem COS flux measurements with the EC technique are still limited (Asaf et al., 2013;Billesbach et al., 2014;Maseyk et al., 2014;Commane et al., 2015;Gerdel et al., 2017;Wehr et al., 2017;Yang et al., 2018;Kooijmans et al., 2019;Spielmann et al., 2019), and there is no standardized flux processing protocol for COS EC fluxes. Table 1 summarizes the processing steps used in earlier studies. Most studies do not report all necessary steps and in particular often ignore the storage change correction. COS EC flux measurements and data processing have similarities with other trace gases (e.g., CH 4 and N 2 O) that often have low signal-to-noise ratios, especially regarding time lag determination and frequency response corrections. Time lag determination is essential for aligning wind and gas concentration measurements to minimize biases in flux estimates. Frequency response corrections, on the other hand, are needed for correcting flux underestimation due to signal losses at both high and low frequencies (Aubinet et al., 2012). Unlike for CH 4 or N 2 O, there are no sudden bursts or sinks expected for COS, and in that sense some processing steps for COS are more like those for CO 2 (e.g., despiking, storage change correction, and friction velocity, u * , filtering). Gerdel et al. (2017) describe the issues of different detrending methods, high-frequency spectral correction, time lag determination, and u * filtering. However, there has not been any study comparing different methods for time lag determination or high-frequency spectral correction in terms of their effects on COS fluxes. This weakens our ability to assess uncertainties in COS flux measurements.
In this study, we compare different methods for detrending, time lag determination, and high-frequency spectral correction. In addition, we compare two methods for storage change flux calculation, discuss the nighttime low-turbulence problem in the context of COS EC measurements, introduce a method for gap-filling COS fluxes for the first time, and discuss the most important sources of random and systematic errors. Through the evaluation of these processing steps, we aim to settle on a set of recommended protocols for COS flux calculation.

Materials and methods
In this study we used COS and CO 2 EC flux datasets collected at the Hyytiälä ICOS station in Finland from 26 June to 2 November 2015. The site has a long history of flux and concentration observations , and a COS analyzer was introduced to the site in March 2013. In this section, we describe methods used in the reference and alternative data processing schemes.

Site description
Measurements were made in a boreal Scots pine (Pinus sylvestris) stand at the Station for Measuring Forest Ecosystem-Atmosphere Relations (SMEAR II) in Hyytiälä,Finland (61 • 51 N,24 • 17 E; 181 m above sea level). The Scots pine stand was established in 1962 and reaches at least 200 m in all directions and about 1 km to the north . The site is characterized by modest height variation, and an oblong lake is situated about 700 m to the southwest of the forest station (Rannik, 1998;Vesala et al., 2005). Canopy height was 17 m, and the all-sided leaf area index (LAI) was approximately 8 m 2 m −2 in 2015. EC measurements were done at 23 m height. Sunrise time varied from 02:37 in June to 07:55 in November, while sunset was at 22:14 at the beginning and 16:17 at the end of the measurement period. All results are presented in Finnish winter time (UTC+2), and nighttime is defined as periods with a photosynthetic photon flux density (PPFD) of < 3 µmol m −2 s −1 .

EC measurement setup
The EC setup consisted of an ultrasonic anemometer (Solent Research HS1199, Gill Instruments Ltd., England, UK) for measuring wind speed in three dimensions and sonic temperature; an Aerodyne quantum cascade laser spectrometer (QCLS; Aerodyne Research Inc., Billerica, MA, USA) for measuring COS, CO 2 , and water vapor (H 2 O) mole fractions; and an LI-6262 infrared gas analyzer (LI-COR, Lincoln, NE, USA) for measuring H 2 O and CO 2 mole fractions. All measurements were recorded at 10 Hz frequency and were made with a flow rate of approximately 10 L min −1 for the QCLS and 14 L min −1 for the LI-6262. The PTFE sampling tubes were 32 and 12 m long for QCLS and LI-6262, respectively, and both had an inner diameter of 4 mm. Two PTFE filters were used upstream of the QCLS inlet to prevent any contaminants from entering the analyzer sample cell: one coarse filter (0.45 µm, Whatman) followed by a finer filter (0.2 µm, Pall Corporation) at approximately 50 cm distance to the analyzer inlet. The Aerodyne QCLS used an electronic pressure control system to control the pressure fluctuations in the sampling cell. The QCLS was run at 20 Torr sampling cell pressure. An Edwards XDS35i scroll pump (Edwards, England, UK) was used to pump air through the sampling cell, while the LI-6262 had flow control by an LI-670 flow control unit.
Background measurements of high-purity nitrogen (N 2 ) were done every 30 min for 26 s to remove background spectral structures in the QCLS (Kooijmans et al., 2016). In addition, a calibration cylinder was sampled each night Table 1. Processing steps used in previous COS eddy covariance studies. Detrending methods include linear detrending (LD), block averaging (BA), and recursive filtering (RF). Spectral corrections include an experimental method and a theoretical method by Moore (1986). Processing steps not specified in the original articles are marked with a dash. The last row summarizes the recommended processing options in this study.  (Kooijmans et al., 2016). The standard deviation calculated from the cylinder measurements was 19 ppt for COS mixing ratios and 1.3 ppm for CO 2 at 10 Hz measurement frequency. It has previously been shown that water vapor in the sample air can affect the measurements of COS through spectral interference of the COS and H 2 O absorption lines (Kooijmans et al., 2016). This spectral interference was corrected for by fitting the COS spectral line separately from the H 2 O spectral line.
The computer embedded in the Aerodyne QCLS and the computer that controlled the sonic anemometer and logged the LI-6262 data were synchronized once a day with a separate server computer. Analog data signals from the LI-6262 were acquired by the Gill anemometer sensor input unit, which digitized the analog data and appended them to the digital output data string. Digital Aerodyne data were collected on the same computer with a serial connection and recorded in separate files with custom software (COSlog).

Profile measurements
Atmospheric concentration profiles were measured with another Aerodyne QCLS at a sampling frequency of 1 Hz. Air was sampled at five heights: 125, 23, 14, 4, and 0.5 m. A multiposition Valco valve (VICI, Valco Instruments Co. Inc.) was used to switch between the different profile heights and calibration cylinders. Each measurement height was sampled for 3 min each hour. One calibration cylinder was measured twice for 3 min each hour to correct for instrument drift, and two other calibration cylinders were measured once for 3 min each hour to assess the long-term stability of the measurements. A background spectrum was measured once every 6 h using high-purity nitrogen (N 7.0; for more details, see Kooijmans et al., 2016). The overall uncertainty of this analyzer was determined to be 7.5 ppt for COS and 0.23 ppm for CO 2 at 1 Hz frequency (Kooijmans et al., 2016). The measurements are described in more detail in Kooijmans et al. (2017).

Eddy covariance fluxes
In this section, we describe the processing steps of EC flux calculation from raw data handling to final flux gap filling and uncertainties. Figure 1 provides a graphical outline of all processing steps. The different processing options presented here are applied and discussed in Sect. 3. In the next section, the different processing schemes are compared to a "reference scheme", which consists of linear detrending, planar-fit coordinate rotation, using CO 2 time lag for COS, and experimental spectral correction. A subset of the data -nighttime fluxes processed with the reference scheme -was published in Kooijmans et al. (2017).

Preprocessing
For flux calculation, the sonic anemometer and gas analyzer signals need to be synchronized. This is particularly relevant for fully digital systems where digital data streams are Figure 1. Different EC processing steps summarized. Yellow boxes refer to steps only used for COS data processing, blue boxes to steps used only for CO 2 data, and green boxes to steps that are relevant for both gases. Recommended options are written in bold. Options that are used in the reference processing scheme for COS in this study are planar-fit coordinate rotation, linear detrending, CO 2 time lag, experimental high-frequency correction, low-frequency correction according to Rannik and Vesala (1999), and storage change flux from measured concentration profile. The abbreviations most commonly used throughout the text are written in parentheses. gathered from different instruments that can be completely asynchronous to each other . The following procedure was used to combine two data files of 30 min length (of which one includes sonic anemometer and LI-6262 data and the other includes Aerodyne QCLS data): (1) the cross-covariance of the two CO 2 signals (QCLS and LI-6262) was calculated; (2) the QCLS data were shifted so that the cross-covariance of the CO 2 signals was maximized. Note that this will result in having the same time lag for the QCLS and LI-6262. The time shift between the two computers was a maximum of 10 s, with most varying between 0 and 2 s during 1 d. It is also possible to shift the time series by maximizing the covariance of CO 2 and w, which will then already account for the time lag (Fig. S1 in the Supplement) or combine files according to their time stamps and allow a longer window in which the time lag is searched. However, in this case it is important that the time lag (and computer time shift) is determined from CO 2 measurements only as using COS data might result in several covariance peaks in longer time frames due to low signal-to-noise ratios and small fluxes.
Raw data were then despiked so that the difference between subsequent data points was a maximum of 5 ppm for CO 2 , 1 mmol mol −1 for H 2 O, 200 ppt for COS, and 5 m s −1 for w. After despiking, the missing values were gap-filled by linear interpolation.
We used the planar-fit method to rotate the coordinate frame so that the turbulent flux divergence is as close as possible to the total flux divergence. In this method, w is assumed to be zero only on longer timescales (weeks or even longer). A mean streamline plane is fitted to a long set of wind measurements. Then the z axis is fixed as perpendicular to the plane, and the v wind component is fixed to be zero (Wilczak et al., 2001). In addition, we used 2D coordinate rotation for coordinate rotation in two processing schemes to determine the flux uncertainty that is related to flux data processing (Sect. 2.4.6). First, the average u component was forced to be along the prevailing wind direction. The second rotation was performed to force the mean vertical wind speed (w) to be zero (Kaimal and Finnigan, 1994). In this way, the x axis is parallel and z axis perpendicular to the mean flow. While 2D coordinate rotation is the most commonly used rotation method, the planar-fit method brings benefits especially in complex terrain  and is nowadays recommended as the preferred coordinate rotation method .
To separate the mixing ratio time series into mean and fluctuating parts, we tested three different detrending options: (1) 30 min block averaging (BA), (2) linear detrending (LD), and (3) recursive filtering (RF) with a time constant of 30 s. BA is the most commonly used method for averaging the data with the benefit of damping the turbulent signal the least. On the other hand, BA may lead to an overestimation of the fluctuating part (and thus overestimation of the flux), for example due to instrumental drift or large-scale changes in atmo-spheric conditions that are not related to turbulent transfer (Aubinet et al., 2012). The LD method fits a linear regression to the averaging period and thus gets rid of instrumental drift and to some extent weather changes but may lead to underestimation of the flux if the linear trend was related to actual turbulent fluctuations in the atmosphere. The third method, RF, uses a time window (here 30 s) for a moving average over the whole averaging period. RF brings the biggest correction and thus lowest flux estimate compared to other methods but effectively removes biased low-frequency contributions to the flux. An Allan variance was determined for a time period when the instrument was sampling from a gas cylinder (Werle, 2010). The time constant of 30 s for RF was determined from the Allan plot ( Fig. S4) as the system starts to drift in a nonlinear fashion at 30 s following the approach suggested by Mammarella et al. (2010). The effect of different detrending methods is shown and discussed in Sect. 3.

Time lag determination
The time lag between w and COS signals was determined using the following five methods: 1. The time lag was determined from the maximum difference of the cross-covariance of the COS mixing ratio and w (w χ COS ) to a line between covariance values at the lag window limits (referred to hereafter as COS lag ). This also applies to other covariance methods explained below and prevents the time lag from being exactly at the lag window limits. Lag window limits (from 1.5 to 3.8 s) were determined based on the nominal time lag of 2.6 s calculated from the flow rate and tube dimensions. More flexibility was given to the upper end of the lag window as time lags have been found to be longer than the nominal time lag (Massman, 2000;Gerdel et al., 2017).
2. The time lag was determined from the maximum difference of the cross-covariance of the CO 2 mixing ratio and w (w χ CO 2 ) to a line between covariance values at the lag window limits within the lag window of 1.5-3.8 s (referred to hereafter as CO2 lag ).
3. The time lag was determined using a constant time lag of 2.6 s, which was the nominal time lag and the most common lag for CO 2 with our setup (referred to hereafter as Const lag ).
4. The time lag was determined from the maximum difference of the smoothed w χ COS to a line between covariance values at the lag window limits. The crosscovariance was smoothed with a 1.1 s running mean (referred to hereafter as RM lag ). The averaging window was chosen so that it provided a more distinguishable covariance maximum while still preventing a shift in the timing of the maximum. 5. The time lag was determined from a combination of COS lag and CO2 lag . First, the random flux error due to instrument noise was calculated according to Mauder et al. (2013): where instrumental noise σ noise c was estimated from the method proposed by Lenschow et al. (2000), σ w is the standard deviation of the vertical wind speed, and N is the number of data points in the averaging period. The random error was then compared to the raw maximum covariance. If the maximum covariance was higher than 3 times the random flux error, then the COS lag method was used for time lag determination. If the covariance was dominated by noise (the covariance being smaller than 3 times the random error) or COS lag was at the lag window limit, then the CO2 lag lag method was selected as proposed in Nemitz et al. (2018; referred to hereafter as the DetLim lag ).

Frequency response correction
Some of the turbulence signal is lost at both high and low frequencies due to losses in sampling lines, inadequate frequency response of the instrument, and inadequate averaging times among other reasons (Aubinet et al., 2012). In this section, we describe both high-and low-frequency loss corrections in detail. We tested two high-frequency correction methods, described below, simultaneously correcting for low-frequency losses. One run was performed with neither low-frequency nor high-frequency response corrections.

High-frequency correction
Especially in closed-path systems, the high-frequency turbulent fluctuations of the target gas damp at high frequencies due to long sampling lines. Other reasons for high-frequency losses include sensor separation and inadequate frequency response of the sensor. In turn, high-frequency losses cause the normalized cospectrum of the gas with w to be lower than expected at high frequencies, resulting in lower flux. The flux attenuation factor (FA) for a scalar s is defined as where w s meas and w s are the measured and unattenuated covariances, respectively; T ws (f ) is the net transfer function, specific to the EC system and scalar s; and Co ws (f ) is the cospectral density of the scalar flux w s . For solving FA, a cospectral model and transfer function are needed. In this study, we tested the effect of high-frequency spectral correction by applying either an analytical correction for highfrequency losses (Horst, 1997) or an experimental correction (Aubinet et al., 2000). The analytical correction was based on scalar cospectra defined in Horst (1997), the experimental approach was based on the assumption that temperature cospectrum is measured without significant error, and the normalized scalar cospectra were compared to the normalized temperature cospectrum (Aubinet et al., 2000;Wohlfahrt et al., 2005;Mammarella et al., 2009). In the analytical approach by Horst (1997) the integral in Eq.
(2) is solved analytically by assuming a model cospec- The flux attenuation then results in where α = 7/8 for neutral and unstable stratification, and α = 1 for stable stratification in the surface layer, τ s is the overall EC system time constant, and f m is the frequency of the logarithmic cospectrum peak estimated from f m = n m u z m −d , where n m is the normalized frequency of the cospectral peak, u the mean wind speed, z m the measurement height, and d the displacement height. The normalized frequency of the cospectral peak n m is dependent on atmospheric stability ζ = z m −d L (Horst, 1997): where L is the Obukhov length (Fig. S8). The model cospectrum for the analytical high-frequency spectral correction was adapted from Kaimal et al. (1972) and given as In the experimental approach, we solved Eq. (2) numerically and used the fitting of the measured temperature cospectra to define a site-specific scalar cospectral model (De Ligne et al., 2010) as where the stability dependency of the cospectral peak frequency n m (Fig. S8) followed the equation In both approaches (analytical and experimental), the time constant τ s was empirically estimated by fitting the transfer function T ws (f ) to the normalized ratio of cospectral densities: where N θ and N s are normalization factors, and Co ws and Co wθ are the scalar and temperature cospectra, respectively. The estimated time constant was 0.68 s for the Aerodyne QCLS and 0.35 s for the LI-6262.

Low-frequency correction
Detrending the turbulent time series, especially with LD or RF methods, may also remove part of the real low-frequency variations in the data (Lenschow et al., 1994;Kristensen, 1998), which should be corrected for in order to avoid flux underestimation. Low-frequency correction in this study for different detrending methods was done according to Rannik and Vesala (1999).

Flux quality criteria
The calculated fluxes were accepted when the following criteria were met: the second wind rotation angle (θ) was below 10 • , the number of spikes in 1 half hour was less than 100, the COS mixing ratio was higher than 200 ppt, the CO 2 mixing ratio ranged between 300 and 650 ppm, and the H 2 O mixing ratio was higher than 1 ppb. We used a similar flagging system for micrometeorological quality criteria as Mauder and Foken (2006) for COS and CO 2 : flag 0 was given if flux stationarity was less than 0.3 (meaning that covariances calculated over 5 min intervals deviate less than 30 % from the 30 min covariance), kurtosis was between 1 and 8, and skewness was within a range from −2 to 2. Flag 1 was given if flux stationarity was from 0.3 to 1 and if kurtosis and skewness were within the ranges given earlier. Flag 2 was given if these criteria were not met.
In addition to these filtering and flagging criteria, we added friction velocity (u * ) filtering to screen out data collected under low-turbulence conditions. A decrease in measured EC flux is usually observed under low-turbulence conditions, although it is assumed that gas exchange should not decrease due to low turbulence. While this assumption holds for CO 2 , it may not be justified for COS (Kooijmans et al., 2017), as is further discussed in Sect. 3.5. The appropriate u * threshold was derived from a 99 % threshold criterion (Papale et al., 2006;Reichstein et al., 2005). The lowest acceptable u * value was determined from both COS and CO 2 nighttime fluxes.

Storage change flux calculation
Storage change fluxes were calculated from gas mixing ratio profile measurements and by assuming a constant profile throughout the canopy using EC system mixing ratio measurements. Storage change fluxes from mixing ratio profile measurements were calculated using the formula where p is the atmospheric pressure, T a air temperature, R the universal gas constant, and χ c (z) the gas mixing ratio at each measurement height. The integral was determined from the hourly measured χ c profile at 0.5, 4, 14, and 23 m by integrating an exponential fit through the data (Kooijmans et al., 2017). When the profile measurement was not available, storage was calculated from the COS (or CO 2 ) mixing ratio measured by the EC setup. Another storage change flux calculation was done assuming a constant profile from the EC measurement height (23 m) to the ground level. A running average over a 5 h window was applied to the COS mixing ratio data to reduce the random noise of measurements.
The storage change fluxes were used to correct the EC fluxes for storage change of COS and CO 2 below the flux measurement height.

Flux uncertainty
The flux uncertainty was calculated according to ICOS recommendations presented by Sabbatini et al. (2018). First, flux random error was estimated as the variance of covariance according to Finkelstein and Sims (2001): where N is the number of data points (18 000 for 30 min of EC measurements at 10 Hz) and m the number of samples sufficiently large to capture the integral timescale (18 000 was used in this study).γ w,w is the variance andγ w,c the covariance of the measured variables w and c (in this case, the vertical wind velocity and gas mixing ratio).
As the chosen processing schemes affect the resulting flux, the uncertainty related to the used processing options have to be accounted for. This uncertainty was estimated as where F c,j is the flux calculated according to j = 1, . . ., N different processing schemes, and N is the number of possible processing scheme combinations that are equally reliable but cause variability in fluxes. For simplicity, the processing steps we considered here are detrending, coordinate rotation, and high-frequency spectral correction, leading to N = 8 processing schemes: BA with 2D coordinate rotation and experimental spectral correction, BA with 2D rotation and Horst (1997) spectral correction, BA with planar fitting and experimental correction, BA with planar fitting and Horst (1997) correction, LD with 2D rotation and experimental correction, LD with 2D rotation and Horst (1997) correction, LD with planar fitting and experimental correction, and LD with planar fitting and Horst (1997) correction. As all the different processing schemes will lead to slightly different random errors as well, the flux random error was estimated to be the mean of Eq. (12) for different processing schemes: The combined flux uncertainty is then the summation of rand and proc in quadrature: To finally get the total uncertainty as the 95th percentile confidence interval, the total uncertainty becomes total = 1.96 comb .
It should be noted, though, that this uncertainty estimate holds for single 30 min fluxes only. When working with fluxes averaged over time, the total uncertainty cannot be directly propagated to the long-term averages because the two uncertainty sources behave differently. The random uncertainty is expected to decrease with increasing number of observations, while processing-related uncertainty would not be much affected by time averaging. The random uncertainty of a flux averaged over multiple observations is obtained as where N is the number of observations and rand,i the random uncertainty of each observation . In this study, we calculated the time-averaged processing uncertainty from Eq. (13) with time-averaged fluxes from the four different processing schemes. The total uncertainty of timeaveraged flux was then calculated similarly from Eqs. (15) and (16) with time-averaged random and processing uncertainties.

Flux gap filling
Missing CO 2 fluxes were gap-filled according to Reichstein et al. (2005), while missing COS fluxes were replaced by simple model estimates or by hourly mean fluxes if model estimates were not available in a way comparable to gap filling of CO 2 fluxes (Reichstein et al., 2005). The COS gap-filling function was parameterized in a moving time window of 15 d to capture the seasonality of the fluxes. To calculate gap-filled fluxes, the parameters were interpolated daily. Gaps where any driving variable of the regression model was missing were filled with the mean hourly flux during the 15 d period.
We tested different combinations of linear or saturating (rectangular hyperbola) functions of the COS flux on PPFD and linear functions of the COS flux against vapor pressure deficit (VPD) or relative humidity (RH). The saturating light response function with the mean nighttime flux as a fixed offset term explained the short-term variability of COS flux relatively well, but the residuals as a function of temperature, RH, and VPD were clearly systematic. Therefore, for the final gap filling, we used a combination of saturating function on PPFD and linear function on VPD that showed good agreement with the measured fluxes while having a relatively small number of parameters: where I is PPFD (µmol m −2 s −1 ); D is VPD (kPa); and a (pmol m −2 s −1 ), b (µmol m −2 s −1 ), c (pmol m −2 s −1 kPa −1 ), and d (pmol m −2 s −1 ) are fitting parameters. Parameter d was set as the median nighttime COS flux over the 15 d window, and other parameters were estimated accordingly.
3 Results and discussion

Detrending methods
In order to check the contribution of different detrending methods to the resulting flux, we made flux calculations with different methods: block averaging (BA), linear detrending (LD), and recursive filtering (RF) using the same time lag (CO 2 time lag) for all runs (Fig. S5). The largest median COS flux (most negative) was obtained from RF (−12.0 pmol m −2 s −1 ), while the smallest median (least negative) flux resulted from BA (−10.7 pmol m −2 s −1 ), and LD (−11.3 pmol m −2 s −1 ) differed by 5.3 % from BA ( Table 2). The range of the 30 min COS flux was the largest (from −183.2 to 82.56 pmol m −2 s −1 ) when using the BA detrending method, consistent with a similar comparison in Gerdel et al. (2017). In comparison, it was from −107.3 to 73.1 pmol m −2 s −1 for LD and from −164.9 to 36.8 pmol m −2 s −1 for RF. While it was surprising that RF resulted in a more negative median flux than BA, it is likely explained by the large variation in BA with compensating high negative and positive flux values as the positive flux values with BA and LD are higher than with RF. In addition, it has to be kept in mind that the fluxes were corrected for low-frequency losses, which in part also accounts for the effects of detrending.
The most commonly recommended averaging methods are BA Nemitz et al., 2018;Moncrieff et al., 2004) and LD (Rannik and Vesala, 1999) because they have less of an impact on spectra (Rannik and Vesala, 1999) and require fewer low-frequency corrections. These are also the most used detrending methods in previous COS EC flux studies (Table 1). RF may underestimate the flux (Aubinet et al., 2012), but as spectroscopic analyzers are prone to fringe effects under field conditions, for example, the use of RF might still be justified (Mammarella et al., 2010). Regular checking of raw data provides information on instrumental drift and helps to determine the optimal detrending method. It is also recommended to check the contribution of each detrending method to the final flux to better understand what the low-frequency contribution in each measurement site and setup is.

Effect of time lag correction
Different time lag methods resulted in slightly different time lags and COS fluxes. The most common time lags were 2.6 s from the COS lag , CO2 lag , and DetLim lag methods and 1.5 s from RM lag , which was the lag window limit (Fig. 2). Time lags determined from the COS lag and RM lag methods were distributed evenly throughout the lag window, whereas the lags from the CO2 lag and DetLim lag methods were normally distributed, with most lags detected at the window center.
We also tested determining time lags from the most commonly used method of maximizing the absolute covariance. If the time lag was determined from the absolute covariance maximum instead of the maximum difference to a line between covariance values at the lag window limits, the COS lag and RM lag had most values at the lag window limits (Fig. S2). This resulted in a "mirroring effect" (Langford et al., 2015); i.e., fluxes close to zero were not detected as often as with other methods since the covariance is always maximized, and the derived flux switches between positive and negative values of similar magnitude (Fig. S3). This issue should be taken into account in COS EC flux processing as absolute COS covariance maximum is by far the most commonly used method in determining the time lag in COS studies (Table 1). To make the different methods more comparable, the time lags were, in the end, determined from the maximum difference to a line between covariance values at the lag window limits in this study. In this way, time lags were not determined at the window borders, and most of the methods had the final flux probability distribution function (PDF) peak approximately at the same flux values and had otherwise small differences in the distributions (Fig. 3). The only method that was clearly different from the others was DetLim lag , which produced higher fluxes than other methods.
A constant time lag has been found to bias the flux calculation as the time lag can likely vary over time due to, for example, fluctuations in pumping speed (Langford et al., 2015;Taipale et al., 2010;Massman, 2000). However, as the CO2 lag was often the same as the chosen constant lag of 2.6 s, we did not observe major differences between these two methods. A reduced bias in the flux calculation with smoothed cross-covariance was introduced by Taipale et al. (2010), who recommended using this method for any EC system with a low signal-to-noise ratio. However, we do not recommend this as a first choice since the time lags do not have a clear distribution, and if the maximum covariance method is used, we find a mirroring effect with the RM lag in the final flux distributions.
By using the DetLim lag method, the COS time lag was estimated for 54 % of cases from COS lag , while the CO2 lag was used as a proxy for the COS time lag in about 46 % of cases. Figure 3 shows that the raw covariance of COS only exceeds the noise level at higher COS flux values, and thus the COS lag is chosen by this method only at higher fluxes, as expected. At lower flux rates, and especially close to zero, the COS fluxes are not high enough to surpass the noise level, and thus the CO2 lag is chosen.
The median COS fluxes were highest when the time lag was determined from the DetLim lag (−13.1 pmol m −2 s −1 ) and COS lag (−11.5 pmol m −2 s −1 ) methods (Table 2). Using the COS lag results in both higher positive and negative fluxes and might thus have some compensating effect on the median fluxes. Const lag and RM lag produced the smallest median uptake (−11.1 pmol m −2 s −1 ), while the CO2 lag had a median flux of −11.3 pmol m −2 s −1 . The difference between the reference flux and the flux from the DetLim lag method was clearly higher (15.9 %) than with other methods (1.8 %) and had the largest variation (from −113.8 to 81.6 pmol m −2 s −1 ) and standard deviation (16.1 pmol m −2 s −1 ). This difference might become important at the annual scale, and as the most commonly used covariance maximization method does not produce a clear time lag distribution for DetLim lag or COS lag , we recommend using the CO2 lag for COS fluxes as in most ecosystems the CO 2 cross-covariance with w is more clear than the cross-covariance of COS and w signals.

High-frequency response correction
The mean COS cospectrum was close to the normal mean CO 2 cospectrum (compare Fig. 4a and c). The power spectrum of COS was dominated by noise as can be seen from the increasing power spectrum with increasing frequency for normalized frequencies greater than 0.2, which is similar to what was measured for COS by Gerdel et al. (2017) and for N 2 O by Eugster et al. (2007). The fact that COS measurements are dominated by noise at high frequencies means that those measurements are limited by precision and that they likely do not capture the true variability in COS turbulence signals. This is less of a problem for CO 2 , where white noise only starts to dominate at higher frequencies (normalized frequency higher than 3). Cospectral attenuation was found for both COS and CO 2 at high frequency.
High-frequency losses due to, for example, attenuation in sampling tubes and limited sensor response times are expected to decrease fluxes if not corrected for (Aubinet et al., Table 2. Median COS fluxes ( pmol m −2 s −1 ) and CO 2 fluxes ( µmol m −2 s −1 ) during 26 June to 2 November 2015 and their difference to the reference fluxes when using different processing options. Median reference fluxes are −11.3 pmol m −2 s −1 and −0.56 µmol m −2 s −1 for COS and CO 2 , respectively, and median daytime (nighttime) fluxes are −16.8 (−4.1) pmol m −2 s −1 and −4.58 (1.23) µmol m −2 s −1 when using linear detrending, CO 2 time lag, and experimental high-frequency spectral correction. NA denotes that data are not available.  2012). The median COS flux, when using the CO 2 time lag and keeping low-frequency correction and quality filtering the same, was the least negative without any high-frequency correction (−9.7 pmol m −2 s −1 ), most negative with the experimental correction (−11.3 pmol m −2 s −1 ), and in between with the analytical correction (−11.0 pmol m −2 s −1 ; Fig. S6). Correcting for the high-frequency attenuation thus made a maximum of 14.2 % difference in the median COS flux. In addition, daytime median flux magnitudes increased from −15.0 to −16.9 pmol m −2 s −1 with the analytical correction and to −16.8 pmol m −2 s −1 with the experimental high-frequency correction. However, the relative difference was larger during nighttime, when flux magnitudes increased from −3.4 to −4.0 and −4.1 pmol m −2 s −1 with analytical and experimental methods, respectively. Similar results were found for the CO 2 flux, but the differences were smaller: without any high-frequency correction the median flux was the least negative at −0.48 µmol m −2 s −1 , most negative with experimental correction (−0.56 µmol m −2 s −1 ), and in between with the analytical correction (−0.54 µmol m −2 s −1 ), thus making a maximum of 14.3 % difference in the median CO 2 flux, similar to COS. Very similar results were found for CH 4 and CO 2 fluxes in Mammarella et al. (2016), where the highfrequency correction made the largest difference in the final flux processing for closed-path analyzers. Similar to COS, CO 2 flux magnitudes were also increased more during nighttime due to spectral correction than during daytime. Daytime median flux magnitude increased from −4.18 to −4.77 and −4.58 µmol m −2 s −1 , and nighttime fluxes increased from 1.02 to 1.18 and 1.23 µmol m −2 s −1 when using analytical and experimental high-frequency spectral corrections, respectively. Flux attenuation was dependent on stability and wind speed for both correction methods, as also found in Mammarella et al. (2009; Fig. S7).
The site-specific model captures the cospectrum better than the model cospectrum by Horst (1997), as shown in Fig. 4. For high-frequency spectral corrections, it is recommended to use the site-specific cospectral model, as has been done in most previous COS studies (Table 1).

Storage change fluxes
In the following, storage change fluxes based on profile measurements are listed as default, with fluxes based on the constant profile assumption listed in brackets.
The COS storage change flux was negative from 15:00 until 06:00, with a minimum of −1.0 pmol m −2 s −1 (−0.6 pmol m −2 s −1 ) reached at 20:00. A negative storage change flux of COS indicates that there is a COS sink in the ecosystem when the boundary layer and effective mixing layer is shallow. Neglecting this effect would lead to overestimated uptake at the ecosystem level later when the air at the EC sampling height is better mixed. The COS storage change flux was positive from around 06:00 until 15:00 and peaked at 09:00 with a magnitude of 1.9 pmol m −2 s −1 (0.8 pmol m −2 s −1 ). The storage change flux made the highest relative contribution to the sum of measured EC and storage change fluxes at midnight with 18 % (13 %; Fig. 5c). The difference between the two methods was a minimum of 13 % at 11:00 and a maximum of 56 % at 09:00. The two methods made a maximum of 7 % difference in the resulting cumulative ecosystem flux, as already reported in Kooijmans et al. (2017). . Cospectrum and power spectrum for COS (panels a and b, respectively) and CO 2 (panels c and d, respectively) in July 2015. All data were filtered by the stability condition −2 < ζ < −0.0625, and COS data were only accepted when the covariance was higher than 3 times the random error due to instrument noise (Eq. 1). The cospectrum models by the experimental method and Horst (1997) that were used in the high-frequency spectral correction are shown in continuous and dashed gray lines, respectively.
The CO 2 storage change flux was positive from 15:00 until around 04:00, with a maximum value of 0.62 µmol m −2 s −1 (0.38 µmol m −2 s −1 ) reached at 21:00. A positive storage change flux indicates that the ecosystem contains a source of carbon when the boundary layer is less turbulent and accumulates the respired CO 2 within the canopy. As turbulence would increase later in the morning, the accumulated CO 2 would result in an additional flux that could mask the gas exchange processes occurring at that time step. The CO 2 storage change flux minimum was reached with both methods at 08:00, with a magnitude of −1.01 µmol m −2 s −1 (−0.52 µmol m −2 s −1 ) when the boundary layer had already started expanding, and leaves are assimilating CO 2 . The maximum contribution of the storage change flux was as high as 89 % (36 %) compared to the EC flux at 18:00, when the CO 2 exchange turned to respiration, and storage change flux increased its relative importance (Fig. 5d). The difference between the two storage change flux methods for CO 2 was a maximum of 53 % at 21:00 and a minimum of 13 % at midnight. The maximum difference of 5 % was found in the cumulative ecosystem CO 2 flux when the different methods were used.
In conclusion, the storage change fluxes are not relevant for budget calculations, as expected, and have not been widely applied in previous COS studies (Table 1) even though storage change flux measurements are mandatory in places where the EC system is placed at a height of 4 m or above according to the ICOS protocol for EC flux measurements . In addition, storage change fluxes are important at the diurnal scale to account for the delayed capture of fluxes by the EC system under low-turbulence conditions.

u * filtering
Calm and low-turbulence conditions are especially common during nights with stable atmospheric stratification. In this case, storage change and advective fluxes have an important role, and the measured EC flux of a gas does not reflect the atmosphere-biosphere exchange, typically underestimating the exchange. This often leads to a systematic bias in the annual flux budgets (Moncrieff et al., 1996;Aubinet et al., 2000;Aubinet, 2008). Even after studies of horizontal and vertical advection, u * filtering still keeps its place as the most efficient and reliable tool to filter out data that are not representative of the surface-atmosphere exchange under low turbulence .
For COS, nighttime filtering is a more complex issue than it is for CO 2 . In contrast to CO 2 , COS is taken up by the ecosystem during nighttime (Kooijmans et al., 2017(Kooijmans et al., , 2019 depending on stomatal conductance and the concentration gradient between the leaf and the atmosphere. When the atmospheric COS mixing ratio decreases under low-turbulence conditions (due to nighttime COS uptake in the ecosystem), the concentration gradient between the leaf and the atmosphere goes down such that a decrease in COS uptake can be expected (Kooijmans et al., 2017). Thus, the assumption that fluxes do not go down under low-turbulence conditions, as is the case for respiration of CO 2 , does not necessarily apply to COS uptake. Gap-filling the u * -filtered COS fluxes may therefore create a bias due to false assumptions if the gap filling is only based on data from periods with high turbulence. However, as we did not see u * dependency disappearing even with a concentration-gradient-normalized flux, the u * filtering and proceeding gap filling were applied here as usual (Papale et al., 2006) to overcome the EC measurement limitations under low-turbulence conditions. We determined u * limits of 0.23 m s −1 for COS and 0.22 m s −1 for CO 2 (Fig. 6). Filtering according to these u * values would remove 12 % and 11 % of data, respectively. If the storage change flux was excluded when determining the u * threshold, the limits were 0.39 and 0.24 m s −1 from CO 2 and COS fluxes, respectively. The increase in the u * threshold with CO 2 is because the fractional storage change flux is larger for CO 2 than for COS ( Fig. 5c and f). On the other hand, the u * limit for COS stayed similar to the previous one. With these u * thresholds the filtering would exclude 30 % and 13 % of the data for COS and CO 2 respectively.
If fluxes are not corrected for storage before deriving the u * threshold, there is a risk of flux overestimation due to double accounting. The flux data filtered for low turbulence would be gap-filled, thereby accounting for storage by the canopy, but then accounted for again when the storage is released and measured by the EC system during the flushing hours in the morning (Papale et al., 2006). Thus, it is necessary to make the storage change flux correction before deriving u * thresholds and applying the filtering.

Gap filling
Three combinations of environmental variables (PAR, PAR and RH, and PAR and VPD) were tested using the gap-filling function (Eq. 18). These environmental parameters were chosen because COS exchange has been found to depend on stomatal conductance, which in turn depends especially on radiation and humidity (Kooijmans et al., 2019). Development of the gap-filling parameters a, b, c, and d over the measurement period is presented in Fig. S10. While the saturating function of PAR alone already captured the diurnal variation relatively well, adding a linear dependency on VPD or RH made the diurnal pattern even closer to the measured one, although some deviation is still observed, especially in the early morning (Fig. 7). Therefore, the combination of saturating light response curve and linear VPD dependency was chosen. Furthermore, we chose a linear VPD dependency instead of a linear RH dependency due to smaller residuals in the former (Fig. S9). The mean residual of the chosen model was −0.54 pmol m −2 s −1 , and the root mean square error (RMSE) was 18.7 pmol m −2 s −1 , while the saturating For COS fluxes, 44 % of daytime flux measurements were discarded due to the above-mentioned quality criteria (Sect. 2.4.4) and low-turbulence filtering. As expected, more data (66 %) were discarded during nighttime. Altogether, 52 % of all COS flux data were discarded and gapfilled with the gap-filling function presented in Eq. (18). The average of the corrected and gap-filled COS fluxes during the whole measurement period was −12.3 pmol m −2 s −1 , while without gap filling the mean flux was 14 % more negative, −14.3 pmol m −2 s −1 . This indicates that most gap filling was done for the nighttime data, when COS fluxes are less negative than during daytime.
For CO 2 , 41 % of daytime CO 2 fluxes were discarded, while 67 % of fluxes were discarded during nighttime, altogether comprising 53 % of all CO 2 flux data. CO 2 fluxes were gap-filled according to standard gap-filling procedures presented in Reichstein et al. (2005). The average CO 2 flux after all corrections and gap filling was −2.14 µmol m −2 s −1 , while without gap filling the mean flux was 39 % more negative, −3.53 µmol m −2 s −1 . Similar to COS, CO 2 fluxes are also mostly gap-filled during nighttime. As nighttime CO 2 fluxes are positive, gap-filled fluxes include more positive values than non-gap-filled fluxes, thus making the mean flux less negative.
Although the COS community has not been interested in the cumulative COS fluxes or yearly COS budget so far, it is important to fill short-term gaps in COS flux data to properly capture the diurnal variation, for example. The gap-filling method presented here is one option to be tested at other measurement sites as well.

Errors and uncertainties
The uncertainty due to the chosen processing scheme was determined from a combination of eight different processing schemes, as described in Sect. 2.4.6, that were equally reliable but caused the most variation in the final COS flux (Table 2). This processing uncertainty contributed 36 % to the total uncertainty of the 30 min COS flux, while the rest was composed of the random flux uncertainty (Fig. 8). For the CO 2 flux uncertainty, the processing was more important than for COS (48 %), but the random uncertainty still dominated the combined flux uncertainty. The random error of the CO 2 flux was found to be lower than in Rannik et al. (2016) for the same site, probably related to differences in the gas analyzers and overall setup. The mean noise estimated from Lenschow et al. (2000) was 0.06 µmol m −2 s −1 for our QCLS CO 2 fluxes, while in Rannik et al. (2016) it was approximately 0.08 µmol m −2 s −1 for LI-6262 CO 2 fluxes at the same site. Gerdel et al. (2017) found the total random uncertainty of COS fluxes to be mostly around 3-8 pmol m −2 s −1 , comparable to our results.
The relative flux uncertainty for COS was very high at low flux (−3 pmol m −2 s −1 < F COS < 3 pmol m −2 s −1 ) values (8 times the actual flux value) but leveled off to 45 % at fluxes higher (meaning more negative fluxes) than −27 pmol m −2 s −1 (Fig. 8c). The total uncertainty of the CO 2 flux was also high at low fluxes (−1.5 µmol m −2 s −1 < F CO 2 < 1 µmol m −2 s −1 , uncertainty reaching 130 % of the flux at 0.17 µmol m −2 s −1 ) and decreased to 15 % at fluxes more negative than −11 µmol m −2 s −1 (Fig. 8d). Higher relative uncertainty at low flux levels is probably due to the detection limit of the measurement system.
The median relative random uncertainty of COS flux decreased from 0.35 for single 30 min flux to 0.013 for monthly averaged flux (Fig. 8e). The processing uncertainty had a less prominent decrease, from 0.15 for 30 min fluxes to 0.05 for monthly fluxes. The strongest decrease in processing uncertainty was when moving from single 30 min flux values to daily average fluxes, after which the averaging period did not affect the processing uncertainty. This is probably due to the large scatter between the two detrending methods, which levels off when averaging over several flux values (Fig. S11a). Relative random uncertainty of CO 2 flux decreased from 0.11 for 30 min fluxes to 0.021 for monthly fluxes. The processing uncertainty, however, did not change significantly between the different averaging periods, as would be expected. Figure 8. Uncertainty of (a) COS and (b) CO 2 fluxes, binned into 15 equal-sized bins that represent median values. Error bars show the 25th and 75th percentiles. Total uncertainty is represented as the 95 % confidence interval (1.96 comb ). Panels (c) and (d) represent the relative uncertainty, i.e., the uncertainty divided by the flux for COS and CO 2 , respectively. Panels (e) and (f) represent the relative uncertainties of time-averaged COS and CO 2 fluxes, respectively.

Conclusions
In this study, we examined the effects of different processing steps on COS EC fluxes and compared them to CO 2 flux processing. COS fluxes were calculated with five time lag determination methods, three detrending methods, two highfrequency spectral correction methods, and with no spectral corrections. We calculated the storage change fluxes of COS and CO 2 from two different concentration profiles and investigated the diurnal variation in the storage change fluxes. We applied u * filtering and introduced a gap-filling method for COS fluxes. We also quantified the uncertainties of COS and CO 2 fluxes.
The largest differences in the final fluxes came from time lag determination and detrending. Different time lag methods made a difference of a maximum of 15.9 % in the median COS flux. Different detrending methods, on the other hand, made a maximum of 6.2 % difference in the median COS flux, while it was more important for CO 2 (10.7 % difference between linear detrending and block averaging). Omitting high-frequency spectral correction resulted in a 14.2 % lower median flux, while different methods used in high-frequency spectral correction resulted in only 2.7 % difference in the final median fluxes. We suggest using CO 2 time lag for COS flux calculation so that potential biases due to a low signal-tonoise ratio of COS mixing ratio measurements can be eliminated. The CO 2 mixing ratio is measured simultaneously with the COS mixing ratio with the Aerodyne QCLS, and in most cases it has a higher signal-to-noise ratio and more clear cross-covariance with w than COS. Experimental highfrequency correction is recommended for accurately correcting for site-specific spectral losses. We recommend comparing the effect of different detrending methods on the final flux for each site separately to determine the site-and instrumentspecific trends in the raw data.
Flux uncertainties of COS and CO 2 followed a similar trend of higher relative uncertainty at low flux values and random flux uncertainty dominating over uncertainty related to processing in the total flux uncertainty. The relative uncertainty was more than 5 times higher for COS than for CO 2 at low flux values (absolute COS flux of less than 3 pmol m −2 s −1 ), while at higher fluxes they were more similar. When averaging fluxes over time, the relative random uncertainty decreased with increasing averaging period for both COS and CO 2 . Relative processing uncertainty decreased from single 30 min COS fluxes to daily averages but remained at the same level for longer averaging periods.
We emphasize the importance of time lag method selection for small fluxes, whose uncertainty may exceed the flux itself, to avoid systematic biases. COS EC flux processing follows similar steps as other fluxes with low signalto-noise ratios, such as CH 4 and N 2 O, but as there are no sudden bursts of COS expected, and its diurnal behavior is close to CO 2 , some processing steps are more similar to CO 2 flux processing. In particular, time lag determination and high-frequency spectral corrections should follow the protocol of low signal-to-noise ratio fluxes (Nemitz et al., 2018), while quality assurance and quality control, despiking, u * filtering, and storage change correction should follow the protocol produced for CO 2 flux measurements . Our recommendation for time lag determination (CO 2 cross-covariance) differs from the most commonly used method so far (COS cross-covariance), while experimental high-frequency spectral correction has already been widely applied before (Table 1). Many earlier studies have neglected the storage change flux, but we emphasize its importance in the diurnal variation in COS exchange. In addition, we encourage implementing gap filling in future COS flux calculations for eliminating short-term gaps in data.
Data availability. The flux data used in this study are available in Kohonen (2020; https://doi.org/10.5281/zenodo.3907342). Environmental data are available on the AVAA open research data publishing platform (https://avaa.tdata.fi/web/smart/smear/download, last access: 17 March 2020; Junninen et al. , 2009). The metadata of the environmental observations are available via the ETSIN service. Raw data and flux data processed with different schemes other than the standard scheme are available upon request from the first author.
Author contributions. K-MK and IM designed the study, and K-MK processed and analyzed the data. PK developed the gap-filling function for COS and participated in data processing. IM, LMJK, US, WS, and HC participated in field measurements. K-MK and IM wrote the manuscript, with major participation in editing by WS and LMJK and contributions from all coauthors.