Performance of open-path GasFinder3 devices for CH 4 concentration measurements close to ambient levels

. Open-path measurements of methane (CH 4 ) with the use of GasFinder systems (Boreal Laser Inc, Edmonton Canada) have been frequently used for emission estimation with the inverse dispersion method (IDM), particularly from agricultural sources. It is common to many IDM applications that the concentration enhancement related to CH 4 sources is small, typically between 0.05 and 0.5 ppm, and accurate measurements of CH 4 concentrations are needed at concentrations close to ambient levels. The GasFinder3-OP (GF3) device for open-path CH 4 measurements is the latest version of the commercial GasFinder systems by Boreal Laser Inc. We investigated the uncertainty of six GF3 devices from side-by-side intercomparison measurements and comparisons to a closed-path quantum cascade laser device. The comparisons were made at near-ambient levels of CH 4 (85 % of measurements below 2.5 ppm) with occasional phases of ele-vated concentrations (max. 8.3 ppm). Relative biases as high as 8.3 % were found, and a precision for half-hourly data between 2.1 and 10.6 ppm-m (half width of the 95 % conﬁdence interval) was estimated. These results deviate from the respective manufacturer speciﬁcations of 2 % and 0.5 ppm-m. Intercalibration of the GF3 devices by linear regression to remove measurement bias was shown to be of limited value due to drifts and step changes in the recorded GF3 concentrations.


Introduction
The experimental determination of methane (CH 4 ) emission rates from agricultural sources is a key element for emission inventories and for the development of mitigation strategies. A large diversity of approaches to derive emission rates from measurements is available. Focusing on micrometeorological methods, they can broadly be divided into fluxbased and concentration-based approaches. The latter combine measurements of the concentration enhancement downwind or above the source with the modeling of the dispersion of the concentration released by the source. One frequently applied concentration-based approach is the inverse dispersion method (IDM; Flesch et al., 2005) where, generally, two concentration measurements are used in parallel, placed up-and downwind of the source under investigation. It is common to many IDM applications that the concentration enhancement related to CH 4 sources is small, typically between 0.05 and 0.5 ppm.
In recent years, optical open-path instruments have become commercially available that determine the pathintegrated CH 4 concentration over measurement path lengths of up to several hundred meters. Regarding the IDM, pathintegrated concentration measurements are preferable over point measurements, since they capture a larger fraction of the emission-related plume and, therefore, are less sensitive to variation and uncertainty in the measured wind direction.
On the other hand, it is more difficult to assess and control the quality of measurements by open-path gas analyzers in comparison to closed-path instruments. The latter can be checked or recalibrated periodically during a field campaign using common cylinder standards (also for multiple spatially separated instruments). This is usually not possible for open-path devices with longer measurement paths. The use of cylinder standard gases is feasible for very short path lengths (few meters), but the corresponding calibration may not be representative for other setups with longer path lengths (DeBruyn et al., 2020). Therefore, the quality of open-path measurements in the field with path lengths of 10 to 100 m (or longer) needs to be tested in other ways using instrument internal quality indicators, plausibility checks and intercomparisons of two or more instruments.
In this paper, we focus on the GasFinder3-OP (GF3) system for CH 4 measurements (Boreal Laser Inc, Edmonton Canada; "Lo-Range" methane variant, i.e., detection range between 2 and 8500 ppm-m). This open-path system has a very user-friendly design and is in the lower cost range of available instruments. It is an improved version of the Gas-Finder2 system, which has been frequently used to measure emission rates with the IDM (e.g., Flesch et al., 2007;Harper et al., 2010;McGinn et al., 2019;VanderZaag et al., 2014). The aim of this study is to characterize the stability and accuracy of the GF3 instruments for CH 4 measurements close to ambient levels. We present an overview of several field campaigns including (i) intercomparisons between GF3 devices and a fast-response quantum cascade laser spectrometer (QCL) considered to be a state-of-the-art reference and (ii) direct intercomparisons between various GF3 instruments. They served to generate a basis to correct the measurement data of individual GF3 instruments placed up-and downwind of emitting sources, which induced a low concentration enhancement where instrument stability and accuracy are particularly important. This article is written from the point of view of a GF3 instrument's end user.

GasFinder3-OP instrument
The GF3 instrument from Boreal Laser Inc. is an open-path instrument with a tunable laser diode emitting in the infrared centered around 1654 nm where CH 4 shows a distinct absorption line. The measurement output of the GF3 is provided as path-integrated concentration C PI in units of parts per million meter (denoted ppm-m) that reflects the concentration integrated over the one-way path length (distance between laser source and reflector). The output data in units of ppm-m were converted to the path-averaged concentration C in units of parts per million (i.e., divided by the one-way path length) and corrected with temperature and pressure correction functions provided by the manufacturer. Six different open-path GF3 devices were used in this study (Table 1). The two devices OP-Ext and OP-1, as well as OP-3 and OP-5, had identical pressure and temperature correction functions.
The "Lo-Range" version of the GF3 for CH 4 measures in the range of 2 to 8500 ppm-m with a sensitivity (precision) of 0.5 ppm-m at a sample rate of 1 to 1/3 Hz as stated by the manufacturer . The accuracy of the GF3 system is specified as 2 % of the reading (Boreal Laser Inc., 2018a) with a lower value for the "typical accuracy" of 0.5 % of the reading (Boreal Laser Inc., 2018b). Details on the instrument are given in DeBruyn et al. (2020).
Together with the concentration measurement, the supporting parameters "received power" (of the reflected incoming beam) and "R2" (the goodness of fit between the sample and the calibration waveform) are provided as standard outputs of the GF3 instruments. According to the manufacturer, a valid concentration measurement can be expected if the following constraints are met: received power is in the range of 50 to 3000 µW and R2 is above 0.85 (Boreal Laser Inc., 2018b). We decided to be stricter and kept data for further analysis only if the received power was in the range of 100 to 2500 µW (as suggested in Boreal Laser Inc., 2016) and R2 was equal to or greater than 0.98. The quality-assessed data were aggregated to 1 and 30 min average concentrations. Only averages resulting from a data coverage of 90 % or more of the respective time interval were retained for further evaluation.

Intercomparison campaigns
In total, eight intercomparison campaigns were conducted at different sites in Switzerland with varying ranges of near-ambient concentrations of CH 4 ( Table 2). Two campaigns, P16 and P17, with a focus on the comparison between GF3 devices and a QCL (QC-TILDAS, Aerodyne Research Inc.) as a reference system, were conducted in Posieux (46 • 46 4.22 N, 7 • 6 27.65 E) close to an animal housing facility (approx. 100 m north). The QCL is a closed-path instrument with a 20 m inlet tube flushed by a vacuum pump at 13 sL min −1 . The sample air is analyzed in a multi-pass cell (0.5 L) with a fixed optical path length of 76 m. The cell is kept at constant temperature (294 K) and pressure (31 Torr). Due to the stabilized operation, the instrument exhibits a high precision (1 s) around 0.004 ppm or 0.2 % (Nelson et al., 2004;Wang et al., 2020).
Seven intercomparison campaigns including various GF3 instruments placed side by side were carried out at the following locations: A18 in Aadorf ( Table 2.

Name used in Unit number
Year of Intercomparison campaign this study manufacture P16 P17 A18 K19 I19 H19-1 H19-2 H19-3 age 1 . In the campaigns P16, P17 and A18, the seven-corner cube array type was used; in H19-1, H19-2, H19-3 and I19, the 12-corner cube array type was used; and in K19 both types were used. During side-by-side intercomparisons, the laser beams of the GF3 devices were always aligned in parallel with small lateral distances of 1 to 2 m. Instrument and laser beam heights were between 1.3 and 1.7 m above ground. For the comparison to the QCL measurements, the QCL inlet was located approx. 4 to 12 m from the center of the laser beams 1.9 m above ground.
For the temperature and pressure correction of the GF3 instruments (Sect. 2.1) during the field campaigns, the temperature and pressure data from a close-by weather station were used. In A18, the weather station was situated 1.2 km away with a negligible difference in the elevation of approx. 6 m. At all other sites, the weather station was within 100 m of the devices. All measurements were conducted continu-1 In 2016, when the first devices of GF3 (OP-1 to OP-3) were ordered, Boreal Laser Inc. recommended seven-corner cube array reflectors for path lengths up to 200 m. Meshes of different grid sizes could be installed in front of the corner cubes for path lengths that are shorter than the specified range. Prior to the second order in 2019 (devices OP-4 and OP-5), the recommendation was adapted to use the 12-corner cube array reflectors for path lengths up to 200 m. ously, i.e., during day and night, in regions characterized by agricultural activities related to livestock production.

Data evaluation
For a valid concentration comparison between the parallel instruments, the internal clocks of the individual devices were adjusted such that all concentration data were synchronous. This time synchronization was done by maximizing the covariance of the high-frequency concentration data in parts per million between the individual instruments. For each day, the data were broken down to 1 s data (i.e., inserting repetition values where necessary), and the time shift with the highest covariance was assessed. From these daily estimates of time shifts, a constant time lag was estimated and corrected for each device and each campaign individually. Time lags around 2 to 5 s d −1 between the devices have been observed and corrected for.
In two intercomparison campaigns (P16 and P17) four different GF3 devices (OP-Ext, OP-1, OP-2 and OP-3) were compared to the closed-path point measurements by the QCL instrument based on the 30 min averaged concentrations.
Based on the synchronized time series, the concentration difference C between the parallel instruments was calculated for each averaging interval. The C data partly showed significant deviations (asymmetry, outliers) from an ideal Gaussian distribution. Thus, for analyzing the difference between devices, the median C and the "median absolute deviation" (MAD) of C over each campaign were determined for each pair of devices. The two quantities are robust estimates of the mean and variability of C that are insensitive to outliers and do not rely on prescribed data distributions. For the ideal case of a Gaussian distribution, the MAD can be related to twice the standard deviation (comprising 95 % of the data) by multiplication with a factor of 2.9. The resulting value represents an estimate for the (random) precision of C, whereas the median C represents the (systematic) bias between the two instruments. The estimates of bias and precision of C can be partitioned equally to the concentrations of both intercompared devices by dividing by the square root of 2 (according to Gaussian error propagation). Thus, the relative bias and the precision of an individual GF3 device for a campaign period were estimated as where the relative bias was expressed relative to the concentration average of the two devices C avg , and the precision was converted back to path-integrated concentrations C PI using the one-way path length l path of the GF3 device (in the case of the intercomparison of two GF3 devices the path lengths were averaged). In addition to the concentration differences, the parallel measurements were also analyzed concerning their linear relationship using the Deming regression that considers measurement errors from both instruments. The GF3 devices were analyzed with reference to OP-1. Coefficients from the linear regression and the predicted C at OP-1 concentration levels of 2 and 4 ppm were reported for each device (OP-2, OP-3, OP-4 and OP-5) and campaign, if the number of observations exceeded 20 and the concentration range was large enough (difference between 0.025 and 0.975 quantiles greater than 0.4 ppm). 3 Results and discussion

Intercomparison between GF3 and QCL
During the two intercomparison campaigns P16 and P17, the magnitude and temporal course of the GF3 concentrations measured by the devices OP-Ext, OP-1, OP-2 and OP-3 compared well to the concentration measured by the QCL, specifically for high-frequency structures. Figure 1 shows 1.5 d of parallel QCL and OP-Ext measurement in campaign P16. However, when focusing on the lower end "baseline" concentrations near 2.2 ppm, the OP-Ext signal shows drifts and steps relative to the more stable QCL signal on the order of 0.2 ppm (shaded phases in Fig. 1). This corresponds to instrument-related changes in the path-integrated concentration of about 7.4 ppm-m (path length of 37 m).
At the 26 h timestamp, a drift occurred dropping the concentration of OP-Ext from roughly 0.2 ppm above to roughly 0.1 ppm below the QCL concentration. There is no indication of a deterioration of the measurement quality of the GF3 values during this period. The received laser beam power was always above 100 µW, and the R2 value for the waveform fit was greater than 0.98 (Sect. 2.1). Further, there was no correlation of the drift with the local weather data (air temperature, wind direction, wind speed, relative humidity, etc.; data not shown). The same applies to step changes and drifts of GF3 devices, typically over several hours, during other phases of the intercomparison campaigns. In some selected cases, step changes in the concentration could occur when there was activity related to device handling during operation (such as downloading data, checking the reference cell state, etc.), as observed at hour 46 in Fig. 1. However, such device handling should not affect the measurements, and it remains unclear what exactly causes the signal changes. Since these drifts and step changes cannot be distinguished from real changes in the ambient concentration without the information from a further Figure 2. Histograms of recorded 1 min average concentrations of GF3 devices OP-1, OP-2, OP-3, OP-4 and OP-5. A few values greater than 3.5 ppm are not shown. Blue: values > 1.88 ppm; red: values ≤ 1.88 ppm. Grey: data from device OP-4 during the campaign H19-3 that passed the quality check but have been omitted in the analysis due to an obvious jump in the concentration (Fig. 3). parallel measurement, they affect the uncertainty in the GF3 measurements.
Bias and precision of the GF3 devices (Sect. 2.3) were estimated and compared to the accuracy (2 % of reading) and sensitivity (0.5 ppm-m) specified in the GF3 operation manual. The magnitude of the relative bias of the GF3 is higher than the stated 2 %, with values ranging from −2.7 % to −8.3 % ( Table 3). The C PI precision for the GF3 devices was determined to 2.1 up to 10.6 ppm-m, which is between 4 and 21 times higher than the specified sensitivity of 0.5 ppm-m.

GF3 side-by-side intercomparisons
A cumulated dataset of 60 d in total with GF3 side-by-side measurements that passed the enhanced quality checks was produced within the seven intercomparison campaigns P17, A18, K19, I19, H19-1, H19-2 and H19-3. It contains the periods during which at least two devices were running in parallel, i.e., the reference device OP-1 and at least one further instrument (OP-2, OP-3, OP-4 or OP-5). Data from device OP-4 measured during the campaign H19-3 passed the quality check but have been omitted in the further analysis due to an obvious jump in concentration (Figs. 2 and 3). The overall average CH 4 concentration was 2.1 ppm. The 1 min averages ranged between 1.3 and 40.3 ppm, with most of the data centered around 2.0 ppm.
Extended periods of CH 4 concentrations constantly below 1.88 ppm, the minimum of the monthly average background Figure 3. CH 4 concentrations recorded by OP-1 (30 min averages) and the corresponding differences to OP-2, OP-3, OP-4 and OP-5. Grey dots: data from device OP-4 during the campaign H19-3 that passed the quality check but have been omitted in the analysis due to an obvious jump in the concentration. concentration in Switzerland since 2016 (BAFU, 2019), could be observed with devices OP-1, OP-2 and OP-3. Overall, shares of measured CH 4 concentration (1 min averages) below 1.88 ppm ranged from 0 % (OP-5) and 13 % (OP-4) to 27 % (OP-2), 35 % (OP-3) and 41 % (OP-1), whereas values above 3.5 ppm rarely occurred: 1 % (OP-2), 2 % (OP-1) and 3 % (OP-3, OP-4 and OP-5). This agrees with the systematically lower concentrations measured with the GF3 devices compared to the measurement by the QCL device in the previous section. Figure 3 shows the 30 min averages of the recorded OP-1 concentration with the corresponding differences between the measured concentration by the individual devices and the OP-1 concentration. The differences are generally small, but larger deviations, as during the A18 campaign, occur. Table 4 provides statistics on the differences between the GF3 devices OP-2 to OP-5 and the reference device OP-1 regarding directly comparable 30 min concentration averages. The differences were determined in units of parts per million and transformed to ppm-m related to the path length of the GF3 device that has been compared to OP-1. The relative bias ranged from −1.7 % to 8.0 % and the precision of C PI between 2.6 and 8.8 ppm-m, which lies within the range of the precision estimates in Sect. 3.1. A large offset in the concentration, reflected by the relative bias, could be observed for OP-4 and OP-5 compared to concentration measurements from OP-1 (on average > 0.15 ppm higher). Devices OP-4 and OP-5 were acquired 2 years later than instruments OP-1 to OP-3, and this offset may be due to a difference in the internal calibration by the manufacturer between the instruments acquired in 2017 and the instruments acquired in 2019.
The devices OP-1 and OP-3 episodically showed dents in the concentration output that are in line with step decreases in the received power. Figure 4 shows an example of such a dent recorded by OP-1 with OP-3 measuring in parallel as a reference. The rapid loss of receiving power at 27.1 h after device start seems to have triggered a gradual loss of up to 0.15 ppm in the concentration of OP-1. A few minutes later a step change in the concentration by almost 0.2 ppm occurred, while the received power was still low. We assign these concentration variations to the wrong concentration determination of OP-1, as the OP-3 concentration remained constant at the ambient background value slightly above 1.8 ppm. This indicates that a constant threshold for the received power (50 or 100 µW) may not be sufficient for quality filtering. We noticed that the "optimal" threshold varied between individual instruments and campaigns, with threshold values ranging up to 400 µW.
Frequently, linear regression is used to correct for differences between instruments. There are two problems, however, that can occur with this correction method for GF3 devices in the case of CH 4 concentration measurements close to ambient level. One problem arises if the dataset contains drifts and steps as shown in Figs. 1 and 4. Inspecting the A18 intercomparison between OP-1 and OP-2 closer (intercept: −0.04, slope: 1.04), a period of approximately 5.4 continuous days is apparent (around intervals 550 to 750 in Fig. 3) where OP-2 (and OP-3) recorded systematically higher concentrations than OP-1. If we separate this "offset" period from the remaining part of the campaign (Fig. 5), we see that the regression results are systematically different. The offset period shows an intercept of 0.04 and a slope of 1.05, whereas we get an almost perfect 1 : 1 relationship for the residual time (intercept: 0.01, slope: 1.00). Using the overall regression results for the entire period (Table 5) instead of two separate periods thus introduces a bias in the evaluation.
The second problem is the observed rather large variation in the intercalibration from one campaign to another (Table 5). Such a variation between different campaigns was also observed with GF3 devices for ammonia measurements by Baldé et al. (2019). Concentration response of the instru- calibrating OP-2 with OP-1 would provide significantly different 30 min concentration estimates at concentration levels of 2 and 4 ppm. Even though, in theory, an intercalibration of the devices after an IDM measurement campaign could solve the issue of differences in the measurements, the nec- essary change in the setup to perform such an intercalibration could lead to a change in the response of the devices, and the intercalibration would then be useless.

Conclusion
We found that the uncertainty in the measurements of several GasFinder3-OP instruments is higher than given in the specification provided by the manufacturer when measuring concentrations close to ambient levels. From on-site intercomparisons at various field sites (side-by-side intercomparisons and comparisons to a reference QCL instrument), we estimate a bias up to 8.3 % of the reading and a precision between 2.1 and 10.6 ppm-m for our devices. This is 4 to 21 times higher than the sensitivity specified by the manufacturer. A large part of the inferior precision is attributed to low-frequency drifts, whereas high-frequency changes in the concentration are often well captured, as the similarity of the small features between hours 25 and 27 in Fig. 1 demonstrates. Drifts and step changes in the concentration occur up to 0.3 ppm (Fig. 1). Most critical are changes in the concentration that can hardly be distinguished from fluctuations of the atmospheric concentrations. Some of the step changes are caused by activity related to the handling of the GF3 device (e.g., downloading data, checking time, checking reference cell quality). It remains unclear though what activity causes these step changes, since none of the activities consistently cause such step changes. The internal calibrations of the GF3 seem to differ between devices. Devices OP-1, OP-2 and OP-3 show systematically lower concentration measurements than the devices OP-4 and OP-5. Application with paired devices needs an intercalibration of the devices. However, it remains unclear to what extent a side-by-side intercalibration can be transferred to the actual measurement setup, since relocation of the devices might cause systematic changes, as indicated by the different regression coefficients for different intercomparison campaigns.