Building the COllaborative Carbon Column Observing Network (COCCON): long-term stability and ensemble performance of the EM27/SUN Fourier transform spectrometer

. In a 3.5-year long study, the long-term performance of a mobile, solar absorption Bruker EM27/SUN spectrometer, used for greenhouse gas observations, is checked with respect to a co-located reference Bruker IFS 125HR spectrometer, which is part of the Total Carbon Column Observing Network (TCCON). We ﬁnd that the EM27/SUN is stable on timescales of several years; the drift per year between the EM27/SUN and the ofﬁcial TCCON product is 0.02 ppmv for XCO 2 and 0.9 ppbv for XCH 4 , which is within the 1 σ precision of the comparison, 0.6 ppmv for XCO 2 and 4.3 ppbv for XCH 4 . The bias between the two data sets is 3.9 ppmv for XCO 2 and 13.0 ppbv for XCH 4 . In order to avoid sensitivity-dependent artifacts, the EM27/SUN is also compared to a truncated IFS 125HR data set derived from full-resolution TCCON interferograms. The drift is 0.02 ppmv for XCO 2 and 0.2 ppbv for XCH 4 per year, with 1 σ precisions of 0.4 ppmv for XCO 2 and 1.4 ppbv for XCH 4 , respectively. The bias between the two data sets is 0.6 ppmv for XCO 2 and 0.5 ppbv for XCH 4 . With the presented long-term stability, the EM27/SUN qualiﬁes as an useful supplement to the existing TCCON network in remote areas. To achieve consistent performance, such an ex-tension requires careful testing of any spectrometers involved by application of common quality assurance measures.


Introduction
Precise measurements of atmospheric abundances of greenhouse gases (GHGs), especially carbon dioxide (CO 2 ) and methane (CH 4 ), are of utmost importance for the estimation of emission strengths and flux changes (Olsen and Randerson, 2004). Furthermore, these measurements offer the prospect of being usable for the evaluation of emission reductions as specified by international treaties, e.g., the Paris COP21 agreement (https: //unfccc.int/resource/docs/2015/cop21/eng/l09r01.pdf, last access: 4 March 2019). The Total Carbon Column Observing Network (TCCON)  measures total columns of CO 2 and CH 4 with reference quality. TCCON achieves a calibration accuracy with a 1σ error of 0.2 ppmv for XCO 2 and 2 ppbv for XCH 4 and a total uncertainty budget of below 1 ppmv for XCO 2 and below 5 ppbv for XCH 4 , respectively (Wunch et al., 2010. However, the instruments used by this network are rather expensive and need large infrastructure to be set up and expert maintenance, which has to be performed on site. Therefore TCCON stations have sparse global coverage, especially in Africa, South America and large parts of Asia . Current satellites like the Orbiting Carbon Observatory-2 (OCO-2) (Frankenberg et al., 2015) and the Greenhouse Gases Observing Satellite (GOSAT) (Morino et al., 2011) on the other hand offer global coverage. Nonetheless, they suffer from coarse temporal resolution (the repeat cycle of OCO-2 is 16 days), and in the case of GOSAT from sparse spatial sampling as well as limited precision of a single measurement. These limitations mostly inhibit a straightforward estimation of the emission strength of localized sources of CO 2 and CH 4 like cities, landfills, swamps or fracking and mining areas from satellite observations. Recently OCO-2 data were used for estimating the source strength of power plants (Nassar et al., 2017) and urban emissions (Ye et al., 2017). However, this can only be done for power plants and urban areas that lie directly under the OCO-2 overpass locations. TCCON stations are also the primary validation for OCO-2 (https://ocov2.jpl.nasa.gov/files/ocov2/OCO-2_ SciValPlan_111005_ver1_0_revA_final_signed1.pdf; last access: 4 March 2019), and validating the satellite observations at different locations is critical for the validation effort .
The previously described Bruker EM27/SUN portable FTIR spectrometer Frey et al., 2015;Hedelius et al., 2016) is a promising instrument to overcome the above-mentioned shortcomings as it is a mobile, reliable, easy-to-deploy and low-cost supplement to the Bruker IFS 125HR spectrometer used in the TCCON network. So far the EM27/SUN was mainly used in campaigns for the quantification of local sinks and sources Chen et al., 2016). In this work the long-term performance of the EM27/SUN with respect to a reference high-resolution TC-CON instrument is investigated. Additionally, the ensemble Atmos. Meas. Tech., 12, 1513Tech., 12, -1530Tech., 12, , 2019 www.atmos-meas-tech.net/12/1513/2019/ performance of several EM27/SUN spectrometers is tested. During 2014-2018, 30 EM27/SUN were tested at the Karlsruhe Institute of Technology (KIT) before being shipped to the customers. Several instruments that were distributed before this calibration routine at KIT was established were upgraded with a second channel for CO observations at Bruker Optics ™ and after this also checked at KIT. This results in a unique data set as all EM27/SUN are directly compared to a reference EM27/SUN, continuously operated at KIT, as well as a co-located TCCON instrument. From this data set an EM27/SUN network precision and accuracy can be estimated. The COllaborative Carbon Column Observing Network (COCCON) is intended to be a lasting framework for creating and maintaining a greenhouse gas-observing network based on common instrumental standards and data analysis procedures. Currently, about 18 working groups operating EM27/SUN spectrometers are contributing. We expect that COCCON will become an important supplement of TCCON, as the logistic requirements are low and the spectrometers are easy to operate. It will increase the global density of columnaveraged greenhouse gas observations and, due to the fact that the spectrometers are portable, will especially contribute to the quantification of local sources.

TCCON data set
As part of the TCCON, the Karlsruhe Institute of Technology (KIT) operates a high-resolution ground-based spectrometer at KIT, Campus North (CN) near Karlsruhe (49.100 • N,8.439 • E, 112 m a.s.l.). Standard TCCON instruments have been described in great detail elsewhere (Washenfelder et al., 2006;Wunch et al., 2011). The Karlsruhe instrument, in the following called HR125, is the first demonstration of synchronized recordings of TCCON near-infrared (NIR) and NDACC mid-infrared (MIR) spectra using a dedicated dichroic beamsplitter (BS) arrangement (Optics Balzers Jena GmbH, Germany) with a cut-off wavenumber of 5250 cm −1 . It uses an InGaAs (indium-gallium-arsenide) detector in conjunction with an InSb (indium-antimonide) detector; details can be found in Kiel et al. (2016b). By the TC-CON measurements, the relevant wavenumber region 4000-11000 cm −1 , corresponding to wavelengths λ between 0.9 and 2.5 µm, is covered so that, among other species, O 2 , CO 2 , CH 4 , CO and H 2 O can be retrieved. A figure showing the spectral range of TCCON and the EM27/SUN can be found in Hedelius et al. (2016), Fig. 1. The TCCON measurements were chosen as reference measurements because these gases are also measured by the EM27/SUN spectrometer. For TCCON measurements in the NIR the HR125 records single-sided interferograms with a resolution of 0.014 cm −1 ( λ = 3.5 pm) or 0.0075 cm −1 ( λ = 1.9 pm), correspond-ing to a maximum optical path difference (MOPD) of 64 and 120 cm. The recording time for a typical measurement consisting of two forward and two backward scans is 212 and 388 s, respectively. The applied scanner velocity is 20 kHz. TCCON site Karlsruhe participated in the Infrastructure for Measurement of the European Carbon Cycle (IMECC) aircraft campaign Geibel et al., 2012). The spectrometer has been used for calibrating all gas cells used by TCCON for instrumental line shape (ILS) monitoring (Hase et al., 2013).
TCCON data processing is performed using the GGG Suite software package . In this study, the current release version, GGG 2014, is used . The software package includes a pre-processor correcting for solar brightness fluctuations (Keppel-Aleks et al., 2007) and performing a fast Fourier transform including a phase error correction routine to convert recorded interferograms into solar absorption spectra. Note that forward and backward scans are split by the preprocessing software and analyzed separately. The central part of the software package is nonlinear least-squares retrieval algorithm GFIT. It performs a scaling retrieval with respect to an a priori profile, and then integrates the scaled profile over height to calculate the total column of the gas of interest. The software package additionally uses meteorological data from the National Center for Environmental Protection and National Center for Atmospheric Research (NCEP/NCAR) (Kalnay et al., 1996) and provides daily a priori gas profiles. TCCON converts the retrieved total column abundances VC gas of the measured gases into column-averaged dry air mole fractions (DMFs), where the DMF of a gas is denoted as X gas = VC gas VC O 2 ×0.2095. In this representation several errors cancel out that affect both the target gas and O 2 . However, residual bias with respect to in situ measurements still persists, as well as a residual spurious dependence of retrieval results on the apparent airmass. Therefore the GGG suite also includes a postprocessing routine applying an empirical airmass-dependent correction factor (ADCF) and airmass-independent correction factor (AICF). The AICF is deduced from comparisons with in situ instrumentation on aircrafts (Wunch et al., 2010).

HR125 low-resolution data set
In addition to the afore-mentioned TCCON data product, a second data product from the HR125 will be used in this work, in the following called HR125 LR. For this product the raw interferograms are first truncated to the resolution of the EM27/SUN, 0.5 cm −1 . At 0.5 cm −1 resolution, the ILS of the HR125 is expected to be nearly nominal. However, to avoid any systematic bias of the HR125 LR data with respect to the EM27/SUN results, the same procedure for ILS determination from H 2 O signatures in open path lab air spectra was applied and the resulting ILS parameters adopted for the trace gas analysis. The analysis procedure will be explained in detail in Sect. 2.3; the retrieval software used for this data set is PROFFIT Version 9.6 (Hase et al., 2004). The reason for the construction of this HR125 LR data set is that with this approach the analysis for the two instruments can be performed in exactly the same way. The resolution is harmonized; the averaging kernels for a given airmass are nearly identical. Differences between the EM27/SUN and the HR125 LR data set can then be attributed to instrumental features alone and do not need to be disentangled from retrieval software, resolution and airmass dependency differences. Note that for the low-resolution data set, forward and backward scans are averaged and then analyzed, whereas they are analyzed separately for the TCCON data set. Therefore the number of coincident measurements with the EM27/SUN data set compared to the TCCON data set is lower.

EM27/SUN data set
The EM27/SUN spectrometer, which was developed by KIT in collaboration with Bruker Optics ™ , is utilized for the acquisition of solar spectra. The instrument has been described in great detail in Gisi et al. (2012); in the following a short overview is given. The central part of this Fourier transform spectrometer (FTS) is a RockSolid ™ pendulum interferometer with two cube corner mirrors and a CaF 2 beamsplitter. The EM27/SUN routinely records double-sided interferograms; the compensated BS design minimizes the curvature in the phase spectrum. This setup achieves high stability against thermal influences and vibrations. The retroreflectors are gimbal-mounted, which results in frictionless and wearfree movement. In this aspect the EM27/SUN is more stable than the HR125 high-resolution FTS, which suffers from wear because of the use of friction bearings on the moving retroreflector. Over time this leads to shear misalignment and requires regular realignment (Hase, 2012). The gimbal-mounted retroreflectors move a geometrical distance of 0.45 cm, leading to an optical path difference of 1.8 cm which corresponds to a spectral resolution of 0.5 cm −1 .
In a first pre-processing step, a solar brightness fluctuation correction is performed similarly to Keppel-Aleks et al. (2007). Furthermore, the recorded interferograms are Fourier transformed using the Norton-Beer medium apodization function (Davis et al., 2010). This apodization is useful for reducing sidelobes around the spectral lines, an undesired feature in low-resolution spectra, which would complicate the further analysis. A quality control, which filters interferograms with intensity fluctuations above 10 % and intensities below 10 % of the maximal signal range, is also applied.
In this work, spectra were analyzed utilizing PROFFIT Version 9.6, a nonlinear least-squares spectral fitting algorithm, which gives the user the opportunity to provide the measured ILS as an input parameter, an option chosen for this study (Hase et al., 2004). This code is in wide use and has been thoroughly tested in the past for the HR125 as well as the EM27/SUN, e.g., Schneider and Hase (2009), Sepúlveda et al. (2012), Kiel et al. (2016a), and Chen et al. (2016). Due to the low resolution of the EM27/SUN, the atmospheric spectra were fitted by scaling of a priori trace gas profiles, although PROFFIT has the ability to perform a full profile retrieval (Dohe, 2013). As the source of the a priori profiles, the TCCON daily profiles introduced in Sect. 2.1 are utilized to be consistent with the TCCON analysis. Also for the daily temperature and pressure profiles, the approach from TCCON was adopted, using NCEP model data together with on-site ground pressure data from a meteorological tall tower (http://www.imk.kit.edu/messmast/; last access: 4 March 2019).
For the evaluation of the O 2 column the 7765-8005 cm −1 spectral region is used, which is also applied in the TC-CON analysis (Wunch et al., 2010). For CO 2 we combine the two spectral windows used by TCCON into one larger window ranging from 6173 to 6390 cm −1 . CH 4 is evaluated in the 5897-6145 cm −1 spectral domain. For H 2 O the 8353-8463 cm −1 region is used. This differs from TCCON, which deploys several narrow spectral windows, a strategy which is more in line with high-resolution spectral observations. For consistency reasons, and to reference the results to the WMO scale, the EM27/SUN retrieval also performs a postprocessing. The AICFs from TCCON are adopted, and similarly to Wunch et al. (2010), an airmass dependency correction is performed, although other numerical values for the correction parameters are used. Details can be found in Frey et al. (2015) and Klappenbach et al. (2015).
3 Long-term performance

ILS analysis
Accurate knowledge of the real ILS of a spectrometer is extremely important because errors in the ILS lead to systematic errors in the trace gas retrieval. For this reason regular ILS measurements were performed from the beginning of this study 4 years ago to detect possible misalignments and alignment drifts. The source of a de-adjustment is mostly mechanical shock, due to, e.g., impacts or vibrations especially due to transportation of the instruments. For the analysis of the measured data, version 14.5 of retrieval software LIN-EFIT (Hase et al., 1999;Hase, 2012) is used. Due to the fact that the EM27/SUN is equipped with a circular field stop aperture, the ILS is nearly nominal. Therefore, to keep the treatment concise, we use the simple two-parameter ILS model offered by LINEFIT. A detailed description of the ILS analysis is given in Frey et al. (2015). The time series of the ILS measurements is shown in Fig. 1; the modulation efficiency (ME) at maximum optical path difference (MOPD) ranges between 0.9835 and 0.9896, with a mean value of 0.9862 and a standard deviation of 0.0015. The phase error is close to zero for the whole time series, with a mean value of 0.0019 ± 0.0018. This modulation efficiency is significantly different from nominal, which is surprising, as great care was Atmos. Meas. Tech., 12, 1513-1530, 2019 www.atmos-meas-tech.net/12/1513/2019/ taken to align the instrument. Therefore open path measurements were also performed for the HR125 at a resolution of 0.5 cm −1 to investigate whether this method shows a bias. For this small optical path difference, the alignment of the HR125 should be very close to nominal. However, the LIN-EFIT analysis shows a ME of 0.9824 at MOPD. From this result it is concluded that this method shows an overall low bias of around 1.5 %-2 %, probably due to a slight underestimate of the pressure-broadening parameters of H 2 O in the selected spectral region. There is no overall trend apparent in the time series; the remaining differences in the modulation efficiency are probably due to the remaining uncertainty of the measurement technique. As is indicated by the more frequent measurements in 2017, there is also no seasonality in the results of the open path measurements. It should be noted that the measurement routine was refined in the course of this work. In particular, in the beginning (2014) it was assumed that the inside of the EM27/SUN is free of water vapor, so the instrument was not vented during the lamp measurements. However, sensitivity studies as presented in Frey et al. (2015) revealed that the influence of the water vapor column inside the spectrometer can not always be neglected. After this discovery the instrument was vented during the open path measurements. This is why the 2014 calculations show larger scatter, as here the amount of water vapor inside the spectrometer is not known. For this analysis it was assumed that also for the 2014 measurements the total pressure inside the spectrometer is the same as of the surrounding air, which is a sensible assumption as the spectrometer is not evacuated. This also explains why the deviations become smaller in 2017. A further test to verify the stability of the instrument is the X air parameter, which is the surface pressure divided by the measured column of air. This test will be shown in Sect. 3.3. The grey lines in Fig. 1 denote transportation of the spectrometer over longer distances for field campaigns in Berlin (northeastern Germany), Oldenburg (northern Germany) and Paris (France) and for maintenance at Bruker Optics. Note that no realignment of the interferometer was performed during this maintenance. Only the reference HeNe laser was exchanged due to sampling instabilities during interferogram recordings. More specifically, the laser wavelength was unstable, resulting in a corruption of parts of the measured spectra. Later in 2016 and 2017 this instrument was not used for campaigns since it has been chosen as the reference EM27/SUN for comparison measurements next to the HR125 spectrometer in order to take measurements at Karlsruhe as continuously as possible. The instrument was not realigned during the whole comparison study.
An error estimation for the open path measurements is given in Table 1. For the temperature and pressure error, the stated accuracies of the data logger manufacturer were used. For the other potential error sources reasonable estimates were made. The total error, given by the root-squares sum of the individual errors, is 0.29 % in ME amplitude, consisting of several errors of approximately the same magnitude.

Total column time series
In this section the total column measurements from the EM27/SUN are compared to the reference HR125 spectrometer. For the measurements, the EM27/SUN was moved to a terrace on the top floor of the IMK-ASF, building 435 KIT CN (49.094 • N, 8.436 • E; 133 m a.s.l.) on a daily basis if weather conditions were favorable. The spectrometer was moved from the lab on the fourth floor to the roof terrace on the seventh floor, thus being exposed to mechanical stress. The instrument was coarsely oriented north, without effort for levelling. If further orientation was needed, the spectrometer was manually rotated so that the solar beam was centered onto the entrance window. The CamTracker program was then able to track the sun. The spectrometer was operated at ambient temperatures. During summer, the spectrometer heated up to temperatures above 40 • C. In order to protect the electronics from the heat, a sun cover for the EM27/SUN was built, which reduced the temperatures inside the spectrometer by about 10 • C. In winter the temperatures were as low as −4 • C at the start of measurements. Double-sided interferograms with 0.5 cm −1 resolution were recorded. With 10 scans and a scanner velocity of 10 kHz, one measurement takes about 58 s. For precise time recording, a GPS receiver was used.
The full time series from March 2014 to November 2017 is shown in Fig. 2 for the three data sets. For better visibility only coincident data points measured within 1 min between EM27/SUN and the other data sets are shown. There are 8349 paired measurements between EM27/SUN and TC-CON and 4624 between EM27/SUN and HR125 LR; in total there are 50 550 EM27/SUN and 25 361 TCCON measurements.
All gases show a pronounced seasonal cycle, where the variability in water vapor is strongest with values below 1×10 26 molec. m −2 in winter and up to 14×10 26 molec. m −2 in summer. Furthermore, the seasonal cycle of water vapor is shifted with respect to the other species. Another feature seen is that there is an offset in the EM27/SUN (red squares) and HR125 LR (blue squares) total column data with respect to the TCCON data (black squares). The occurrence of a systematic bias when reducing the spectral resolution has been observed by several investigators (Petri et al., 2012;Gisi et al., 2012). The observed offset between EM27/SUN and HR125 LR measurements is smaller. The remaining difference can be attributed to the different measurement heights of the HR125 (112 m) and EM27/SUN (133 m). For a quantitative analysis we do not utilize the total column measurements, but rather use the X Gas , as in this representation systematic errors, e.g., ILS errors, timing errors, tracking errors and nonlinearities, mostly cancel out. Furthermore, the height dependence largely cancels out in this representation. The comparison will be presented in the following sections.
First, a sensitivity study is provided demonstrating the effect of changes in the ILS on the gas retrieval. For this 1 h of measurements around solar noon on 1 August 2016 and 15 February 2017, corresponding to solar elevation angles (SEAs) of 60 and 30 • , were analyzed with artificially altered ILS values. The results are shown in Table 2. An increase of 1 % in the modulation efficiency leads to a decrease of 0.35 % (0.37 %) in the retrieved O 2 column, 0.31 % (0.31 %) in H 2 O, 0.26 % (0.28 %) in CH 4 and 0.50 % (0.57 %) in CO 2 for an SEA of 60 • (30 • ). So the change in the retrieved total column is not alike, but a unique characteristic of each species, and also slightly airmass-dependent. As the decrease in the CO 2 column is larger than the decrease in the O 2 column, XCO 2 decreases with an increasing ME, 0.16 % (0.19 %) for 1 % ILS increase, whereas XCH 4 increases 0.10 % (0.09 %). This is opposed to prior studies (Gisi et al., 2012;Hedelius et al., 2016) reporting an increase in XCO 2 and decrease in XCH 4 for an increase in the modulation efficiency, albeit in agreement with the findings from Hase et al. (2013) for the HR125 spectrometer, reporting that a change in the modulation efficiency results in a larger relative decrease in the CO 2 column than in the O 2 column.

X air
In this section the column-averaged amount of dry air (X air ) is investigated. This quantity is a sensitive test of the stability of a spectrometer because for X air there is no compensation of possible instrumental problems, in contrast to the DMFs, where errors can partially cancel out. X air compares the measured oxygen column (VC O 2 ) with surface pressure measurements (P S ):  Here µ and µ H 2 O denote the molecular masses of dry air and water vapor, respectively, g is the column-averaged gravitational acceleration and VC H 2 O is the total column of water vapor. The correction with VC H 2 O is necessary as the surface pressure instruments measure the pressure of the total air column, including water vapor. For an ideal measurement and retrieval with accurate O 2 and H 2 O spectroscopy, as well as accurate surface pressure, X air would be 1. However, due to insufficiencies in the oxygen spectroscopy, this value is not obtained. For TCCON measurements X air is typically ∼ 0.98 . For the EM27/SUN prior studies showed a factor of ∼ 0.97 (Frey et al., 2015;Klappenbach et al., 2015). Large deviations (∼ 1 %) from these values indicate severe problems, e.g., errors with the surface pressure, pointing errors, timing errors or changes in the optical alignment of the instrument. As mentioned in Sect. 3.1, here X air is used to check whether the small changes in the modulation efficiency indicated by the open path measurements are due to actual alterations in the alignment of the EM27/SUN or due to the residual uncertainty of the calibration method.
Panel (a) of Fig. 3 shows the X air time series of TCCON, the EM27/SUN and HR125 LR. For clarity, only coincident data points that were measured within 1 min between the different data sets are shown. Grey areas denote periods where the EM27/SUN was moved over long distances for campaigns or maintenance. The absolute values of X air differ for the data sets, with 0.9805 ± 0.0012 for TCCON, 0.9669±0.0010 for the EM27/SUN and 0.9670±0.0011 for HR125 LR. The difference between the EM27/SUN and the HR125 LR is within 1σ precision. The difference between the EM27/SUN and TCCON data set, which is commonly observed as previously noted, is a consequence of the different resolution together with the different retrieval algorithm (Gisi et al., 2012). It can be seen that all data sets exhibit a seasonal variability, which is more prominent in the TC-CON data, as can also be seen from the higher standard deviation. From this higher variability it can be concluded that the airmass dependency in the official TCCON O 2 retrieval is higher than for the PROFFIT retrieval on reduced-resolution TCCON measurements, a finding also observed by Gisi et al. (2012) between the TCCON retrieval and the PROFFIT retrieval at full resolution. For the PROFFIT retrieval, it is suspected that part of the variability stems from insufficiencies in the utilized HITRAN 2008 H 2 O linelist. It was reported by Tallis et al. (2011) that in the 8000-9200 cm −1 region, line intensities are low by up to 20 % compared to other wavenumber regions. This in return will lead to a systematic overestimation of the water column, which also affects X air .
To test the sensitivity of X air with respect to the measured H 2 O column, in panel (b) of Fig. 3  There are no obvious steps and there is no significant drift between the EM27/SUN and the HR125 LR data sets, so that it can be concluded that the EM27/SUN is stable during the complete course of the over 3-year long comparison, and differences seen in the modulation efficiency are introduced by the remaining uncertainty in the calibration method.  shows the XCO 2 ratio between the EM27/SUN and the two HR125 data sets. A linear fit was applied to investigate a possible trend in the ratios.

XCO 2
In Fig. 4 XCO 2 time series of the three data sets are shown together with the offsets between the data sets. The general characteristics of the data sets are similar. The yearly increase in XCO 2 due to anthropogenic emissions of about 2 ppmv can be seen as well as the seasonal cycle with a decrease in XCO 2 of approximately 10 ppmv during summer due to photosynthesis, characteristic of mid-latitude stations. Despite these agreements in the general trend, there are also differences between the data sets. Relative to the TCCON data the EM27/SUN and the HR125 LR data sets are biased high (0.98 % and 0.84 %, respectively). The scaling factors are calculated by taking the mean of all individual coincident point ratios (EM27/SUN/TCCON and EM27/SUN/HR125 LR). Together with these ratios a standard deviation is also derived; see Table 3. A high bias was also observed by Gisi et al. (2012); Frey et al. (2015), albeit with smaller absolute differences. This is due to the fact that (1) in the Gisi et al. pa-per the TCCON data were retrieved with an earlier version of GFIT (GGG2012) and (2) after the publication of the Frey et al. paper the Karlsruhe TCCON data were reprocessed with a customized GFIT retrieval accounting for baseline variations (Kiel et al., 2016b). The offset between EM27/SUN and TC-CON shows a seasonal variability. The reasons for this are mainly the differences in airmass correction, averaging kernels and retrieval algorithm. These effects have been investigated before (Gisi et al., 2012;Frey et al., 2015;Klappenbach et al., 2015;Hedelius et al., 2017;Kiel et al., 2016a). The averaging kernels of the EM27/SUN have been previously presented and compared to TCCON in a study by Hedelius et al. (2016).
It has to be noted that the level of uncertainty for XCO 2 is significantly higher between COCCON and TCCON compared to the internal EM27/SUN consistency. According to Table 3, a current calibration uncertainty with respect to TC-CON of 0.6 ppmv is estimated.
Atmos. Meas. Tech., 12, 1513-1530, 2019 www.atmos-meas-tech.net/12/1513/2019/   For the long-term stability of the EM27/SUN the focus lies on the comparison with the HR125 LR data set, where the above-mentioned differences cancel out. There is a small offset between the two data sets, resulting in a calibration factor of 1.0014, which is constant over time in the analyzed time period. To test this assumption a linear fit was applied to the XCO 2 ratios; see panel (b) of Fig. 4. In Table 3 the slope coefficient is depicted. For both comparisons the yearly trend in the ratio is well within the 1σ precision (0.44 ppmv) of the data set. In absolute numbers the slope per year is ≈ −0.02 ppmv for both ratios, or a drift smaller than 0.1 ppmv over the whole comparison period of around 3.5 years. Figure 5 shows the data sets in a different representation. In panel (a) the EM27/SUN is compared to the HR125 LR; the colorbar indicates the date of measurement and the dashed line is the 1 : 1 line. It can be seen that there is no trend in the data apart from the overall increase in time due to anthropogenic emissions. In panel (b) the EM27/SUN is compared to the TCCON data set; the colorbar shows the SEA. This representation is chosen so that the remaining airmass dependency of the ratio can be seen. It is also interesting to note that omitting the TCCON AICF for our analysis would move the data set significantly closer to the 1 : 1 line. The scaling factor would change from 1.0098 to 0.9995. As this finding is not true for XCH 4 and is probably coincidental, we maintain the AICF. Figure 6 shows the XCH 4 time series of the different data sets. As for XCO 2 , the general features are in agreement  for all data sets. There is a slight annual increase of about 10 ppbv. Also, there is a seasonal cycle with a variability of ≈ 30 ppbv; however, compared to XCO 2 the interannual seasonality strength and phase vary significantly between the years due to the many different variable sinks and sources of methane, e.g., Dlugokencky et al. (1997). The differences between the data sets largely resemble the differences observed for XCO 2 . The bias between EM27/SUN and TCCON is 0.72 %; see Table 4. This bias is close to the bias observed by Hedelius et al. (2016), 0.75 %, where they used the GGG software package for the analysis of EM27/SUN spectra. Although a single bias is reported, as was observed for XCO 2 the offset is not constant, but rather shows a seasonality. The calibration uncertainty between COCCON and TCCON is estimated to amount to 5 ppbv for XCH 4 ; see Table 4. The retrievals between EM27/SUN and HR125 LR agree within 1σ precision (0.9997±0.0008). Panel (a) of Fig. 7 shows the ratio between EM27/SUN and HR125 LR color-coded with the observation date. As for XCO 2 , no trend is apparent. An explicit linear fit to the XCH 4 ratio produces a slope coefficient of 0.0001, 1 order of magnitude smaller than the 1σ precision of the ratio (0.0008).

XCH 4
An interesting feature is observed in the ratio between EM27/SUN and TCCON data sets; see panel (b) of Fig. 7. In general the pattern is similar to that of XCO 2 , with a slight dependence on the SEA. The ratio in the figure is color-coded with the date of observation rather than the SEA. It can be seen that for 1 and 14 March 2016 (shaded area in Fig. 7), the XCH 4 ratio significantly differs from the other observations. Previous work by Ostler et al. (2014) has shown that stratospheric intrusion, caused for example by the subsidence of the polar vortex, has a different effect on MIR Atmos. Meas. Tech., 12, 1513-1530, 2019 www.atmos-meas-tech.net/12/1513/2019/ and NIR retrievals, even when using the same a priori profile. This is due to the differing sensitivity of the retrievals with respect to altitude. Therefore, differences between the true atmospheric profile and the assumed a priori profiles on these days could cause the differences seen. This effect will also lead to larger differences between EM27/SUN and TC-CON XCH 4 because of the different impact on the retrieved columns due to differing sensitivities. A spread of the polar vortex to mid-latitudes could lead to significantly altered CH 4 profiles compared to the a priori profiles, explaining the observed differences in the XCH 4 ratio. For dates without measurements, the data were interpolated using a weighted average. The dotted black lines denote 1 and 14 March 2016, the dates on which the XCH 4 ratio between EM27/SUN and TCCON shows an anomaly. The changed profile shape during that period is clearly visible. As this station is south of Karlsruhe, it is expected that also for Karlsruhe the CH 4 profile will show considerable downwelling, explaining the observed anomaly in the XCH 4 ratio.

Ensemble performance
Having investigated the long-term stability of the EM27/SUN with respect to a reference spectrometer in the previous section, here the level of agreement of an ensemble of EM27/SUN spectrometers is presented. The procedure is the same as for the comparison between the reference EM27/SUN and the HR125. First, the ILS is analyzed, followed by calibration factors for XCO 2 and XCH 4 .

ILS measurements and instrumental examination
The measurement of the ILS is a valuable diagnostic for detecting misalignments of spectrometers. Differences in the ILS of the EM27/SUN spectrometers due to misalignment can lead to biases in the data products between the instruments. Here the spread of ILS values of all EM27/SUN spectrometers that were checked at KIT in the past 4 years is estimated. Numerical values are given in Table 5; the results are shown in Fig. 9. The black square denotes an ILS measurement of the HR125 spectrometer, also with 1.8 cm MOPD. This test was done to check for an absolute offset of our method. The HR125 would be expected to show an ideal ILS for short optical path differences, but a value of 0.9824 was obtained. From this measurement it is concluded that our method shows an absolute offset and that values between 0.98 and 0.99 are desired. In general, the agreement between the 30 tested EM27/SUN is good, with an ensemble mean of 0.9851 ± 0.0078, which does not differ significantly from the value obtained for the HR125, but there are exceptions. Instrument SN 44 was checked at KIT only after an upgrade with the second channel was performed at Bruker Optics. Before realignment, the instrument showed a very low ME value of 0.9374. A realignment of the instrument enhanced the ME to 0.9714. This is still significantly low compared to the EM27/SUN ensemble mean, but the difference was drastically reduced. The second instrument showing strong deviations from the ensemble mean is SN76 with an ILS of 1.0160, the only instrument showing overmodulation. The ILS was even higher (1.0350) when the first ILS measurements were performed. Due to our findings, the manufacturer exchanged the beamsplitter, which reduced the overmodulation, but it partly remained. In the meantime it was recognized that the cause of the error was the manufacturer during assembly of the instrument forgetting to insert the foreseen spacer to achieve the correct detector position with respect to the beamsplitter. The beamsplitter is coated, and the coating is applied on both sides of the beamsplitter over half the surface area. If the optical axis of the detector element coincides with the transition region of the two coating areas, detrimental effects occur. For this reason the detector element needs to be raised with respect to the interferometer. This problem occurred for instrument SN 77, but there it was diagnosed and corrected by KIT (ILS before lifting: 1.0340; ILS after correction: 0.9855).
The above-mentioned problems show the benefit of the calibration routine at KIT. Imperfections from nonideal alignments were diagnosed and corrected. Also, other detrimental effects, e.g., double-passing, channeling, nonlinearity issues, solar tracker problems, inaccurate positioning of the second detector, or camera issues, were corrected or minimized for a number of instruments. Finally, we checked whether the linear interpolation method suppressing sampling ghosts was activated.

XCO 2 and XCH 4 comparison measurements
After checking the alignment and performing lamp measurements, side-by-side solar calibration measurements were performed on the terrace on top of the KIT-IMK office build-ing with each spectrometer with respect to the reference EM27/SUN and also a co-located HR125 spectrometer. Calibration measurements started in June 2014 and are ongoing, if new spectrometers arrive for testing. The aim is to have at least 1 day of comparison measurements so that the spectrometers can be scaled to TCCON via the reference EM27/SUN. TCCON is extensively compared to measurements on the WMO scale. Dates of the comparison measurements for the different spectrometers as well as number of coincident measurements are given in Table 6. On 21 January 2016, our reference spectrometer suffered from laser sampling errors after approximately 1 h of measurements. Therefore the number of coincident measurements for SN62 and 63 that were checked on this date are sparse. A typical calibration day is depicted in Fig. 10.
The calibration factors and standard deviations for all instruments with respect to the reference spectrometer are also depicted in Table 6. Calibration factors and standard deviations were obtained using the methods described in Sect. 3.4. The calibration factors are close to nominal for all species and instruments. For XCO 2 the ensemble mean is high compared to the reference EM27/SUN, with a mean calibration factor of 0.9993 and a standard deviation of 0.0007. In Fig. 11 histograms of the calibration factor distributions are depicted for XCO 2 , XCH 4 , and O 2 , respectively. The histograms are not conspicuous.
Applying the mean calibration factor to all calculated calibration factors centers the data around the ensemble mean. As an estimate for the spread of the calibration factors 1 n |XGas factor − 1|, we arrive at an average bias between Atmos. Meas. Tech., 12, 1513Tech., 12, -1530Tech., 12, , 2019 www.atmos-meas-tech.net/12/1513/2019/ the instruments of 0.20 ppmv. From Table 6 we can also calculate an average standard deviation 1 n |σ | of 0.13 ppmv. For XCH 4 the ensemble mean is closer to the reference EM27/SUN (0.9997 ± 0.0006) as compared to XCO 2 . From this results an average bias of 0.8 ppbv. The average standard deviation is 0.6 ppbv. These values are comparable to results obtained in a study from Hedelius et al. (2017). They checked the intercomparability of the four United States TC-CON sites using an EM27/SUN as a traveling standard. They report average biases of 0.11 ppmv for XCO 2 and 1.2 ppbv for XCH 4 ; for the average standard deviations they obtain 0.34 ppmv (XCO 2 ) and 1.8 ppbv (XCH 4 ). It has to be noted that for the Hedelius et al. (2017) study only data within ±2 h of local noon were taken into account, whereas here no constraints regarding the time of measurement were applied. As another sensitive test the O 2 total column calibration factors are given. In contrast to XCO 2 and XCH 4 , there is no canceling of errors in this quantity. The ensemble mean is slightly high compared to the reference EM27/SUN (0.9999 ± 0.0014). The average bias is 0.11 % O 2 with an average standard deviation of 0.04 % O 2 .
Note that for our setup this average bias is a worst case scenario. The bias only applies if no calibration factor is used in the subsequent analysis. The strength of this calibration routine is that the computed calibration factors can be used, thereby significantly lowering the bias between different EM27/SUN spectrometers. The remaining bias is then given by the long-term drift of the individual instrument (see Sect. 3.4 and 3.5) and sudden alignment drifts due to mechanical strain from, e.g., transport and campaign use. To estimate this drift, we utilize the calibration factors before and after the Berlin campaign performed in 2014. There the drifts between five instruments were below 0.005 % XCO 2 and 0.035 % XCH 4 (Frey et al., 2015).
Ideally, we would expect identical calibration factors as we took the real ILS of the instruments into account. As this is not the case, we investigate whether the remaining differences can be attributed to the uncertainties of the open path measurements, which are summarized in Table 1. The results are incorporated into Fig. 12. Panel (a) shows the correlation between O 2 and XCO 2 calibration factors. Black squares denote the empirical calibration factors derived from  the side-by-side measurements. The red squares show calculated calibration factors based on the ME uncertainty budget. The dashed red line is a linear fit through the calculated factors. About half the measured empirical factors are within the bounds of the factors derived from the ME error budget. Furthermore, the slopes of the calculated and empirical factors are in good agreement, confirming that the ME uncertainty is contributing to the uncertainty of the calibration factors. The other contributions for this uncertainty are due to a superposition of various small device-specific imperfections. Panel (b) of Fig. 12 shows the correlation between O 2 and XCH 4 calibration factors. The findings mentioned above for the O 2 and XCO 2 correlation also hold true here.

Conclusions and outlook
Based on a long-term intercomparison of column-averaged greenhouse gas abundances measured with an EM27/SUN FTIR spectrometer and with a co-located 125HR spectrometer, respectively, we conclude that the EM27/SUN offers highly stable instrument characteristics on timescales of sev-eral years. The drifts on shorter timescales reported by Hedelius et al. (2016) were probably exclusively -as conjectured by the authors of the study -due to a deviation from the instrumental design as originally recommended. The application of a wideband detector suffering from nonlinearity together with steadily decreasing signal levels due to ageing of the tracker mirrors seem to be the reason for the observed drifts.
The favorable instrument stability which is preserved even during transport events and operation under ambient conditions suggests that the EM27/SUN spectrometer is well suited for campaign use and long-term deployment at very remote locations as a supplement of the TCCON. A deployment at remote sites is further facilitated by the recent development of an automated enclosure for the EM27/SUN, which enables unattended remote operation (Heinle and Chen, 2018;Dietrich and Chen, 2018). An annual to biannual check of the instrument performance by performing a side-by-side intercomparison with a TCCON spectrometer seems adequate for quality monitoring. To separate out instrumental drifts from atmospheric signals, the addition of Atmos. Meas. Tech., 12, 1513Tech., 12, -1530Tech., 12, , 2019 www.atmos-meas-tech.net/12/1513/2019/ Figure 12. Correlation of O 2 calibration factors and XCO 2 (a) as well as XCH 4 (b) calibration factors. Black squares show the empirical calibration factors from the side-by-side measurements, red squares show calculated factors derived from the total ME uncertainty shown in Table 1, and the dashed red line is a linear fit through the calculated factors. The slope of empirical and calculated factors is in good agreement.
low-resolution spectra derived from the TCCON measurements is highly useful, because in this kind of comparison, the smoothing error and any possible resolution-dependent biases of the analysis software cancel out. The ensemble performance of 30 EM27/SUN spectrometers turns out to be very uniform, supported by a centralized acceptance inspection performed at KIT before the spectrometers are deployed. When using the empirical ILS parameters derived for each spectrometer, the scatter in XCO 2 amounts to 0.13 ppmv, while it is 0.6 ppbv for XCH 4 . The standard deviation of the oxygen columns is 0.04 %. We expect that the conformity of measurement results will be even better than indicated by this scatter, if the remaining empirical calibration factors are taken into account. These empirical calibration factors are likely composed of several small device-specific error contributions; a major contribution was identified to stem from the uncertainty of the ILS measurements. Continuation and further development of the COCCON activities seem highly desirable for achieving the optimal performance of the growing EM27/SUN spectrometer network. The implemented pre-deployment procedures of testing, optimizing, and calibrating each device -executed by experts at a central facility -help to ensure consistent results from EM27/SUN spectrometers operated in any part of the world. This approach is corroborated by the proven excellent long-term stability of instrumental characteristics, and the proven high degree of stability under thermal and mechanical burdens as they occur during transport. In order to maintain the reliability of the EM27/SUN spectrometers, we suggest investigators send the instrument to KIT for a biennial inspection. The EM27/SUN spectrometer does not require continuous expert maintenance and it is very simple to operate; we therefore expect that many investigators world-wide who are not keen on becoming FTIR experts will be attracted by this measurement device, operating it as a side activity. Current COCCON work supported by ESA in the framework of the COCCON PROCEEDS project will result in an easyto-handle preprocessing tool optimized for the EM27/SUN spectrometer. This tool will generate quality-checked spectra from raw interferograms, which then are forwarded to a central data analysis facility. A demonstration setup of the central facility will be part of COCCON PROCEEDS. When finally implemented on an operational level, the facility will remove the whole burden of the quantitative trace gas analysis from the operator and ensure the consistency of the trace gas analysis chain to the utmost degree. Furthermore, it will enable a timely reanalysis of all submitted spectra after upgrades of the retrieval procedures and minimize the risk of data loss if operators for some reason are stopping their activity. Finally, this centralized facility will serve as a unique contact point for the data users.
Data availability. TCCON Karlsruhe data  are available from the TCCON data archive, hosted by CaltechDATA: https://tccondata.org/. EM27/SUN data are available upon request to the authors.
Author contributions. MF: performed measurements, data analysis, paper writing; MKS performed measurements and contributed to data analysis. FH performed measurements, data analysis, and paper writing. MK contributed to data analysis. TB performed measurements and contributed to calibration efforts. RH contributed to calibration efforts. GS contributed to calibration efforts. NMD contributed to calibration efforts. KS contributed to calibration efforts. JF contributed to calibration efforts. HB contributed to calibration efforts. JC contributed to calibration efforts. MG contributed to calibration efforts. HO contributed to calibration efforts. YS contributed to calibration efforts. AB contributed to calibration efforts. GMT contributed to calibration efforts. DE contributed to calibration efforts and provided evidence of XH 2 O bias. DW contributed to calibration efforts. ZC contributed to calibration efforts. OG con-tributed to calibration efforts. MR contributed to calibration efforts. FV contributed to calibration efforts. JO supported the advance of the project and contributed to calibration efforts.