Strategy for high-accuracy-and-precision retrieval of atmospheric methane from the mid-infrared FTIR network

. We present a strategy (MIR-GBM v1.0) for the retrieval of column-averaged dry-air mole fractions of methane (XCH 4 ) with a precision < 0.3 % (1-σ diurnal variation, 7-min integration) and a seasonal bias < 0.14 % from mid-infrared ground-based solar FTIR measurements of the Network for the Detection of Atmospheric Composition Change (NDACC, comprising 22 FTIR stations). This makes NDACC methane data useful for satellite validation and for the inversion of regional-scale sources and sinks in addition to long-term trend analysis. Such retrievals complement the high accuracy and precision near-infrared observations of the younger Total Carbon Column Observing Network (TC-CON) with time series dating back 15 years or so before TC-CON operations began.


Introduction
Methane (CH 4 ) in the gas phase is characterized spectroscopically by its highly infrared active eigen modes due to the tetrahedral molecular structure.As a result, it has a global warming potential of 72 in the earth's atmosphere compared with carbon dioxide if calculated over a period of 20 years (Forster et al., 2007).Therefore, it is the second most important anthropogenic greenhouse gas in spite of its still relatively small abundance in the atmosphere compared to carbon dioxide.
Published by Copernicus Publications on behalf of the European Geosciences Union.
The main sources are natural wetlands, anthropogenic activities (livestock production; rice cultivation; production, storage, transmission, and distribution of fossil fuels; waste waters and landfills), and biomass burning, both natural and human-induced.About 90 % of the CH 4 loss in the atmosphere is due to the destruction by OH in the troposphere (Lelieveld et al., 2004).
Methane concentrations in the atmosphere have more than doubled since the beginning of industrialization and columnaveraged mole fractions have reached more than 1780 ppb on a global average in 2009 (Frankenberg et al., 2011;Schneising et al., 2011).After a period of near stable concentrations at the beginning of this century, attributed to the collapse of the former USSR economy (Dlugokencky et al., 2003), the growth rate of atmospheric methane has started to increase again recently (Rigby et al., 2008).This increase could be attributed to emissions from natural wetlands due to interannual anomalies in temperature and precipitation (Dlugokencky et al., 2009;Bousquet et al., 2011).For the future, however, there is concern that large positive feedbacks on climate warming can arise from releases of CH 4 from marine hydrates or melting permafrost.
In order to assess the effectiveness of emission reduction schemes within the frame of the Kyoto process, it is necessary to quantify the sources and sinks on regional scales.One way to do so is the inverse modeling of atmospheric concentration measurements.This approach has recently been based upon methane surface measurements from global surface monitoring networks (Bousquet et al., 2011), or on columnaveraged methane data from ENVISAT/SCIAMACHY satellite retrievals (Bergamaschi et al., 2009).
Ground-based solar FTIR measurements of methane have the potential to contribute to trend studies as well as quantification of sources and sinks.The latter can be two complementary tasks: (i) validation of satellite retrievals of methane by FTIR (e.g.Sussmann et al., 2005a;Morino et al., 2011) which is important because spatio-temporal biases of the satellite data can be misinterpreted as sources or sinks by the inversion; (ii) direct use of the ground-based FTIR network data for inverse modeling.As to (ii) it had been stated by Bergamaschi et al. (2007) that the precision of the midinfrared FTIR measurements of 3 % and the relative accuracy of 7 % shown by Dils et al. (2006) was significantly below the precision and (relative) accuracy targets of <1-2 % of SCIA-MACHY measurements.Therefore, in situ measurements of CH 4 from a comprehensive global air sampling network would be preferred since they have very high (and sufficient) precision and absolute accuracy (≈0.1 %).
In situ measurements are probing the earth surface, and, therefore, additional information about the vertical distribution of CH 4 is required to render these measurements amenable for satellite validation or inverse modeling.Ground-based FTIR spectrometry, on the other hand, can directly measure the same quantity as the satellite (columns).Furthermore, measured columns contain direct information on sources and sinks.Thus, it could be shown that a single column-measurement station can provide significantly more information on sources and sinks than several surface stations together (Olsen and Randerson, 2004).
Column measurements for many species are performed within the Network for the Detection of Atmospheric Composition Change (NDACC, http://www.ndacc.org)for about two decades by ground-based solar FTIR spectrometry in the mid-infrared, currently operating with 22 NDACC FTIR stations.NDACC mid-infrared spectra have also been utilized for methane retrievals (e.g.Zander et al., 1989;Sussmann et al., 2005a;Warneke et al., 2006).These activities have recently been complemented by the Total Carbon Column Observing Network (TCCON, http://www.tccon.caltech.edu/)which has been designed for providing high-quality column measurements of CO 2 (and CH 4 ) in connection to the OCO (Orbiting Carbon Observatory) mission (Wunch et al., 2011).TCCON is based on near-infrared solar FTIR measurements utilizing a normalization to simultaneous oxygen measurements to achieve very high precisions -for methane in the order of 0.3 % for a 1.5-min integration time (Washenfelder et al., 2003).Currently, there are 15 operational TCCON stations.Most of them began operations during the last couple of years.
It is the goal of this paper to develop a strategy to infer methane also from NDACC mid-infrared FTIR measurements with an accuracy and precision in the order of a few tenths of one per cent to make the data useful for satellite validation and for the inversion of regional-scale sources and sinks in addition to long-term trend analysis.This gives the possibility to complement the TCCON (near-infrared) observations as to spatial coverage and to provide the link to trend investigations dating back 15 years or so before TCCON operations began.
Our paper is organized as follows.Section 2 is about the retrieval strategy.Section 2.1 describes the conceptual approach to develop the retrieval strategy and Sect.2.2 introduces the 3 different FTIR sites used for test of the strategy.Focus is then put on selecting the best sub-set of spectral micro windows (MW) out of a candidate set of 5 MW (Sect.2.3).Selecting the best available spectroscopic line parameters compilation is the subject of Sect.2.4: erroneous spectroscopy can lead to airmass-dependent artifacts impacting methane seasonality, as described for methane retrievals from SCIAMACHY (Frankenberg et al., 2008a).Some effort is undertaken to optimize precision for methane via a special inverse method (Sect.2.5) and a dedicated data quality selection (Sect.2.6).Accuracy for methane turns out to be heavily impacted by water-vapor interference errors in the case where a non-optimum retrieval strategy is used (Sect.2.7).A related water-vapor-methane interference problem had been described for satellite retrievals (Frankenberg et al., 2008b).For this reason the whole study is performed using data from 3 FTIR stations (Wollongong, Garmisch, and Zugspitze) located at strongly differing climatic zones to cover all water Table 1.Spectral micro windows and molecular line parameters compilations investigated in this study to find an optimum retrieval strategy.

Goal and approach
A first technical goal in optimizing the retrieval strategy is to investigate how retrievals of vertical profiles of methane can lead to improved precision for total columns compared to a simple scaling of a volume mixing ratio (vmr) profile with an altitude-constant factor.A requirement for such a profile inversion method is that it shall be based upon a robust regularization scheme that can easily be transferred to all NDACC-FTIR stations in a consistent manner.Another strategic goal is to identify a favorable selection out of five mid-IR candidate spectral micro windows (MW, see Table 1) which have been established previously within the EC project UFTIR (http://www.nilu.no/uftir/).Furthermore, we want to find out which of the 3 most recent, official-release HI-TRAN 1 line parameters compilations (Table 1) is best suited for our mid-infrared micro windows.Finally, a concept for the quality selection of the methane retrievals shall be developed, since this is crucial to obtain a data set with optimum precision.
The conceptual approach is to perform an error characterization for multi-annual data sets prepared with varied retrieval strategies, i.e. varied subsets of the 5 candidate micro windows, and using the 3 different recent HITRAN versions for each case.A focus is on H 2 O/HDO-CH 4 interference errors which turned out in the course of our study to be the dominant errors in mid-IR methane retrievals which are not carefully optimized (up to ≈4 %, for some HITRAN versions 1 HIgh-resolution TRANsmission molecular absorption database  2 shows, these test sites cover humidity levels between extremely wet to very dry due to their differing climatic locations.This range of integrated water vapor is representative for the clear-sky measurement conditions of the whole NDACC FTIR network.

Zugspitze FTIR system
The Zugspitze (47.42 • N, 10.98 • E, 2964 m a.s.l.) solar FTIR system was set up in 1995 as part of the "Alpine Station" of the NDACC network.It is operated by the Group "Variability and Trends" at IMK-IFU2 , Karlsruhe Institute of Technology, together with a variety of additional sounding systems at the Zugspitze site3 .These include an AERI (Atmospheric Emitted Radiance Interferometer), GPS (Global Positioning System for water vapor soundings), and two water vapor lidars (both a differential absorption lidar and a Raman lidar).The FTIR team contributes to satellite validation and studies of atmospheric variability and trends (e.g.Sussmann and Buchwitz, 2005;Sussmann et al., 2005a,b;Sussmann and Borsdorff, 2007;Sussmann et al., 2009;Vogelmann et al., 2011).The FTIR system is based upon a Bruker IFS 125/HR interferometer; details can be found in Sussmann and Schäfer (1997).The interferograms used for the methane retrievals have been recorded with an InSb detector using an optical path difference of typically 175 cm, averaging a number of 6 scans (≈7-min integration time).

Garmisch FTIR system
The Garmisch solar FTIR system (47.48• N, 11.06 • E, 743 m a.s.l) was set up in 2004 and is part of the TCCON network operating in the near-infrared for high-precision retrieval of column-averaged mixing ratios of carbon dioxide and methane.The system performs mid-IR NDACC-type measurements in parallel (in alternating mode on the time scale of several minutes).The latter are utilized for this study.The system is operated together with a variety of additional sounding systems by IMK-IFU at the Garmisch site4 , which comprises, e.g. an NDACC aerosol lidar, GPS, and an ozone lidar.The Garmisch FTIR data contribute to satellite validation and studies of atmospheric variability and trends (e.g.Morino et al., 2011;Borsdorff and Sussmann, 2009).
The FTIR system is similar to the mid-IR Zugspitze set up with additional InGaAs and Si diodes (dual recording) for the near-IR measurements plus high-precision solar tracking (Bruker A547N, ±2 min of arc) and high-accuracy pressure measurement devices.The measurement settings for the Garmisch mid-IR methane measurements are the same as detailed for the Zugspitze above.

Wollongong FTIR system
The Wollongong solar FTIR system (34.45Griffith et al., 1998).During 2007 a Bruker IFS 125/HR instrument replaced the Bomem DA8 as part of an upgrade of the FTIR measurement program to expand the measurement capability to both the mid-IR and near-IR (Jones et al., 2011;Wunch et al., 2011).For this paper only the Bruker data were used.Spectra used for the CH 4 retrievals were obtained from interferograms recorded from an InSb detector, and a KBr beamsplitter or a CaF 2 beamsplitter.The optical path difference was 257 cm, coadding 2 consecutive interferograms giving an integration time of approximately 4 min.

Spectral information and interfering species
The spectral contributions from methane and all relevant interfering species to our candidate spectral micro windows within the measured solar absorption spectrum are plotted in Fig. 1.The most important interfering species, i.e. water vapor and its isotope HDO, can vary by factors >200 between dry NDACC sites like Zugspitze and humid sites like Wollongong, see Table 2.The spectral effect from this huge dynamic range is demonstrated in Fig. 1, see dashed versus solid lines.This provides evidence that it is important for  (2002), where a minimization of interference errors was achieved without joint fitting of interfering species.This was performed by extensive micro window cutting, aiming to minimize the inclusion of spectral signatures of interfering species while preserving the main features of the target species.

Spectroscopic line data and spectral fitting residuals
Figure 2 shows the spectral residuals (measured minus calculated) averaged over more than one year of solar absorption measurements analyzed by the non-linear least squares spectral fitting software SFIT2 (Pougatchev et al., 1995)   verified by comparing the residuals to the contribution plots (Fig. 1).
iii.HITRAN 2008 shows similar problems as HI-TRAN 2004 for the non-methane line parameters.Additionally, the residuals due to methane have increased with HITRAN 2008.In particular, a huge error in the line strength of the 2921.33 cm −1 methane line was introduced.
Figure 2b proves that HITRAN 2000 behaves comparably well for all three test sites in spite of their strongly differing humidity levels.

Profile retrieval optimizing precision for columns
An inverse method for methane retrievals from groundbased, mid-infrared solar FTIR spectrometry has been set up via the spectral fitting software SFIT2 (Pougatchev et al., 1995) -1 .
The classical approach to retrieve total columns from ground-based FTIR spectrometry has used least squares spectral fitting with iterative scaling of an volume mixing ratio (vmr) a priori profile via one (unconstrained) altitude constant factor.This had been implemented in non-linear least squares spectral fitting software like SFIT1 (e.g.Rinsland et al., 1984) or GFIT (e.g.Toon et al., 1992).Because of the free profile scaling, this approach has the advantage that it does not damp true scaling-type columns variability in the retrieval.However, it frequently leads to significant spectral residuals.This is because of (i) likely discrepancies of the shape of the true profile relative to the a priori profile (e.g.caused by variability of the tropopause altitude) and (ii) possible spectral line shape errors in the forward calculation and/or the measurement.Both effects can introduce significant biases to the retrieved columns.A strategy to reduce 5 Infrared Working Group this problem is to derive total columns from profile retrievals which helps to better integrate the area of the measured absorption line shape and thereby obtain a more accurate estimate of the total column integral.
Up to now, most profile retrievals from solar FTIR spectrometry have been regularized via diagonal a priori covariance matrices with the magnitude of the variances tuned empirically to avoid profile oscillations.This is sometimes referred to as "empirical implementation of optimal estimation" (e.g.Pougatchev et al., 1995).This type of profile retrieval, however, has the tendency to damp variability in the retrieved total columns as a result of profile smoothing at the cost of any deviation between the retrieved profile and the a priori profile.This damping effect is easily overlooked, and may even be misinterpreted to be an indication of good precision of the measurement.It tends to become critical in cases where the a priori covariance matrix elements are set to small values, allowing for small profile variability only and/or the retrieval contains only low information content.
Therefore, we favor a more robust retrieval that combines the advantages of both a profile scaling and a profiling approach while avoiding their disadvantages at the same time.For this purpose, we construct a regularization matrix that allows for some (constrained) flexibility in profile shape (degree of flexibility to be tuned) and also guarantees that pure profile-scaling type variations remain unconstrained.This can be achieved as follows.
The forward model F maps the profiles to be retrieved from state space x into measurement space y.The retrieval is the (ill-posed) inverse mapping from y to x which is formulated as a least squares problem.Due to the non-linearity of F, a Newtonian iteration is applied and a regularization term R ∈ R n×n (inverse model with n layers) is used that allows one to constrain the solution and thereby avoid oscillating profiles (1) where the subscript i denotes the iteration index and x a is the a priori profile.Here K x = F/∂x are the Jacobians and S ε is the measurement covariance (assumed to be diagonal with a signal-to-rms-noise ratio of 500 in our formulation).Using first order Tikhonov regularization (Tikhonov, 1963), R is set up by the relation where α is the regularization strength and L 1 is the discrete first derivative operator which constrains x in a way such that a constant profile is favored for the difference x − x a .The prior x a for methane and the interfering species (including H 2 O) was constructed from a multi-annual average output from the Whole Atmosphere Chemistry Climate Model (WACCM, Garcia et al., 2007), and from the US Standard Atmosphere for species which were not available from WACCM (e.g.HDO).Pressuretemperature profiles have been obtained from NCEP (National Center for Environmental Prediction).
Tests have shown that it is a good choice to apply the Tikhonov L 1 regularization to percentage changes (or scaling factors) of the vmr of the individual profile layers to be retrieved.Another choice would be the application of R to the state vector given in units of absolute vmr, which implies a differing altitude dependency of the regularization strength.An argument for the implementation of L 1 in units of percentage profile changes is that this leads to the limiting case of a vmr-profile scaling, in case the regularization strength α is tuned towards infinity leading to 1 degree of freedom for signal (dofs, see Rodgers, 2000 for a definition): vmr-profile scaling is one of the best-tested retrieval approaches and well known to yield very robust retrieval results for total columns.
The details of the retrieval grid chosen impact the altitude dependency of the regularization strength.For an altitudeconstant retrieval grid Eqs. ( 2) and (3) result in an altitudeconstant regularization strength.This turned out to work robustly for water vapor (Sussmann et al., 2009) as well as for the methane retrievals and other species.In order to preserve this altitude-constant regularization in case a nonaltitude constant retrieval grid is used, a transformation T has to be applied to Eq. ( 2), i.e. where and z i is the vertical thickness of a layer with index i of the non-equidistant retrieval grid.Finally, the regularization strength α can be optimized in a way to achieve minimum diurnal variation (optimum precision) of the retrieved CH 4 columns.Figure 3a shows the L-curve, and Fig. 3b its second derivative which shows an optimum for an α corresponding to dofs ≈2. Figure 3c shows that at the same time one gets a dofs ≈2, a minimum for the diurnal variation is obtained (0.23 %, 1 σ ).This is nearly a factor of 2 lower than the diurnal variation of 0.39 % which is obtained in case a simple vmr-profile scaling approach is used (dofs ≡1, see point on the very left hand side of Fig. 3c).Together with the L-curve this provides evidence that the optimized Tikhonov profile retrieval accounts for true profile variations in a way that helps to better integrate the measured absorption-line profile, i.e. that is closer to an equivalent width retrieval than a simple vmr-profile scaling approach.See Appendix A for ensembles of retrieved profiles and total-column averaging kernels.

Quality selection
Final quality selection of the methane retrievals is crucial, e.g. for obtaining a data set of methane columns with best possible precision.Any quality selection is a trade off between improving the overall quality of the data and losing too much data.Therefore, we present an approach that optimizes this problem for ground-based FTIR spectrometry.
First of all we use a threshold for χ 2 as a measure for the goodness of fit as shown in Fig. 4.However, we found that there are still some outliers for low χ spectra with bad quality.To remove these we added another quality selection threshold for spectral rms-noise divided by the information content (dofs) as outlined in Fig. 4. The reason for using the spectral rms-noise to dofs ratio is as follows.Figure 5a (upper trace) shows that the time series of spectral noise contains a seasonal cycle with a maximum in winter (minimum in summer) which is due to the changing solar zenith angle.This means that a classical quality criterion using a simple threshold for the rms-noise would eliminate more measurements in winter than in summer.However, Fig. 5a (lower trace) indicates that the dofs shows a seasonal cycle with same phase.This is a result of the absorption line depth changing with varying solar zenith angle in such a way that during winter there is higher dofs, as the lines are deeper.Deeper lines from (winter) spectra with higher noise level (less sun light than during summer) can be analyzed at a comparable quality (retrieval noise) level as the weaker lines from summer spectra, which show a lower average noise level.Thus, an optimized quality criterion can be utilized using a threshold for the ratio of the spectral rmsnoise and dofs, see Fig. 5b.Another advantage is that this threshold is more generic as it is no longer sensitive to the average zenith angle of a specific site.Therefore the same threshold can be used for sites at differing geographic locations.We used one common quality threshold of 0.15 % (red line in Fig. 5b) for all three test sites.In addition, to remove a few obvious outliers, we added a threshold for the deviation of an individual methane-column measurement from the daily mean of <1.8 %.

Information content
Table 4 (first rows) shows that MW 5 is the most important micro window within the set of 5 candidate MWs: excluding MW 5 leads to a stronger drop of dofs than excluding any of the other MWs.E.g. there is a drop of dofs from 1.94 to 1.75 as a result of dropping micro window 5 for Wollongong.

Diurnal variation
The precision of the retrieved CH 4 columns (mostly limited by the impact of clouds) is estimated from the 1-σ diurnal variation of retrievals from single spectra (derived from average of several scans, ≈4-7-min integration), averaged over all individual days of the multi-annual time series.the precision of remote sounding column measurements of CH 4 .In reality, part of the diurnal variation will be caused by real variations in CH 4 over the day.Therefore this method gives an upper limit for the precision (see, e.g.Warneke et al., 2006).Table 4 (second rows) shows that an average precision of 0.25 % is achieved for Wollongong, 0.23 % for Garmisch, and 0.29 % for Zugspitze using all 5 candidate micro windows.Furthermore, the table shows that dropping individual micro windows leads to a slight increase of the diurnal variation, related to the corresponding drop in dofs.Using HITRAN 2004 or HITRAN 2008 instead of HITRAN 2000 also leads to a slight but significant increase of the diurnal variation (Table 5).In order to minimize interference errors, our final recommendation will be to use only micro windows 1, 3, and 5 together with HITRAN 2000 (HIT00 MW (135) strategy)).The resulting diurnal variations are 0.27 % for Wollongong, 0.26 % for Garmisch, and 0.30 % for Zugspitze (Table 4).One note on the HIT08 MW ( 1234) strategy.This strategy might be favored by the "esthetic" reason that it comprises latest version HITRAN.However, it will be shown in Sect.2.7.3 that this strategy causes significantly higher interference errors than our recommended HIT00 MW (135) strategy comprising HITRAN 2000.
Table 6 shows two further disadvantages of using HIT08 MW (1234): for Garmisch (Wollongong) there is an increased diurnal variation of ±0.28 (±0.31) compared to using the HIT00 MW (135) strategy which leads to ±0.26 (±0.27).Also the information content is lower using HIT08 MW (1234), namely 1.75, compared to 1.80 attainable by using HIT00 MW (135).This means that a precision of <0.3 % is attainable for total column methane from mid-IR NDACC-type measurements and this is comparable to the TCCON state of the art for methane of <0.3 % for single spectra.However, two points have to be considered for a more detailed quantitative comparison: i.The integration time for one single TCCON spectrum is about 1.6 min while the integration time of the mid-IR NDACC measurements of our study is ≈4-7 min.
Recalculating the TCCON precision for a 7-min integration would lead to ≈0.3 %/sqrt (7/1.6)= 0.14 % which would be a factor of ≈2 better than the mid-IR precision of <0.3 % for total column methane.
ii.However, in case of the mid-IR retrievals no correction for variability induced by clouds is performed, while the TCCON retrievals use a normalization by simultaneous O 2 column measurements (Washenfelder et al., 2003;Wunch et al., 2011).The TCCON retrievals additionally include a correction for solar intensity fluctuations via the DC signal of the interferograms according to Keppel-Aleks et al. (2007).Both measures lead to a reduction of the diurnal variation (caused, e.g. by clouds), but they are applied only in case of the TC-CON measurements, not in case of the mid-IR NDACC measurements.
Therefore, our interpretation of the relatively good mid-IR precision (only factor ≈2 lower compared to TCCON) is that both the Tikhonov profile retrieval optimized for minimum diurnal variation (Fig. 3) and the dedicated quality selection www.atmos-meas-tech.net/4/1943/2011/(Figs. 4 and 5) help to bring the mid-IR columnar methane retrievals' precision to this unexpected high quality level.

Interference errors
In the recent paper by Sussmann and Borsdorff (2007) we introduced a general formulation for a class of "interference errors" which could not be described by any of the classical four error categories of remote sounding (e.g.Rodgers, 2000, Eq. 3.16); i.e. errors in the retrieval of a target species (e.g.methane) as a result from the smoothing effect from interfering species (e.g.HDO).Additional interference effects can be due to errors in forward model parameters of the interfering species (e.g.erroneous HITRAN parameters for HDO) which can be propagated into the retrieval of the target species (e.g.methane).The latter class of errors can be described by the existing concept of "(forward) model parameter errors" (second term in Rodgers, 2000, Eq. 3.16).We will hereafter present an empirical interference error analysis and, in doing so, a separation of these two differing classes of errors is neither needed nor possible.Therefore, we will use the term "interference errors" in this paper to designate either or both of the two interference phenomena.
Figure 6 shows the ratio of Garmisch year-2007 methane time series derived with two differing retrieval strategies, i.e. (i) the retrieval strategy HIT08 MW (12345) using all 5 candidate micro windows and HITRAN 2008 line parameters (which was the starting point of our study), and (ii) the retrieval strategy HIT00 MW (135) using HITRAN 2000 and using only micro windows 1, 3, and 5, which will be the final recommendation resulting from our study.The ratio time series shows a significant seasonal discrepancy between these two retrieval strategies.
The reason for this seasonal discrepancy can be understood from Fig. 7 which shows the same ratio as above as a function of the HDO column level (also for the other test sites).The HDO columns were taken from the joint HDO retrieval of the HIT00 MW (135) run.Occurrence of a strong HDO-CH 4 interference error for all test sites (up to ≈5 % for Wollongong) is obvious, caused by either or both of the two retrieval strategies.HDO, together with H 2 O, is the strongest interfering species in our set of 5 micro windows (Fig. 1), and the origin of the seasonal artifact of Fig. 6 can therefore be understood to be due to HDO-CH 4 (and H 2 O-CH 4 ) interference in combination with the well know seasonal cycle of columnar HDO (like the one of H 2 O, with a large-amplitude summer maximum).We will show in the following that this interference effect is due to the HIT08 MW (12345) retrieval strategy, i.e. this interference effect can practically be eliminated by using the HIT00 MW (135) retrieval strategy.In order to show this we perform a systematic study as indicated in Fig. 8. Using HITRAN 2000 and dropping stepwise each of the 5 candidate micro windows, we quantify the resulting interference effect relative to the HIT00 MW (12345) reference run: e.g. it can be quantified from Fig. 8d that dropping micro window 4 leads to a total relative interference error of +0.46 % which is indicated in blue in the figure.Additionally, an overall bias of +0.27 % relative to the 5-micro window run results (indicated in red in Fig. 8d).Note, this overall bias has two major contributions in general, (i) from the average interference error (e.g.dominant contribution in Fig. 8d), and/or (ii) from methane line strength errors (e.g.dominant contribution in Fig. 8e).
Table 4 gives numbers for all the relative interference errors for all 3 test stations; e.g. for Wollongong there is a significant relative interference error upon dropping MWs 2 and 4 (+0.31 % and +072 %, respectively), while dropping MWs 1, 3 and 5 leads only to minor interference errors of −0.07 %, +0.10 %, and −0.17 %, respectively.Similar results are obtained for the other test stations (Table 4).
use micro windows number 1, 3, and 5, and drop MWs 2 and 4 as long as no better spectroscopy than HITRAN 2000 is available.(It will be shown below that HITRAN 04 and HIT 08 give worse results than HITRAN 2000.) In the following we characterize the absolute quantity of the interference error of the optimum (HIT00 MW (135)) retrieval strategy; e.g. for Garmisch the HIT00 MW (135) retrieval strategy leads to a relative interference error of +0.91 % derived from the ratio series MW (135)/MW (12345) (see Table 4).We derive a twofold logical proof that this relative interference error is caused by the MW (12345) reference run -and is not due to the recommended MW (135) run -as follows: Assuming that the HIT00 MW (12345) run would be absolute interference free and the relative interference error MW (135)/MW (12345) of +0.91 % would thereby be due to the HIT00 MW (135) run, the logical consequence would be that MWs 2 and 4 would cause an absolute interference error with similar magnitude but opposite sign; i.e., the relative interference error MW (24)/MW (12345) would be around ≈ −0.9 %.This is not to be the case: according to Table 4 the relative interference error MW (24)/MW (12345) for Garmisch is +0.14 (8), i.e. in the order of ≈0.1 %.This means that our starting assumption is erroneous, the absolute interference effect of the MW (135) run is only in the order of ≈0.1 % and the observed relative interference effect of +0.91 % is mainly due to an absolute interference error of the MW (12345) run of ≈ −0.9 %.This conclusion can be double checked by assuming the opposite, namely that the MW (135) run is absolute interference free would mean that the MW (12345) run would have an absolute interference error of about −0.9 % caused by MWs 2 and 4.This would mean that the relative interference error MW (24)/MW (12345) should be around zero.In reality it is +0.14 (8) % for Garmisch (Table 4), i.e. our assumption is valid to a good approximation.In other words, the Garmisch HIT00 MW (1235) run is interference free at the ≈0.1 % level.
In continuation of the previous considerations we define a way to calculate an "absolute interference error" of a certain retrieval strategy to be the negative of the sum of the relative interference errors found by dropping the different micro windows that make up this strategy; e.g., the absolute interference error for the HIT00 MW (135) run would be the negative of the sum of the relative interference errors derived from the ratio time series MW (2345)/MW (12345), MW (1245)/MW (12345), and MW (1234)/MW (12345), i.e. the absolute interference error for the Garmisch HIT00 MW (135) run would be -[(+0.01%) + (+0.12 %) + (−0.03 %)] = −0.10% (see Table 4).In an analogous absolute interference errors have been derived for Wollongong (+0.14 %) and for Zugspitze (+0.2 %).Note the improvement of these MW (135) runs relative to the MW (12345) runs, since the latter show an absolute interference error of −0.89 % for Garmisch, −1.96 % for Wollongong, and −0.35 % for Zugspitze (Table 4).
It is a crucial result (from Table 4) that the quality ranking of the various retrieval strategies, according to the absolute interference error, is the same for our 3 test sites in spite of their strongly differing water vapor levels.This means that it does make sense, indeed, to recommend one joint (optimum) retrieval strategy for all NDACC sites.According to Table 4 this is the HIT00 MW (135) strategy.
However, for a final recommendation we still have to investigate the impact of varied spectroscopy upon interference errors.All hitherto discussed results of this paper used HITRAN 2000 (Table 4).Table 5 shows the increase of interference errors upon using HITRAN 2004 or HITRAN 2008 instead.Using HITRAN 2004 leads to an unacceptable value for the absolute interference error of −3.79 % for the MW (12345) run and still −3.21 % for the R. Sussmann et al.: Strategy for high-accuracy-and-precision retrieval of atmospheric methane seen from the relative interference error of +3.95 % for the HIT04 MW ( 1234) run.This is clearly a result of the large residual at the left hand side of MW 5 due to HDO with HI-TRAN 2004 (rms of 0.27 %, see Fig. 2a) which is significantly larger than the residual of 0.13 % with HITRAN 2000 (Fig. 2a).Also a large relative interference error from the HIT04 MW (2345) run of −0.81 % arises which is caused by increased residuals in MW 1 with HITRAN 2004 compared to the case with HITRAN 2000 (Fig. 2a).
From Table 5 one might wonder whether the HIT04 MW (1234) retrieval strategy would be a reasonable alternative since the absolute interference error is +0.16 %, which is larger, but of the same order of magnitude of the proxy of −0.10 % for the recommended HIT00 MW (135) retrieval strategy.However, the relatively small (+0.16 %) absolute interference error for the HIT04 MW (1234) run comprises large relative interference error components with opposite sign from the individual micro windows.These cancel out by chance, see Table 5: −0.16 % = −0.81% (from MW (2345), i.e. due to MW 1) + 0.23 % (from MW (1345), i.e. due to MW 2) + 0.07 % (from MW (1245), i.e. due to MW 3) + 0.35 % (from MW (1235), i.e. due to MW 4).This means that the HIT04 MW (1234) retrieval strategy contains strong "internal tension", i.e. retrievals using the individual micro windows alone, would lead to strongly differing retrieval results.This is not a recommendable, stable retrieval approach.
Finally, we investigate the use of HITRAN 2008.Similarly to HITRAN 2004, the use of HITRAN 2008 leads to an unacceptably large absolute interference error: it is −3.75 % for the HIT08 MW (12345) run and still −2.75 % for the HIT08 MW (135) run.The major reason for this can be seen from the relative interference error of +3.31 (9) % for the HIT08 MW (1234) run which clearly is a result of the large MW-5 residual due to the CH 4 line at 2921.33 cm −1 (rms of 0.41 % see Fig. 2a), which is significantly larger than the residual of 0.27 % with HITRAN 2004 and the residual of 0.13 % with HITRAN 2000 (Fig. 2a).
Just for completeness, the HIT08 MW (1234) retrieval strategy is no alternative, since the absolute interference error for it is −0.40 % for Garmisch (Table 5) which is significantly larger than the value of −0.10 % achieved with our recommended HIT00 MW (135) retrieval strategy (for Garmisch).In addition, Table 5 shows that the HIT04 MW (1234) run also comprises large relative interference error components with opposite sign from the individual micro windows (−0.64 %, +0.22 %, +0.08 %, +0.74 % for MWs 1-4, respectively).This means the HIT08 MW (1234) retrieval strategy also contains strong "internal tension"similar to what has been found for the HIT04 MW (1234) strategy.
These numbers show that the HIT00 MW (135) strategy is favorable over the HIT08 MW (1234) strategy for the medium-humidity site Garmisch.We expect that the disadvantages of the HIT08 MW (1234) strategy become even more pronounced for the wettest site Wollongong.To show this we calculated analogous numbers for Wollongong, see Table 6.Indeed, the absolute interference error of the HIT08 MW (1234) strategy approaches the unacceptable 1 % level for Wollongong (−0.82 %).Also the "internal tension" is even higher compared to Garmisch with a strong negative interference error contribution from MW 1 (−0.87(3)%) and a strong positive contribution from MW 4 (+1.18(2)%), see Table 6.
To conclude this section, we validate our concept of calculating absolute interference errors.
We give 4 validation examples.First, we derived for Garmisch from the HIT00 MW ( 135) strategy an absolute interference error of −0.10 % (Table 6), and for the HIT08 MW ( 12345) strategy an absolute interference error of −3.71 % (Table 6).
All validation results are summarized in Table 7.The overall validation result is that our method of absolute interference error estimation yields results with an accuracy at the ≈0.2 % level or better.This confirms the validity of the concept to calculate the absolute interference error from the negative of the sum of the relative interference errors.This also means that our quality ranking of the different retrieval strategies with respect to absolute interference 0 .0 0 .5 1 .0 1 .5 2 .0 2 .5 0 .9 7 0 .9 8 0 .9 9 1 .0 0 1 .0 1 1 .0 2 0 .0 0 .5 1 .0 1 .5 2 .0 2 .5 3 .0 0 .9 7 0 .9 8 0 .9 9 Fig. 9. Ratio plots showing significant relative HDO-CH 4 interference errors which are dominated by the unfavorable HIT08 MW (1234) retrieval strategy while the recommended HIT00 MW (135) retrieval strategy is practically interference free (see Sect. 2.7.3).

Calculation of column-averaged dry-air mole fractions
The retrieved individual methane total columns were divided by dry air columns to obtain column-averaged dry-air mole fractions of methane (XCH 4 ).Dry air columns were calculated using NCEP pressure-temperature-humidity (PTU) profiles by calculating the total air column from the PT profiles and substracting water vapor columns obtained from integrating the NCEP water vapor profiles.Not all NDACC sites perform quality controlled surface pressure measurements as TCCON sites do.Therefore we investigated the quality of NCEP pressure information and its interpolation to an elevated site.For this purpose we performed a multi-year comparison of NCEP-derived pressure for the Garmisch station (743 m a.s.l.) versus the TCCON pressure sensor (1min values from a high-quality pressure transducer which is regularly quality checked against a mercury barometer).We found a bias of −0.21 hPa with a standard deviation of 1.6 hPa.NCEP PTU profiles are available four times a day (06:00, 12:00, 18:00, 24:00 GMT) and were interpolated to the time of the FTIR measurement.This is recommended because of the strong diurnal cycle of water vapor columns above most sites.Alternatively, information on the water vapor column can also be retrieved from the FTIR measurements.However, the water vapor retrieval is in another spectral domain than the methane retrieval.These domains are measured sequentially, not coincidentally, so that water vapor measurements may therefore not be available for all days.Additionally, an optimized water vapor retrieval has not yet been implemented by all NDACC FTIR groups.For this reason, and because comparing NECP water vapor columns to the FTIR retrievals of water vapor has shown very good agreement (Fig. 10), we favor the use of NCEP water vapor information for a harmonized NDACC retrieval strategy of methane.The FTIR retrievals of integrated water vapor used in Fig. 10 have been performed with the retrieval strategy of Sussmann et al. (2009) and have recently been inter-compared against differential absorption lidar measurements showing excellent agreement (Vogelmann et al., 2011).

Recommended retrieval strategy MIR-GBM v1.0
We make the point that it does make sense to recommend one joint mid-IR retrieval strategy for all NDACC sites.Our conclusion is based upon two findings; (i) the outcome from Sect.2.7 that the quality ranking of the 24 differing retrieval strategies (using differing micro windows and HITRAN versions) according to the absolute H 2 O/HDO-CH 4 interference error is the same for all three test sites of our study.(ii) The 3 test sites cover strongly differing levels of integrated water vapor which are representative of the whole NDACC network.This means we could not find any indication that there could be a retrieval strategy, optimized for the highesthumidity NDACC sites that would not also be the optimum for the driest sites.The outcome from Sect.2.7 (Tables 4 and 5) is that the optimum strategy for all sites is using HITRAN 2000 and micro windows 1, 3, and 5 (HIT00 MW ( 135)).This selection, together with our inverse method (Sect.2.5) and the scheme for quality selection (Sect.2.6) as well as the scheme for calculating column-averaged dry-air mole fractions (Sect.2.8) comprises our new strategy for mid-IR ground-based methane retrievals.We refer to it as MIR-GBM v1.0 thereafter, see Table 8 for its definition.Table 9 gives our overall quality estimates.In another, ongoing study we perform an inter-calibration of MIR-GBM v1.0 XCH 4 to Table 9. Information on accuracy and precision for methane column-averaged dry-air mole fractions retrieved from mid-infrared solar spectra with the retrieval strategy MIR-GBM v1.0.

Seasonality and comparison to SCIAMACHY data
We quantify the seasonality of XCH 4 retrieved with MIR-GBM v1.0 and compare it to SCIAMACHY data.Our motivation is threefold.(i) It is of interest to quantify and understand the true seasonality of column-averaged mole fractions of methane.This seasonality is a complex interplay between the seasonality of the emissions (most with a maximum in summer), the OH sink (maximum in summer), and the varying contribution of the stratospheric component (the tropopause altitude also shows a maximum in summer).The latter is the reason why the seasonality of XCH 4 differs from the seasonality of surface concentrations.(ii) Furthermore, we have seen in Sect.2.7.3 that interference errors can significantly impact the retrieved seasonal cycle.Therefore, we want to validate the seasonality retrieved with our recommended retrieval strategy.(iii) Seasonalities reported from SCIAMACHY retrievals have changed over the different processor versions in amplitude and phase (Frankenberg  et al., 2008a,b;Schneising et al., 2009Schneising et al., , 2011) ) and it is of interest to see what the current state of the art is compared to our ground-based retrieval MIR-GBM v1.0.
Figure 11 shows the de-trended seasonality of XCH 4 derived from the Zugspitze time series.From the full Zugspitze time series covering 1995-2011 we have taken into account only the time interval between the beginning of 2004 up to the end of 2009 [2004, 2009] covering the period for which SCIAMACHY data are available.Red points in Fig. 11 are Zugspitze multi-annual monthly means for this time span.Underlying individual-year monthly means are based on 42 individual FTIR measurements on average.Since methane has shown a renewed increase during 2007-2009 after a near zero-trend period before (Rigby et al., 2008;Dlugokencky et al., 2009;Frankenberg et al., 2011;Schneising et al., 2011) we performed a linear de-trending for the period [2007,2009] before calculating the multi-annual monthly means from the full [2004,2009] interval.The de-trending was performed by the approach described in Gardiner et al. (2008).
An intra-annual function (2nd order Fourier series) was fitted to the de-trended multi-annual monthly means (red line in Fig. 11).The seasonal amplitude is 16.3 ± 2.9 ppb or 1.0 ± 0.2 %.The phase of the seasonality can be characterized by an approximate minus-sine-type behavior with a minimum in March/April and a maximum in September.The maxima are somewhat broader and the minima narrower than a simple sine function, however, and are described by the Fourier coefficients of Table 10 more quantitatively.Fig. 11.Red points: multi-annual mean seasonality of columnaveraged dry-air mole fractions of methane retrieved from Zugspitze FTIR.It has been derived from the de-trended monthlymean time series in the interval [2004,2009] with the retrieval strategy MIR-GBM v.1.0(Table 10).Error bars are standard errors of the multi-annual monthly means for 95 % confidence.Red line: Fit of a 2nd order Fourier series.Grey squares: Same as red points but using only [2004,2005] FTIR data.Grey line: Seasonality derived from SCIAMACHY WFMDv2.0 data taken from Fig. 12 of Schneising et al. (2011) for the [2004,2009] time interval.Black line: same as grey line but using only SCIAMACHY [2004SCIAMACHY [ , 2005] ] data.
SCIAMACHY monthly mean XCH 4 data for the northern hemisphere retrieved with WFMD v2.0 for the years 2004-2009 were taken from Schneising et al. (2011, Fig. 12 therein).These retrievals are utilizing CH 4 absorption features in channel 6 (1000-1750 nm) along with HI-TRAN 2008.SCIAMACHY data are a useful set for comparison because they are sufficiently sampled in time to show a significant seasonality.In addition the WFM-DOAS total column averaging kernels are close to 1 in a range between well above the tropopause and the surface.This means that the retrievals integrate the column with a high sensitivity similar to the characteristics of the ground-based soundings (see averaging kernels in Fig. A1).We calculated de-trended multi-annual means from this data set by the same procedure used in the FTIR data (grey line in Fig. 11).The agreement to the Zugspitze FTIR result is good.It is even better (black line) if only SCIAMACHY data for 2004-2005 are used.The reason for this is probably due to detector degradation in the spectral range used for the methane column retrieval and the corresponding availability of considerably fewer detector pixels the results, as data since November 2005 exhibit larger scatter by about a factor of two (see Fig. 12 in Schneising et al., 2011).There is good agreement both for the amplitude and the phase (see Table 10 for numbers).This indicates some breakthrough in the quality of SCIAMACHY retrievals, since earlier SCIAMACHY scientific processor versions (WFMD v1.0 or IMAP-DOAS v49) had shown a differing, "frown shape" seasonality for the Northern Hemisphere, see, e.g.Fig. 12 in Schneising et al. (2009).The reason for the much more FTIR compatible phase of the WFMD v2.0 retrievals relative to WFMD v1.0 is probably due to the use of improved methane and water vapor spectroscopy (Frankenberg et al., 2008a,b) and/or the use of an updated Carbon Tracker version (to correct the retrieved methane mole fractions for CO 2 seasonal variability), see Sect.3.2 in Schneising et al. (2011) for a detailed discussion of the latter improvements.

Summary
We have developed a strategy for retrieval of atmospheric methane from mid-infrared solar absorption spectra with minimized seasonal bias <0.14 % and optimized precision <0.3 % (1-σ diurnal variation, 7-min integration).This optimum strategy is designated as MIR-GBM v1.0.If other, non-optimum micro window selections and/or spectroscopy selections are used, dominant systematic errors up to ≈5 % arise which are due to interference by water vapor and HDO.Because of this finding, our study was performed in parallel with data from 3 FTIR sites (Zugspitze, Garmisch, and Wollongong) located in differing climatic zones with strongly differing mean levels of precipitable water ranging from 0.2 mm to 44.9 mm for clear sky conditions.This spans the range of the humidity levels of all NDACC sites.We derived a concept for empirical estimation of the absolute interference error of a certain retrieval strategy.Performing a systematic study with 24 different retrieval strategies (8 different micro window selections and 3 different HITRAN versions) we found that the quality ranking of these 24 strategies with respect to the absolute interference error is the same for all three test sites.(Precision is only weakly impacted by the retrieval strategy).This means that it does make sense, indeed, to agree upon one joint mid-IR retrieval strategy for all NDACC sites.
The cornerstones of the recommended retrieval strategy MIR-GBM v1.0 have been summarized in Table 8.The best available spectroscopy for the mid-infrared methane retrievals is currently HITRAN 2000 including the 2001 update release.Intriguingly, HITRAN 2004 and HITRAN 2008 lead to worse spectral residuals in our 5 candidate spectral micro windows.These spectral residuals are due to line parameter errors for methane, HDO, and H 2 O.However, even using HITRAN 2000, only 3 of the candidate micro windows are suitable for a retrieval (first, third, fifth, if indexed for increasing wave number).The other two micro windows (second and fourth) lead to significant H 2 O/HDO-CH 4 interference errors up to several per cent.For some retrieval strategies with moderate overall interference errors strong "internal tension" is observed; i.e. there are significant interference errors due to the individual micro windows, with opposite sign.These cancel out partly in the combined multi-window retrieval.In these cases strongly differing retrieval results from stand-alone retrievals with the individual micro windows are observed.Examples of such non-recommendable retrieval strategies are those using the first 4 micro windows together with HITRAN 2004 or 2008.
The recommended retrieval constraint is the robust Tikhonov L 1 scheme set up with an altitude-constant regularization strength for a state vector given in units of per-cent of an a priori methane volume mixing ratio profile (taken from the WACCM model).The overall regularization strength is tuned in a way to achieve optimum precision (minimum diurnal variation).Thereby this high precision of <0.3 % is achieved for all test sites.This is also the result of an innovative final quality selection of the retrievals utilizing not the usual threshold for the root-mean-square residual of the spectral fit but using a threshold (0.15 %) for the ratio between the rms-residual and the information content (dofs) of the individual retrievals.Both quality measures show a similar zenith angle dependency, and thereby the elimination of too much data is avoided from winter when signal-to-noise is worse than in summer but information content is higher.Another benefit is that the same quality threshold can be used for all NDACC site locations with differing mean zenith angles.
Based on the retrieved total columns of methane, a method for harmonized calculation of column-averaged dry-air mole fractions (XCH 4 ) at all sites has been set up.It utilizes 4-times-daily information on pressure-temperature-humidity profiles from the National Center for Environmental prediction interpolated to the time of the FTIR measurement.
Finally, as stated above, due to the strong seasonality of water vapor, related interference effects potentially introduce errors in the methane seasonality, in particular for high-humidity sites.With the retrieval strategy MIR-GBM v1.0 we could eliminate H 2 O/HDO-interference errors down to the <0.14 % level according to our empirical interference error analysis.To double check this result we have investigated the seasonality of XCH 4 retrieved with the Zugspitze FTIR with MIR-GBM v1.0 in some detail.The outcome is a minus-sine-type seasonality with an amplitude of 16.2 ± 2.9 ppb (0.94 ± 0.17 %).Comparison to newestgeneration SCIAMACHY satellite retrievals (WFMD v2.0) for the northern hemisphere showed very good agreement in amplitude and phase.
Another outcome of our paper is that it has laid the cornerstone for obtaining a minimized station-to-station bias, i.e. improved relative accuracy for methane retrievals of the NDACC network.To improve this situation considerably became an obvious need after a study by Dils et al. (2006) had shown unacceptable numbers for the quality of NDACC methane retrievals, which induced Bergamaschi et al. (2007) to comment that "the precision of the mid-infrared FTIR measurements of 3 % and the relative accuracy of 7 % is significantly below the precision and (relative) accuracy targets of <1-2 % of SCIAMACHY measurements".The problem of the Dils et al. (2006) study had been strongly differing retrieval strategies used by the participating groups from different stations (e.g.inconsistent HITRAN versions and priors).Therefore, we have demonstrated in our paper how to implement a harmonized retrieval strategy comprising one common spectroscopic line list, one consistent source of prior information, one regularization approach, one common source of pressure-temperature information, one set of to-beretrieved interfering species for all stations, and one common quality selection approach.We have described and applied such harmonized retrieval strategy to the 3 test sites Wollongong, Garmisch, and Zugspitze in this paper.The benefit of these measures with respect to improved station-to-station accuracy is currently quantified (see end of next section).

Outlook
One outcome of this paper is that improved spectroscopic parameters for methane, HDO, and H 2 O in the 2613-2922 cm −1 spectral domain are urgently needed.If such parameters become available in the future, the concept for empirical interference-error quantification suggested in this paper could be applied again and our micro-window selection be updated.This could open the possibility that all 5 of our tested candidate micro windows can be used in future.This would lead to another improvement in information content and precision compared to the current version MIR-GBM 1.0.
In the short comment to our paper by F. Hase it has been argued that using different a priori profiles for the joint scaling of HDO and H 2 O (namely from a pre-determination on a daily basis via an independent HDO and H 2 O profile fit using other micro windows) impacts the retrieved CH 4 columns compared to our setup using a fixed-shape a priori profile both for H 2 O (from WACCM) and HDO (US Standard Atmosphere).There are two different effects which may be responsible for such finding.(i) Pre-determination of HDO and H 2 O profiles may help to reduce part of the interference error, namely the one from propagation of smoothing errors due to the variability of the interfering species to the target species -as described in detail by Sussmann and Borsdorff (2007).However, such improvement is the dominant effect only in the limiting case that spectroscopy for the interfering species is perfect, the kernels are ideal, and/or the a priori does not differ from the retrieval.If there is a spectroscopic inconsistency between the interfering H 2 O (HDO) line(s) and the H 2 O (HDO) lines(s) used for pre-determination, the pre-determination will increase the systematic part of the interference error, because the local residuum of the interfering species around the target line will be increased and mis-interpreted by the target species retrieval.Unfortunately, this effect is difficult to quantify because the error analysis of the pre-determination would have to be propagated through the subsequent methane retrieval.(ii) Additionally, the difference between the HDO and H 2 O profiles (delta-D) varies with season (because depletion correlates with the H 2 O amount), while our scaling retrieval of the interfering species uses fixed-shape a priori profiles.We conjecture that the error from effect i) dominates over the one from effect ii) because our paper has proven a relative accuracy on the ≈0.1 % level using fixed a priori profiles.Anyway, a high-quality, seasonally varying delta-D a priori profile may lead to improvements.However, for future tests independent (non-FTIR) information on delta-D should be utilized in addition; first of all to estimate the magnitude of the systematic interference error from a FTIR-pre-determination (effect i), but also because retrievals of H 2 O and HDO profiles from FTIR spectra have been performed by only a few groups, and cannot be exploited for a NDACC standard approach at present.
Future tests may also include a refined method of fitting interfering species which show signatures in different micro windows.The standard retrieval software used within the community allows a joint fit of a certain interfering species in multiple micro windows, but this fit is linked between the different micro windows.However, as spectroscopic inconsistencies between micro windows exist, an independent fit per micro window may be advantageous.This option should therefore be implemented into the standard code as a basis for future tests.
We were able to show that NDACC methane retrievals are possible with a precision of <0.3 % for a 7-min integration time.This is only about a factor of 2 lower compared to TC-CON retrievals if compared for the same integration time.This is an unexpectedly good result.Our finding, that a Tikhonov L 1 methane profile retrieval can be tuned to optimize precision, contributes to this result.It would be interesting to test this approach on TCCON retrievals as well (TCCON retrievals are hitherto based on scaling of a volume mixing ratio profile).
The paper described a harmonized methane retrieval strategy for multiple stations.The benefit of this strategy with respect to an improved station-to-station accuracy for the NDACC network is currently quantified using the TCCON network as an inter-calibration standard (TCCON stations are inter-calibrated via aircraft measurements).This study is performed for a subset of multiple stations that perform NDACC-and TCCON-type measurements at the same time (Forster et al., 2011).First results of this study show that a station-to-station accuracy of the order of 0.5 % is the result of the harmonized retrieval strategy described in this paper.An outcome of this study will also be an inter-calibration of the absolute XCH 4 levels of the NDACC MIR-GBM v1.0 retrievals to TCCON data.This will be the basis for a possible joint exploitation of NDACC and TCCON data with the potential for extended trend analyses (NDACC operations started about 15 years before TCCON) and an improved global coverage by stations of the (joint) ground-based FTIR network for satellite validation and the inversion of sources and sinks.

Retrieved CH 4 profiles and averaging kernels
Figure A1 shows ensembles of retrieved methane vmr profiles using two differing micro window sets, i.e.MW (12345) (average dofs = 1.94) and MW (135) (average dofs = 1.80);HITRAN 2000 has been used in both cases.Note there is no obvious difference in profile shapes.However, what cannot be seen from this figure is that the total columns derived with the HIT00 MW (12345) retrieval strategy are significantly impacted by H 2 O/HDO-CH 4 interference errors (absolute interference error proxy = −0.86%, see Table 5 and Sect.2.7.3) while our recommended strategy HIT00 MW ( 135) is practically interference-free (absolute interference error proxy = −0.1 %).

Fig. 1 .
Fig. 1.Spectral contribution plot for a solar zenith angle of 65 • .Solid lines are for a H 2 O column of 44.9 mm (Wollongong maximum); dashed lines correspond to a H 2 O column of 0.2 mm (Zugspitze minimum).

Fig. 3 .
Fig. 3. Optimizing regularization strength α of Tikhonov L 1 retrievals of CH 4 (using a diagonal measurement covariance with a signal-to-rms-noise ratio of 500) via a test ensemble of all Garmisch year 2007 measurements.(a) Mean L-curve, i.e. goodness of fit (χ 2 ) of as a function of α.The residual term within χ 2 is the overall rms-residual from the spectral fit and the noise term within χ 2 is calculated from the wave number interval 2615.25-2615.40cm −1 .(b) Second derivative (curvature) of the L-curve.(c) Mean diurnal variation (1 σ ) as a function of α.Corresponding numbers for the information content (dofs) are indicated.

2 Fig. 4 .
Fig.4.Quality selection criteria and thresholds for goodness of fit (χ 2 ) and spectral quality (rms-noise) relative to information content (dofs).Data points are for 5 years of measurements.

Fig. 5 .
Fig. 5. (a) Upper trace: time series of spectral rms-noise calculated from the out-of-band 2615.25-2615.40cm −1 wave number interval for the Zugspitze site; lower trace: resulting degrees of freedom for signal (dofs) from fitting micro windows 1, 3, and 5 using HITRAN 2000.(b) Ratio of spectral residuals and dofs.Red line: quality selection threshold.

Fig. 7 .
Fig. 7. Investigation of the seasonal artifact shown in Fig.6: ratio of about one year of methane retrievals with two differing retrieval strategies as in Fig.6, now plotted as a function of HDO column level for the 3 test sites (HDO columns are from the joint HDO retrieval of the HIT00 MW (135) run).

Fig. 8 .
Fig. 8. Ratio of Garmisch year-2007 retrievals of columnar methane with one MW out of 5 dropped (a)-(e) versus retrieval with all 5 MWs plotted as a function of HDO column level.HITRAN 2000 was used in all cases.The definition of total relative interference error and overall bias is indicated.

Fig. 10 .
Fig. 10.Intercomparison of integrated water vapor (IWV) above the Zugspitze retrieved from the solar FTIR with the strategy of Sussmann et al. (2009) versus integrated water vapor profiles from NCEP.Four-times-daily NCEP profiles were interpolated to the times of the FTIR measurements.
r d .F o u r i e r Z u g s p i t z e F T I R 0 4 / 0 5 Fig. A1.(a) Ensembles of retrieved methane profiles using two differing micro window sets, i.e.MW (12345) and MW (135); HI-TRAN 2000 has been used in both cases.(b) Corresponding totalcolumn averaging kernels.
HITRAN 2000 including the official April 2001 update release on CH 4 and H 2 O. *

Table 2 .
Test sites of this study and corresponding range of integrated water vapor according to National Center for Environmental Prediction data selected for clear-sky days (FTIR measurement conditions). 1 mm corresponds to 3.345 × 10 21 cm −2 .andmicro-windows,see Table4below).Since the magnitude of the H 2 O/HDO-CH 4 interference errors is expected to depend also on the overall humidity level, all test runs will be performed for data from the 3 different NDACC-FTIR sites Wollongong, Garmisch, and Zugspitze in parallel.As Table *

Table 3 .
Interfering species that have to be taken into account in the 5 candidate micro windows (MW, defined in Table1).

Table 4 .
Effect of dropping individual micro windows using data from 3 test sites with strongly differing humidity levels; impact on information content (dofs), interference errors, and diurnal variation.The HITRAN 2000 line parameters compilation was used.
a Diurnal variation of individual days (1 σ ), averaged over all days of full time series.b"Relativeinterferenceerror",defined as HDO/H 2 O-CH 4 interference error relative to MW (12345), see Fig.8; uncertainties in brackets are for 95 % confidence.c"Absoluteinterference error", defined as the negative of the sum over the rel.IF errors (row above).dBiasrel. to MW (12345), see Fig.8.

Table 5 .
Effect from dropping individual micro windows using 3 different HITRAN versions for the test station Garmisch; impact on information content (dofs), interference errors, and diurnal variation.
a Diurnal variation of individual days (1 σ ), averaged over all days of full time series.b"Relativeinterferenceerror",defined as HDO/H 2 O-CH 4 interference error relative to MW (12345), see Fig.8; uncertainties in brackets are for 95 % confidence.c"Absoluteinterference error", defined as the negative of the sum over the rel.IF errors (row above).dBiasrel. to MW (12345), see Fig.8.

Table 6 .
Comparison of the HIT08 MW (12345) and HIT08 MW (1234) retrieval strategies versus the recommended strategy HIT00 MW (135).Numbers are for Garmisch (standard font) and Wollongong (bold).Use of HIT08 MW (12345) is out of discussion (Wollongong absolute interference error −5.22 %) but also the use of the HIT08 MW (1234) strategy is strongly discouraged because of i) high absolute interference errors (e.g.−0.82 % for Wollongong) and (ii) strong "internal tension" (strong rel.interference error contributions from differing micro windows with opposite sign, e.g.−0.87 % versus +1.18 % for Wollongong).

Table 7 .
Four validation cases using two independent ways of estimating relative interference errors.The discrepancy is a measure for the accuracy of the method of estimating absolute interference errors.For details see text.

Table 8 .
Optimum strategy for retrieval of methane from mid-infrared solar spectra designated as MIR-GBM v1.0.

Table 10 .
Parameters describing the seasonality of columnaveraged mole fractions of methane.Defined as (max-min)/2 of 2nd order Fourier fit to multi-annual monthly means of de-trended time series, see Fig. 11.b Describing the fitted intra-annual function a 1 cos (2 π t) + a 2 sin (2 π t) + a 3 cos (4 π t) + a 4 sin (4 π t), see Fig. 11.c Retrieved with MIR-GBM v1.0, defined in Table 8. d Error for amplitude calculated from combining the standard errors (σ /sqrt(n)) for the minimum (using March and April individual-year monthly means) and the maximum (August-October monthly means).