Validation of SCIAMACHY HDO/H2O measurements using the TCCON and NDACC-MUSICA networks

Measurements of the atmospheric HDO/H2O ratio help us to better understand the hydrological cycle and improve models to correctly simulate tropospheric humidity and therefore climate change. We present an updated version of the column-averaged HDO/H2O ratio data set from the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY). The data set is extended with 2 additional years, now covering 2003–2007, and is validated against co-located ground-based total column δD measurements from Fourier transform spectrometers (FTS) of the Total Carbon Column Observing Network (TCCON) and the Network for the Detection of Atmospheric Composition Change (NDACC, produced within the framework of the MUSICA project). Even though the time overlap among the available data is not yet ideal, we determined a mean negative bias in SCIAMACHY δD of −35± 30 ‰ compared to TCCON and −69± 15 ‰ compared to MUSICA (the uncertainty indicating the stationto-station standard deviation). The bias shows a latitudinal dependency, being largest (∼−60 to −80 ‰) at the highest latitudes and smallest (∼−20 to −30 ‰) at the lowest latitudes. We have tested the impact of an offset correction to the SCIAMACHY HDO and H2O columns. This correction leads to a humidityand latitude-dependent shift in δD and an improvement of the bias by 27 ‰, although it does not lead to an improved correlation with the FTS measurements nor to a strong reduction of the latitudinal dependency of the bias. The correction might be an improvement for dry, high-altitude areas, such as the Tibetan Plateau and the Andes region. For these areas, however, validation is currently impossible due to a lack of ground stations. The mean standard deviation of single-sounding SCIAMACHY–FTS differences is ∼ 115 ‰, which is reduced by a factor ∼ 2 when we consider monthly means. When we relax the strict matching of individual measurements and focus on the mean seasonalities using all available FTS data, we find that the correlation coefficients between SCIAMACHY and the FTS networks improve from 0.2 to 0.7–0.8. Certain ground stations show a clear asymmetry in δD during the transition from the dry to the wet season and back, which is also detected by SCIAMACHY. This asymmetry points to a transition in the source region temperature or location of the water vapour and shows the added information that HDO/H2O measurements provide when used in combination with variations in humidity.


Introduction
The hydrological cycle plays a key role in the uncertainty of climate change.More specifically, water vapour plays an important role, since it is the strongest natural greenhouse gas.

Published by Copernicus Publications on behalf of the European Geosciences Union.
Water vapour is involved in positive feedback mechanisms (Soden et al., 2005;Randall et al., 2007) as well as in cloud formation processes of which the feedback mechanisms are still poorly understood (Boucher et al., 2013).Observations of water vapour isotopologues, such as HDO, can be used to improve our understanding, as phase changes leave distinct isotopic signatures in the water vapour (Dansgaard, 1964;Craig and Gordon, 1965).Improving global general circulation models (GCMs) to correctly capture these signatures will lead to a better representation of the many interacting processes that control tropospheric humidity (Jouzel et al., 1987), which will eventually lead to more robust climate projections.
Global satellite measurements, as well as ground-based Fourier transform spectrometer (FTS) measurements of HDO have become available in recent years.The measurements of HDO are normally expressed as a ratio of the HDO abundance to the abundance of the main isotopologue H 16  2 O (from here on referred to as H 2 O).We use the standard "delta notation" for the fractionation of column-averaged HDO relative to Vienna Standard Mean Ocean Water (VSMOW): in which VCD stands for the vertical column density and R s = 3.1152 × 10 −4 is the HDO abundance of VSMOW (Craig, 1961).The bar in the notation δD indicates that it represents a column-averaged value of δD.In the remainder of this work, however, the column averaging is always implied and we will simply use the notation δD.
Certain measurements directly retrieve δD, as well as the separate total columns HDO and H 2 O, while other measurements retrieve HDO and H 2 O separately and calculate δD a posteriori.The first satellite retrievals of HDO (and a posteriori calculated δD) were taken by the Interferometric Monitor for Greenhouse gases (IMG, Zakharov et al., 2004), followed by the SCanning Imaging Absorption SpectroMeter for Atmospheric CHartographY (SCIAMACHY, Frankenberg et al., 2009).The first direct retrievals of δD were taken by the Thermal Emission Spectrometer (TES, Worden et al., 2007), followed by the Infrared Atmospheric Sounding Interferometer (IASI, Herbin et al., 2009;Schneider and Hase, 2011;Lacour et al., 2012;Wiegele et al., 2014).From the ground, very precise measurements of atmospheric HDO are obtained with FTS instruments, organized in networks such as the Total Carbon Column Observing Network (TC-CON, Wunch et al., 2011, retrieval of HDO and H 2 O among other trace gases) and the Network for Detection of Atmospheric Composition Change (NDACC, formerly the Network for Detection of Stratospheric Change, Kurylo and Solomon, 1990, retrieval of δD, HDO and H 2 O among other trace gases).These different observations greatly complement each other, as the satellite measurements have near global coverage but at relatively low precision and accuracy, while the FTS measurements have higher precision and ac-curacy but only for about 20-30 locations (and often less for older periods of coinciding satellite measurements).First comparisons between measurements of the ratio HDO/H 2 O in the middle-to-lower troposphere and GCMs have been performed and show the great potential for long time series of tropospheric HDO/H 2 O ratio measurements to help in the improvement of the models and to gain a better understanding of the hydrological cycle (Risi et al., 2010(Risi et al., , 2012a, b;, b;Yoshimura et al., 2011).
With new satellite HDO data sets available, such as from the Greenhouse gases Observing Satellite (GOSAT, Frankenberg et al., 2013;Boesch et al., 2013), or currently being developed, such as from the Tropospheric Monitoring Instrument (TROPOMI, Veefkind et al., 2012), long and global time series of tropospheric δD are becoming a reality.It is crucial, however, that these data sets are properly validated to ensure consistency and intercomparability and to better identify their potential advantages and shortcomings.In this respect it is also important to demonstrate that δD adds information that is not contained in H 2 O measurements alone and that the satellite data sets are capable of adding this information.A clear example of such activities is the project MUSICA (MUlti-Platform remote Sensing of Isotopologues for investigating the Cycle of Atmospheric water), which consists of a ground-based (NDACC) and a space-based (IASI) remote sensing component and aims at providing quasi-global and homogeneous tropospheric H 2 O and δD data (Schneider et al., 2012).Recent MUSICA efforts include extensive theoretical data characterization, continuous in situ measurements, aircraft campaigns and comparisons between ground-and space-based data (Schneider et al., 2015;Wiegele et al., 2014).
Here, we present an updated version of the SCIAMACHY δD data set.The European Space Agency's Environmental Research Satellite (ENVISAT) mission, with the SCIA-MACHY instrument on board, ended 8 April 2012.The original SCIAMACHY δD data set, however, covers the years 2003-2005(Frankenberg et al., 2009)).The next available global δD data set, with a high sensitivity in the lower troposphere, is from GOSAT, which started measurements in 2009 (Frankenberg et al., 2013;Boesch et al., 2013).As a first step in our effort to acquire longer and overlapping time series of δD, including a better data characterization, we have expanded the SCIAMACHY δD time series with 2 additional years to cover the period 2003-2007.Although this time series is not long enough to reach overlap with GOSAT, it is long enough to allow for sufficient measurements coinciding with data from ground-based FTS stations to perform a first validation and to study if the measurements of HDO add information to the measurements of H 2 O alone.Assuming the ground-based FTS measurements remain consistent, they could yield a transfer between SCIAMACHY and GOSAT.
The updated SCIAMACHY δD data set is introduced in Sect.2, and in Sect. 3 we describe the TCCON and NDACC ground-based FTS data sets that we use as the baseline for Atmos.Meas.Tech., 8, 1799-1818, 2015 www.atmos-meas-tech.net/8/1799/2015/our validation study.The spatial and temporal co-location algorithm is explained in Sect. 4. In Sect. 5 we present time series of the differences between the SCIAMACHY and ground-based δD measurements and we discuss the bias, standard deviation and other statistics.Mean seasonalities in δD, the relationship between δD and humidity and the added information of δD are discussed in Sect.6.We provide a summarizing discussion and the conclusions of our work in Sect.7.

Retrieval algorithm description
The SCIAMACHY HDO/H 2 O data set and its retrieval algorithm were first described in Frankenberg et al. (2009).Scheepmaker et al. (2013) further improved the original data set and incorporated updated water spectroscopy.The retrieval is based on nadir short-wave infrared spectra in the 2.3 µm range (channel 8 of SCIAMACHY).In a microwindow ranging from 4212 to 4248 cm (2354 to 2374 nm) the total vertical columns of the species H 2 O, HDO, H 18 2 O, CO and CH 4 are fitted simultaneously.The retrieval approach follows Rodgers (2000) and uses an iterative maximum a posteriori (IMAP) algorithm, similar to the SCIAMACHY CH 4 retrievals (Frankenberg et al., 2005(Frankenberg et al., , 2011)).No atmospheric scattering or clouds are taken into account, which means that we need to filter for clouds using a posteriori filter criteria (described below).Since the final product consists of the ratio of the retrieved total columns of HDO and H 2 O, certain light-path modifications due to scattering and remaining instrumental effects that modify the HDO and H 2 O columns in similar ways are expected to cancel out (Boesch et al., 2013).SCIAMACHY observes backscattered sunlight that has passed through the atmosphere at least twice, which results in a high sensitivity to water vapour in the lower troposphere where the concentration is highest.The spatial footprint of a single measurement is 120 × 30 km, with a local overpass time of 10:00 a.m. at the equator (mean local solar time).
As mentioned earlier, the values for δD are calculated a posteriori from the retrieved HDO and H 2 O columns.Although the output of the retrieval consists of total HDO and H 2 O columns, the algorithm assumes five retrieval layers (equidistant in pressure), of which only the bottom layer is used to fit the partial column density.Measured variations in the HDO and H 2 O concentrations higher in the atmosphere (where their concentration is low) will be ascribed to the lowest layer.This so-called "bottom-scaling" approach assures that the sensitivity to real changes in atmospheric HDO and H 2 O are comparable in the lower troposphere.We will show this in more detail in terms of the total column averaging kernels in Sect.2.3 below, where we also show the sensitivity to variations in the choice of retrieval layers.Differences between the HDO and H 2 O retrieval sensitivi-ties at higher layers, however, as well as cross-dependencies between HDO and H 2 O (e.g.how actual atmospheric H 2 O affects retrieved HDO and vice versa) could introduce issues in the derived δD if not properly corrected for by use of the full averaging kernels (Schneider et al., 2012).Unfortunately, we did not save the full averaging kernels (including the cross-correlations) due to storage limitations.It is therefore possible that the derived δD not only reflects actual atmospheric δD but partly also actual atmospheric H 2 O.Besides these cross-dependency effects, which act on a layer-by-layer basis, we also expect some effects from taking column-averaged values.For example, we could observe some variability in column-averaged δD, even if the vertical distribution of δD were to remain constant, due to variability in the vertical distribution of H 2 O.These caveats have to be considered throughout this work (and will be addressed in a future study).These caveats also apply to the TCCON data, which will be described in Sect.3, but not to the MUSICA data (which have been corrected for cross-dependencies using the full averaging kernels).

Updated data product
The new HDO/H 2 O product, from here on referred to as IMAP v2.0, has been updated with respect to the original product (IMAP v1.0, including the updates described in Scheepmaker et al., 2013).First of all, we have used consolidated level 1b files from the Instrument Processing Facility version 7.04-W.Compared to the previous version 7.03-U, the updated files incorporate improved auxiliary files, fixing incorrect in-flight calibration information.The auxiliary information is used as input in the nadc_tools instrument calibration software package, developed at SRON1 , which was also updated with some minor bug fixes.The nadc_tools package is used for dark current corrections of the data (among other calibrations).
The output format of the HDO/H 2 O product has been changed to the more user-friendly netCDF4 format and the filter criteria have been updated.Measurements that satisfy the following criteria are deemed suitable for further study and have received a quality flag (QF) of 1: The first two criteria act as a simple cloud filter.The subscript "apriori" refers to the vertical column densities derived from the modelled a priori input profiles from the European Centre for Medium-range Weather Forecasts (ECMWF, for H 2 O) or the Chemistry Transport Model TM4 (for CH 4 Meirink et al.,

2006
).If more strict (cloud) filtering is preferred, we suggest users add the following criterion: For a description of the other fields included in the netCDF files we refer the reader to the product specification document that is provided with the data set.The SCIA-MACHY HDO/H 2 O ratio product can be requested via www.sciamachy.org.
The extension of the data product beyond the original 2003-2005 time period is complicated due to various instrument issues with the channel 8 detector (Gloudemans et al., 2005).First of all, there is a growing ice layer on the detector window, affecting the instrument spectral line shape (ISLS) in a variable manner.Second, the detectors, particularly those used for the SCIAMACHY channels 6+ to 8, are degrading with time.The indium-gallium-arsenide (InGaAs, EPI-TAXX, New Jersey) detector material of channels 6+ to 8 is doped with higher amounts of indium, which leads not only to an increased level of infrared sensitivity but also to a higher dark current and susceptibility to radiation damage (Kleipool et al., 2007).This radiation damage can manifest itself in different ways.We see a general increase with time in the amount of pixels that suffer from higher noise levels and in the occurrence of dark currents that start to alternate between different modes (i.e.random telegraph signal, or RTS, resulting in bad pixels).Some pixels show a combination of higher noise and RTS.Other pixels become completely unresponsive and are labelled dead.These degradation issues are especially problematic for channel 8, due to the intrinsic low signal-to-noise ratio in this channel.
Nevertheless, we have extended the δD data product through 2007 by using the same fixed dead/bad pixel mask as used for 2003-2005 and a description of the ISLS that accounts for variability due to the ice layer between 2003 and 2007 (both taken from Frankenberg et al., 2009).Extension of the data set to the end of the ENVISAT mission in 2012 would require a detailed analysis of the detector pixel degradation and ice layer in our retrieval window, which is not within the scope of the current validation study.By extending the data set through 2007, however, we remain fairly consistent with respect to the original data set, while simultaneously creating enough overlap with ground-based FTS data to allow for a validation.This validated data set can serve as the baseline for more elaborate future extensions.
In Fig. 1 we show a world map of the 2003-2007 averaged δD distribution (top, QF = 1).The new v2.0 data set shows the same global δD patterns as the previous data set: a latitudinal gradient with the highest values at the equator and decreasing values towards the poles, a continental gradient with δD decreasing further inland (best visible in North America) and an altitude gradient, with δD decreasing over high mountain ranges, such as the South American Andes or the Himalayas.A few small-scale features are more pronounced in IMAP v2.0, e.g. the depletion patterns above Saharan mountain ranges, such as the Tibesti mountains in northern Chad and the Hoggar mountains in Algeria.Also, Lake Victoria in Tropical Africa shows a clear depletion pattern amidst the enhanced δD values of the surrounding rain forest (which are due to non-fractionating transpiration of the vegetation).Note that the measurements above oceans rely on light re- flected off low-level clouds or sunglint, as the ocean itself is too dark in the near-infrared.

Atmos
In the bottom panel of Fig. 1 we show the mean single sounding error.The standard error in the mean δD (not shown) decreases with the number of measurements (N) as 1/

√
N and is only a few per mil for the 2003-2007 average.The mean single sounding error, however, is close to 80 ‰ for most areas and increases to ∼ 100 ‰ above very high and arid regions (such as the Tibetan Plateau and Greenland) or towards very high latitudes, where the signal becomes very low due to low solar zenith angles.Similar to this spatial variability, there is also a temporal variability in the mean single sounding error.Above a fixed location, the driest time periods will have the lowest H 2 O and HDO signals and therefore the largest single sounding errors.This shows that one has to be careful when applying noise estimates in the filtering or weighting of the data, as spatial or temporal sampling biases against the lowest HDO signals (highest depletions) might be introduced.Therefore, we use non-weighted averages and do not use any error estimates in the filtering of the data.

Dependency on prior assumptions
The retrieval algorithm uses variable a priori information from ECMWF for the profiles of H 2 O.The a priori HDO profile is derived from the H 2 O profile by assuming a depletion of δD = −100 ‰ at the lowest layer, increasing to δD = −500 ‰ at the highest layer.Even though the prior information is variable, the top panel of Fig. 2 shows that the prior column-averaged δD is roughly constant at −150‰.The small variations in δD can be explained by small variations in the scale height of H 2 O.A larger scale height, for example, gives more weight to the higher, and therefore more depleted, HDO layer, resulting in a slightly lower prior column-averaged δD.This also illustrates the issue mentioned in Sect.2.1 of how variations in the vertical distribution of H 2 O can affect column-averaged δD even though the δD profile remains the same.The bottom panel of Fig. 2 shows how the prior δD has been changed by the retrieval.The retrieval of H 2 O and HDO introduces strong variations in the a posteriori calculated δD, which are uncorrelated to the prior information.This shows that the variations in a posteriori δD contain new information and are not simply the result of variations in the a priori H 2 O in combination with the fixed depletion profile.Additionally, we have tested that the a posteriori calculated δD is not very sensitive to the choice of a priori δD depletion profile by assuming a constant HDO profile with zero depletion.This resulted in some additional scatter in the a posteriori δD but without introducing any bias with respect to the standard retrieval (not shown).
Figure 3 (top panel) shows the impact in terms of total column averaging kernels of a bottom-scaling retrieval relative to a profile-scaling retrieval.A profile-scaling retrieval keeps the shape of the a priori H 2 O and HDO profiles constant by fitting a single scaling factor that applies to all five layers.This typically results in an H 2 O column averaging kernel being > 1.0 in the bottom layer, meaning that the retrieval is too sensitive to real atmospheric H 2 O variations in this layer.The bottom-scaling retrieval, however, ascribes all measured H 2 O variations to the bottom layer and is therefore perfectly sensitive to real atmospheric H 2 O variations in this layer (i.e. the averaging kernel is 1.0 in this layer).Since HDO is a much weaker absorber than H 2 O, the HDO column-averaging kernels are practically equal for both scaling approaches and show almost no variation with altitude.
The bottom panel of Fig. 3 shows the impact of a profilescaling retrieval relative to a bottom-scaling retrieval in terms of the a-posteriori-derived δD.Overall, the retrieval is quite robust against this change in the layering setup: the highest density of data points is found on the one-to-one line.The few outliers with very large shifts in δD are scenes where both retrieval setups retrieve H 2 O columns lower than the a priori column (not shown).The > 1.0 averaging kernel for the lowest layer of the profile-scaling setup explains why this approach leads to an even lower H 2 O column compared to bottom-scaling (while the HDO columns remain similar).typical column averaging kernels of a bottom layer scaling retrieval (blue) and a profile scaling retrieval (magenta).Right :::::: Bottom: Impact ::::: impact : of the profile scaling approach with respect to a bottom layer scaling approach, for a month of measurements above the Sahara.Both approaches assumed no prior depletion profile for δD (the red triangle shows the a priori δD of 0‰).
The right-hand side :::::: bottom : panel of Fig. 3 shows the impact of a profile-scaling retrieval relative to a bottom-scaling retrieval in terms of the a posteriori derived δD.Overall, the retrieval is quite robust against this change in the layering setup: the highest density of datapoints is found on the oneto-one line.The few outliers with very large shifts in δD are scenes where both retrieval setups retrieve H 2 O columns lower than the a priori column (not shown).The > 1.0 averaging kernel for the lowest layer of the profile-scaling setup explains why this approach leads to an even lower H 2 O column compared to bottom-scaling (while the HDO columns remain similar).This translates into a positive shift in the ratio HDO/H 2 O.This is another example of how differences in the retrieval sensitivity of HDO and H 2 O can impact the derived δD.By using the bottom-scaling approach in our re- trieval algorithm we limit this impact, since the averaging kernels for HDO and H 2 O are very similar in the bottom layer where most of the water vapor resides.

Offset correction
An offset in the humidity total column, i.e. the total column that would be measured under perfectly dry conditions, will lead to a humidity-dependent shift in δD (since a constant offset has a relatively larger impact on smaller total columns).Multiplicative correction factors for the H 2 O and HDO columns, on the other hand, will lead to a constant shift in δD, which is easier to correct.Correcting for an offset in the H 2 O and HDO columns will have a similar impact on the dataset as applying a noise filter (namely larger shifts in δD for drier areas), except that it will not introduce sampling biases, since all measurements are corrected, and none are rejected.
Since the existence of humidity-dependent biases can potentially hamper the correct use of the δD dataset, we have tested if the SCIAMACHY δD retrievals are affected by offsets in the H 2 O and HDO columns, and what the impact of a correction for these offsets would be on the validation of δD.Frankenberg et al. (2013) found offsets in the H 2 O and HDO total columns from GOSAT, significantly affecting the δD retrievals.Similar to their method for estimating the offset above Antarctica, we estimate the offset by selecting all 2003-2007 data in a box above Greenland (60 Bottom: impact of the profile scaling approach with respect to a bottom layer scaling approach for a month of measurements above the Sahara.Both approaches assumed no prior depletion profile for δD (the red triangle shows the a priori δD of 0‰).
This translates into a positive shift in the ratio HDO/H 2 O.This is another example of how differences in the retrieval sensitivity of HDO and H 2 O can impact the derived δD.By using the bottom-scaling approach in our retrieval algorithm we limit this impact, since the averaging kernels for HDO and H 2 O are very similar in the bottom layer where most of the water vapour resides.

Offset correction
An offset in the humidity total column, i.e. the total column that would be measured under perfectly dry conditions, will lead to a humidity-dependent shift in δD (since a constant offset has a relatively larger impact on smaller total columns).Multiplicative correction factors for the H 2 O and HDO columns, however, will lead to a constant shift in δD, which is easier to correct.Correcting for an offset in the H 2 O and HDO columns will have a similar impact on the data set typical column averaging kernels of a bottom layer scaling retrieval (blue) and a profile scaling retrieval (magenta).Right :::::: Bottom: Impact ::::: impact : of the profile scaling approach with respect to a bottom layer scaling approach, for a month of measurements above the Sahara.Both approaches assumed no prior depletion profile for δD (the red triangle shows the a priori δD of 0‰).
The right-hand side :::::: bottom : panel of Fig. 3 shows the impact of a profile-scaling retrieval relative to a bottom-scaling retrieval in terms of the a posteriori derived δD.Overall, the retrieval is quite robust against this change in the layering setup: the highest density of datapoints is found on the oneto-one line.The few outliers with very large shifts in δD are scenes where both retrieval setups retrieve H 2 O columns lower than the a priori column (not shown).The > 1.0 averaging kernel for the lowest layer of the profile-scaling setup explains why this approach leads to an even lower H 2 O column compared to bottom-scaling (while the HDO columns remain similar).This translates into a positive shift in the ratio HDO/H 2 O.This is another example of how differences in the retrieval sensitivity of HDO and H 2 O can impact the derived δD.By using the bottom-scaling approach in our re- trieval algorithm we limit this impact, since the averaging kernels for HDO and H 2 O are very similar in the bottom layer where most of the water vapor resides.

Offset correction
An offset in the humidity total column, i.e. the total column that would be measured under perfectly dry conditions, will lead to a humidity-dependent shift in δD (since a constant offset has a relatively larger impact on smaller total columns).Multiplicative correction factors for the H 2 O and HDO columns, on the other hand, will lead to a constant shift in δD, which is easier to correct.Correcting for an offset in the H 2 O and HDO columns will have a similar impact on the dataset as applying a noise filter (namely larger shifts in δD for drier areas), except that it will not introduce sampling biases, since all measurements are corrected, and none are rejected.
Since the existence of humidity-dependent biases can potentially hamper the correct use of the δD dataset, we have tested if the SCIAMACHY δD retrievals are affected by offsets in the H 2 O and HDO columns, and what the impact of a correction for these offsets would be on the validation of δD.Frankenberg et al. (2013) found offsets in the H 2 O and HDO total columns from GOSAT, significantly affecting the δD retrievals.Similar to their method for estimating the offset above Antarctica, we estimate the offset by selecting all 2003-2007 data in a box above Greenland (60 as applying a noise filter (namely larger shifts in δD for drier areas), except that it will not introduce sampling biases, since all measurements are corrected and none are rejected.
Since the existence of humidity-dependent biases can potentially hamper the correct use of the δD data set, we have tested if the SCIAMACHY δD retrievals are affected by offsets in the H 2 O and HDO columns and what the impact of a correction for these offsets would be on the validation of δD.Frankenberg et al. (2013) found offsets in the H 2 O and HDO total columns from GOSAT, significantly affecting the δD retrievals.Similar to their method for estimating the offset above Antarctica, we estimate the offset by selecting all 2003-2007 data in a box above Greenland (60-90 • latitude, −15-−60 • longitude).Like Antarctica, Greenland is a high-altitude region at high latitude, which results in very dry total columns.In Fig. 4 we show a two-dimensional density distribution of the retrieved H 2 O and HDO total columns as a function of the a priori H 2 O total column from ECMWF.The HDO total column has already been divided by the VSMOW abundance of 3.1153×10 −4 .The assumption is that both the H 2 O and HDO columns should be zero at an a priori H 2 O column of zero (even though at nonzero columns the ratio between the H 2 O and HDO columns can be variable).The data have been filtered according to the QF = 1 criteria mentioned above except for the criterion VCD H 2 O /VCD H 2 O , apriori > 0.7, as this would lead to a bias against the driest H 2 O columns while the corresponding HDO column would not be affected.We performed a linear regression between 0 and 0.8 × 10 22 molec cm −2 and define the offset as the Y-intercept of this regression.This gives us H 2 O and HDO offsets of 6.8 × 10 20 and −5. 10 20 molec cm −2 respectively.Subtracting these offsets from the H 2 O and HDO columns before we calculate δD gives an offset-corrected δD, of which the world map is shown in the second panel of Fig. 1.The third panel shows the difference with the uncorrected world map.It clearly shows that, as expected, the offset correction has the largest impact on the driest regions, such as the Atacama desert and the Tibetan plateau.In these regions the offset correction can induce shifts in the 2003-2007 averaged δD of up to ∼ 120 ‰, while the impact at the more temperate regions is ∼ 40 ‰.As a robustness test we have also determined the offsets using data above Antarctica.This resulted in slightly larger offsets (9.5×10 20 molec cm −2 for H 2 O and −2.4×10 20 molec cm −2 for HDO) but no significant difference in the impact on δD.
It is difficult to validate the impact of the offset correction for the driest areas, as either no FTS stations exist in these regions or the stations have a very sparse sampling throughout the year and limited SCIAMACHY coverage due to their high latitudes and resulting high SZAs (such as for the dry high (ant-)arctic stations at Eureka, Ny-Ålesund and Arrival Heights).The corrected values of δD for these areas, however, are more consistent with the simulations from isotope-enabled general circulation models such as IsoGSM (Yoshimura et al., 2011) and LMDZ (Risi et al., 2012b), suggesting that the world map of the offset-corrected data set is more realistic.Since it is expected that the offset correction has a relatively larger impact on the driest months in the regions where ground-based FTS data do exist, we consider both the non-corrected and the offset-corrected SCIA-MACHY data in the remainder of this validation study, and we test if the offset correction leads to a better data set.In Sect.7 we come to a recommendation on which data set to use.

Ground-based FTS data
Our validation data set is based on ground-based FTS stations from two different networks with different retrieval approaches: TCCON and NDACC.The NDACC data are produced within the framework of the MUSICA project.Six stations from both networks were selected with data available between 2003 and 2007.Two stations, Bremen and Lauder, are part of both networks albeit with different temporal coverage.Below we describe the TCCON and MUSICA data sets separately.

TCCON
The TCCON network consists of about 20 operational ground-based FTS stations that use direct solar spectra in the near-infrared to measure the column-averaged abundances of various atmospheric constituents, including H 2 O and HDO.We have used version GGG2012 of the TCCON data from the stations at Ny-Ålesund (Spitsbergen), Bremen (Germany), Park Falls (USA), Pasadena (Jet Propulsion Laboratory or JPL, USA), Darwin (Australia) and Lauder (New Zealand) (also see Table 1).The TCCON measurements and their calibrations using aircraft profiles are described in detail by Wunch et al. (2011) and, more specific to δD, by Boesch et al. (2013).A characterization of a posteriori calculated δD from TCCON-like measurements is presented by Rokotyan et al. (2014).The HDO and H 2 O columns are independently retrieved in the near-infrared wavelength region by scaling a priori profiles of the volume mixing ratio (VMR).The a priori profiles are taken from re-analysis data from the National Centers for Environmental Prediction (NCEP).The HDO a priori profile is inferred from the H 2 O profile with an H 2 O dependent fractionation (−40 ‰ at 1% H 2 O VMR, decreasing to −600 ‰).Since the a priori profiles are highly variable and close to the true atmospheric profiles, the H 2 O and HDO retrieval products will be strongly correlated with the a priori.Since δD is not retrieved directly but calculated a posteriori, the TCCON retrievals suffer from the same caveats as mentioned in Sect.2.1 for the SCIAMACHY retrievals regarding the averaging kernels of H 2 O, HDO and their crossdependencies.
Fifteen spectral windows are used for H 2 O and six for HDO, of which two are partly overlapping with the spectral window used for the SCIAMACHY retrievals.The spectroscopic parameters used by TCCON are different from the ones used by the SCIAMACHY retrievals.For the SCIA-MACHY HDO/H 2 O retrievals the H 2 O line list compiled by Scheepmaker et al. (2013) was used, while for TCCON a compilation of parameters from various sources was used, where the selection was based on the performance of the spectral fit at each absorption line (Wunch et al., 2011).Wunch et al. (2010) have calibrated the TCCON data using aircraft profiles, from which a correction factor of 1.031 was determined for the total column H 2 O.All TCCON H 2 O columns were already divided by this factor.Since no correction factor was applied to HDO (this could not be determined from the aircraft measurements), the H 2 O correction would introduce a bias in the ratio HDO/H 2 O.We have therefore first removed the H 2 O correction by multiplying the H 2 O columns by a factor 1.031, before deriving δD according to Eq. ( 1).This implicitly assumes that the (spectroscopic) errors that the factor corrects for are approximately similar for HDO and H 2 O, so that their effect cancels once the ratio is taken.We have tested that not removing the TCCON H 2 O correction would lead to an increase in the SCIAMACHY δD bias of ∼ 30 ‰, as expected.
The average measurement precision of the a posteriori calculated δD varies between 12 and 20 ‰ for the six TCCON stations used in this study.
For every ground station the a priori information is held constant, so all the observed variability in H 2 O and δD is due to the measured spectra.The retrieval algorithm is described in detail by Schneider et al. (2012).Schneider et al. (2015) have performed a first empirical validation of the H 2 O and δD products using aircraft profiles, from which a bias in δD was estimated of about 30 ‰ close to the surface, increasing to about 70 ‰ above 5 km altitude.This would translate into a high bias in the column-averaged δD of roughly 35 ‰.The measurement precision is estimated to be around 10 ‰.
We have used the 2012 version of the MUSICA data (column-averaged values for H 2 O and δD only) from the stations at Kiruna (Sweden), Bremen (Germany), Jungfraujoch (Switzerland), Izaña (Tenerife, Spain), Wollongong (Australia) and Lauder (New Zealand) (also see Table 1).Since the MUSICA project uses data from the already existing NDACC network, time series for some stations go back to the mid-1990s, while the earliest TCCON data are from 2004.The MUSICA data therefore have more temporal overlap with the 2003-2007 SCIAMACHY data.The measurement frequency, however, is lower for MUSICA, which means that per overpassing SCIAMACHY measurement the number of co-located FTS measurements is also significantly lower.

Spatial and temporal co-locating algorithm
For co-locating the SCIAMACHY data with the groundbased FTS data, we first selected all SCIAMACHY measurements within a 500 km radius of the FTS stations.The measurements were filtered according to the QF = 1 criteria from Sect.2.2, and in addition some outliers were removed by demanding: −600 ‰ < δD < 300 ‰.Compared to other atmospheric trace gases, the diurnal variation of water vapour can be very large.We therefore added a temporal matching criterion of ±2 h to our co-locating algorithm.For every SCIAMACHY measurement within the 500 km radius, we take the average δD of all FTS measurements within a ±2 h time window of the SCIAMACHY measurement.In Table 1 we give an overview of all the FTS stations used.The last two columns show the resulting number of SCIAMACHY measurements with at least one spatially and temporally colocated FTS measurement (N scia ) and the average number of FTS measurements per co-located SCIAMACHY measurement ( N fts ).The matched pairs of individual SCIA-MACHY measurements with averaged FTS measurements form the basis for our bias determination.a very low surface albedo (Lauder).The MUSICA stations Jungfraujoch and Izaña are high-altitude stations and will therefore measure a lower column-averaged δD as a result of δD decreasing with altitude (the so-called altitude effect).The co-located SCIAMACHY measurements for these highaltitude stations consist mostly of reflected sunlight from nearby lower surfaces, which results in the observed strong positive bias in δD.

Time series
For every station we have determined various statistics, which are printed at the top and bottom of the figures and will be explained below.These statistics are based on the spatially and temporally co-located data points from the algorithm as described in Sect. 4. A summary of the statistics is given in Table 2, including the values before and after correcting for the offset.

Bias
The bias is defined as the mean of the 2003-2007 SCIAMACHY-FTS δD differences.Without the offset correction we find significant negative biases above all ground stations (except the mountain stations).The weighted average of the bias for all six TCCON stations and the four lowaltitude MUSICA stations is −35 ± 1.6 and −69 ± 3.9 ‰ respectively.The uncertainties denote standard errors in the mean.The station-to-station standard deviations are 30 ‰ (TCCON) and 15 ‰ (MUSICA).The offset correction introduces an average reduction of the bias of 27 ‰, consistent with the world map that showed the impact of the offset correction in Fig. 1.After the offset correction, the average bias above the TCCON and MUSICA stations is −8.4 ± 1.6 and −42 ± 3.9 ‰ respectively, with station-to-station standard deviations of 26 ‰ (TCCON) and 21 ‰ (MUSICA).

Standard deviation
The standard deviation of the SCIAMACHY-FTS differences is defined as σ and has an average value of 115-120 ‰ for single SCIAMACHY measurements (shown in the bottom two lines of Table 2).For 30-day averages the standard deviation is reduced by a factor ∼ 1.5-3.The offset correction does not lead to reduction of the standard deviation.The standard deviation is larger than the mean measurement noise error in SCIAMACHY δD (typically between 40 and 60 ‰ above the FTS stations).Partly this is explained by the measurement noise error underestimating the true error, since it is based on the measured spectrum and does not include any errors due to calibration.Also, we are taking neither the statistical uncertainty of the FTS measurements into account nor the remaining systematic effects present in the co-location window of ±2 h and distance < 500 km.Within this window, local humidity can show strong temporal and spatial variations, leading to strong variations in δD and increased standard deviations.We have tested that for some sta-tions the standard deviation can indeed be reduced by stricter co-location criteria.However, the sample size also reduces with stricter criteria so that we do not observe a clear trend of lower biases and standard deviations for more strict colocations.

Reduced chi-square
The reduced chi-square χ 2 ν is a parameter that can be used in the comparison of two data sets, taking into account the statistical errors of the data.We use the following definition of χ 2 ν : The summation is over all (n) co-located measurements per station, where σ 2 scia and σ 2 fts denote the uncertainty in δD from SCIAMACHY and the FTS measurements respectively (not to be confused with σ that denotes the standard devia-  tion of the SCIAMACHY-FTS differences).For the TCCON data, σ 2 fts could be determined from the given uncertainties (precision) in the dry-air mole fractions of H 2 O and HDO.For the MUSICA data no uncertainties in total column δD (nor H 2 O and HDO) were provided, so we assumed a constant σ fts =10 ‰.The average χ 2 ν we find for TCCON and MUSICA is 3.9 and 5.0 respectively (printed in Figs. 5 and  6 and summarized in Table 2).Consistent with our observations of the standard deviation in Sect.5.1.2,this shows that either additional differences are present within the colocation window (due to the variability of water vapour), or that our uncertainties are underestimated.We note that a doubling of the SCIAMACHY δD uncertainty would lead to χ 2 ν ≈ 1.1-1.4(meaning that with these larger uncertainties the observed spread in the differences is practically consistent with the statistical uncertainty of the measurements).Alternatively, χ 2 ν ≈ 1 could be reached with adding a separate noise term with σ ≈ 90 ‰, suggesting that this would be the additional variability of water vapour in the co-location window if it was the single cause for the additional spread in the observed differences.

Atmos
The Izaña station shows the highest χ 2 ν of 12, which can be explained by its location close to the Sahara (with a very high albedo), which results in many co-located SCIAMACHY measurements from above the Sahara with very small uncertainties.
Finally, the values of χ 2 ν in Table 2 show no improvement due to the offset correction.

Correlation coefficient
The last statistical parameter we show in Figs. 5, 6 and Table 2 is the linear Pearson correlation coefficient "r".Before the offset correction, the average correlation coefficients for TCCON and MUSICA are 0.25 and 0.21 respectively.The offset correction has a negative impact on the correlation, reducing the average coefficients by 0.06.Again, the variability in humidity present in the co-location window could explain why the individual SCIAMACHY data are only reasonably correlated with the FTS data.The correlation improves when we take monthly averages, as we will present in Sect.6.1.
Overall, considering all statistical parameters, we can conclude that the uncorrected SCIAMACHY δD product is low biased by −35 ± 30 ‰ compared to TCCON and −69 ± 15 ‰ compared to MUSICA, where the uncertainties are the station-to-station standard deviations.Considering these averages, it seems that the bias with respect to MU-SICA is larger than the bias with respect to TCCON.However, considering the individual Bremen and Lauder stations (which participate in both networks) either the reverse is true (Bremen shows a lower bias with respect to MUSICA instead of TCCON) or the TCCON and MUSICA biases agree within the estimated errors (Lauder).This suggests that the differences in biases between TCCON and MUSICA might be related to the differences in the location of the stations.We study this further in the next section.The offset correction reduces the bias on average by 27 ‰ but leads neither to a reduction of the standard deviation nor to an improvement of the correlation coefficient.weighted mean bias over all stations is smallest with respect to TCCON, even though the bias at Lauder and Bremen is either comparable or smaller with respect to MUSICA.It also shows that, in order to accurately determine a latitudinal bias of SCIAMACHY δD using the combination of both TCCON and MUSICA networks, more information on the differences between TCCON and MUSICA δD is necessary, including

Latitudinal dependence of the bias
As mentioned above, the averaged bias with respect to MU-SICA is larger than with respect to TCCON, while for the two individual stations that participate in both networks the bias with respect to MUSICA is either consistent with TC-CON, or even smaller.Since the MUSICA and TCCON stations are spread across different geographical locations, this could be the result of a selection effect in combination with a latitudinal bias.
In Fig. 7 we show the bias at all the FTS stations (except the two high-altitude stations) as a function of latitude.For the two stations participating in both networks (Bremen and Lauder), the weighted average of their biases is used.The figure shows evidence for a latitudinal dependent bias both for the offset-corrected and uncorrected data set.The bias at the higher latitudes can be up to 30 to 60 ‰ larger than at the lower latitudes.Although the offset correction leads to a larger shift in δD at higher latitudes, this gradient is not sufficient to remove the latitudinal dependency of the bias.
Three of the four stations at low to moderate latitudes (−40 to +50 • ) are TCCON stations with relatively small error bars.The only MUSICA station in this latitudinal range is Wollongong which has the largest error bar on the bias.The other three MUSICA stations are at higher latitudes where the SCIAMACHY δD bias is larger.This explains why the weighted mean bias over all stations is smallest with respect to TCCON, even though the bias at Lauder and Bremen is either comparable or smaller with respect to MUSICA.It also shows that, in order to accurately determine a latitudinal bias of SCIAMACHY δD using the combination of both TCCON and MUSICA networks, more information on the differences between TCCON and MUSICA δD is necessary, including more elaborate cross-validations and their latitudinal dependencies.

Seasonality
As a result of Rayleigh distillation (i.e. the preferential condensation of the heavier isotopologues due to their lower vapour pressure), seasonal variations in δD are primarily tracing variations in humidity.Small departures from this humidity-δD correlation, however, could provide new insights into secondary processes of the hydrological cycle.In Sect.6.1 we first test whether the SCIAMACHY δD measurements can accurately represent local seasonalities of δD.In Sect.6.2 we combine δD with humidity and study asymmetries in the primary seasonal correlation with humidity.

Monthly means
We show monthly means for all stations in Figs. 8 and 9. Since we now focus on the mean seasonality, we have relaxed the temporal co-location constraint of matching every FTS measurement within ±2 h to a SCIAMACHY measurement.Instead, we selected all available FTS measurements between 2003 and 2012, which were taken in a ±2 h window around the mean SCIAMACHY overpass time.The mean overpass time for a specific ground station was estimated from all available 2003-2007 SCIAMACHY measurements within a 500 km radius of the station.This results in much more available data and therefore more accurate monthly means, while still sampling the same window in the diel humidity cycle.For SCIAMACHY all available measurements between 2003 and 2007 within a 500 km radius of the FTS station were used, and we required a minimum of 10 measurements for any month to be used as a monthly mean.
Figures 8 and 9 show that the SCIAMACHY δD measurements reproduce the shape of the mean seasonalities quite well for most ground stations.This is also visible in the correlation coefficients, which are printed in the figures for both the offset-corrected (r corr ) and uncorrected (r uncorr ) SCIA-MACHY data.
As a reference, we have also plotted the seasonalities of the a priori δD as dashed curves.In contrast to MUSICA (which uses a single fixed a priori assumption per station), the TCCON a priori is variable and highly correlated with the posterior δD seasonality.The SCIAMACHY a priori δD, however, is fairly constant at about −150 ‰.We can therefore conclude that the correlation between the SCIAMACHY and FTS δD seasonalities is a result of the retrieval process and not of the a priori assumptions.
For Ny-Ålesund and Lauder (and to a lesser extent Kiruna) the similarity in δD seasonality is not good and the correlation coefficient is even negative.These are the stations with the poorest sampling of SCIAMACHY measurements throughout the year, either due to the high latitude (Ny-Ålesund, Kiruna) or isolated position surrounded by oceans (Lauder), making their seasonalities incomplete.Due to the averaging, the correlation coefficients for the monthly means are much higher than the correlation coefficients for the single co-located observations (as were shown in Figs. 5 and 6 and Table 2).The offset correction, however, generally leads to lower correlation coefficients.This seems to be caused by overcorrecting some of the driest months with the largest uncertainties, such as winter for Park Falls and spring for Ny-Ålesund and Kiruna.The seasonalities for those stations match better without the offset correction.So even though the offset correction leads to an overall smaller bias in δD and more realistic global patterns for the driest areas (such as the Tibetan plateau, see Fig. 1), it slightly deteriorates the shape of local seasonalities.

Atmos
Interestingly, the shape of the seasonality for the highaltitude Izaña station is reproduced quite well (except for the negative shift due to the altitude effect), even though many Bremen, Park Falls, Darwin, Jungfraujoch, Izaña and Wollongong.For reasons unknown, JPL is the only station for which SCIAMACHY observes a reversed pattern compared to FTS, of lower δD values from April-June compared to September-November.Nevertheless, the general behaviour of asymmetrical spring and fall transitions is captured fairly well.These asymmetries between the seasons point to differences in the distance or temperature of the source of the water vapor and show that the SCIAMACHY δD measurements add additional information to the measurements of humidity alone.
The FTS stations of Jungfraujoch and Izaña show much more depleted δD and humidities than SCIAMACHY due to their high altitude, but the SCIAMACHY observations seem to correctly connect to their Rayleigh curves, including the decreasing slope towards higher humidities.The SCIAMACHY observations above the stations Ny Ålesund, Kiruna, and Lauder show incomplete seasonal cycles due to their geographical locations, and are therefore less useful to study the seasonal asymmetries.Also, the first δD observations above these stations in local spring are all overestimated compared to observations later in the season (also visible in the monthly means in Figs. 8 and 9).This leads to the negative correlation coefficients in the monthly means, and the wrong orientation (too low or even negative slopes) of the δD vs humidity curves.It is possible that these overestimations are caused by sampling biases due to the high latitudes.Within the 500 km co-location radius, the lower latitudes will generally have higher humidities with higher δD.Early in the season these lower latitudes will preferentially be sampled by SCIAMACHY due to their lower SZAs and therefore stronger signals.Such a sampling bias could also explain why SCIAMACHY overestimates the lowest humidities (and δD) for Park Falls.SCIAMACHY's underestimation of the very high humidities measured at Darwin, on the other hand, SCIAMACHY observations for that station are measured above the Sahara.In contrast, the seasonality measured from the top of the Jungfraujoch mountain is much steeper than observed by SCIAMACHY, possibly related to the higher altitude of Jungfraujoch (compared to Izaña) or differences between Central Europe and the subtropics in the dominating circulation patterns that control moisture transport at high altitudes.
Finally, it has been suggested that SCIAMACHY might overestimate seasonalities in δD due to a selectional bias in the observations of high altitude scenes due to fractional cloud cover (Risi et al., 2010).We do not see any evidence for such overestimation of the seasonalities from Figs. 8 and 9, although we do acknowledge that fractional cloud cover might play a role in the observed overall low bias in δD, since small fractions of clouds will reflect signal of higher altitude atmospheric layers (depleted in δD) into the light path.

δD vs. humidity
To study if δD derived from the SCIAMACHY measurements can also add information with respect to humidity retrievals alone, we show the monthly mean δD as a function of humidity in Figs. 12 and 13.As for the monthly means, all available years were used for the FTS data (within ±2 h of the mean SCIAMACHY overpass time), while for SCIA-MACHY we used the years 2003-2007 (offset corrected).For both FTS and SCIAMACHY we only use months with at least 10 measurements.Figure 10 shows a close-up for the TCCON stations Park Falls and Darwin, including the SCIAMACHY and TCCON a priori assumptions (blue and magenta dashed curves).Figure 11 shows a close-up for Bremen and Lauder, which participate in both TCCON and MU-SICA networks.Following Schneider et al. (2012), by numbering and connecting the months in sequence, the transition of δD and humidity throughout the year becomes visible.We find significant differences between the transition from winter to summer and summer to winter for most FTS stations.The TC-CON station Darwin is the clearest example (bottom panel in Fig. 10), showing lower δD values for the transition from January to July than for the transition from July to January, even though the humidities during these transitions are the same.All FTS stations seem to suggest the same pattern of higher δD in spring and lower δD in fall, and this is also observed by SCIAMACHY above the stations of Bremen, Park Falls, Darwin, Jungfraujoch, Izaña and Wollongong.For reasons unknown, JPL is the only station for which SCIAMACHY observes a reversed pattern compared to FTS of lower δD values from April to June than from September to November.Nevertheless, the general behaviour of asymmetrical spring and fall transitions is captured fairly well.These asymmetries between the seasons point to differences in the distance or temperature of the source of the water vapour and show that the SCIAMACHY δD measurements  −69 ± 3.9‰ using MUSICA stations (uncertainties denoting standard errors in the mean and assuming that the FTS measurements are the truth).The station-to-station standard deviations are 30‰ and 15‰, respectively.Above the highaltitude stations of Izaña and Jungfraujoch we find considerable positive biases, which are expected from the depletion of HDO with altitude and the fact that SCIAMACHY is biased towards the lower-altitude areas surrounding mountains due to its spatial footprint of 120 × 30 km.

Atmos
Throughout this work we have also studied the impact of an offset correction on the retrieved total columns HDO and H 2 O.An offset correction is expected to affect mostly the driest areas, which also have the largest uncertainties in δD, but without introducing sampling biases (as would be the case with weighting or filtering by the uncertainty).The offset correction we derived reduces the overall bias in δD by about 27‰ and leads to a much larger increase of δD above very dry and elevated areas.The impact at these elevated and remote areas seems justified in comparison with model data.However, its validity could not be confirmed using groundbased data.Above the FTS stations, the offset correction seems to overcorrect the driest months and is thereby reducing the overall correlation.Depending on the study, or area of interest, it might therefore be better to use a constant bias correction instead.The latitudinal dependency suggests a bias of −30‰ for latitudes between −20 • to +50 • , and a bias of about −70‰ at higher latitudes.
The latitudinal dependency of the bias also explains why the average bias determined using MUSICA (with more stations at higher latitudes compared to TCCON) is larger than the average bias determined using TCCON, even though we add additional information to the measurements of humidity alone.
The FTS stations of Jungfraujoch and Izaña show much more depleted δD and humidities than SCIAMACHY due to their high altitude, but the SCIAMACHY observations seem to correctly connect to their Rayleigh curves, including the decreasing slope towards higher humidities.The SCIAMACHY observations above the stations Ny-Ålesund, Kiruna and Lauder show incomplete seasonal cycles due to their geographical locations and are therefore less useful to study the seasonal asymmetries.Also, the first δD observations above these stations in local spring are all overestimated compared to observations later in the season (also visible in the monthly means in Figs. 8 and 9).This leads to the negative correlation coefficients in the monthly means and the wrong orientation (too low or even negative slopes) of the δD vs. humidity curves.It is possible that these overestimations are caused by sampling biases due to the high latitudes.Within the 500 km co-location radius, the lower latitudes will generally have higher humidities with higher δD.Early in the season these lower latitudes will preferentially be sampled by SCIAMACHY due to their lower SZAs and therefore stronger signals.Such a sampling bias could also explain why SCIAMACHY overestimates the lowest humidities (and δD) for Park Falls.SCIAMACHY's underestimation of the very high humidities measured at Darwin, however, might result from a sampling bias against very cloudy and humid conditions in local summer.While the FTS instruments can measure under such conditions in between the clouds, SCIA-MACHY needs less cloud-contaminated (and thus less humid) conditions for a measurement to successfully pass the filter criteria.
It is important to note that the seasonal asymmetry is not present yet in the a priori information.Figures 10 and 11 show that the prior SCIAMACHY δD vs. humidity curve is constant at about −150 ‰.Although the prior TCCON δD varies throughout the season depending on humidity, there is no clear difference between the spring and fall transition yet.This shows that the asymmetry of both SCIAMACHY and TCCON is a result of the retrieval and that their posterior correlation has been improved compared to their a priori correlation.The same holds for MUSICA, as the a priori total column δD is constant for every station and the a priori total column humidity varies only slightly with surface pressure.Figure 11 shows that the posterior δD vs. humidity curves are very similar for TCCON and MUSICA, even though their prior information is completely different.The TCCON curves are correlated with the prior information, but especially for Lauder it can be seen that the retrieval has added information, enhancing the seasonal asymmetry and bringing it very close to the asymmetry as seen by MUSICA.For example, the spring transition in Lauder (months 8-12) shows lower δD than the fall transition in the a priori, while this pattern is reversed by the retrieval.
A simultaneous validation in two dimensions (δD and humidity) introduces the complication of possible biases in two dimensions as well.For the SCIAMACHY humidity measurements we have used the total column H 2 O retrievals.Since these are not ratio retrievals, certain instrumental artefacts and light path modifications (such as scattering from the clouds remaining after filtering) might not cancel out.As a very simple humidity validation exercise we have compared the retrieved H 2 O columns with the H 2 O columns from ECMWF in Fig. 14.The figure shows a high correlation (r = 0.96) but also somewhat underestimated columns (the linear regression has a slope of 0.87).The H 2 O profiles from ECMWF are also used as a priori information for the H 2 O retrievals.This explains why in the δD vs. humidity diagrams the retrieved SCIAMACHY humidities remain close to their a priori.A similar high correlation between prior and posterior H 2 O holds for TCCON but not for MUSICA, as the latter uses a priori H 2 O information based on a constant mixing ratio.For SCIAMACHY and TCCON, the slope in the δD vs. humidity diagram could therefore easily be affected by the choice of the H 2 O prior.The retrieval of HDO, however, can add significant information independently of its prior assumptions (this was also shown by Fig. 2 in Sect.2.3).The seasonal asymmetry introduced by this information is therefore a more robust metric of hydrological changes than the slope of the curve itself.

Conclusions and discussion
We have presented a validation study of the SCIAMACHY HDO/H 2 O ratio product using high-accuracy, ground-based FTS measurements.After updating the SCIAMACHY product with 2 additional years, the [2003][2004][2005][2006][2007]  After co-locating the data spatially within 500 km and temporally within ±2 h, we find an average bias of −35 ± 1.6 ‰ using TCCON and −69 ± 3.9 ‰ using MUSICA stations (uncertainties denoting standard errors in the mean and assuming that the FTS measurements are the truth).The station-to-station standard deviations are 30 and 15 ‰ respectively.Above the high-altitude stations of Izaña and Jungfraujoch we find considerable positive biases which are expected from the depletion of HDO with altitude and the fact that SCIAMACHY is biased towards the lower-altitude areas surrounding mountains due to its spatial footprint of 120 × 30 km.
Throughout this work we have also studied the impact of an offset correction on the retrieved total columns HDO and H 2 O.An offset correction is expected to affect mostly the driest areas, which also have the largest uncertainties in δD, but without introducing sampling biases (as would be the case with weighting or filtering by the uncertainty).The offset correction we derived reduces the overall bias in δD by about 27 ‰ and leads to a much larger increase of δD above very dry and elevated areas.The impact at these elevated and remote areas seems justified in comparison with model data.However, its validity could not be confirmed using ground-based data.Above the FTS stations, the offset correction seems to overcorrect the driest months and thereby reduces the overall correlation.Depending on the study or area of interest, it might therefore be better to use a constant bias correction instead.The latitudinal dependency suggests a bias of −30 ‰ for latitudes between −20 and +50 • and a bias of about −70 ‰ at higher latitudes.
The latitudinal dependency of the bias also explains why the average bias determined using MUSICA (with more stations at higher latitudes compared to TCCON) is larger than the average bias determined using TCCON, even though we find smaller biases with MUSICA at the combined stations of Bremen and Lauder.The retrieval setups of MUSICA and TCCON are considerably different (including different algorithms, wavelength regions, spectroscopy, a priori inputs and calibrations), which seems to result in lower δD from MUSICA than from TCCON.To confirm this, we encourage more validations of the FTS networks using aircraft profiles and a more dedicated intercomparison between δD from both networks.This should become increasingly feasible as more data become available for more recent years, also for stations that participate in both networks.Schneider et al. (2015) have performed a first validation of the MUSICA station at Izaña using aircraft profiles, which suggests a high bias in columnaveraged δD of about 35 ‰.This could mean that the SCIA-MACHY bias we find is actually overestimated by a similar amount.The aforementioned comparison to TCCON and more aircraft validations at different sites would be useful to confirm this.
Regardless of any bias, the SCIAMACHY HDO/H 2 O ratio captures local seasonal variabilities quite well, with correlation coefficients between 0.4 and 0.9 for the stations with complete seasonalities.The added benefit of δD measurements, compared to measurements of water vapour alone, becomes particularly clear when combined in δD-humidity diagrams.We showed that SCIAMACHY can observe asymmetries in the seasonal evolution of δD vs. humidity, in line with the observations from the FTS networks, that point towards a shifting pattern of water vapour source region or temperature between seasons.We also showed that this information is not present yet in the prior, even though in the case of SCIAMACHY and TCCON the prior is variable and already quite close to the truth.Therefore, the information was clearly added by the retrieval.We have to consider, however, that in the a posteriori calculation of δD, the SCIAMACHY and TCCON retrievals could be corrected neither for crossdependencies between HDO and H 2 O nor for their difference in retrieval sensitivity at higher layers (the bottom-scaling approach was adopted to optimize this sensitivity in the most important lowest layer).This means that to some extent, variations in the calculated δD could have been due to variations in H 2 O instead of variations in the true δD.This might explain some of the remaining differences in observed δD among SCIAMACHY, TCCON and MUSICA in addition to expected differences due to measurement noise and imperfect matching of the data sets.The aforementioned caveats show that retrieving ratios such as HDO/H 2 O is challenging and that it is beneficial to retain the full averaging kernels, including cross-correlations.
There have been a few other studies that compared SCIA-MACHY or other satellite retrievals of the ratio HDO/H 2 O with isotope-enabled GCMs or ground-based FTS measurements.Werner et al. (2011) compared the SCIAMACHY data with modelled δD values from the ECHAM5-wiso model and found unexplained differences in δD of about 30 ‰ for low latitudes and up to 50 ‰ for higher latitudes (up to 60 • ).Our results show that these differences can almost entirely be attributed to the low bias of SCIAMACHY δD.Risi et al. (2012b) have compared a large number of isotopic data sets with each other and with the Laboratoire de Météorologie Dynamique Zoom (LMDZ) general circulation model.Consistent with our results, they find the HDO/H 2 O ratio measured by SCIAMACHY too low compared to the measurements of TCCON and MUSICA.Their listed differences between TCCON or MUSICA and SCIAMACHY (their Table 6) are roughly comparable to our results (our Table 2), even though they did not use truly co-located data points but focussed more on zonal averages.From their analysis, however, a latitudinal gradient (or "meridional bias") in the differences between FTS and SCIAMACHY is not very apparent, leading them to conclude that the observed latitudinal gradient in the differences between observations and LMDZ is a shortcoming in the model.A latitudinal dependent bias correction of the SCIAMACHY data, as we prowww.atmos-meas-tech.net/8/1799/2015/Atmos.Meas.Tech., 8, 1799-1818, 2015 posed above, would actually weaken this LMDZ meridional bias.This highlights once more the importance of further cross-validations between TCCON and MUSICA δD, including their latitudinal dependencies, in order to better constrain the latitudinal bias of SCIAMACHY and the models.Finally, Boesch et al. (2013) have made a comparison between TCCON and the recent HDO/H 2 O observations of GOSAT.Interestingly, their estimated bias in δD retrieved from GOSAT (about −30 to −70 ‰) is very similar to our findings for SCIAMACHY, even though there are considerable differences in the retrieval setup (e.g. in the spectral fitting window and the assumed profile layers).The standard deviations of the GOSAT-TCCON differences are almost half of what we find for the SCIAMACHY-TCCON differences, showing the improved precision of HDO/H 2 O retrievals of GOSAT compared to SCIAMACHY.The similarity in the biases gives us confidence in the prospect of using the combination of both instruments to measure long and global time series of tropospheric HDO/H 2 O in the future.Such time series will likely be further extended with measurements from the TROPOMI instrument onboard ESA's Sentinel-5 Precursor mission, planned to be launched in 2016 (Veefkind et al., 2012).

Figure 1 .
Figure 1.First: map of the new IMAP v2.0 2003-2007 δD data set.The resolution is 1 × 1 • over land and 2 × 2 • over water (nonweighted average of at least two retrievals per grid cell).Second: the same map after the offset correction.Third: difference between the map with and without the offset correction for the years 2003-2007.Fourth: mean single sounding error.

Figure 2 .
Figure 2. Top: density distribution of the a priori column-averaged δD based on variable H 2 O profiles and a fixed HDO depletion profile.Bottom: density distribution of the a posteriori δD derived from the retrieval of total column H 2 O and HDO.The colour scale in both panels corresponds to the number of measurements in every box, and for both panels all measurements in June 2003 were used.

Fig. 3 .
Fig. 3. Left ::: Top: Typical :::::typical column averaging kernels of a bottom layer scaling retrieval (blue) and a profile scaling retrieval (magenta).Right :::::: Bottom: Impact ::::: impact : of the profile scaling approach with respect to a bottom layer scaling approach, for a month of measurements above the Sahara.Both approaches assumed no prior depletion profile for δD (the red triangle shows the a priori δD of 0‰).

Fig. 4 .
Fig. 4. Cumulative density distributions (showing the 25, 50 and 75 percentiles) of H 2 O (blue) and HDO (magenta, divided by the VSMOW abundance of 3.1153×10 −4 ) total columns measured by SCIAMACHY, versus the H 2 O columns from ECMWF, for a box over Greenland.The thick lines show linear regressions, with their lower-left starting points, indicated by the asterisks, being the constant offsets.The errorbars are shifted to the left for readability, and denote the uncertainty in the offset.

Figure 3 .
Figure3.Top: typical column averaging kernels of a bottom layer scaling retrieval (blue) and a profile scaling retrieval (magenta).Bottom: impact of the profile scaling approach with respect to a bottom layer scaling approach for a month of measurements above the Sahara.Both approaches assumed no prior depletion profile for δD (the red triangle shows the a priori δD of 0‰).
Fig. 3. Left ::: Top: Typical :::::typical column averaging kernels of a bottom layer scaling retrieval (blue) and a profile scaling retrieval (magenta).Right :::::: Bottom: Impact ::::: impact : of the profile scaling approach with respect to a bottom layer scaling approach, for a month of measurements above the Sahara.Both approaches assumed no prior depletion profile for δD (the red triangle shows the a priori δD of 0‰).

Fig. 4 .
Fig. 4. Cumulative density distributions (showing the 25, 50 and 75 percentiles) of H2O (blue) and HDO (magenta, divided by the VSMOW abundance of 3.1153×10 −4 ) total columns measured by SCIAMACHY, versus the H2O columns from ECMWF, for a box over Greenland.The thick lines show linear regressions, with their lower-left starting points, indicated by the asterisks, being the constant offsets.The errorbars are shifted to the left for readability, and denote the uncertainty in the offset.

Figure 4 .
Figure 4. Cumulative density distributions (showing the 25, 50 and 75 percentiles) of H 2 O (blue) and HDO (magenta, divided by the VSMOW abundance of 3.1153×10 −4 ) total columns measured by SCIAMACHY vs. the H 2 O columns from ECMWF for a box over Greenland.The thick lines show linear regressions, with their lowerleft starting points, indicated by the asterisks, being the constant offsets.The error bars are shifted to the left for readability and denote the uncertainty in the offset.

Figures 5 Figure 5 .
Figures5 and 6show the entire 2003-2007 time series of δD measurements from TCCON and MUSICA respectively (dark grey dots and magenta curve) together with all the spatially co-located SCIAMACHY δD measurements (light grey dots and blue curve).To clearly show all the available data per station, no temporal constraints have been applied to the data points in these figures.A few stations show very sparse sampling throughout the year, either due to their high latitude and resulting high SZAs (e.g.Ny-Ålesund and Kiruna) or due to their location close to oceans which have

Figure 6 .
Figure 6.Same as Fig. 5 but now for the MUSICA stations.

Figure 7 .
Figure 7. Bias as a function of latitude all low-altitude TCCON and MUSICA stations.The letters "T" and "M" indicate a TCCON and MUSICA station respectively.For Lauder (−45 • ) and Bremen (+53 • ) the results from both networks are shown separately, but the curve follows the weighted average of both networks.

Figure 8 .
Figure 8. Monthly means for the TCCON stations.FTS data within ±2 h of the mean SCIAMACHY overpass time were used from all available years.SCIAMACHY data are for 2003-2007.Error bars denote standard errors of the mean.r corr and r uncorr refer to the correlation coefficients between the FTS data and respectively the offset-corrected and uncorrected SCIAMACHY data.

Figure 9 .
Figure 9. Same as Fig. 8 but for the MUSICA stations.

Fig. 10 .
Fig. 10.Monthly means of δD as a function of humidity for the TC-CON stations Park Falls (left ::: top) and Darwin (right :::::bottom), compared to spatially co-located SCIAMACHY measurements.Subsequent months are numbered and connected by lines.The prior information is shown with dashed lines and black numbers.

Fig. 11 .
Fig. 11.Same as Fig. 10 but now for the two stations participating in both TCCON and MUSICA networks.

Figure 10 .
Figure 10.Monthly means of δD as a function of humidity for the TCCON stations Park Falls (top) and Darwin (bottom) compared to spatially co-located SCIAMACHY measurements.Subsequent months are numbered and connected by lines.The prior information is shown with dashed lines and black numbers.

Figure 11 .
Figure 11.Same as Fig. 10 but for the two stations participating in both TCCON and MUSICA networks.

Figure 12 .
Figure12.Monthly means of δD as a function of humidity for the six TCCON stations (magenta) compared to spatially co-located SCIA-MACHY measurements (blue, corrected for the offset).Subsequent months (numbered) are connected by lines.

Figure 13 .
Figure 13.Same as Fig. 12 but for the MUSICA stations.

Fig. 14 .
Fig. 14.Correlation diagram of the H 2 O total columns retrieved by SCIAMACHY versus the coinciding H 2 O total columns from ECWMF, for all measurements in 2003 above the Sahara (latitude 15 • to 30 • , longitude −15 • to 30 • ).The red dashed line shows a linear regression, while "r" indicates the linear Pearson's correlation coefficient.

Figure 14 .
Figure 14.Correlation diagram of the H 2 O total columns retrieved by SCIAMACHY vs. the coinciding H 2 O total columns from ECWMF for all measurements in 2003 above the Sahara (latitude 15 to 30 • , longitude −15 to 30 • ).The red dashed line shows a linear regression, while "r" indicates the linear Pearson's correlation coefficient.

Table 1 .
consists of(Rothman et al., 2009)operating within the NDACC network.From here on we refer to this data set simply as "MUSICA".MUSICA uses Overview of the FTS stations used in the validation study for the period 2003-2007.N scia is the number of SCIAMACHY measurements which were co-located with at least one FTS measurement within a 500 km radius and a 2 h time window.N fts is the average number of FTS data points per co-located SCIAMACHY measurement.direct-solarFTSspectra in the mid-infrared to retrieve column-averaged abundances and profiles of H 2 O, HDO and also δD.Ten spectral microwindows are used for H 2 O and HDO, and the spectroscopic parameters come from a combination ofHITRAN 2008(Rothman et al., 2009)andSchneider et al. (2011).Contrary to the a posteriori δD calculation of TCCON, MUSICA performs a direct retrieval of δD.The a priori profile information for H 2 O is based on radiosonde measurements at different locations.The a priori δD profile is based on the measurements by www.atmos-meas-tech.net/8/1799/2015/Atmos.Meas.Tech., 8, 1799-1818, 2015

Table 2 .
Results of the SCIAMACHY-FTS comparisons before and after the SCIAMACHY offset correction.The last two rows show the weighted mean of the bias (including error in the mean) for all six TCCON stations and the four low-altitude MUSICA stations.The values within brackets are the station-to-station standard deviation.The other values in the last two rows are unweighted averages for all six TCCON and MUSICA stations.
Bias as a function of latitude for all low-altitude TCCON and MUSICA stations.The letters "T" and "M" indicate a TCCON and MUSICA station, respectively.For Lauder (−45 • ) and Bremen (+53 • ) the results from both networks are shown separately, but the curve follows the weighted average of both networks.