**Research article**
08 Apr 2019

**Research article** | 08 Apr 2019

# Sampling bias adjustment for sparsely sampled satellite measurements applied to ACE-FTS carbonyl sulfide observations

Corinna Kloss Marc von Hobe Michael Höpfner Kaley A. Walker Martin Riese Jörn Ungermann Birgit Hassler Stefanie Kremser and Greg E. Bodeker

^{1,2},

^{1},

^{3},

^{4},

^{1},

^{1},

^{5},

^{6},

^{6}

**Corinna Kloss et al.**Corinna Kloss Marc von Hobe Michael Höpfner Kaley A. Walker Martin Riese Jörn Ungermann Birgit Hassler Stefanie Kremser and Greg E. Bodeker

^{1,2},

^{1},

^{3},

^{4},

^{1},

^{1},

^{5},

^{6},

^{6}

^{1}Institute of Energy and Climate Research (IEK-7), Forschungszentrum Jülich GmbH, Jülich, Germany^{2}Laboratoire de Physique et Chimie de l'Environnement et de l'Espace (LPC2E), Université d'Orléans, CNRS, Orléans, France^{3}Institute of Meteorology and Climate Research, Karlsruhe Institute of Technology, Karlsruhe, Germany^{4}Department of Physics, University of Toronto, Toronto, Ontario, Canada^{5}Deutsches Zentrum für Luft- und Raumfahrt (DLR), Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany^{6}Bodeker Scientific, Alexandra, New Zealand

^{1}Institute of Energy and Climate Research (IEK-7), Forschungszentrum Jülich GmbH, Jülich, Germany^{2}Laboratoire de Physique et Chimie de l'Environnement et de l'Espace (LPC2E), Université d'Orléans, CNRS, Orléans, France^{3}Institute of Meteorology and Climate Research, Karlsruhe Institute of Technology, Karlsruhe, Germany^{4}Department of Physics, University of Toronto, Toronto, Ontario, Canada^{5}Deutsches Zentrum für Luft- und Raumfahrt (DLR), Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany^{6}Bodeker Scientific, Alexandra, New Zealand

**Correspondence**: Corinna Kloss (corinna.kloss@cnrs-orleans.fr)

**Correspondence**: Corinna Kloss (corinna.kloss@cnrs-orleans.fr)

Received: 13 Jun 2018 – Discussion started: 12 Jul 2018 – Revised: 28 Feb 2019 – Accepted: 21 Mar 2019 – Published: 08 Apr 2019

When computing climatological averages of atmospheric trace-gas mixing ratios
obtained from satellite-based measurements, sampling biases arise if data
coverage is not uniform in space and time. Homogeneous spatiotemporal
coverage is essentially impossible to achieve. Solar occultation
measurements, by virtue of satellite orbit and the requirement of direct
observation of the sun through the atmosphere, result in particularly sparse
spatial coverage. In this proof-of-concept study, a method is presented to
adjust for such sampling biases when calculating climatological means. The
method is demonstrated using carbonyl sulfide (OCS) measurements at 16 km
altitude from the ACE-FTS (Atmospheric Chemistry Experiment Fourier Transform
Spectrometer). At this altitude, OCS mixing ratios show a steep gradient
between the poles and Equator. ACE-FTS measurements, which are provided as
vertically resolved profiles, and integrated stratospheric OCS columns are
used in this study. The bias adjustment procedure requires no additional
information other than the satellite data product itself. In particular, the
method does not rely on atmospheric models with potentially unreliable
transport or chemistry parameterizations, and the results can be used
uncompromised to test and validate such models. It is expected to be
generally applicable when constructing climatologies of long-lived tracers
from sparsely and heterogeneously sampled satellite measurements. In the
first step of the adjustment procedure, a regression model is used to fit a
2-D surface to all available ACE-FTS OCS measurements as a function of
day-of-year and latitude. The regression model fit is used to calculate an
adjustment factor that is then used to adjust each measurement individually.
The mean of the adjusted measurement points of a chosen latitude range and
season is then used as the bias-free climatological value. When applying the
adjustment factor to seasonal averages in 30^{∘} zones, the maximum
spatiotemporal sampling bias adjustment was 11 % for OCS mixing ratios at
16 km and 5 % for the stratospheric OCS column. The adjustments were
validated against the much denser and more homogeneous OCS data product from
the limb-sounding MIPAS (Michelson Interferometer for Passive Atmospheric
Sounding) instrument, and both the direction and magnitude of the adjustments were in agreement with the adjustment of
the ACE-FTS data.

Creating climatologies of atmospheric trace-gas concentrations from satellite-based measurements is usually done by collecting available observations into latitudinal and monthly or seasonal bins and calculating the respective averages (e.g., Jones et al., 2012 and Koo et al., 2017, who compiled comprehensive trace-gas climatologies from Atmospheric Chemistry Experiment Fourier Transform Spectrometer, ACE-FTS, observations). For such methods, an evenly distributed coverage with no significant measurement gaps is desirable to avoid introducing sampling biases when calculating climatological means. Satellite-based instruments, however, perform measurements only on distinct orbits, leaving spatiotemporal measurement gaps. This inhomogeneous sampling in space and time can introduce significant biases when calculating climatological averages (Aghedo et al., 2011; Toohey et al., 2013) if they are calculated in the traditional way. The magnitude of the sampling bias depends on the frequency spectrum of the spatial and temporal structure to be averaged. The bias can become particularly large when analyzing data from solar occultation instruments that typically provide two measurements per orbit, leading to sparse and spatially structured data coverage. The annual solar occultation sampling pattern of ACE-FTS is shown in Fig. 1a.

Recent studies (Aghedo et al., 2011; Sofieva et al., 2014; Toohey et al.,
2013; Millán et al., 2016) have investigated the effects of sampling
biases for various satellite data products. Toohey et al. (2013) quantified
the sampling bias for a number of satellites measuring ozone and water vapor.
Depending on the trace gas, pressure level and latitude, they frequently
found sampling biases as high as 20 % and, in some cases, biases as high
as 40 % in regions with steep spatial and/or temporal gradients, such as
in the vicinity of the polar vortex in both hemispheres. In an effort to
quantify long-term trends in stratospheric ozone between 60^{∘} N and
60^{∘} S, Damadeo et al. (2018) used a regression model (described in
Damadeo et al., 2014) to estimate the sampling biases of several solar
occultation instruments. They found that these biases lead to about 1 %
per decade absolute percentage differences in derived ozone trends. A common
attribute of all previous methods used to estimate the sampling bias is that
they either use additional or multiple data products or atmospheric models
that use a priori knowledge of atmospheric transport and chemistry.

Here, we present a novel approach to adjust measurements to mitigate spatiotemporal sampling biases in climatological averages of carbonyl sulfide (OCS) measured by the solar occultation instrument ACE-FTS. The method does not employ dynamical or chemical atmospheric models (e.g., chemistry transport models, CTMs) that may reflect inaccurate or incomplete understanding of the underlying processes. This approach thus allows the uncompromised application of the adjusted data product to test and validate such models.

The approach is suitable to be used on measurements with a seasonal cycle that is smooth enough to be represented by a low-order expansion in Fourier series. Motivated by efforts to quantify the stratospheric burden of OCS from ACE-FTS observations (Kloss, 2017), we use OCS measurements from ACE-FTS. We introduce these measurements in Sect. 2, together with OCS measurements from Envisat–MIPAS that will be used to evaluate our method. Section 3 describes the method developed to estimate and adjust for spatiotemporal sampling biases in detail, which is then evaluated using the much denser and more homogeneous MIPAS data set in Sect. 4. Limitations of the method and its applicability to other tracers and regimes are discussed in Sect. 5.

## 2.1 ACE-FTS OCS observations

ACE-FTS is an infrared solar occultation spectrometer on the Canadian
satellite SCISAT and has been delivering data since 2004 (Bernath
et al., 2005). It measures in the spectral region from 750 to 4400 cm^{−1}
(2.2 to 13.3 µm) with a spectral resolution of 0.02 cm^{−1}. From
these data, mixing ratio values are derived for over 30 trace gases, together
with temperature and pressure in selected altitude regions. As a solar
occultation spectrometer, ACE-FTS retrieves only 30 profiles per day (two
per orbit, at sunrise and sunset, with orbits spaced about 24^{∘}
longitude apart) and thus exhibits significant data gaps in specific
regions, as shown in Fig. 1a. Measurements of the solar spectrum are made
at tangent altitudes from 150 km down to 5 km (or cloud top) at a vertical
resolution of 3 to 4 km. OCS mixing ratios are retrieved up to about 30 km altitude, above which the concentration typically drops below the detection
limit.

Here, we use version 3.6 ACE-FTS OCS volume mixing ratio measurements between
February 2004 and September 2016 (Boone et al., 2005; Boone, 2013), retrieved
from microwindows in the range 2036 to
2056 cm^{−1}. The average fitting error for OCS is a statistical error
for the retrieval from the fitting process and is between 1 % and 3 %
for the period considered here. A detailed analysis of OCS from ACE-FTS
version 2.2 is presented in Barkley et al. (2008). Stratospheric OCS columns
are calculated by vertically integrating concentration profiles from the
dynamical tropopause to the top of the retrieved OCS profiles, where mixing
ratios decrease to zero. The dynamical tropopause is defined as 380 K
potential temperature in the tropics and 3.5 PV units at latitudes poleward
of 30^{∘} and is calculated from ECMWF ERA-Interim data (Dee et al.,
2011).

When calculating climatological means of atmospheric trace-gas mixing ratios
at a given altitude, missing data over large parts of a region of interest
do not automatically prohibit climatological averaging: an average can
theoretically be created from a single data point, even though it may not
be very representative of the true mean over the chosen spatiotemporal
regime. On the contrary, when calculating the stratospheric OCS burden over a
particular latitude band and season, data coverage is critical irrespective
of sampling bias because data have to be gridded and added up rather than
being averaged. In our study, partial OCS columns are accumulated into
1^{∘} latitude bands over the chosen time period (e.g., one season:
either DJF, MAM, JJA or SON), and if there is more than one partial column in any
bin, the mean is calculated. When adding up the burden for the chosen
period, all 1^{∘} latitude bands have to contain realistic numbers,
which is rarely the case with the sparse ACE-FTS sampling pattern.
Therefore, bands with no profiles are either linearly interpolated from
adjacent latitude bands or, close to the poles, linearly extrapolated from
the two bands closest to the respective pole. If the gradient over the two
bands closest to the poles is approximately representative for the gradient
all the way to the pole, this procedure already accounts for potential
sampling biases in a simplified way.

## 2.2 OCS observations by Envisat–MIPAS

The Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) is a
mid-infrared spectrometer on board the ESA (European Space Agency) satellite
ENVISAT. It is a limb-sounding instrument analyzing the spectral radiance
emitted by atmospheric trace gases. From its sun-synchronous polar orbit,
MIPAS measures vertical profiles of multiple trace gases, including OCS. From
2002 to 2012 MIPAS operated in the spectral region between 685 and
2410 cm^{−1} (4.1–14.6 µm), at a resolution of 0.025 cm^{−1}
until 2004 and then at 0.065 cm^{−1} from 2005 onwards (Fischer et al.,
2008). The vertical sampling is around 3 km in the altitude range from about
5 to 150 km above the clouds. With a horizontal sampling of about 400 to
500 km along its orbit, MIPAS measured 1000 vertical profiles per day from
2002 to 2004 and 1400 between 2005 and 2012, covering almost all latitudes
from 88^{∘} S to 88^{∘} N. This is about 40 times as many
profiles as can be provided by ACE-FTS. OCS profiles are retrieved in
spectral windows between 839 and 876 cm^{−1} (Glatthor et al., 2015,
2017). The retrieval uncertainty for a single OCS scan is estimated to be
10 % between 10 and 15 km, 26 % at 20 km and increasing up to
195 % at 40 km altitude (Glatthor et al., 2015).

## 2.3 A regression model representation of the OCS field

Adjusting for spatiotemporal sampling biases requires some description of
the gap-free field. The field could be obtained, for example, from
CTM output, or, as mentioned above, from a satellite
data set providing higher spatial and temporal sampling. In this study, we
use the sparse data themselves to create a gap-free OCS field through the
application of a regression model fit. The regression model is used to fit a
continuous, smooth 2-D (time and latitude) surface either to OCS mixing
ratios at a given altitude or to fields of OCS partial columns. In a general
form with OCS represented by *X*, the regression model is as follows:

where the Fourier expansion in *N* accounts for the annual cycle in the
compound of interest and *d* is the day of the year. To accommodate the
latitudinal structure in OCS, each of the *a*_{i} coefficients are expanded
in a Legendre series of index *M*. Values for *N* and *M* must be carefully selected
to capture as much of the latitudinal and seasonal structure in OCS as
possible but must also avoid overfitting. For OCS, optimal fits were found
for *N*=1 and *M*=4, resulting in a total of 15 fit coefficients. The output
of Eq. (1), *X*_{est}, is visualized in Fig. 1b.
Applying fewer coefficients does not represent the OCS variability
sufficiently, while applying more coefficients showed minima and maxima that
are not observed in ACE-FTS as signs of overfitting.

A total of 12.5 years of ACE-FTS OCS mixing ratios at 16 km altitude are
used by the regression model to obtain the 15 fit coefficients (see Fig. 1a). A different set of fit coefficients is obtained from the regression
model when it is fitted to the stratospheric partial columns. Note that
because the regression model provides a value for any arbitrary latitude and
day of the year, it meets the “continuous” requirement for *X*_{est}. The
extent to which the regression model can capture the true underlying
morphology of the latitude vs. time OCS field depends on the OCS measurement
coverage; however, with too many gaps in the measurements the regression
model will be required to have lower *N* and *M* expansions and may not capture
subtleties in the OCS field to avoid overfitting and underfitting in areas of low
data coverage. As a solar occultation spectrometer with only 30 measurements
per day, ACE-FTS exhibits significant data gaps in specific regions (as seen
in Fig. 1a) that restrict the expansions in
Eq. (1) to *N*=1 and *M*=4.

This Fourier–Legendre fit only reflects the variability in the data with latitude and season that reoccur every year. Using the entire 12-year data record for the fit yields the most robust result for this purpose. Any additional variability in the spatiotemporal pattern, such as single events, trends, impact of El Niño, quasi-biennial oscillation, etc. is conserved; i.e., it will not be removed by the sampling bias correction. This might occur if the approximation were applied to each year individually.

The estimated regression fits for OCS mixing ratios at a given altitude or OCS partial columns describe the climatological and global state of OCS valid for the 12.5 years of available ACE-FTS observations. The coefficients for the regression fit are calculated by minimizing the sum of the squared differences between the original data (here the ACE-FTS observations) and the complete regression fit. This step in the regression is optimized by minimizing the differences simultaneously with respect to all coefficients used for the Fourier and Legendre expansions.

The regression model fit, together with its uncertainties, is therefore the best representation of the ACE measurements given the information provided (original measurements and number of Fourier and Legendre expansion settings) and due to the fitting process each fit coefficient has an associated uncertainty. Allowing the estimation of the effects of the coefficient uncertainties on the determined sampling biases (see Sect. 3) would require the application of bootstrapping techniques to create many different realizations of the determined OCS climatologies.

Using the gap-free field as described in Sect. 2.3 (*X*_{est}), adjusted
values can then be calculated as follows:

where *X*_{adj} is the OCS value adjusted for its representativeness of the
temporal–zonal mean, *X*_{orig} is the unadjusted OCS measurement,
$\stackrel{\mathrm{\u203e}}{{X}_{\mathrm{est}}}$ is an estimate of the true OCS temporal–zonal mean, and
*X*_{est}(lat,*t*) is the estimated OCS concentration at the location (note
that only the latitude information affects the *X*_{est} calculated by
Eq. 1) and time of the actual OCS measurement, sampled from the same
source as $\stackrel{\mathrm{\u203e}}{{X}_{\mathrm{est}}}$. *t* in Eq. (2) only represents season (day of
year), so there is only one combination of *X*_{est}(lat,*t*) and
$\stackrel{\mathrm{\u203e}}{{X}_{\mathrm{est}}}$ for any particular day of the year and latitude. This is used to
adjust corresponding data points in every single year of the data set. Note
that because the regression model provides a value for any arbitrary
latitude and day of the year, it meets the continuous requirement for
*X*_{est}. *X*_{est} does not have to be quantitatively correct – any biases
divide out in Eq. (2). There are several options for obtaining
*X*_{est}. The only prerequisites are that the *X*_{est} field represents the
true underlying temporal and spatial morphology of the OCS field (though, as
pointed out above, the values themselves do not need to be exact), and it
needs to be continuous in so far as spatiotemporal means can be calculated
from the *X*_{est} field without any spatiotemporal sampling gaps. The
procedure for adjusting the sampling bias when calculating an average mixing
ratio for a defined region over a given time period is illustrated in
Fig. 1. As examples, the method is explained in
detail for two representative latitude–time boxes: one at 30–60^{∘} N for JJA (red box in Fig. 1a–c) and one for 60 to
90^{∘} S for DJF (black box in Fig. 1a–c).

Figure 1a shows the OCS mixing ratio values from
12.5 years of ACE-FTS observations as a function of latitude and
time of year. The small year-to-year shifts in the latitudinal coverage of
ACE-FTS cause small offsets between the traces for individual years seen in
Fig. 1a. The red and black boxes in Fig. 1
indicate the selected time and latitude frames used to demonstrate the
application of this method. The boxes were chosen as examples for the
highest (red box) and lowest (black box) ACE-FTS latitude coverage. The
climatological mean OCS pattern, represented as the regression model fit to
the 12.5 years of ACE-FTS measurements, as a function of latitude and
season, is shown in Fig. 1b. Figure 2 shows the
same for the OCS stratospheric columns. Values for $\stackrel{\mathrm{\u203e}}{{X}_{\mathrm{est}}}$ for the
two example spatiotemporal means, indicated by the red (JJA, 30–60^{∘} N) and black (DJF, 60–90^{∘} S) boxes in
Fig. 1, can be calculated analytically without any spatiotemporal
sampling bias from the regression model fit.

ACE-FTS data (*X*_{orig}) for 2010 are shown in
Fig. 1c. OCS mixing ratios from the regression
model at the same latitudes and times as *X*_{orig} provide
*X*_{est}(lat,*t*), allowing the original data to be adjusted using Eq. (2). The
advantage of applying Eq. (2) rather than simply using
$\stackrel{\mathrm{\u203e}}{{X}_{\mathrm{est}}}$ as the zonal seasonal mean is that trends and
year-to-year variability observed in the data set are conserved. Equation (2) adjusts each measurement to be more indicative of the zonal seasonal
mean. Figure 1d shows the adjusted ACE-FTS data set
for the example of the red box in Fig. 1c. These data points, now adjusted
for their representativeness of the zonal seasonal mean, can then be used to
calculate a better estimate of the true zonal seasonal mean for the temporal
and spatial domain of the red box. It should be noted that only derived
averages are adjusted and not the individual data points. The average values
should be more representative for the mean of the compound within each
chosen box than without applying the adjustment method. The adjustment
should not be applied to individual data points for any other purpose. Clearly, the sampling bias is a systematic error type that only arises when
deriving spatiotemporal averages and it does not impair the quality of
individual data points at a particular location and time.

## 4.1 Case study results

As seen in Fig. 1a and shown in Barkley et al. (2008), OCS mixing ratios at a specific altitude (here 16 km) decrease with increasing latitude. The stratospheric partial column distribution, shown in Fig. 2, is quite different. Because both pressure and OCS mixing ratios rapidly decrease with height above the tropopause, the major fraction of the stratospheric OCS column resides in the few kilometers just above the tropopause, and thus the significant decrease in tropopause height with latitude leads to lower partial columns in the tropics and higher values closer to the poles. For the same reason, the annual cycle and day-to-day variability of the dynamical tropopause, rather than the annual cycle in OCS mixing ratios, largely controls the temporal variability of the stratospheric OCS partial columns, resulting in a more variable stratospheric OCS partial column field compared to the mixing ratio distribution shown in Fig. 1a, potentially confounding the adjustment procedure.

Figure 3 shows the frequency distribution of
ACE-FTS OCS measurements at 16 km from 2004 to 2016 for the two chosen
latitude bands and time regions. The green histograms show the distribution
of the original measurements and the blue histograms show the distribution
of the adjusted measurements using Eq. (2). Here, all individual
measurements are adjusted for biases in the zonal seasonal mean. The shifts
in the mean values and contraction of the standard deviations provide useful
summary metrics of the effects of the applied spatiotemporal sampling bias
adjustments. The distribution of all 12 years of data between 60 and 90^{∘} S in the southern hemispheric summer (DJF) is shown in
Fig. 3a. This example was chosen because it
displays the highest shift of 28 pptv or 11 % in the mean OCS mixing
ratios after applying the adjustment. The decrease in the mean value from
293 to 265 pptv in the latitude band from 60 to 90^{∘} S
can be explained by the fact that there are large measurement gaps at the
southernmost latitudes, especially in DJF, and no measurements between
85 and 90^{∘} S. Decreasing mixing ratios towards the
poles, and measurement gaps where lower mixing ratios are expected, lead to
a high biased mean over the chosen box (black box
Fig. 1a) when only averaging the available
measurements. The true mean over the entire box is expected to be lower than
the mean of only the available data. Thus, the shift of the mean to a lower
value seen in Fig. 3a qualitatively represents an
adjustment of the simple data average towards the true mean of OCS mixing
ratios over the entire box, and therefore at least a partial remedy for the
sampling bias. Because Eq. (2) generally shifts each data point towards
the mean of the distribution, the standard deviation of the adjusted data
will be lower than the standard deviation of the original data set. This is
because in the original data set both measurement uncertainties and actual
variability inside the considered box add on to the resulting standard
deviation. Note that the observed reduction in the standard deviation (8 pptv in our black box example) reflects neither a reduction of the
statistical uncertainty associated with the derived mean nor a reduced
variability over the entire box compared to only the available data. In
fact, if actual observations covering the entire box were available, then
their standard deviation would most likely be higher than that of the
limited data because values would vary over a wider range of mixing ratios.

The histograms in Fig. 3b show the data
distribution for the red box in Fig. 1, i.e.,
between 30 and 60^{∘} N in northern hemispheric summer
(JJA). Here, the adjustment method yields only a small shift in the average
of 6 pptv (1.5 %) because the entire chosen latitude range is covered by
ACE-FTS measurements, which are therefore much more representative of the
true mean value of the entire box compared to the previous example. For the
red box, the original measurement values are more or less evenly distributed
around the regression model mean, and Eq. (2) shifts data towards the
mean from both sides. Consequently, the reduction in the standard deviation
by 32 % is larger than in the previous example.

To assess whether the methodology quantitatively adjusts the sampling bias, a validation against an independent data set was performed and will be described in the following section.

## 4.2 A quantitative evaluation using MIPAS observations

To quantify the sampling bias arising from the sparse ACE-FTS sampling for a
chosen latitude–time box, the OCS data product from the MIPAS instrument,
with its much denser data coverage, is used. Because of the dense sampling
pattern and almost complete latitude coverage (down to 88^{∘} S), the
sampling bias of MIPAS is negligible compared to that of ACE-FTS.
Figure 4 visualizes how much denser the MIPAS
sampling is compared to that of ACE-FTS globally (Fig. 4a) and in a
chosen latitude–time box (Fig. 4b). Figure 4a shows that both seasonal
evolution and latitudinal variability of OCS mixing ratios at 16 km altitude
are much better resolved by MIPAS than by ACE-FTS (cf. Fig. 1a).
Overall, seasonality and mixing ratio distribution agree well with the
regression model output in Fig. 1b. Naturally, the regression tends to
produce smoother gradients than the denser observations. For example, the observed
sharp decline in OCS at the southernmost latitudes in June (Fig. 4a) is
smeared out in the regression (Fig. 1b). A notable difference between the
MIPAS observations and the regression output is present at lower latitudes:
while MIPAS OCS shows maximum values in the subtropics around 30^{∘}
in both hemispheres and a moderate local minimum in the tropics, the
regression places the maximum close to the Equator (with some seasonal
variance) and shows decreasing OCS with latitude over all latitude ranges.
The regression clearly inherits this behavior from the individual
ACE traces shown in Fig. 1a, so this appears to be an instrumental
difference between the MIPAS and ACE-FTS OCS data products. A systematic
difference of 75 to 100 ppt lower OCS observed by ACE-FTS compared to MIPAS
in the 14 to 20 km altitude region has been noted by Glatthor et al. (2017).

For the best possible quantitative evaluation, the spatiotemporal box in
the ACE-FTS measurements with the lowest ACE-FTS coverage (Fig. 4b) and
the highest observed sampling bias is chosen: December 2009–February
2010, 60 to 90^{∘} S (i.e., the black box in
Fig. 1). We compare the average of all MIPAS
observations in a particular box to the average of only those MIPAS
observations that are roughly equivalent in space and time to the available
ACE-FTS observations in that box (i.e., only MIPAS measurements from 1 December 2009 to 5 January 2010 between 60 and
68^{∘} S are used). Comparing all ACE-FTS and MIPAS measurement
points between December 2009 and February 2010 in Fig. 4b again shows how
much denser the MIPAS sampling is compared to ACE-FTS. Like Glatthor
et al. (2017), we also find the ACE-FTS mean value between 60
and 90^{∘} S to be 115 ppt (28 %) lower than the mean value of
MIPAS. Therefore, relative rather than absolute mixing ratio differences are
used to quantitatively describe the sampling bias in the comparison below.

Using the chosen spatiotemporal box (black box in
Fig. 1), we show in
Fig. 5 histograms of the relative frequency
distributions of all MIPAS OCS mixing ratios at 16 km observed between
60 and 90^{∘} S in DJF 2009/10 and of only those MIPAS
observations roughly covering the ACE-FTS sampling locations in that
particular box (i.e., only MIPAS measurements from 1 December 2009
to 5 January 2010 between 60 and 68^{∘} S are
used). The histograms in Figs. 3a and
5 look similar in terms of shape and
relative position. When we compare the two histograms in
Fig. 5 it becomes apparent that extending the
sampling space over the entire box (down to 88^{∘} S) changes the
distribution by adding additional lower mixing ratio values that were
measured at the southernmost latitudes. The difference between the mean
values of both histograms is 46 pptv, equivalent to a relative deviation of
about 11 %, with the average of the full data set being lower. Thus, the
difference has the same direction and magnitude as the shift in mean value
when using the adjusted ACE-FTS data compared to the original ACE-FTS data
(Fig. 3a). For this example, the performed
sampling bias adjustment of the climatological mean from ACE-FTS data
appears to work not only qualitatively but also quantitatively.

## 4.3 Significance

To investigate the scientific relevance and applicability of the proposed sampling bias adjustment, climatologies for the seasonal stratospheric OCS columns and OCS mixing ratios at 16 km altitude are calculated with and without sampling bias adjustments.

Due to the satellite orbit, ACE-FTS does not measure in the latitude ranges
85–90^{∘} N and 85–90^{∘} S,
which can lead to a higher sampling bias close to the poles compared to the
tropics and midlatitudes where mostly all latitudes are covered within each
season. Additionally, OCS mixing ratios exhibit lower stratospheric
variability in the tropics. Therefore, the sampling bias is higher towards
the poles and lower in the tropics. For the majority of points, from
60^{∘} N to 60^{∘} S (see Fig. 1), the modifications made
using Eq. (2) have only a minimal effect and is within the measurement
uncertainty calculated using the ACE-FTS error estimates
(see Toohey et al., 2010, for
details on ACE-FTS error estimation).

The largest difference between the seasonal mean calculated using original
OCS measurements and the seasonal mean calculated using the adjusted OCS
measurements occurs in the latitude band 60–90^{∘} S.
Figure 6 shows the seasonal mean of the
stratospheric column (top) and of mixing ratios at 16 km (middle) for this
latitude band as calculated from the adjusted data set in red and the
original ACE-FTS measurements in blue as well as the MIPAS mixing ratio
equivalents (bottom). Due to the lower spatial coverage before 2008, only
MIPAS data between 2008 and 2011 are considered. The relative difference
between the mean values from the original and adjusted data set varies
between 0.1 % and 5.1 % for the stratospheric columns (for the 5.1 %
difference, 1.29 kg km^{−2} instead of 1.36 kg km^{−2})
and between 2 % and
28 % for OCS mixing ratios at 16 km. The largest adjustment of 28 % was
observed in SON 2011, and, unlike in virtually all other years and seasons,
the mean mixing ratio was adjusted upwards from 195 to 250 ppt. In 2011,
the sampling of the 60–90^{∘} S latitude band in the
SON (shown in Fig. 7c) was even more sparse than in all other years
(Fig. 7b) and the few valid data points are all located at the high-latitude edge of the region where the regression model predicts lowest OCS
mixing ratios (Fig. 7a). In addition, the OCS mixing ratios that were
actually measured in SON were significantly higher in 2011 than in other
years (compare Fig. 7b and c). The cause of these elevated OCS mixing
ratios is currently unclear. The important thing to note in the context of
our sampling bias correction is that the anomaly contained in the original
data is conserved in the adjusted mean.

As described in Sect. 2.1, the procedure for the OCS stratospheric column
integration already reduces the sampling bias by extrapolating OCS data into
empty latitude bands. As a consequence, the sampling bias adjustment for the
stratospheric burden is lower than for the mixing ratios. In this particular
case (Fig. 6), there is a marginal impact on the amplitude of the seasonal
cycle as the adjustment most significantly reduces the austral summer OCS
maximum at 16 km in virtually all years. No significant trends are apparent
in either the original or adjusted data ($-\mathrm{1.9}\pm \mathrm{2.3}\times {\mathrm{10}}^{-\mathrm{3}}$ kg km^{−2} per year for the ACE-FTS stratospheric column and
$-\mathrm{4.3}\pm \mathrm{2.5}\times {\mathrm{10}}^{-\mathrm{3}}$ kg km^{−2} per year for the corrected column;
$-\mathrm{0.2}\pm \mathrm{0.8}$ ppt per year for the ACE-FTS mixing ratios and 1.1±0.9 ppt per year for the corrected mixing ratios; 0.05±0.43 ppt per year
for the MIPAS equivalent chosen according to the ACE-FTS sampling and
0.01±0.46 ppt per year for the full mixing ration data set).
Theoretically, if a sparse sampling pattern reoccurs each year (as for ACE),
then the sampling bias does not affect long-term (seasonal) trends but
absolute climatological averages (such as the total burden). Trends related
to dynamic changes in one particular region and season would also show up in
both data sets if data from that region and season existed.

In this study, we present a method to adjust the spatiotemporal sampling
bias in climatologies calculated from sparsely sampled satellite
observations without requiring additional observational evidence beyond the
data set used. The fact that this method is exclusively based on
observations and is independent of parameterization of atmospheric models
makes it accessible for potential sampling-bias-corrected climatologies used
to test and improve such atmospheric models. Generally, the method can be
applied to any atmospheric compound or property of which the variability
follows defined seasonal and latitudinal patterns and can therefore be
sufficiently well described using a regression model approach. The method
has been shown to quantitatively adjust the sampling bias in seasonal
30^{∘} latitude band climatologies of OCS mixing ratios at 16 km altitude and OCS stratospheric column constructed from ACE-FTS observations.
Our results show that, at least for OCS, the influence of the sampling bias
is too small to significantly alter the scientific conclusions of
climatological trends.

ACE-FTS, with its solar occultation viewing geometry, and therefore sparse and heterogeneous sampling pattern, is particularly sensitive to the occurrence of a sampling bias when calculating climatologies (Toohey et al., 2013). OCS with its atmospheric variability in the stratosphere and upper troposphere limited to large spatial (100s of km) and temporal (i.e., seasons) scales (Barkley et al., 2008) provides an ideal tracer to investigate and demonstrate the sampling bias adjustment method. Note that the method would not work in the presented form (i.e., with a relatively simple regression model that is reasonably well determined by the data) for an OCS data product reflecting the lower tropospheric and boundary layer variability with complex regional patterns and to some extent distinct day–night differences such as the Infrared Atmospheric Sounding Interferometer (IASI) tropospheric OCS product described by Vincent and Dudhia (2017).

In the stratosphere, and often in the upper troposphere–lower stratosphere (UTLS), many long-lived trace
gases (e.g., N_{2}O, chlorofluorocarbons) behave qualitatively similar
to OCS with variabilities on similar scales. We expect the method to work
well in the construction of climatologies for such tracers, explicitly
including most compounds for which climatologies from ACE-FTS data have been
compiled by Jones et al. (2012) and Koo et al. (2017). Toohey et al. (2013)
addressed the sampling bias issue specifically for ozone (O_{3}) and
water vapor (H_{2}O) measured by a wide range of satellites.
Considering that the variability of both gases in the stratosphere, and to a
large extent the UTLS, is dominated by distinct altitudinal, latitudinal and
seasonal gradients, we expect a regression model such as the one described in
Sect. 2.3 to adequately capture the largest part of this variability and,
consequently, our sampling bias correction method to be applicable for both
gases. Theoretically, with a denser satellite data product and a more
elaborate version of the regression model that captures longitudinal and
other variabilities, the sampling bias correction scheme can be extended to
climatologies that include other dependencies than just
latitude and season. A detailed
investigation and application of the method to O_{3}, H_{2}O
and other gases is beyond this proof-of-concept study and remains to be
investigated in the future.

The IMK/IAA-generated (Institute of Meteorology and Climate Research/Instituto de Astrofísica de Andalucía) MIPAS data used in this study are available for registered users at http://www.imk-asf.kit.edu/english/308.php (last access: 29 March 2019). The ACE-FTS level 2 (V3.6) data used in this study are available upon request at https://databace.scisat.ca/l2signup.php (last access: 29 March 2019).

CK and MvH designed the research and analyzed and interpreted the results. GEB had the original idea upon which the research presented in this paper is based. MH and KAW provided data and data analysis. MR, JU, BH, SK and GEB contributed ideas and were involved in discussion. Everyone contributed in writing the paper.

The authors declare that they have no conflict of interest.

Measurements used in this study are from the ACE-FTS instrument and MIPAS
together with the dynamical tropopause data from ECMWF. The Atmospheric
Chemistry Experiment (ACE), also known as SCISAT, is a Canadian-led mission
mainly supported by the Canadian Space Agency and the Natural Sciences and
Engineering Research Council of Canada. MIPAS spectra used for deriving OCS
vertical profiles at the Karlsruhe Institute of Technology have been provided by
the European Space Agency. Corinna Kloss has been supported
by the graduate school of Forschungszentrum Jülich HITEC (Helmholtz
Interdisciplinary Doctoral Training in Energy and Climate Research) and
ANR-17-CE01-0015 (TTL-Xing). Marc von Hobe was supported by the German
Federal Ministry of Education and Research through the project ROMIC-SPITFIRE
(BMBF-FKZ: 01LG1205C). The authors thank Kage Nesbit, Ben Lewis and Christian
Rolf for their programming contribution.

The
article processing charges for this open-access publication
were covered by a Research Centre of the Helmholtz Association .

This paper was edited by Justus Notholt and reviewed by two anonymous referees.

Aghedo, A. M., Bowman, K. W., Shindell, D. T., and Faluvegi, G.: The impact of orbital sampling, monthly averaging and vertical resolution on climate chemistry model evaluation with satellite observations, Atmos. Chem. Phys., 11, 6493–6514, https://doi.org/10.5194/acp-11-6493-2011, 2011.

Barkley, M. P., Palmer, P. I., Boone, C. D., Bernath, P. F., and Suntharalingam, P.: Global distributions of carbonyl sulfide in the upper troposphere and stratosphere, Geophys. Res. Lett., 35, L14810, https://doi.org/10.1029/2008GL034270, 2008.

Bernath, P. F., McElroy, C. T., Abrams, M. C., Boone, C. D., Butler, M., Camy-Peyret, C., Carleer, M., Clerbaux, C., Coheur, P. F., Colin, R., DeCola, P., Bernath, P. F., McElroy, C. T., Abrams, M. C., Boone, C. D., Butler, M., Camy-Peyret, C., Carleer, M., Clerbaux, C., Coheur, P. F., Colin, R., DeCola, P., DeMaziere, M., Drummond, J. R., Dufour, D., Evans, W. F. J., Fast, H., Fussen, D., Gilbert, K., Jennings, D. E., Llewellyn, E. J., Lowe, R. P., Mahieu, E., McConnell, J. C., McHugh, M., McLeod, S. D., Michaud, R., Midwinter, C., Nassar, R., Nichitiu, F., Nowlan, C., Rinsland, C. P., Rochon, Y. J., Rowlands, N., Semeniuk, K., Simon, P., Skelton, R., Sloan, J. J., Soucy, M. A., Strong, K., Tremblay, P., Turnbull, D., Walker, K. A., Walkty, I., Wardle, D. A., Wehrle, V., Zander, R., and Zou, J.: Atmospheric Chemistry Experiment (ACE): Mission overview, Geophys. Res. Lett., 32, L15S01, https://doi.org/10/1029/2005GL022386, 2005.

Boone, C. D.: Version 3 Retrievals for the Atmospheric Chemistry Experiment Fourier Transform Spectrometer (ACE-FTS) The Atmospheric Chemistry Experiment ACE at 10: A Solar Occultation Anthology, edited by: Bernath, P. F., A. Deepak Publishing, Hampton, Virginia, USA, 103–127, 2013.

Boone, C. D., Nassar, R., Walker, K. A., Rochon, Y., McLeod, S. D., Rinsland, C. P., and Bernath, P. F.: Retrievals for the atmospheric chemistry experiment Fourier-transform spectrometer, Appl. Opt., 44, 7218–7231, 2005.

Damadeo, R. P., Zawodny, J. M., and Thomason, L. W.: Reevaluation of stratospheric ozone trends from SAGE II data using a simultaneous temporal and spatial analysis, Atmos. Chem. Phys., 14, 13455–13470, https://doi.org/10.5194/acp-14-13455-2014, 2014.

Damadeo, R. P., Zawodny, J. M., Remsberg, E. E., and Walker, K. A.: The impact of nonuniform sampling on stratospheric ozone trends derived from occultation instruments, Atmos. Chem. Phys., 18, 535–554, https://doi.org/10.5194/acp-18-535-2018, 2018.

Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., van de Berg, L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S. B., Hersbach, H., Holm, E. V., Isaksen, L., Kallberg, P., Kohler, M., Matricardi, M., McNally, A. P., Monge-Sanz, B. M., Morcrette, J. J., Park, B. K., Peubey, C., de Rosnay, P., Tavolato, C., Thepaut, J. N., and Vitart, F.: The ERA-Interim reanalysis: configuration and performance of the data assimilation system, Q. J. Roy. Meteorol. Soc., 137, 553–597, 2011.

Fischer, H., Birk, M., Blom, C., Carli, B., Carlotti, M., von Clarmann, T., Delbouille, L., Dudhia, A., Ehhalt, D., Endemann, M., Flaud, J. M., Gessner, R., Kleinert, A., Koopman, R., Langen, J., López-Puertas, M., Mosner, P., Nett, H., Oelhaf, H., Perron, G., Remedios, J., Ridolfi, M., Stiller, G., and Zander, R.: MIPAS: an instrument for atmospheric and climate research, Atmos. Chem. Phys., 8, 2151–2188, https://doi.org/10.5194/acp-8-2151-2008, 2008.

Glatthor, N., Hopfner, M., Baker, I. T., Berry, J., Campbell, J. E., Kawa, S. R., Krysztofiak, G., Leyser, A., Sinnhuber, B. M., Stiller, G. P., Stinecipher, J., and von Clarmann, T.: Tropical sources and sinks of carbonyl sulfide observed from space, Geophys. Res. Lett., 42, 10082–10090, 2015.

Glatthor, N., Höpfner, M., Leyser, A., Stiller, G. P., von Clarmann, T., Grabowski, U., Kellmann, S., Linden, A., Sinnhuber, B.-M., Krysztofiak, G., and Walker, K. A.: Global carbonyl sulfide (OCS) measured by MIPAS/Envisat during 2002–2012, Atmos. Chem. Phys., 17, 2631–2652, https://doi.org/10.5194/acp-17-2631-2017, 2017.

Jones, A., Walker, K. A., Jin, J. J., Taylor, J. R., Boone, C. D., Bernath, P. F., Brohede, S., Manney, G. L., McLeod, S., Hughes, R., and Daffer, W. H.: Technical Note: A trace gas climatology derived from the Atmospheric Chemistry Experiment Fourier Transform Spectrometer (ACE-FTS) data set, Atmos. Chem. Phys., 12, 5207–5220, https://doi.org/10.5194/acp-12-5207-2012, 2012.

Kloss, C.: Carbonyl Sulfide in the stratosphere: airborne instrument development and satellite based data analysis, Ph.D., Chemistry Department, Bergische Unviersität Wuppertal, Wuppertal, 2017.

Koo, J. H., Walker, K. A., Jones, A., Sheese, P. E., Boone, C. D., Bernath, P. F., and Manney, G. L.: Global climatology based on the ACE-FTS version 3.5 dataset: Addition of mesospheric levels and carbon-containing species in the UTLS, J. Quant. Spectrosc. Ra., 186, 52–62, 2017.

Millán, L. F., Livesey, N. J., Santee, M. L., Neu, J. L., Manney, G. L., and Fuller, R. A.: Case studies of the impact of orbital sampling on stratospheric trend detection and derivation of tropical vertical velocities: solar occultation vs. limb emission sounding, Atmos. Chem. Phys., 16, 11521–11534, https://doi.org/10.5194/acp-16-11521-2016, 2016.

Sofieva, V. F., Kalakoski, N., Päivärinta, S.-M., Tamminen, J., Laine, M., and Froidevaux, L.: On sampling uncertainty of satellite ozone profile measurements, Atmos. Meas. Tech., 7, 1891–1900, https://doi.org/10.5194/amt-7-1891-2014, 2014.

Toohey, M., Strong, K., Bernath, P. F., Boone, C. D., Walker, K. A., Jonsson, A. I., and Shepherd, T. G.: Validating the reported random errors of ACE-FTS measurements, J. Geophys. Res.-Atmos., 115, D20304, https://doi.org/10.1029/2010JD014185, 2010.

Toohey, M., Hegglin, M. I., Tegtmeier, S., Anderson, J., Anel, J. A., Bourassa, A., Brohede, S., Degenstein, D., Froidevaux, L., Fuller, R., Funke, B., Gille, J., Jones, A., Kasai, Y., Kruger, K., Kyrola, E., Neu, J. L., Rozanov, A., Smith, L., Urban, J., von Clarmann, T., Walker, K. A., and Wang, R. H. J.: Characterizing sampling biases in the trace gas climatologies of the SPARC Data Initiative, J. Geophys. Res.-Atmos., 118, 11847–11862, 2013.

Vincent, R. A. and Dudhia, A.: Fast retrievals of tropospheric carbonyl sulfide with IASI, Atmos. Chem. Phys., 17, 2981–3000, https://doi.org/10.5194/acp-17-2981-2017, 2017.

sampling biasand investigate its influence on derived long-term trends. The method is illustrated and validated for a long-lived trace gas (carbonyl sulfide), and it is shown that the influence of the sampling bias is too small to change scientific conclusions on long-term trends.