We evaluate the uncertainties of methane optimal estimation retrievals from single-footprint thermal infrared observations from the Atmospheric Infrared Sounder (AIRS). These retrievals are primarily sensitive to atmospheric methane in the mid-troposphere through the lower stratosphere
(

Advances in remote sensing and global transport modeling and an increasingly dense
network of surface measurements have led to substantive advances in evaluating the components and
error structure of the global methane budget and the processes controlling this budget. For example,
Frankenberg et al. (2005, 2011) showed that total column methane estimates could be derived from
near-infrared (NIR) radiances at

The goal of this paper is to evaluate the uncertainties of new methane retrievals from AIRS single-footprint, original (non-cloud-cleared) radiances using aircraft measurements from the HIAPER
Pole-to-Pole Observations (HIPPO) and Atmospheric Tomography Mission (ATom) campaigns and National
Oceanic and Atmospheric Administration (NOAA) Global Monitoring Laboratory (GML) aircraft network,
taken between 2006 and 2017. Evaluation of these uncertainties is needed to determine if AIRS
methane data can characterize and improve errors in global chemistry transport models. For example,
a recent paper by Zhang et al. (2018) combined synthetic CrIS and TROPOMI methane retrievals and a
global inversion system to show that it would be possible to infer the north–south gradient of OH,
the primary methane sink, to within 10 %, and temporal variations of OH concentrations. However,
knowing the accuracy of the methane data is important for inferring the uncertainty in the
spatiotemporal variability of

In this paper we present an evaluation of methane retrievals derived from AIRS single-footprint
radiances. We follow an optimal estimation approach (Rodgers, 2000), based on the heritage of the
Aura Tropospheric Emission Spectrometer (TES) algorithm (Bowman et al., 2006), now called the
MUlti-SpEctra, MUlti-SpEcies, MUlti-Sensors (MUSES) algorithm (Worden et al., 2006, 2013b; Fu et al., 2013, 2016, 2018,
2019). The MUSES algorithm uses radiances from one or multiple instruments to quantify and
characterize geophysical parameters derivable from those radiances. The optimal estimation method
provides the vertical sensitivity (i.e., the averaging kernel matrix) and estimates of the
uncertainties due to noise and to radiative interferences such as temperature,

The quantities of interest that we validate in this paper are (a) the AIRS

The AIRS instrument is a nadir-viewing, scanning infrared spectrometer (Aumann et al.,
2003; Pagano et al., 2003;
Irion et al., 2018; DeSouza-Machado et al., 2018) that is onboard the NASA Aqua satellite and was
launched in 2002. AIRS measures the thermal radiance between approximately 3–12

Measurements from the HIPPO (Wofsy et al., 2012) and ATom (Wofsy et al., 2018) aircraft campaigns
provide excellent datasets for satellite validation, due to their wide latitudinal coverage, the
large vertical extent of the profiles (up to 9–12

We compare AIRS to observations from the ATom aircraft campaigns 1–4 (Wofsy et al., 2018). This
comparison provides validation

Location of aircraft profile measurements used for validation. The upside-down triangles
show HIPPO,

The NOAA GML aircraft network observations (Cooperative Global Atmospheric Data Integration Project,
2019) are taken twice per month at fixed sites primarily in North America and also Rarotonga (RTA)
at 21

Figure 1 shows the locations of all the aircraft data used for the comparisons described in this paper. Most of the ocean measurements are from the HIPPO and ATom campaigns that span a range of latitudes, whereas most of the land measurements are taken over North America.

Worden et al. (2012, 2019) describe in detail the forward model and retrieval approach used for
estimating methane from TES and AIRS radiances. The radiative transfer forward model used for this
work is the Optimal Spectral Sampling (OSS) fast radiative transfer model (RTM) (Moncet et al., 2008, 2015). In particular, radiances from the thermal infrared bands at 8 and
12

Here are the recommended cutoffs to select good quality and sensitivity flagging for AIRS

The radiance residual rms is

The absolute value of the radiance residual mean is

The absolute value of KdotdL is

The surface temperature minus the near-surface atmospheric temperature
value is

Cloud top pressure is

Cloud optical depth is

Cloud variability vs. wavenumber is

The degrees of freedom are

The tropospheric degrees of freedom are

The stratospheric degrees of freedom are

The predicted error on the column above 750

Detailed descriptions of the use of optimal estimation (OE) to infer trace gas profiles from remote
sensing radiance measurements' retrieval is included in numerous publications (e.g., Rodgers, 2000;
Worden et al., 2006; Bowman et al., 2006). However, we present a partial description here as it is
relevant for comparing the AIRS methane retrievals and aircraft profile measurements. As discussed
in Rodgers (2000), the estimate for a trace gas profile inferred (or inverted) from a radiance
spectrum is described by the following linear equation:

The rows of an averaging kernel for

The degrees of freedom, DOFs, describe the sensitivity of

Finally, we look at the quantity of interest,

A challenge in comparing the satellite-based AIRS measurements to aircraft data is that the aircraft
will typically measure only a section of the atmosphere (e.g., the troposphere), whereas the AIRS
measurements are sensitive, to varying degrees (see Fig. 2), to the entire atmosphere. To account
for these differences, we divide the atmosphere into two parts

We compare our AIRS observation,

Equation (7a) is the predicted bias between

For the purpose of evaluating the AIRS methane measurement uncertainties and comparing the AIRS
methane to aircraft in situ measurements, we refer to the four terms on the right side of Eq. (7b)
as follows:

Figure 3 shows the predicted errors for the AIRS partial column

Calculated errors for AIRS measurements shown in this paper. The total error shown is the smoothing error (Eq. 5) plus the observation error (Eq. 7b). The measurement error is the last term of Eq. (7b) and the only fully random error.

A typical aircraft profile will only measure part of the troposphere and rarely measure into the stratosphere. However, the AIRS methane profile measurements are sensitive to methane variations over the whole atmosphere, as shown by the averaging kernel matrix in Fig. 2. Similarly, the true state in the troposphere influences retrieved values in the stratosphere. Options for dealing with this are (a) extending the true profile with the AIRS prior or (b) extending the true profile with a model profile value.

This section estimates this uncertainty by calculating the difference of

Simulated comparison between AIRS and aircraft in which the LMDz model

These differences provide an estimate for how knowledge error in the stratosphere projects to
uncertainties in our methane retrievals. For example, this uncertainty varies with latitude, similar
to the residual bias between the AIRS estimate and aircraft (next section). Furthermore, the
variability over small latitudinal ranges of 10

The methane profile has a strong variable negative vertical gradient in the stratosphere. Models in
general have a positive bias in the extratropical stratosphere (Patra et al., 2011). In GEOS-Chem

AIRS

Comparison of AIRS methane VMR to aircraft for all HIPPO comparisons over the partial
column

Bias vs. pressure with and without bias correction. The bias correction was developed on HIPPO-4 and tested on HIPPO-4; HIPPO-1, HIPPO-2, HIPPO-3, and HIPPO-5; and the NOAA aircraft network.

We use HIPPO-4 observations to set a bias correction which we then evaluate with the other HIPPO
campaigns and NOAA aircraft network data. HIPPO-4 was selected as it covers a wide range of
latitudes and so that the bias correction can be set and tested with two independent datasets. To
set the bias, we use Eq. (6b) to estimate the aircraft observation as seen by AIRS then compare
this to AIRS observations. The result (by pressure level) is shown in Table 1. Then a bias was
applied to AIRS using Eq. (8), with the bias term

Figure 6 shows the effect of bias correction on the average of all HIPPO (1, 2, 3, and 5) AIRS profiles. The bias correction improves the mean AIRS–aircraft difference and improves the pressure-dependent skew in the bias (Table 1). The HIPPO data are shown before and after the AIRS averaging kernel is applied (using Eq. 6b), which has the effect of bringing the HIPPO observations towards the AIRS prior. This is to match the imperfect sensitivity of satellite-based observations, which are similarly influenced by the prior.

Example of the effect of bias correction on the AIRS profile from averaged HIPPO-1, HIPPO-2, HIPPO-3, and HIPPO-5. The blue lines show the AIRS methane profile before (dotted) and after (solid) bias correction. The black lines show the HIPPO measurements before (dotted) and after the averaging kernel is applied (solid).

Figure 5 shows a comparison between all AIRS measurements within 50

Figure 7 shows the same comparisons as Fig. 5 after bias correction (described in Sect. 3.4). The
mean bias is 1

Same as Fig. 5 but after bias correction. The ocean has

Comparison of daily averaged AIRS to HIPPO measurements

Comparison at TGC (27.7

We also compare AIRS

Satellite data are typically averaged in order to improve the precision of a comparison between data
and model. However, as shown in the previous figure, these data contain errors that vary with
latitude. For example, knowledge error of the true profile in the stratosphere as well as errors in
the jointly retrieved AIRS temperature and water vapor retrievals have both a random and a bias
component, both of which vary with latitude. The bias component is approximately the same for all
AIRS methane measurements taken on the same day within 50

On the other hand, averaging AIRS data seasonally can reduce the error further because geophysical
errors such as temperature and water vapor vary over longer timescales. We demonstrate this aspect
of the AIRS uncertainties by comparing averaged AIRS data to the NOAA aircraft methane profiles
taken off the coast near Corpus Christi, Texas (27.7

We look at daily averages vs. aircraft data and find a similar result to that found with
comparisons to ATom and HIPPO: daily averages have much larger errors than would be predicted if
random errors are assumed. The SD of AIRS minus aircraft at TGC is 24

The NOAA aircraft measurements are usually taken about twice per month. The SD of monthly AIRS
average minus aircraft is 8.2

We average over 3-month scales, where averages must have at least 3

We average matched pairs within each month from any year. AIRS minus aircraft values for these averages have a SD of 5.9

To summarize, averaging AIRS observations within 1 d reduces the error vs. aircraft, but
correlated errors prevent daily averaged errors from dropping below 11.5

Table A3 in Appendix A shows the single-observation SD for all NOAA aircraft sites. The ocean
vs. land observations show similar values, with land and ocean SDs within 2

The bias is estimated by calculating the mean bias for each campaign or station separately then
calculating the mean and SD for all campaigns/stations. The bias vs. HIPPO is

We validate single-footprint AIRS methane by comparing 27 000 AIRS methane retrievals to 396
aircraft profiles from the HIPPO campaign, 719 profiles from the NOAA GML aircraft network, and 289
aircraft profiles from the ATom campaign, taken across a range of latitudes, longitudes, and
times. The AIRS methane retrievals are derived using the MUSES optimal estimation algorithm that has
previously been applied to Aura TES radiances (e.g., Fu et al., 2013). After adjusting the aircraft
profile to account for the AIRS sensitivity (using the averaging kernel and a priori profile), we
compare the mean methane value over the aircraft profile to the mean methane from the AIRS profile
over the same altitude (or pressure) range. We use a subset of validation data to derive a
pressure-dependent bias correction on the order of

After applying the bias correction, from Eqs. (8) and (9), the rms difference between the AIRS and
aircraft data of the partial column

We quantify the AIRS minus validation SD for single observations, daily averages (within
50

These results can be compared to AIRS v6 validation by Xiong et al. (2015), which validated AIRS

We characterize the bias vs. validation data by station, campaign, and pressure level. Table A1
shows biases vs. validation data, after bias correction with Eqs. (8) and (9). In the HIPPO
comparisons, the biases are generally smaller than about 10

The NOAA aircraft network comparisons are sorted by site. Many NOAA aircraft locations are at
land–ocean interfaces, allowing a more direct comparison of the land–ocean biases. On average, the
AIRS land observations are 0–5

Table A2 shows the mean bias for AIRS minus NOAA GML aircraft for land and ocean AIRS
observations. The different rows extend the aircraft using the AIRS prior, the CarbonTracker model
(from

Table A3 shows the SD for AIRS observations minus validation data for land–ocean for different
pressure ranges for both single observations and AIRS averages. The mean bias at each site is
subtracted prior to calculating the SD. This table shows the SDs for single observations and
averaged quantities. The predicted error for the daily average is the observation error divided by
the square root of the number of observations and is much smaller than the actual SD, indicating
correlated errors. The predicted error for the monthly, 3-month, and seasonal cycle averages is the
daily SD divided by the square root of the number of days averaged and

Bias by campaign, station, land–ocean, and pressure.

Change in the mean bias of the partial column matching the NOAA aircraft observation using different aircraft profile extensions from the top aircraft measurement to the top of the atmosphere.

SD of AIRS minus validation for land–ocean observations and different pressures or pressure ranges. Rows 1–2 show the SD for single observation, rows 3–4 show the predicted observation error, rows 5–8 show the SD for daily averages, rows 9–10 show the predicted error for daily averages (assuming random error), rows 11–12 show the SD for 3-month averages, rows 13–14 show the SD for seasonal cycle averages (average the same month of all years), rows 15–16 show the predicted error for the seasonal cycle averages, and rows 17–18 show the SD without bias subtraction. The site-dependent biases from Table A1 are subtracted prior to calculating the SD.

AIRS single-footprint methane data will be available at NASA GES DISC (

The supplement related to this article is available online at:

SSK and JRW are responsible for the study design, data analysis, and manuscript writing. VHP was responsible for data analysis and manuscript editing. DF was responsible for implementing AIRS into the MUSES retrieval system. SCW and BCD Jr. were responsible for HIPPO

The authors declare that they have no conflict of interest.

This work is supported by NASA ROSES Aura Science Team NNN13D455T. Part of this research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. The NOAA GML aircraft observations are supported by NOAA. The HIPPO aircraft data were supported by NOAA and NSF. Thanks are given to Bruce Daube, Eric Kort, Jasna Pittman, Greg Santoni and others for QCLS

This research has been supported by the NASA (NASA ROSES Aura Science Team NNN13D455T).

This paper was edited by Frank Keppler and reviewed by two anonymous referees.