Establishment of AIRS Climate-Level Radiometric Stability using Radiance Anomaly Retrievals of Minor Gases and SST

Temperature, H2O, and O3 profiles, as well as CO2, N2O, CH4, CFC12, and SST scalar anomalies are computed using a clear subset of AIRS observations over ocean for the first 16-years of NASA’s EOS-AQUA AIRS operation. The AIRS Level-1c radiances are averaged over 16 days and 40 equal-area zonal bins and then converted to brightness temperature anomalies. Geophysical anomalies are retrieved from the brightness temperature anomalies using a relatively standard optimal estimation approach. The CO2, N2O, CH4, and CFC12 anomalies are derived by applying a vertically uniform mul5 tiplicative shift to each gas in order to obtain an estimate for the gas mixing ratio. The minor gas anomalies are compared to the NOAA ESRL in-situ values and used to estimate the radiometric stability of the AIRS radiances. Similarly the retrieved SST anomalies are compared to the SST values used in the ERA-Interim reanalysis and to NOAA’s OISST SST product. These inter-comparisons strongly suggest that many AIRS channels are stable to better than 0.02 to 0.03 K/Decade, well below climate trend levels, indicating that the AIRS blackbody is not drifting. However, detailed examination of the anomaly 10 retrieval residuals (observed minus computed) show various small unphysical shifts that correspond to AIRS hardware events (shutdowns, etc.). Some examples are given highlighting how the AIRS radiances stability could be improved, especially for channels sensitive to N2O and CH4. The AIRS short wave channels exhibit larger drifts that make them unsuitable for climate trending, and they are avoided in this work. The AIRS Level 2 surface temperature retrievals only use short wave channels. We summarize how these short wave drifts impacts recently published comparisons of AIRS surface temperature trends to other 15 surface climatologies.


Introduction
The Atmospheric Infrared Sounder (AIRS) on NASA's AQUA satellite platform (Aumann et al., 2003) measures 2378 highspectral resolution infrared radiances between 650 and 2665 cm -1 with a resolving power (λ/∆λ) of~1200. Launched in 2002 into a sun-synchronous polar orbit with a 13:30 ascending node equator crossing time, AIRS now has been operating almost 20 continuously for 17+ years. The new AIRS Level-1c radiance product (Aumann et al., 2020) is used in this work rather than the standard L1b product. The Level-1c product provides single-footprint radiance estimates for channels in Level-1b that are not functional, or are extremely noisy. Even high quality Level-1b channels can sometimes "pop" or experience radiation hits that invalidate the measurement.
In these extremely rare cases the Level-1c algorithm substitutes an estimated radiance using a principal-components approach.
These corrections are rare enough that they have no effect on the long-term trends under study in this work. Level-1c also 95 includes some channels (in-between detector arrays) that do not exist.
More importantly for this work, the radiances in Level-1c have been corrected for small drifts in the channel center frequencies. These drifts are small, but are large enough to have some minor impact on radiance trends. We emphasize that the channels selected for the anomaly retrievals are all valid Level-1b channels, and most have undergone no corrections other than adjusting the radiances back to a fixed frequency scale.

100
AIRS L1c clear scenes are primarily detected using a uniformity filter. (Throughout this paper the term "scene" refers to a single AIRS nominal 12-by-12 km. footprint or field-of-view.) The BT of each AIRS ocean scene is subtracted from the BT of each of its 8 neighbors for two window channels at 819.3 and 961.1 cm -1 . A scene is initially deemed clear only if the absolute value of all of these differences, averaged over the two channels, is less than 0.4K. The selected scenes are matched to ERA-I model fields and a simulated clear BT for the 961.1 cm -1 channel is computed using a stand-alone version of the 105 AIRS radiative transfer algorithm  called SARTA, implemented using HITRAN 2008 line parameters. If the difference between the observed and computed clear scene BT values is more than ± 4K the scene is discarded from the clear list. This test mostly removes colder scenes made up of very uniform marine boundary layer stratus clouds. The clear yield and mean zonal radiances are quite insensitive to the exact value of this threshold. The uniformity test is not performed on the first and last of the 135 along-track scans in each AIRS granule since they do not have 8 neighbors and we wanted to avoid 110 cross-granule processing. The total number of clear scenes is limited to~40,000 daily clear scenes by randomly sub-setting the detected clear scenes, however this daily limit is almost never reached. In this work we only use descending node observations in order to avoid solar and nonLTE contributions to the AIRS radiances in the short wave. After subsetting for descending (ocean) only the total number of clear scenes detected is~10,000 per day.
The 4K (observed minus computed) BT test removes~20% of the scenes detected with the uniformity filter. A map of these 115 deleted scenes very clearly shows that they are almost all located along the west coasts of the Americas and Africa, where marine boundary layer stratus clouds commonly occur. The (observed minus computed) BT values for the 1231 cm -1 window channel have a nearly Gaussian distribution with a width of~0.6K. Note that this distribution of biases is well separated from the 4K cutoff used to remove marine boundary layer stratus clouds.
Another important characteristic of this clear subset is the stability of the observing times. If the mean observing time 120 changes during this 16-year time period, trends in the SST could be confused with the diurnal cycle of the SST. Due to the high stability of the AQUA orbit, this is not an issue. The short term day-to-day variations in the mean clear subset times can vary by several hours. In addition, there is a seasonal variation of several hours in the clear subset. But, these variation are extremely stable, and the total linear drift of the clear subset over the 16-year observing period, for any given latitude bin in the tropics, is~20 ± 40 (2-σ) seconds per year, effectively zero.
All observing parameters, on a footprint basis, are saved, such as satellite viewing zenith angle and noise (converted to BT units). In addition, the ERA-I model parameters (temperature, H 2 O, and O 3 profiles, and surface temperature with a spatial resolution of approximately 80 km on 60 levels in the vertical from the surface up to 0.1 hPa) are matched to each clear scene and saved along with their associated simulated L1c radiances. This allows our processing to use simulated rather than observed radiances for testing. The ERA-I profiles are also used to compute the anomaly Jacobians used in the retrievals, and 130 are discussed in detail in Sects. 4.3 and 5. temperature, H 2 O, or O 3 trends in that this data set is not necessarily representative of global/zonal climate trends. However, we do assume that the minor gas anomaly trends we are retrieving are uniformly mixed over multi-year time scales. Our anomaly retrieval results show uniform mixing is generally quite accurate over even 16-day time scales. amount was 385 ppm. The window regions (800-1000 cm -1 ) exhibit a bias of~-0.5K, which is quite small and likely some combination of instrument bias, evaporative cooling of the ocean surface relative to the ERA-I SST, incorrect ERA-I water vapor column affecting the H 2 O continuum, and some cloud-contamination. Sampling errors may contribute to the larger biases in the water region beyond 1300 cm -1 . A zoom of the bias in the bottom panel of Fig. 2 highlights the low bias in the 145 700-750 cm -1 region which is sensitive to tropospheric CO 2 , with a mean of~0.2-0.3K. Shows that the region near 700-760 cm -1 , which is most sensitive to CO2, has a mean bias of~0.2-0.3K, and a single-footprint standard deviation of~0.3K. Also shown is the AIRS NEDT, which is barely smaller than the bias standard deviation.

Clear Scene Characteristics
The single-footprint standard deviation of the bias is also shown in Fig. 2, bottom panel along with the average AIRS NEDT for these footprints. The ERA-I bias standard deviation is barely larger than the AIRS noise in this spectral region, indicating that ERA-I temperatures in the mid-troposphere track the AIRS observations very closely with a standard deviation considerably smaller than the AIRS noise. This makes a strong case for the accuracy of the BT Jacobians computed from 150 ERA-I temperature fields. Figure 3 shows the linear trend for the clear dataset averaged over ±50 o latitude. These BT trends prominently exhibit the growth in CO 2 in the tropospheric channels from 700 to 750 cm -1 , which results in a negative change in the observed BT since increasing CO 2 shifts the emission to higher and therefore colder regions of the atmosphere. The growth in CO 2 in the stratospheric channels (a positive BT change) below 700 cm -1 is roughly cancelled by cooling in the stratosphere. All 155 window channels exhibit warming, with larger values in the shortwave past 2450 cm -1 . The non-uniform spatial sampling of these clear scenes precludes any general statements about climate warming, although for these observations we clearly see surface warming in the 800-1250 cm -1 region, if the AIRS radiometry is stable. In addition, the effects of much stronger water vapor absorption in the long wave compared to the short wave windows makes definitive inter-comparisons of the BT trends complicated, which is addressed below by doing retrievals on these data.

Construction of Anomalies
The clear scene radiance subset is sorted into 40 equivalent area latitude bins that cover the full -90 o to 90 o latitude range and are averaged over every 16 days. This results in a data set for the first 16-years of AIRS that has a size of (40 x 2645 x 365) latitude bins, AIRS L1c channels, and the total number of 16-day averages. The following time-series function was fit to these averaged radiances, r obs (t), for each latitude and AIRS L1c channel, where t is AIRS mission times in years, r o is a constant, c i are the amplitudes of the season cycle and three harmonics, and the φ i are their associated phases. At 28 o N, for example, the annual amplitude relative to the mean radiance, c 1 /r o , has a median value (taken over channel) of 4.2%. The median amplitudes of the three harmonics terms, c 2 , c 3 , c 4 relative to r o is 0.32%, 0.45% and 0.23% respectively, all with 2-σ uncertainties of~0.05%. The linear trends a 1 are included in the anomaly 170 time-series fits for simple diagnostic purposes, and are not used directly in the anomaly retrievals.
The radiance anomalies, r a (t) are formed by removing the constant r o , and the sinusoidal terms in Eq. 1, from the observed radiance time series r obs (t). This can be expressed as The radiance anomalies r a (t) were converted to brightness temperature units using The 40 x 2645 x 365 array of BT a vectors are the retrieval inputs y in the retrieval formulation discussed in Sect. 4.  The anomaly BT time series mean BT spectra and their standard deviations are shown in Figure 4 for the 28.4 o N latitude bin. The BT anomaly is set to zero at the mission start, therefore the mean BT in the channels sensitive to tropospheric CO 2 between 700 and 750 cm -1 is -0.5K, which then increases by~1K during the mission. The standard deviation indicates that 180 the SST (and H 2 O continuum) vary by~0.4K during this time period (window region channels from 800-1000 cm -1 ). Some of this is likely due to changes in sampling from day to day and ERA-I errors in SST and column H 2 O. Upper-tropospheric water vapor, which dominates the spectral region between 1350-1615 cm -1 , has the highest variability, which is expected due to both the high temporal variability of water vapor, and our non-uniform sampling.
An example radiance BT anomaly for the 710.141 cm -1 channel is shown in Fig. 5, for the same latitude bin. This channel is 185 heavily influenced by the CO 2 growth, so the AIRS observed trends are becoming more negative, although there is considerable noise, again due to weather and sampling. For comparison we also plot the ERA-I simulated BT anomaly, which does not contain the CO 2 growth, since it is set to a fixed value of 385 ppm in the simulations. The difference between these two BT anomalies will primarily be due to CO 2 growth, and is shown in black. Note that since the ERA-I tracks the atmospheric state quite accurately and most of the time-series "noise" is removed. This helps lend credence to our use of the ERA-I model fields 4 Retrieval Methodology

Approach
Geophysical retrievals are derived from the BT spectral anomalies y(ν), defined in Eq. 3. Using standard retrieval notation the atmospheric state x is derived from the observations y, by minimizing the cost function J where S is a diagonal observation error covariance matrix containing the square of the BT noise, K are the anomaly Jacobians, and R is a regularization matrix. The retrieved atmospheric state x (the geophysical anomalies) are given by S a is the a-priori covariance matrix, and αL is an empirical regularization constraint using Tikhonov L1-type derivative 200 smoothing. This retrieval approach is standard Optimal Estimation (OE) (Rodgers, 1976) enhanced to include both covariance and empirical Tikhonov regularization in R (Steck, 2002). Forward model uncertainty is not included in the measurement error covariance. The mathematical approach is very similar to the author's single-footprint AIRS retrieval algorithm (DeSouza-Machado et al., 2018).
A-priori estimates for x a (t) ≡ 0 for T(z), O 3 (z), H 2 O(z) and T SST were set to zero. Two approaches were used for the minor 205 gas a-priori estimates. The first approach set x a (t) = x a (t − 1) where x a (t = 0) = 0 for the minor gases, iteratively increasing the a-priori gas amount in time based on the previous 16-day retrieval.
Another approach used the known growth rates in the minor gases (from ESRL) by setting x a (t) = g × (t − t o ) for the a-priori minor gas amount, where g is the nominal yearly growth rate for each gas from the NOAA ESRL atmospheric gas trends. For both approaches we set the a-priori covariance to g times one year, the yearly variation in that gas. Nearly identical 210 results are obtained if we used g times five years. The iterative approach for setting the minor gas a-priori produces noisier retrieval anomalies. However, if our retrievals are averaged over ± 50 o latitude, both approaches produced identical differences compared to in-situ measurements, including error uncertainties. The figures and trend results shown here use the a-priori ramp from the ESRL data, although the figures for the iterative ramp are only distinguishable from what is shown for single zonal retrievals (such as the Mauna Loa and Cape Grim comparisons).

215
The temperature, H 2 O, and O 3 profile retrievals use 20 atmospheric layers, selected from the AIRS standard 100-layer pressure grid  by accumulating five of the standard AIRS layers at a time. The lowest layer is about 1.5 km thick, with increasingly wider layers as you go higher in the atmosphere. This layering scheme allows more layers than degrees-of-freedom (DOFs) although it does limit retrievals in the upper-stratosphere. We wish to minimize our sensitivity to the upper-stratosphere since our comparisons to in-situ measurements are made in the troposphere. Consequently we removed 220 all channels peaking above 10 hPa.
Most of the regularization in the retrieval comes from the Tikhonov terms, since we do not want to invoke climatology too strongly for a climate level measurement. Appendix B discusses the profile retrievals, and simulations of these retrievals, in more detail. In summary, after experimentation with Tikhonov regularization we added some a-priori covariance uncertainties in temperature and water vapor of 2.5K and 60% respectively. These are extremely large values for a-priori uncertainties 225 compared to the anomaly variations. For example, the retrieved 400 HPa temperature anomalies shown in Fig. B3 are all less then ± 3K, indicating that the temperature a-priori covariance uncertainty is providing very minimal regularization. This means that almost all of the retrieved temperature variability is coming from the data, and is not damped by the a-priori estimates, a desirable situation for the measurement of climate trends. These a-priori covariance uncertainty terms did, however, improve the profile trends generated in simulation by a slight amount, 3-10%, and thus were retained in our retrieval.

230
The observation error covariances (noise) are the mean AIRS NEDT for each channel, averaged over 16-days, and then divided by the square root of N, the number of scenes averaged. Originally a fixed value of 0.01K observation noise was used, but we found that this noise value depressed the CO 2 anomaly retrievals as they grew in size over time. This problem disappeared once we switched to the true measurement noise values, which are in the range of noise equivalent brightness temperature (NEDT) equal to 0.004K for long wave CO 2 channels from 700-750 cm -1 , about 0.001K in window regions 235 between 800-1250 cm -1 , and 0.001 to 0.002K in the water band that covers the 1300-1615 cm -1 spectra region. These are extremely low noise values, which help explain why the anomaly retrievals have a relatively high number of degrees-offreedom.
As stated earlier, the profile Jacobians used the ERA-I profiles, which were converted to anomaly profiles for each pressure layer. The minor gas Jacobians were computed using our pseudo line-by-line kCARTA radiative transfer algorithm (Strow 240 et al., 1998;DeSouza-Machado et al., 2019). kCARTA allows for extremely accurate Jacobian calculations, including analytic trace gas and temperature Jacobians. Initial retrievals used a fixed value for the minor-gas Jacobians. However, given the large increase in the minor gases (10% for CO 2 ), we determined that the minor-gas Jacobians need to be updated as the gas amounts increase. Therefore we used finite-difference Jacobians, computed using the minor gas amount retrieved from the previous time-step during the anomaly retrievals (or from the gas amount estimated using NOAA ESRL in-situ gas amount data). The 245 minor gas profiles used in the Jacobian calculations are from (Anderson et al., 1986). The CO 2 profile is essentially constant in ppm until you reach the highest atmospheric layer.
There exists a weak dependence of these retrievals on the ERA-I model fields since we use the ERA-I model fields for the temperature, H 2 O, and O 3 profiles in the profile Jacobians, K. While we could retrieve the atmospheric profiles from the full radiance at each time step and latitude zone, ERA-I is so accurate we do not believe this is needed. Section 5.4 250 discusses potential errors introduced by using ERA-I for Jacobian evaluation, where they are shown to be extremely small and unimportant.
The direct retrieval of anomalies from the BT anomaly spectra represents a very different approach than normally used in infrared remote sounding. Although the mathematical approach is the same as in single-footprint retrievals (DeSouza-Machado et al., 2018), the often troublesome problem of static measurement and RTA bias errors is largely removed here since instrument 255 calibration and/or absolute RTA biases do not appear in the retrieval process.

Channel Selection
As discussed in Section 2, only channels that remain A+B throughout the mission are used, noting that the designation A+B does not apply to detectors in the M-11 and M-12 long wave detector arrays. Initial retrievals showed that the AIRS short wave detectors are drifting slightly, so these channels are also excluded from the anomaly fits (except for demonstration tests 260 as discussed below). Unfortunately, the use of only A+B detectors greatly restricts the number of available channels in the important long wave CO 2 temperature sounding region from 710-780 cm -1 , where many channels are either A-only or B-only.
It is important to weight these channels relatively strongly in the retrieval minimization. Since we also wish to de-emphasize stratospheric contributions to the minor-gas rates only every 5th channel from 650-720 cm -1 was included in the retrieval. In addition, any channels in this range with Jacobians that peaked above 10 hPa were excluded.

265
All channels in the M-5 array were excluded since they have relatively poor radiometric stability (as will be shown later).
Several window channels that are sensitive to CFC11 were excluded, although many channels sensitive to CFC12 were included, and CFC12 trends were retrieved. Many H 2 O channels were included, since they are mostly A+B and have been stable throughout the mission. After some experimentation, four channels sensitive to N 2 O were also excluded since they appear to be behaving significantly out-of-family. Three of these channels are located near the end of the M-4c array, which also exhibits 270 some anomalous frequency shifting behavior (Aumann et al., 2020).
A total of 470 channels remained after this pruning process. These channels are nicely distributed throughout the AIRS spectrum and are easily sufficient for 1D-var retrievals. The overall sensitivity of the anomaly retrievals to CO 2 is shown is shown in Figure 6 where the mean CO 2 Jacobian, averaged over all channels, is plotted. The CO 2 sensitivity peaks around 400 hPa, and drops to near zero at the surface. There is some dependence on stratospheric CO 2 , but stratospheric CO 2 trends, especially in the lower stratosphere, should track the tropospheric trends, albeit with growth rates that are slightly influenced by previous years due to age-of-air. This figure also shows the mean CO 2 Jacobian if all channels below 700 cm -1 are removed (all sensitive to the stratosphere). Retrieval tests 280 using these restrictions are discussed later.

Construction of Jacobians
The relatively high accuracy of ERA-I temperature fields was highlighted previously in Fig. 5 which plots the time dependence of the bias between the observed and simulated BT for this channel. This bias, in black, has very little variability (other than the smooth decrease due to increasing CO 2 ) compared to either the observed or simulated BT values due to the high accuracy of the 285 ERA-I temperature profiles. This is not unexpected in a reanalysis product that assimilates a wide range of in-situ measurements (radiosondes) and satellite measurements (microwave and infrared sounders, including AIRS). In principal we could use the AIRS Level-2 atmospheric state for generating the Jacobians for the anomaly retrievals. However, for the large-scale averaging used in this work errors introduced by the relatively large ERA-I spatial grid compared to AIRS are minimized.
Moreover ERA-I is constrained by a large number of instruments and in-situ measurements for the temperature profile.

290
Monthly mean ERA-I observation minus analysis differences for radiosonde temperatures are below 0.2K throughout the troposphere, rising to 0.3K in the lower stratosphere (Simmons et al., 2014). We note that the statistical accuracy of the AIRS Level-2 algorithm is mainly verified by inter-comparisons with ECMWF forecast/analysis fields (Susskind et al., 2014), which are likely even more stable in a reanalysis product. The AIRS Level-2 retrieved temperature and H 2 O global biases relative to ECMWF are very small, well below 0.5K for temperature and 5% for water vapor.

295
In principle we could have performed 1D-var retrievals on each 16-day averaged BT spectrum in each latitude zone, but given the relatively small biases between ERA-I and AIRS shown in Fig. 2 retrievals will produce minimal improvements to the ERA-I fields. Note that the ERA-I bias in the 700-750 cm -1 region with the most sensitivity to tropospheric CO 2 is only in the 0-0.5K range. Moreover, 1D-var retrievals using AIRS will also be limited by uncertainties in the AIRS radiometric calibration, which is estimated to be in the 0.2K range (Pagano and Broberg, 2016).

300
More importantly, since we are only retrieving anomalies, highly accurate Jacobians are unnecessary since the BT variations in the anomalies are so small, especially when applied to trends. A quantitative assessment of errors in our measured anomaly trends from using using ERA-I for Jacobian evaluations is presented in Sections 5.4 and 5.7.

Temperature and Minor Gas Jacobian Co-linearity
A non-standard "correction" is made to the minor gas retrievals that attempts to correct for the co-linearity of the temperature 305 and minor gas Jacobians. We demonstrate that this new approach clearly removes un-physical variability in the CO 2 anomaly retrievals. Co-linearity of the temperature and minor gas Jacobians makes it difficult for the retrieval to separate temperature profile variations from variations in CO 2 , CH 4 , and N 2 O. Usually this is managed by constraining the retrievals with accurate a priori estimates that have small enough covariances to allow some separation of T(z) and CO 2 variability. Kulawik et.al. Kulawik et al. (2010) discuss this problem in the context of CO 2 retrievals using the NASA EOS-AQUA TES instrument, 310 where they describe the selection of constraints as a way to "determine the partitioning of shared degrees of freedom between CO 2 and temperature".
Here we take a different approach based on the fact that we have highly accurate simulated anomalies computed from the ERA-I model fields. The simulated anomalies are derived using our SARTA radiative transfer algorithm (RTA) and were generated using constant values for the minor gases throughout the 16-year time period. Except for the minor gas signatures, 315 the ERA-I spectral anomalies are very similar to the observed anomalies since both AIRS calibration errors and RTA errors are largely removed when forming the anomalies. Fig. 5 shows the excellent agreement between the observed and simulated BT anomalies for the 710.14 cm -1 channel. The only major difference in these anomalies is the downward drift in the observations primarily due to the growth of CO 2 . Note that almost all the high frequency variability in the observed and simulated anomalies is removed when taking their difference, shown in black in Fig. 5, indicating that the ERA-I temperature fields match the AIRS 320 observations very closely.
Given that the ERA-I spectral anomalies are very similar to observed anomalies we can largely determine the effect of the Jacobian co-linearities on the observed CO 2 anomaly retrievals by retrieving a (fictitious, or non-existing) CO 2 anomaly from the ERA-I simulated BT anomalies using an identical retrieval algorithm. Since the simulated anomalies have a constant value for each minor gas the variations in the retrieved CO 2 (or other minor gases) is a measure of the inability of the retrieval to 325 separate the minor gas anomalies from the temperature profile.   the retrieved CO 2 anomaly derived from the ERA-I simulated anomalies. Although close to zero in the mean, this retrieved CO 2 anomaly varies considerably by up to ± 15 ppm. The red curve (labelled AIRS Raw) shows the CO 2 anomaly retrieved from the 330 AIRS observations, which has similar variability superimposed on a linear ramp of~2 ppm/year. The adjusted observed CO 2 anomaly is generated by subtracting the simulated from the observed CO 2 . This is shown in blue (labelled AIRS Adjusted), showing that most of the "noise" has been removed resulting in a very smooth CO 2 anomaly curve.
The right hand panel of Fig. 7 shows similar results but using the average of all latitude bins between ± 50 • latitude. The co-linearity of the temperature and CO 2 Jacobians apparently changes randomly enough with latitude that the simulated CO 2 335 has far less variability than in the left panel. The utility of this approach is nicely illustrated by examining the dip of about 7 ppm in the "Simulated" CO 2 retrieval in early 2010 for the ± 50 • latitude bin. This dip is also visible in the observed anomaly curve (AIRS Raw). The "Simulated" CO 2 anomaly is subtracted from the "AIRS Raw" curve to obtain the final observed CO 2 anomaly (AIRS adjusted) and it is quite evident that the dip in early 2010 has cancelled out, as desired.
The above adjustments to the CO 2 anomaly retrievals have little effect on estimates of AIRS stability over 16 years, as 340 outlined later in Sect. 5.4, although it does increase the statistical uncertainty in the AIRS BT trends by a factor 2.4. More importantly, the application of these adjustments greatly reduces the apparent noise in the derived CO 2 trends, making the detection of instrument shifts in the AIRS BT time series much more sensitive.

345
Evaluation of the anomaly retrievals requires some knowledge of the AIRS mission events. Table 1 summarizes the major events during the AIRS mission that had thermal consequences for either the spectrometer or the focal plane arrays. While most of these events were minor, recent measurements of the AIRS frequency shifts (Aumann et al., 2020) highlight that these events are associated with small shifts in the AIRS frequency scale. These shifts are indicative of very small movements of the detectors relative to the instrument spectrometer axis and could, for example, slightly alter the detector's view of the blackbody and cold scene. Any small non-uniformities in these calibration looks could affect the absolute radiometry. We will refer to these events during discussions of the anomaly retrieval results. data for CO 2 , N 2 O, and CH 4 . Monthly anomalies for these in-situ datasets were computed using the same methods used to compute the BT anomalies for consistency. We focus mainly on the global CO 2 ESRL anomalies since they are derived from a wide geographical range and sites and carefully merged to avoid local sources. The N 2 O ESRL anomalies provide information on AIRS channels in the 1250 -1310 cm -1 region that are distinct from the main CO 2 channels below 780 cm -1 . (There are also strong N 2 O channels in the short wave band of AIRS.) The CH 4 anomalies mostly probe AIRS channels from 1230 to 1360 360 cm -1 . There is some concern that CH 4 anomaly trends may have more spatial variability than CO 2 and N 2 O, however we find good overall agreement with the ESRL global CH 4 trends, and CH 4 provides some sensitivity to channels that overlap with N 2 O, but extend a bit further into the water band.
We focus mostly on the use of CO 2 for AIRS stability estimations since CO 2 is so well measured and has the largest BT signal in the AIRS spectrum (relative to N 2 O and CH 4 ). In addition, the N 2 O and CH 4 spectra overlap strongly in the AIRS BT 365 spectrum, possibly introducing some retrieval uncertainty relative to CO 2 . Absolute errors in the ESRL CO 2 data are estimated to be~0.2 ppm (https://www.esrl.noaa.gov/gmd/ccl/ccl_uncertainties_co2.html), with yearly growth rate uncertainties of~0.07 ppm/year (https://www.esrl.noaa.gov/gmd/ccgg/trends/gl_gr.html). Anomaly growth rate errors averaged over 16 years are likely much lower since yearly sampling errors should diminish over time. Moreover, most absolute errors will not be applicable to the CO 2 anomaly, which is a relative measurement. Therefore it is difficult to definitively estimate the ESRL anomaly trend 370 uncertainty. If the yearly growth rate uncertainties of 0.07 ppm/year are random, then the average of 16 of these growth rates would be 0.018 ppm/year, which corresponds to a percentage uncertainty of 0.8% in the anomaly trend.
Estimates for N 2 O and CH 4 anomaly trend uncertainties using the ESRL stated uncertainties in yearly growth rates, and assuming these are random errors each year, are 3.5% and 2.4%. These larger uncertainties, and the smaller total impact of these two gases on the AIRS BT anomalies, suggest that the best estimates for AIRS stability are likely derived from the CO 2 375 anomalies. Most of the anomaly retrievals performed here only included AIRS channels located below 1615 cm -1 , avoiding the short wave channels in the 2181 to 2665 cm -1 region. Early retrievals showed that the AIRS short wave channels exhibit a positive trend compared to the longer wave channels. Moreover, anomaly fits to just the short wave channels return SST trends that are 380 significantly larger than both the long wave channels and both the ERA-I (OSTIA) and OISST SST products.

Short Wave Trends
The behavior of the AIRS short wave channel relative to the long wave is easily seen in the anomaly retrieval fit residuals. Figure 8 shows the mean value (taken over the 365 16-day time steps for ± 30 o latitude) for the residuals. All AIRS L1c channels are plotted, which includes many bad channels, and channels that do not exist but are filled during L1c creation (Aumann et al., 2020). The channels selected for the anomaly fits (see Sect. 4.2) are shown in red circles. The fit residuals for channels used in 385 these retrievals are almost all well below 0.02K. However, the short wave channels show anomalies inconsistent with the long wave of up to~0.07K in the window channels past 2450 cm -1 .
The anomaly retrievals can respond to drifts/offsets in the AIRS radiances by retrieving geophysical variables (CO 2 , temperature, etc.) that vary incorrectly in time. Alternatively, un-physical changes in the radiances could also be reflected in larger non-zero fit residuals. This could happen when the forward model Jacobians cannot model time-dependent radiance errors, 390 especially for jumps in the radiometric calibration that happen due to AIRS events (shutdowns). One way to examine this possibility is to look for any remaining trends in the anomaly fit residuals. These are shown for the same data set used in Fig. 8 Figure 10. Retrieved CO2 anomalies compared to ESRL global in-situ data. The CO2 anomaly difference between AIRS and ESRL is shown in yellow. The magenta curve is that difference converted into BT units.
The main observation in Fig. 9 is a clear positive trend in the short wave relative to the longer wave channels used in the 395 retrievals. The (AIRS -ERA) SST trend plotted as a solid horizontal line in this figure (discussed in Sect. 5.7) shows that the AIRS short wave trends are more different from the ERA-I SST trends than the long wave channels. Most of the short wave channels, including those in the mid-troposphere, exhibit positive trends relative to the long wave, except for some channels that are peaking very high in the stratosphere, below 10 hPa, that are marked in gray.
Consequently, unless otherwise noted, all the remaining results presented here avoid short wave channels, and use the channel 400 set (470 channels) denoted in these figures. Figure 10 shows the retrieved CO 2 anomalies averaged over ± 50 o latitude in blue and the ESRL global anomaly product in red. The correspondence over time is excellent. The AIRS minus ESRL anomaly differences are shown in yellow. In order to  convert the variation in the gas anomalies to an equivalent AIRS BT anomaly temperature we computed anomaly retrievals 405 with the observed AIRS BT anomaly spectra modified by a 0.01K/year ramp, for all channels. This 0.01K/year ramp is divided by the resulting changes in the CO 2 anomaly linear trends (ppm/year) to obtain the sensitivity of the retrieval to a trend in the AIRS radiances, in K/ppm. For CO 2 this sensitivity is -0.073K/ppm. This is about 2X larger than the largest column Jacobians in the AIRS spectra, which have a value of~0.030K/ppm. This is not unexpected, since the CO 2 column measurement is partially a relative measurement, especially for weak CO 2 channels in the window region where the absolute BT errors are 410 mostly accounted for by (incorrect) adjustments in the SST that minimize the effect of the 0.01K/year applied ramp. It is also possible that the temperature profile could also adjust to minimize sensitivity of the ramp on the CO 2 ppm values. In addition, this sensitivity estimate assumes all AIRS channels are drifting, which is clearly an approximation given the results shown
The magenta curve in Fig. 10 is the (AIRS minus ESRL) anomaly differences converted to BT units using the -0.073K/ppm 415 sensitivity factor. This curve has been slightly smoothed for clarity. The right-hand side vertical axis shows the variations in  (Strow et al., 2006) and were subsequently corrected in the AIRS L1c product (Aumann et al., 2020;Manning et al., 2019). In addition, as reported in (Strow et al., 2006) interference fringes in the AIRS entrance filters 420 shifted after the Nov. 2003 AQUA shutdown because AIRS was restarted at a slightly different spectrometer temperature. The fringes change the AIRS spectral response functions, which has not yet been corrected in the AIRS L1c product radiances. Figure 11 illustrates the differences between the AIRS and ESRL CO 2 linear growth rates. The growth rates for both our CO 2 retrievals and the ESRL CO 2 time series were computed by re-using the fitting function in Eq. 1, but now applied to the retrieved CO 2 anomalies, ie.
where b 1 are the CO 2 trends in ppm/year. Later this equation will be used to fit the N 2 O, CH 4 , and SST anomalies, instead of the CO 2 anomalies as shown here. Figure 11 plots the fitted values for the AIRS growth rates (the b 1 term in Eq. 6), computed as a function of latitude. The CO 2 growth rates are not completely uniform from year-to-year, so Eq. 6 cannot perfectly fit the trend data. However, it provides 430 a convenient metric for inter-comparing these two CO 2 anomalies. Note that the error bars shown for AIRS are slightly overestimated because of the fact that Eq. 6 does not perfectly fit the slightly non-linear anomaly curve. The error estimates are 95% confidence intervals and they have been corrected for serial correlations in the anomaly time series using the popular lag-1 auto-correlation approach detailed in (Santer et al., 2000).
The Mauna Loa and Cape Grim growth rates are also shown, also derived using Eq. 6, as is the ESRL global rate, indicated the ITCZ and higher rates in regions of descending air. We do not examine this latitude dependence in this work, not only is it small, it could also be related to small inaccuracies in our retrieval algorithm.

440
Since the CO 2 linear growth rate measurements are not sensitive to year-to-year variability in the CO 2 anomaly, we instead use the (AIRS -ESRL) global anomaly differences shown in Fig. 10 to quantify the AIRS stability. Any linear trend differences between the AIRS and ESRL CO 2 in Fig. 10 are quantified by fitting the (AIRS -ESRL) CO 2 anomaly differences to Eq. 6.  Loa, and Cape Grim sites. The uncertainties are as before, 95% confidence intervals corrected for lag-1 auto-correlations. As 445 one might expect, the global trends agree the best, and Cape Grim the worst. The higher errors for Cape Grim may be related to our clear subset having fewer samples at -40 o latitude relative to the 20 o latitude zone occupied by Mauna Loa. These mean differences are extremely small, corresponding, for global, to 1.5 ± 0.6% trend differences. Table 3 shows the conversion of the CO 2 ppm trend differences to equivalent BT differences using the -0.073 K/ppm sensitivity conversion. The baseline entry, first line of the table, represents the final configuration for the anomaly retrievals 450 and is our best estimate for the differences between the ESRL and AIRS CO 2 anomaly trends, -0.023 ± 0.009 K/decade. This is an exceedingly small trend difference. While suggesting that AIRS is extremely stable, for channels sensitive to CO 2 and temperature, systematic errors may be larger than the differences reported here. Our estimate of the ESRL global anomaly trend uncertainty discussed in Section 5.2, 0.8%, is equivalent to 0.017 ppm/year. The AIRS minus ESRL global trend difference shown in Table 2 is about 2X times larger than this estimate for the ESRL uncertainty and slightly larger than the statistical 455 uncertainty in this trend difference. In BT units, this potential uncertainty in the ESRL global CO 2 anomaly trend is~0.012 K/Decade.
The sensitivity of these results to uncertainties in the Jacobians are derived from the second partials derivative of BT as follows, 460 where M is the quantity being measured (here, CO 2 anomalies and trends), X unc is the uncertainty in the profile variables used to compute the Jacobians, and Y meas is either the maximum anomaly or the mean trend measured for Y . These are quantified in Table 4.
The first entry accounts for errors in ∂BT /∂CO 2 due to uncertainties in the CO 2 spectroscopy. The HITRAN database (Gordon et al., 2017) reports uncertainties in the CO 2 line strengths of 1-2%. These uncertainties would translate into the 465 same percentage error in the Jacobians. In addition, atmospheric spectra are sensitive to line widths, line shape, line mixing, often at temperatures that are not measured in laboratory spectra. Characterizing the combination of these errors is essentially impossible, so here we assume a 1% uncertainty in the CO 2 Jacobians, using the line strength uncertainty only. The maximum CO 2 anomaly error occurs at the end of the time series when the CO 2 anomaly is highest (35 ppm). Therefore the max anomaly error is 1% × 35 ppm = 0.35 ppm. Using the retrieval sensitivity of -0.073K/ppm, this translates into an effect max error in 470 the BT anomaly error of 0.026K. Dividing this anomaly uncertainty by the 16 year time period under study gives a trend uncertainty due to CO 2 spectroscopy errors of 0.016 K/decade as shown in Table 4. This value is slightly larger than the statistical uncertainty in the baseline CO 2 trend shown in Table 3, and slightly smaller than the derived trend differences versus ESRL CO 2 .
The second entry in Table 4 lists estimated uncertainties in the CO 2 anomalies and trends (converted to BT units) that could 475 arise due to errors in the ERA-I temperature profile. The second partial derivative was computed with finite differences using a fixed temperature offset for all levels and then summed over all levels, a worst case scenario. The mean of these second order derivatives, taken over the retrieval channels in the~700-750 cm -1 spectral region that has high sensitive to CO 2 , represents an effective scalar value for ∂ 2 (BT )/(∂X∂Y ) in Eq. 7. This term is multiplied by an assumed uncertainty in the ERA-I temperature of 0.5K and by the maximum anomaly value of 35 ppm to obtain a maximum uncertainty of 0.0035K in the 480 CO 2 anomaly. The maximum effect on the CO 2 trend is again this value divided by 16 years giving an uncertainty of 2.2 × 10 -3 K/decade, an insignificant uncertainty. Note that our assumed uncertainty of 0.5K is higher than ERA-I error estimates discussed in Section 4.3.
Clearly the estimated 1% uncertainty in the CO 2 spectroscopy is the dominant source of error in our CO 2 retrievals. If the ESRL 0.8% uncertainty is combined in quadrature with the 1% HITRAN uncertainty, a total minimum expected uncertainty 485 in the CO 2 anomaly trends is 1.3%. This translates to a BT uncertainty of 0.02 K/decade, close to our derived mean trend difference between AIRS and ESRL based on the CO 2 anomaly measurements. This may be a more accurate uncertainty estimate for this measurement rather than the 0.009 K/Decade statistical uncertainty derived from fitting the AIRS minus ESRL anomalies.
The second entry in Table 3 lists the mean trend difference and its uncertainty if the adjustment for co-linearity discussed in 490 Sect. 4.4 is not applied, which leads to a larger trend uncertainty by a factor of 2.4. The resulting trend difference is somewhat smaller, but with a different sign. The baseline retrievals with and without the co-linear CO 2 adjustments do not quite overlap within their respective 2σ uncertainties, missing statistical agreement by 0.013 K/Decade, which is relatively small. However, based on the discussion in Sect. 4.4 we believe that the application of the co-linear CO 2 adjustment improves the accuracy of the AIRS CO 2 anomaly.
495 Table 4. Anomaly and trend error estimates for CO2 and SST due to uncertainties in BT Jacobians via their second derivatives with respect to possible ERA-I uncertainties. As noted, the maximum effect on the CO2 anomalies would be at the time of the largest anomaly, which is  Table 3 also shows the results of a number of fit testing the sensitivity of the retrievals to various retrieval alternatives. The "No Strat" entry removed all channels that primarily sense the stratosphere by removing all channels below 700 cm -1 . Fig. 6 shows how this modifies the mean CO 2 Jacobian used in the retrieval, essentially removing all sensitivity to CO 2 above 60 hPa. Unfortunately channels above 700 cm -1 have some residual sensitivity to CO 2 in the stratosphere, and removing channels below 700 cm -1 may make it more difficult to properly minimize the retrieval residuals for some channels above 700 cm -1 . If 500 S a is completely removed, removing a-priori profile regularization, the CO 2 anomaly trend difference increases by a factor of two. Removing the L1c frequency calibration adjustments increases the anomaly trend differences by nearly a factor of three, and changes their sign. If only short wave channels are fit (excluding channels that peak above 10 hPA, and some channels sensitive to both carbon monoxide), the mean trend differences are more than three times larger than the baseline, again with a sign change.

505
The last test, labeled "ERA-I T(z)", examines the need for performing simultaneous retrievals of temperature profiles while retrieving the CO 2 anomalies by using the ERA-I temperature profiles anomalies, instead of fitting for them from the observed anomalies. This test increased the anomaly differences between AIRS and ESRL by almost a factor of three, with a significant increase in the uncertainty of the trend, giving 0.35 K/decade instead of close to 0.009 K/decade for the baseline. Table 3 also shows the Mauna Lao anomaly difference, which is close to the global result, although accompanied by a higher 510 uncertainty of 0.017 K/decade compared to the 0.009 K/decade for the global anomaly. Cape Grim anomaly differences are almost two times higher than the global trend differences, but this is not surprising given the much lower number of observations at that latitude.
The retrieved AIRS global CO 2 anomalies did exhibit a small seasonal pattern for latitudes above 40 o N of with an amplitude of~0.5 ppm. This is due to the residual of the seasonal cycle of CO 2 that is not completely removed when constructing the BT 515 anomalies.
Note that radiometric shifts or drifts in the AIRS BT time series could be either reflected in incorrect geophysical trends, or partially buried in the anomaly fit residuals. The high quality of the anomaly retrievals for CO 2 and the small fit residuals for  Figure 12. Retrieved N2O anomalies compared to ESRL global in-situ data. The N2O anomaly difference between AIRS and ESRL is shown in yellow. The magenta curve is that difference converted into BT units. CO 2 channels strongly suggest that the AIRS blackbody is extremely stable, at least for long and mid wave A+B channels. The SST retrievals discussed later reinforce this conclusion. However, we do see evidence of radiometric shifts due to discrete AIRS 520 events (especially for N 2 O and CH 4 ) that might be amenable to correction. Future work will include careful examination of both the anomaly retrievals and their residuals, likely in an iterative fashion, in order to determine what channels are responsible for unphysical shifts in the anomaly products.

N 2 O Anomaly Retrievals
The N 2 O retrieved anomaly time series is shown in Fig. 12 and primarily senses the 1240-1325 cm -1 spectral region. Clearly 525 the observed N 2 O anomaly is growing slightly faster than the ESRL values. The N 2 O anomalies are converted to equivalent BT variations just as for CO 2 , but with a derived sensitivity of 0.140 K/ppb. Table 5 tabulates the derived trend for the (AIRS minus ESRL) anomaly by fitting the difference to Eq. 6, and then converting to BT units.
The trend differences here are much larger than for CO 2 . Examination of either the AIRS minus ESRL anomalies in ppb, or their equivalent in BT units (left hand y-axis) suggest that two unphysical steps might be present in the time series, one in 530 mid-2005 and another one in mid-to-late 2010. Unfortunately, these steps do not closely coincide with AIRS events, possibly appearing more than one year after the Nov. 2003 event and and slightly less than one year after the Jan. 2010 event.  Figure 13. Retrieved CH4 anomalies compared to ESRL global in-situ data. The CH4 anomaly difference between AIRS and ESRL is shown in yellow. The magenta curve is that difference converted into BT units. To illustrate the effect of these two discrete shifts on the anomaly trend differences we empirically introduce a step in our retrieved N 2 O time series of -0.6 ppb on July 1, 2005 and another step on Jan. 18, 2010 of -0.5 ppb. The trend difference between this empirically modified time series and ESRL, in BT units, becomes -0.022 ± 0.009 K/decade, very similar to the 535 CO 2 trend differences. The main point of this exercise is to illustrate that just two small discrete radiometric shifts could be responsible for the higher trend differences between AIRS and ESRL for N 2 O. More work is needed to map these discrete non-physical events in the retrieved N 2 O anomaly time series back into steps in the AIRS BT time series. The hope is that careful examination of the anomaly time series residuals during this process would highlight specific channels (or cluster of channels) that are behaving non-physically.

CH 4 Anomaly Retrievals
The CH 4 retrieved anomalies have some similarities to the N 2 O anomalies, since the spectra of both gases occur in the same general spectral region. The CH 4 the region of sensitivity is~1210-1380 cm -1 . Figure 13 shows the CH 4 results using the same approach as for CO 2 and N 2 O. The ppb to BT conversion for CH 4 was measured to be 0.023 K/ppb, significantly lower than for CO 2 or N 2 O, although total BT trend due to CH 4 is only marginally lower than CO 2 and N 2 O.

545
The high variability of atmospheric CH 4 growth is well known, as can be seen in the ESRL curve in Fig. 13. The AIRS derived anomalies follow that variable growth rate quite nicely overall. It should be noted that the ESRL CH 4 curve is more variable than CO 2 and N 2 O, and may be less uniform globally, making CH 4 a less ideal gas for testing AIRS stability. However, the AIRS minus ESRL anomaly differences are valuable in that they, like N 2 O, highlight discrete jumps that can often be However, this apparent jump seems to fade within one year for both gases. We believe this might be caused by AIRS frequency shifts that occurred in the M-4a and M-4c detector modules after this event. Those frequency shifts appeared to disappear within one year, and at present they are not corrected for in the AIRS L1c product. Table 6 lists the trend differences between AIRS and ESRL for CH 4 , showing trends differences that similar to those for 555 N 2 O, presumably since both gases occur in the same spectral region.

SST Retrievals
The SST anomaly retrievals are compared to the ERA-I supplied SST (mostly OSTIA) and to NOAA's OISST operational SST product. Although both of these SST products are tied to the ARGO floating buoy network, they are gridded SST products using interpolation derived from satellite data such as AVHRR.

560
A recent study (Fiedler et al., 2019) compared various SST products to the buoy network and found differences for OSTIA of 1.1 mK/year, and 7.8 mK/Year for OISST. This establishes a rough estimate of the differences in these products when evaluating them relative to our retrieved SST anomalies.   Table 7 summarizes the AIRS minus (ERA-I and OISST) anomaly trend differences, computed using Eq. 6. The trend 570 differences are quite small for both SST products. The (AIRS minus ERA-I) trend has the same magnitude as the trend derived using CO 2 , but with the opposite sign. Overlap of the CO 2 and ERA-I SST within their stated uncertainty estimates is missed by 0.01K/decade, which is very small. The CO 2 and OISST trend estimates miss overlap by slightly more, 0.02K/decade. However, this overlap difference is small compared to the differences between OISST and the buoy network reported by (Fiedler et al., 2019). Overall the excellent agreement of these two extremely independent assessments (CO 2 versus SST) 575 to within 0.02K/decade is very encouraging given the complexity of the CO 2 measurement and the uncertainties in the SST product trends.
Comparisons between AIRS-derived SST and ERA-I or OISST products will contain biases due to time aliasing between the AIRS observations and daily means used in the SST products. Although these time dependent biases can have random and seasonal variations of several hours the observed linear drift in the AIRS local observing time over the 16-year observation 580 period was less than one minute/year, far too small to introduce any drifts in the AIRS SST relative to the ERA-I or OISST daily averages.
Uncertainties in the SST anomaly retrievals due to our use of ERA-I fields for the evaluation of the SST Jacobians were estimated using the same approach for the CO 2 anomaly retrievals. The BT Jacobians (dBT/dSST) for channels sensitive to in the lower troposphere. We computed the partial derivatives of the BT Jacobians with respect to all three of these variables, again using finite differences and a constant offset for the air temperature profile and constant percentage offsets for the H 2 O profile. The partial derivatives were averaged for all AIRS channels used in our retrievals in the 800-1235 cm -1 region that is sensitive to surface temperature. The uncertainties assumed in the ERA model fields (X unc in Eq. 7) are listed in column three of Table 4 and are likely higher than the estimated uncertainties summarized in Section 4.3. The uncertainties in the 590 BT Jacobians are then multiplied by Y meas in Eq. 7 which is either 0.4 K (the maximum SST anomaly, see Fig. 14), or 0.0096K/year (our retrieved trend in SST).
The results shown in columns four and five of Table 4 clearly indicate that using ERA profile fields for estimated BT surface temperature Jacobian is extremely accurate. The highest uncertainties are due to H 2 O, but even these are far below the statistical uncertainties shown in Table 7. He used a statistical approach to remove trends in water vapor that affect the 1231 cm -1 channel radiances, which he concedes could introduce artifacts if there is a shift in the mean vertical distribution of water vapor. Our approach does not contain this limitation in principle, although we have not carefully examined the retrieved water vapor trends, mainly because there is no truth for comparison. An intercomparison of our results to his are not strictly possible since we used different SST products 600 for truth and our SST anomalies used many channels. However, the trend of the 1231 cm -1 channel in our retrievals can be derived by adding the slope of our fit residual for the 1231 cm -1 channel (-0.7 mK/year) to our derived SST trends for ERA-I and OISST. Using Aumann's units of mK/year, the result is a trend of 1.5 mK/year and 2.7mK/year for ERA-I and OISST respectively, with respective uncertainties of 1.2 and 2.1 mK/year. These two trends compare favorably with Aumman's night trend for 1231 cm -1 of +2.9 ± 0.4 mK/year. It is interesting that our OISST trend differences agrees more closely with his 605 RTGSST trend difference since these two data sets have similar heritage. Of course the extremely low statistical errors reported by Aumann do not allow overlap of these two results, but that is not necessarily expected since we use different SST products.
Agreement for AIRS radiometric trends at the several mK/year level for at least a single channel should be considered quite remarkable.
We also derived AIRS minus (ERA-I, OISST) SST trend differences using AIRS short wave only anomaly retrievals. For 610 tropical latitudes, ± 30 o , the (AIRS -ERA-I) trend is 0.078 ± 0.040 K/decade and 0.065 ± 0.09 K/decade for OISST. These represent significantly higher trend than observed using long and mid wave channels only. The trend difference between (AIRS long wave minus AIRS short wave) anomaly fits is -0.058 ± 0.026 K/decade, clearly indicating the short wave positive drift relative to the long wave.
The latitude dependence of the AIRS derived SST trends versus ERA-I and OISST may eventually help determine the 615 source of some of these differences. Figure 16 shows these trends between ± 60 o latitude. The uncertainties in these trends arẽ 0.005K/year, but are not shown since these uncertainties are primarily geophysical in nature (how linear is the SST trend) and affect each SST product identically. Agreement is quite good among all products in the northern hemisphere, while OISST is systematically lower than AIRS and ERA-I in the southern hemisphere. Also shown are the AIRS SST trends using only the short wave channels (gray curve), which are always higher than the long wave AIRS trends except at the highest latitudes and 620 near the equator.
Unfortunately, the AIRS Level 2 retrieval algorithm only uses short wave channels for surface temperature retrievals (Susskind et al., 2014). A recent inter-comparison of surface temperature trends from the AIRS Level 2 retrievals to three established surface temperature climate products (Susskind et al., 2019) concluded that the AIRS surface temperature trends were 0.24 K/decade, slightly higher than GISTEMP's (Hansen et al., 2010)  The results presented here conclude that the AIRS short wave channels are drifting positive by about 0.058 K/decade relative to the long wave channels, which appear to be in extremely good agreement with established SST climate products as discussed above. If we subtract this 0.058 K/decade AIRS short wave drift from the AIRS 0.24 K/decade trend presented in (Susskind et al., 2019) we obtain a corrected AIRS trend of 0.18 K/decade, much more in line with the HadCRUT4 and C+W values. In 630 this case GISTEMP is now the only outlier. A more straightforward way to validate the reported AIRS Level 2 surface trends reported by (Susskind et al., 2019) would be to directly compare them to other SST products such as OISST, but unfortunately this was not part of the (Susskind et al., 2019) analysis.

CFC12 Retrieval
All anomaly retrievals presented here included CFC12 retrievals. Although these are not used for quantitative assessments of 635 AIRS radiometric stability, the retrieved CFC12 anomaly is shown in Fig. 17  These results give us confidence that the SST retrievals have not been compromised by CFC12 contamination, since there are a number of channels sensitive to both. Note that the trend of~40 ppt of CFC12 derived here from AIRS is equivalent to only corresponds to about 0.11K in brightness temperature for the channel with the highest CFC12 Jacobian.

Retrieval BT Breakouts and Residuals
The anomaly fit residuals provide a wealth of information on the behavior of each AIRS channel versus time. As stated earlier, unphysical shifts in the AIRS radiance time series can be reflected in either the retrieved geophysical anomalies or in the fit residuals. Jumps in the fit residuals will generally take place when the shifted radiances cannot be "adjusted away" by the BT 645 Jacobians, which require a reasonably accurate physical response to radiance jumps. We believe that the anomaly retrieval approach presented here will allow objective corrections to AIRS radiances, especially for radiance jumps that can be tied to instrument events. The excellent agreement between the CO 2 and SST anomalies and in-situ data strongly suggests that the AIRS blackbody is very stable, which is key to climate-level trend measurements.
There are several likely causes for some of the differences seen here between our observed anomalies and the N 2 O and largely been removed in the AIRS L1c product, although some transient shifts in the AIRS M-4a and M-4c arrays (that cover N 2 O and CH 4 channels) have not yet been corrected in L1c (see (Aumann et al., 2020)). The AIRS frequency shifts imply that detector views of the blackbody and cold scene targets have also shifted during the mission. While these shifts are very small, radiometric drifts/shifts could arise from these focal plane movements if the blackbody and cold scene targets are not perfectly 655 uniform. As mentioned in Sec. 5.4, shifts of interference fringes in some of the AIRS entrance filters when Aqua was restarted in Nov. 2003 may also contribute to the observed anomaly shifts. These fringe shifts have been modeled by the authors and future work may include modification of AIRS radiances before Nov. 2003 to remove the effects of these small shifts in the instrument spectral response function.
Here we present several views of the AIRS anomaly fits and their residuals as examples on how future work might proceed 660 to potentially correct the AIRS radiances for small remaining radiometric drifts/shifts.

Retrieved Anomalies in BT Units
First to provide some context, Figure 18 shows the contribution of the various geophysical trends to the observed BT anomalies for channels sensitive to different geophysical variables. This is done by multiplying the BT Jacobian for some particular geophysical variable by its retrieved anomaly over time. For illustration purposed we averaged the trends over the latitude bins 665 from ± 50 • latitude. The upper panel shows that the retrieved CO 2 anomaly translates into a BT trend for the 722.1 cm -1 channel of more than -1K. Channels very sensitive to the retrieved CH 4 and N 2 O anomalies have BT trends that are lower than CO 2 . The anomaly for a channel sensitive to SST in this panel has an upward trend due to increasing SST values, but these are quite small compared to the minor gas trends.

670
The bottom panel of Fig. 18 plots the BT anomalies due to the retrieved temperature, H 2 O, and O 3 anomalies. The profile anomalies have been summed over all levels for this figure. The same channel chosen to illustrate the BT anomaly due to the CO 2 anomaly, 722.1 cm -1 , is also used to illustrate the contribution of the temperature anomaly. The BT trend for the 722.1 cm -1 channel due to the temperature anomaly is far smaller than for CO 2 , is slightly noisier, and has a small positive trend that mostly occurs after 2014. This would be expected since there is also a positive trend for SST with the same general time dependence.

675
The BT trend due to the retrieved H 2 O anomaly is plotted for the 1418.6 cm -1 channel sensitive to mid-tropospheric H 2 O. This BT anomaly moves in the opposite direction to the BT anomaly due to temperature, which is expected since on a large scale increasing temperatures raise H 2 O amounts, which leads to lower BT values. This figure clearly shows that CO 2 dominates the changes in the longwave region, as expected. The N 2 O and CH 4 BT anomalies are concentrated in the 1230-1400 cm -1 region with significant overlap, which is largely separable in the retrieval.

685
On this scale the BT changes due to SST are quite small. In the lower panel the temperature, H 2 O, and O 3 BT anomaly trends are derived from the sum of the profile Jacobians over all layers. In many regions of the spectrum the temperature and H 2 O BT anomaly trends are dominant, an indication that our anomaly retrievals successfully accounted for variability in those parameters. BT trends in the channels sensitive to tropospheric temperature (700-750 cm -1 ) are in the range of 0.01-0.02K/year (after dividing the plotted mean anomaly by eight), nominally consistent with global warming during this period.

690
The H 2 O greenhouse effect is clearly seen in the bottom panel of Fig. 19. The increased emission in the water band (1200-1615 cm -1 ) due to higher atmospheric temperatures is largely negated by the decrease in emission due to increasing amounts of H 2 O, which shifts the emission in any given channel to higher altitudes where the temperature is lower.
Also note that channels sensitive to stratospheric temperatures in the 650-690 cm -1 region have a negative trend, indicating stratospheric cooling. This is also an expected result for global warming, but great care should be taken in using this data set 695 for general conclusions since the sampling is non-uniform, and the trend standard deviation (Fig. 20) is almost 2X larger than the trend. For completeness the standard deviation of the nominal linear anomaly trends shown in Fig. 19 are plotted in Fig. 20 using the same breakouts of geophysical anomalies. The CO 2 BT anomaly trend maximum standard deviation of~0.1 K near 730 cm -1 is only slightly higher than the standard deviation expected if it was solely due to a linear trend in CO 2 , 0.08K. The air 700 temperature stratospheric standard deviation is large, as previously noted, presumably due to the effects of the quasi-biennial oscillation (QBO) and possibly ENSO variability. The variability due to air temperature and H 2 O produces standard deviations in the water region (1250-1615 cm -1 ) that are generally larger than variability due to trends in CH 4 and N 2 O, but apparently our retrieval successfully removes those interferences. Note the relatively high O 3 variability, which we do retrieve but have not examined carefully. It is important to remember that these are anomaly standard deviations, so they do not include seasonal 705 variability.

Anomaly BT Residuals
The anomaly fits shown above are summed and then subtracted from the observed BT anomalies to obtain the the BT anomaly fit residuals. Any trends in these residuals can also be examined to search for channels that changed characteristics during the 16-year time period.  versus B-only drifts are largely cancelled when A+B channels are used. Since the SST retrievals are quite good, and because the surface channels near 1200 cm -1 agree with the A+B channels, we conclude that the A-only and B-only drifts are real, and possibly due to drifts, or offsets, in the exact part of the blackbody and/or cold target scenes observed by these detectors.
Since the N 2 O retrieved anomalies exhibit some small unphysical behaviors, we examine the fit residuals for the 24 channels 720 (used in the retrievals) that are most sensitive to N 2 O. Visual inspection of these channel's residual time series clearly indicated that 12 of them had easily identifiable features due to AIRS events. Figure 22 shows three different averages of these residual time series; (a) 12 good channels, with no strong evidence of AIRS events, (b) 12 bad channels which clearly exhibit jumps at the time of AIRS events, and (c) the mean time series for all 24 channels used in the anomaly fits. We see that the good channel mean (blue) is very flat, with a slight indication of a jump near the Nov. 2003 event. The bad channel curve (red) shows a large jump near Nov. 2003, possibly some longer-term drifts, and a feature in March 2014 that seems to last for 1 to 1 1/2 years.
This last feature can change sign depending on which bad channel is observed, making it very likely that this is due to the M-4a/M-4c frequency calibration shift that is not yet corrected in the L1c product.
A new set of anomaly retrievals were produced, but with the 12 bad N 2 O channels removed. When compared with the ESRL N 2 O anomalies, this change produced slightly better agreement with ESRL after Nov. 2013. The slope of the (AIRS

730
-ESRL) anomaly difference curve was reduced from -0.141 K/Decade (as reported in Table 5) to -0.113 K/Decade, a slight improvement. This drift relative to ESRL reduces to -0.069 K/Decade if anomaly data before Nov. 2013 is ignored. This illustrates that improvements to the AIRS products can be achieved by removing channels with residuals that have non-physical jumps. If the Nov. 2013 radiometric jumps can be removed (whether due to frequency shifts, fringe shifts, or pure radiometric jumps) even higher stability is possible. However, one could presently begin the AIRS time series, say on Jan. 1, 2004 and 735 retain a stability approximately 2X better than climate trends.
These results illustrate a simple case for how the anomaly fit residuals can be used to improve AIRS trend products. In this work we have not looked for non-physical jumps in the retrieved temperature, H 2 O, and O 3 profile anomalies. These products likely exhibit some of these behaviors and need to be included in any comprehensive study to further improve the AIRS radiance stability. Some sort of iterative approach will likely be needed in order to ensure that these small remaining 740 radiometric jumps become undetectable in both the retrieved anomalies and in the anomaly residuals.

Conclusions
A framework for establishing stability of the AIRS radiances has been introduced that uses retrievals of minor gas and SST trends from BT anomaly spectra. Extremely good agreement between retrieved CO 2 trends (or anomalies) and in-situ trends from NOAA ESRL to -0.023 ± 0.009 K/Decade illustrates that a large fraction of AIRS channels are extremely stable, well 745 below climate trends. The SST anomaly retrievals also compare favorably to the ERA-I reanalysis and to NOAA's OISST SST product, with differences of less than 0.022 K/Decade, and slightly higher values for comparisons to OISST. Such good agreement for a wide range of detectors strongly suggests that the AIRS blackbody is very stable.
Unphysical radiometric jumps are observed in all the retrieved anomaly time series, but especially for N 2 O and CH 4 . These jumps can largely be related to AIRS events, and we illustrate how the anomaly fit residuals, combined with inter-comparisons 750 to truth anomaly trends such as N 2 O, may provide a way to correct small remaining jumps in some AIRS channels.
This work emphasizes that users of AIRS radiances (both Level-1b and Level-1c) for climate applications must pay careful attention to channel selection, since certain detector arrays and channels are presently not suitable for climate trending, including all of the AIRS short wave channels. However, establishment of such a high level of stability for so many remote sensing observations/channels is unusual, and should lead to a high level of trust in AIRS climate trends that pay careful attention to 755 only using validated climate-level channels.
to ERA-I and generating a simulated radiance. This simulated dataset was used to set the regularization parameters for the profile inversions. The measurement of anomalies largely removes systematic errors in both the radiance observations (radio-low. It appears that their main impact is again for high latitudes under conditions where we have higher noise due to low number of clear samples. Pressure (hPa) Figure B2. H2O kernels for the anomaly retrievals. These are taken from a random day for the zonal bin centered at 28.3 o N.
The temperature and water vapor retrieval kernels are shown in Figs. B1,B2. They exhibit a very regular spacing in the troposphere with roughly 12 well-separated kernels.
790 Figure B3 illustrates the 400 hPa temperatures retrieved from the AIRS data (top panel) along with the ERA-I anomalies computed directly from the model fields. We do not expect these two data sets to compare perfectly, since for example, the ERA-I anomalies are from relatively large gridded data and the AIRS measurement are from a nominal 15 x 15 km field-of- Temperature profile trends retrieved from the AIRS observed anomalies. The middle panel simulation assumes that RTA is perfectly accurate.
view. Given the non-uniform sampling of this data set we do not think detailed examination of the observed versus ERA-I anomalies is warranted for scientific purposes. However, note that there are many similarities in time and latitude that give 795 some measure of validation to our profile retrievals. Similar results are seen with water vapor profiles. Figure B4 summaries the temperature trend simulations and comparisons between ERA-I trends, our anomaly retrievals from the ERA-I generated radiances, and those observed with the AIRS clear subset. The trends are computed from the anomaly retrievals (or model fields) using Eq. 6, where the input is the layer temperature instead of a CO 2 amount.
These results have been slightly smoothed to make visual inter-comparisons easier. The left panel shows the vertical trends 800 versus latitude directly computed from the ERA-I temperature fields. The middle panel shows our simulated temperature trend retrievals. These simulations agree quite well with the ERA-I model fields, the largest differences are seen in the lower troposphere at the higher latitudes, and near the boundary layer in the tropics. The simulated retrievals are also placing the tropopause too high, not surprising given the lack of sensitivity of the infrared to the tropopause height and our limited number of vertical layers. The right panel shows the temperature anomaly trends retrieved from the AIRS observed anomalies. Clearly 805 there are significant differences between the ERA-I temperature profile trends and those we retrieved from AIRS, although the basic structure is relatively similar. Note that the uncertainties in these trends are quite high in the stratosphere (not shown) due to variations in the quasi-biennal oscillation (QBO), especially in the tropics, with errors larger than the observed trends in the vicinity of the tropopause. However, these uncertainties are largely present in both ERA-I and the AIRS observations.
The AIRS observed anomalies may also be impacted by errors in the BT Jacobians. The middle panel in Fig. B4 used similar 810 RTAs for both simulations and the retrieval. The version of SARTA used for the radiance simulations is based on HITRAN2008 while the Jacobians used in the retrieval used kCARTA which is based on HITRAN2016 and a slightly modified version of CO 2 line-mixing. We expect that these spectroscopy differences have little impact since the CO 2 line strengths for the strong 15 µm bands have not changed between HITRAN versions. In addition, no noise was added to the simulated anomalies.
We believe that these results show that the anomaly retrievals used for measuring minor-gas trends exhibit realistic behavior 815 and given our simulation testing this retrieval approach is likely to give accurate minor-gas trends. The impact of some of the regularization choices are discussed in Sect.5.4.
Author contributions. LLS led the study and made the comparisons between the anomaly fits and in-situ data. SDM developed the anomaly retrieval algorithm. LLS and SDM together optimized the anomaly retrieval regularization.
Competing interests. The authors declare that they have no conflict of interest.