TROPESS/CrIS carbon monoxide proﬁle validation with NOAA GML and ATom in situ aircraft observations

. The new single-pixel TROPESS (TRopospheric Ozone and its Precursors from Earth System Sounding) proﬁle retrievals of carbon monoxide (CO) from the Cross-track Infrared Sounder (CrIS) are evaluated using vertical proﬁles of in situ observations from the National Oceanic and Atmospheric Administration (NOAA) Global Monitoring Laboratory (GML) aircraft program and from the Atmospheric Tomography Mission (ATom) campaigns. The TROPESS

Abstract.The new single-pixel TROPESS (TRopospheric Ozone and its Precursors from Earth System Sounding) profile retrievals of carbon monoxide (CO) from the Cross-track Infrared Sounder (CrIS) are evaluated using vertical profiles of in situ observations from the National Oceanic and Atmospheric Administration (NOAA) Global Monitoring Laboratory (GML) aircraft program and from the Atmospheric Tomography Mission (ATom) campaigns.The TROPESS optimal estimation retrievals are produced using the MUSES (MUlti-SpEctra, MUlti-SpEcies, MUlti-Sensors) algorithm, which has heritage from retrieval algorithms developed for the EOS/Aura Tropospheric Emission Spectrometer (TES).TROPESS products provide retrieval diagnostics and error covariance matrices that propagate instrument noise as well as the uncertainties from sequential retrievals of parameters such as temperature and water vapor that are required to estimate the carbon monoxide profiles.The validation approach used here evaluates biases in column and profile values as well as the validity of the retrieval error estimates using the mean and variance of the compared satellite and aircraft observations.CrIS-NOAA GML comparisons had biases of 0.6 % for partial column average volume mixing ratios (VMRs) and (2.3, 0.9, −4.5) % for VMRs at (750,511,287) hPa vertical levels, respectively, with standard de-viations from 9 % to 14 %.CrIS-ATom comparisons had biases of −0.04 % for partial column and (2.2, 0.5, −3.0) % for (750,511,287) hPa vertical levels, respectively, with standard deviations from 6 % to 10 %.The reported observational errors for TROPESS/CrIS CO profiles have the expected behavior with respect to the vertical pattern in standard deviation of the comparisons.These comparison results give us confidence in the use of TROPESS/CrIS CO profiles and error characterization for continuing the multi-decadal record of satellite CO observations.

Introduction
Carbon monoxide (CO) is a useful tracer of atmospheric pollution, with direct emissions from incomplete combustion such as biomass and fossil fuel burning as well as secondary production from the oxidation of methane (CH 4 ) and volatile organic compounds (VOCs).Atmospheric CO distributions have a seasonal cycle that is mainly driven by photochemical destruction, which allows CO to build up over winter and early spring in higher latitudes.The lifetime of CO of weeks to months (e.g., Holloway et al., 2000) is long enough to allow observations of pollution plumes and their subse-Published by Copernicus Publications on behalf of the European Geosciences Union.
quent long-range transport, but short enough to distinguish the plumes against background seasonal distributions (e.g., Edwards et al., 2004Edwards et al., , 2006;;Hegarty et al., 2009Hegarty et al., , 2010)).As a dominant sink for the hydroxyl radical (OH), CO plays a critical role in atmospheric reactivity (e.g., Lelieveld et al., 2016) and is considered a short-lived climate pollutant (SLCP) because of its impacts on methane lifetime as well as carbon dioxide and ozone formation (e.g., Myhre et al., 2014;Gaubert et al., 2017).
Global observations of tropospheric CO from satellites started in 2000 with the NASA Earth Observing System (EOS) Measurement of Pollution in the Troposphere (MOPITT) instrument on Terra (Drummond et al., 2010), followed by the EOS Atmospheric Infrared Spectrometer (AIRS, McMillan et al., 2005) on Aqua launched in 2002, the Scanning Imaging Absorption Spectrometer for Atmospheric Chartography (SCIAMACHY, de Laat et al., 2006) on Envisat launched in 2002, the EOS Tropospheric Emission Spectrometer (TES, Beer et al., 2006) on Aura launched in 2004, the Infrared Atmospheric Sounding Interferometer (IASI, Clerbaux et al., 2009) on the MetOp series beginning in 2006, the Cross-track Infrared Sounder (CrIS, Gambacorta et al., 2014) on the Suomi National Polar-orbiting Partnership (SNPP) satellite launched in 2011, and most recently the Joint Polar Satellite System (JPSS) series, TROPOMI on the Sentinel-5 precursor in 2017 (Borsdorff et al., 2018), and the Fourier transform spectrometer (FTS-2) on the Greenhouse gases Observing SATellite-2 (GOSAT-2, Suto et al., 2021) launched in 2018.Satellite CO observations are assimilated for reanalyses and operational air quality forecasting (e.g., Gaubert et al., 2016;Inness et al., 2019;Miyazaki et al., 2020) and have been used in inverse modeling analyses to estimate emissions and attribute sources for co-emitted species such as CO 2 (e.g., Kopacz et al., 2010;Jiang et al., 2017;Liu et al., 2017;Zheng et al., 2019;Gaubert et al., 2020;Byrne et al., 2021;Qu et al., 2022).Trend analyses of satellite CO observations (e.g., Worden et al., 2013;Buchholz et al., 2021) show a general decline of atmospheric CO over the satellite record globally and in most regions, but with a slowing of this decrease in recent years that emphasizes the need for continued satellite CO observations that are validated and have reliable error characterization.
In this study, we evaluate the biases and reported uncertainties of single-field-of-view (FOV) CO retrievals from the Cross-track Infrared Sounder (CrIS) on board the SNPP satellite launched in October 2011.CrIS is a Fourier transform spectrometer (FTS) that has continuation instruments on the current and planned JPSS series with JPSS1/NOAA-20 launched in 2017 and planned launches in 2022, 2028, and 2032 (https://www.nesdis.noaa.gov/about/our-offices/joint-polar-satellite-system-jpss-program-office, last access: 14 September 2022).The CrIS CO retrievals evaluated here use the MUSES (MUlti-SpEctra, MUlti-SpEcies, MUlti-Sensors) algorithm (Fu et al., 2016(Fu et al., , 2018(Fu et al., , 2019) ) and are processed with the TROPESS (TRopospheric Ozone and its Pre-cursors from Earth System Sounding) Science Data Processing System (Bowman, 2021).TROPESS is a NASA project that provides a framework for consistent data processing of ozone and ozone precursors across different satellite instruments.TROPESS retrievals use single-FOV radiances in sequential optimal estimation retrievals (Rodgers, 2000) of temperature, water vapor, effective cloud parameters, ozone, CO, and other trace gases, allowing for full characterization of the vertical retrieval sensitivity with an averaging kernel and error covariance (Bowman et al., 2006).TROPESS/CrIS CO products differ from other available CrIS CO data products that combine nine FOVs to obtain a single cloud-cleared radiance and corresponding retrieval of atmospheric parameters such as the NOAA Unique Combined Atmospheric Processing System (NUCAPS) (Gambacorta et al., 2014(Gambacorta et al., , 2017;;Nalli et al., 2020) and the Community Long-term Infrared Microwave Combined Atmospheric Product System (CLIM-CAPS) (Smith and Barnet, 2020).
TROPESS data products report a separate matrix for the observational error terms along with the total retrieval error covariance that includes the contribution of smoothing error.This is important for evaluation of retrieval errors using in situ profiles since the validation comparison removes the effect of smoothing in the retrieval by applying the retrieval averaging kernel and a priori to the in situ profile before differencing (Rodgers and Connor, 2003).Similar comparisons were performed in the recent validation study for the MUSES single-FOV CO retrievals from the Aura Atmospheric Infrared Sounder (AIRS) of Hegarty et al. (2022).
Section 2 describes the TROPESS retrievals and CO data products in more detail, and Sect. 3 describes the validation in situ data from the National Oceanic and Atmospheric Administration (NOAA) Global Monitoring Laboratory (GML) aircraft network and the Atmospheric Tomography Mission (ATom) campaigns.The validation methods are presented in Sect.4, and results are shown in Sect. 5 with a summary and conclusions in Sect.6.

TROPESS/CrIS single-field-of-view CO profile retrievals
The first Cross-track Infrared Sounder (CrIS) was launched 28 October 2011 on the SNPP satellite into a sunsynchronous polar orbit with an altitude near 830 km and an Equator-crossing time (ascending node) near 13:30 LT.CrIS is a Fourier transform spectrometer (FTS) operating in three spectral bands between 648 and 2555 cm −1 .This includes the R-branch of the thermal infrared (TIR) CO (0-1) fundamental band above 2155 cm −1 .After launch, spectral radiance data that included the CO band were collected using a spectral resolution of 2.5 cm −1 .This resolution was relatively coarse and significantly limited the vertical sensitivity of CO retrievals (Gambacorta et al., 2014).Following the decision to collect data at full spectral resolution (δ = 0.625 cm −1 ), these finer-resolution spectral radiances have been available since 4 December 2014.Here we only utilize the full-spectral-resolution CrIS data.

TROPESS retrieval approach
TROPESS data processing (Bowman, 2021) produces retrievals of temperature, water vapor, and trace gases such as ozone (O 3 ), methane (CH 4 ), carbon monoxide (CO), ammonia (NH 3 ), and peroxyacetyl nitrate (PAN) from single and multiple instruments including AIRS and OMI as well as CrIS and TROPOMI.The MUSES retrieval algorithm used in TROPESS was developed with heritage from Aura/TES retrieval processing.Bowman et al. (2021) describe the sequential MUSES retrievals of temperature, water vapor, and effective cloud properties for each FOV that are necessary for the retrieval of CO.Each step in the sequence includes an iterative retrieval with a forward model and updated estimate of the state vector of atmospheric parameters following the maximum a posteriori (MAP) method.The forward model for radiative transfer at CrIS TIR wavelengths uses optimal spectral sampling (OSS, Moncet et al., 2015), which includes effective cloud optical depth and height parameters (Eldering et al., 2008;Kulawik et al., 2006).
Here we analyze TROPESS/CrIS TIR-only CO retrievals that use the 2181-2200 cm −1 spectral range.A priori profiles for TROPESS CO retrievals are taken from the model climatology used in Aura/TES processing (MOZART, Brasseur et al., 1998), with monthly variation over a 30 • latitude and 60 • longitude grid.The a priori uncertainty covariance matrix used to constrain the retrieval is the same as used for MO-PITT profiles (Deeter et al., 2010) with 30 % uncertainty for vertical CO parameters at all levels and correlation lengths corresponding to 100 hPa between them in the troposphere.
The TROPESS CO products have quality flags for screening cases that did not converge or that have unphysical results.This screening checks the magnitude and spectral structure of radiance residuals, cloud retrieval characteristics, and deviation of surface emissivity from a priori values.Specifically, retrievals with good data quality of 1 have radiance residual standard deviation less than 12 times the radiance error, an absolute value of the radiance residual mean less than 0.7 times the radiance error, KdotDL (the normalized dot product of the Jacobians and the radiance residual) less than 0.8, LdotDL (the normalized dot product of the radiance and the residual) less than 0.6, cloud-top pressures below 90 hPa, mean cloud optical depths less than 50, cloud variability (variation with respect to wavenumber) less than 3, and mean surface emissivity that did not change by more than 0.06.These threshold values are based on comparisons with in situ data and other satellite data to determine when retrievals are valid.

TROPESS/CrIS CO data examples
Figure 1 shows an example of TROPESS/CrIS CO data for 12 September 2020 when there were significant fires in the western US.These retrievals are from a special data collection that processed scenes selected from 0.25 • × 0.25 • latitude-longitude subsampling to enable throughput with the available computing capacity (Bowman et al., 2021).The data in this collection are pre-filtered for quality (see Sect. 2.1), and Fig. 1a shows all available day and night retrievals.Figure 1b shows the data after higher cloudy scenes are removed (i.e., cloud tops with pressure < 700 hPa and cloud effective optical depth > 0.1).For reference, Fig. 1c shows the mid-tropospheric average CO volume mixing ratio (VMR) for the a priori profiles used in the retrievals, and Fig. 1d shows a NASA Worldview (https://worldview.earthdata.nasa.gov/,last access: 14 September 2022) image from SNPP/VIIRS (Visible Infrared Imaging Radiometer Suite) with clouds and smoke shown in true color and red areas indicating fire and thermal anomalies.Since vertical profile retrievals using TIR radiances have sensitivity to CO mainly in the free troposphere, Fig. 1 shows individual retrievals with average VMR from vertical layers between 700 and 350 hPa.When all scenes are included, the average number of degrees of freedom for signal (DFS) is 0.99 for the CrIS CO observations in Fig. 1a, and when cloudy scenes are removed (Fig. 1b) the average DFS is 1.14 for the remaining CrIS observations.
As stated in the Introduction, the TROPESS single-FOV products are different from the NUCAPS and CLIMCAPS products that combine nine FOVs in a retrieval from a single cloud-cleared radiance (Susskind et al., 2003).These multiple-FOV products have the advantage of increased global coverage in the presence of partially cloudy scenes but with coarser spatial resolution.Figure 2 shows an example of SNPP CLIMCAPS (Barnet, 2019) compared to SNPP TROPESS/CrIS CO products (daytime only) on 13 September 2018 over the Pole Creek fire in Utah.For CLIMCAPS, trace gas products with less than 1 DFS report mass mixing ratio (MMR) on a single level at the retrieval pressure with peak sensitivity, which is 500 hPa for CO.We converted MMR to VMR for Fig. 2.This is compared to the tropospheric column average VMR from TROPESS, so the background VMR values are close but do not represent the same retrieved quantities.CrIS retrieval center locations are shown by the circles in Fig. 2a and b, which are not intended to represent the spatial extent of the observations.The CLIM-CAPS retrievals show elevated CO from the fire, but these combined FOV retrievals would give an overestimate of the plume width and do not distinguish the larger plume from the smaller fires to the east in Colorado.

NOAA GML aircraft network
Spanning 3 decades, NOAA GML aircraft network vertical profile observations are taken on semi-regular flights (approximately one per month) at fixed sites mostly in North America except for one site in Rarotonga, Cook Islands (Sweeney et al., 2015).These flights collect air samples using an automated flask system to obtain vertical profiles for each trace gas measured from near the surface to around 400 hPa,  1).

ATom aircraft campaigns
The Atmospheric Tomography Mission (ATom) was designed to study air masses in the most remote regions of the Pacific and Atlantic Ocean in each season (Thompson et al., 2022), which also makes the data valuable for validating satellite CO observations over a range of latitudes, with mostly background CO concentrations, except for where transported pollution plumes were encountered (Deeter et al., 2019(Deeter et al., , 2022;;Martínez-Alonso et al., 2020;Hegarty et al., 2022).We use CO profiles from the quantum cascade laser spectrometer (QCLS) on ATom campaigns 1-4 (see Table 1).These NASA DC-8 flights obtained vertical profiles from 0.2 to 12 km altitude (∼ 290 hPa) by ascending or descending approximately every 220 km.CO was measured at 1 Hz with QCLS reproducibility around 0.15 ppbv (McManus et al., 2010;Santoni et al., 2014).The QCLS data were calibrated to the X2014A CO WMO scale maintained by the NOAA GML.We then find all eligible CrIS and aircraft profile pairs within 9 h and 50 km distance.This has been a standard coincidence distance criterion for several validation studies (e.g., Deeter et al., 2019Deeter et al., , 2022;;Hegarty et al., 2022).certainty introduced by this extension explicitly using NOAA AirCore in situ balloon profiles that sample into the stratosphere (Karion et al., 2010).This uncertainty is computed for validation using aircraft profiles (with top samples around 400 hPa for NOAA/GML) by comparing MOPITT profiles to truncated and extended AirCore profiles vs. the true full Air-Core profiles.The comparison error introduced by the extension was at most 3 % around 300 hPa and much less than the standard deviation of MOPITT and full AirCore profile differences (∼ 7 %-10 %) in the upper troposphere.We also note that for ATom profiles, the highest-altitude samples are normally taken around 12 km (∼ 200 hPa), and the profile extension therefore has a minimal impact on tropospheric validation results.

Comparison of TROPESS satellite and aircraft observations
In order to account for the satellite observational and retrieval approach, including prior information, when comparing satellite retrieval products to in situ measurements of CO, we apply the instrument operator to convert the in situ profile into the values that would be retrieved for the same air mass assuming the satellite instrument and retrieval (Jones et al., 2003, Rodgers and Conner, 2003, Worden et al., 2007): where x val is the aircraft or sonde in situ profile being used for validation (following extension, described above, and linear interpolation to the satellite vertical grid), x a is the a priori profile used in the TROPESS retrieval, A is the averaging kernel matrix that describes the observation and retrieval vertical sensitivity to the true state, and xval is the in situ validation profile transformed by the satellite instrument operator.
This operation accounts for both the broad vertical resolution (or "smoothing") of remotely sensed measurements and the influence of the a priori, which is especially important in the vertical ranges where satellite observations have low sensitivity to CO abundance.Figure 3 shows an example of the averaging kernel A and a validation comparison with Eq. ( 1) applied to an ATom in situ profile.

Evaluating TROPESS CO reported observational errors
Following Bowman et al. (2006Bowman et al. ( , 2021)), for retrieved parameter x (e.g., CO abundance) with a priori covariance S a , radiance measurement covariance S e , Jacobian matrix K = ∂L ∂x , radiance L(x), gain matrix G = K T S −1 e K + S −1 a −1 K T S −1 e , and averaging kernel A = GK, the a posteriori error covariance can be written as the sum of with S smoothing = (I − A xx ) S a (I − A xx ) T and where In this notation, b variables are parameters that are held constant in the CO retrieval (such as temperature and water vapor) but affect the radiance observation and are propagated through Jacobian K b , while b_ret variables are retrieved along with CO (such as surface emissivity) and have corresponding off-diagonal terms in the full retrieval averaging kernel matrix.When we apply the satellite instrument operator in Eq. ( 1) to the in situ aircraft profile, we are accounting for the smoothing error term.Thus, we expect differences between xval and our retrieved x to be due to observational error terms (Eq. 3) and to geophysical differences from the sampling of different air masses and surface locations because of imperfect coincidence.

TROPESS/CrIS CO comparisons with NOAA GML aircraft data
After extending the in situ profiles vertically (described in Sect.4.1) and applying Eq. ( 1), we compute the differences between satellite retrievals and transformed aircraft profiles.
Figure 4 shows the bias (% relative difference) of the CrIS CO retrieved profiles with respect to NOAA GML aircraft profiles ( xval ).A similar pattern of positive bias in the lower to middle troposphere and negative bias in the upper troposphere is observed for MUSES/AIRS profiles compared to NOAA GML flights (Hegarty et al., 2022).However, MOPITT (version 9, TIR-only data) comparisons to NOAA GML (Deeter et al., 2022) have almost the opposite vertical bias pattern with a negative bias (−1.6 %) in the lower to middle troposphere and a positive bias (0.6 %) in the upper troposphere.Since TROPESS and MOPITT retrievals both use optimal estimation algorithms and a similar prior CO error covariance, this different vertical bias pattern is most likely due to instrument differences.MOPITT uses gas filter correlation radiometry instead of spectroscopy to detect CO absorption in the atmosphere with corresponding differences in vertical sensitivity that are determined from gas cell pressure rather than spectral resolution.After accounting for retrieval differences in a priori profiles and covariances between MOPITT and IASI (another FTS instrument), George et al. ( 2015) find a similar positive bias for MOPITT in the upper troposphere.Table 2 gives the mean bias and standard deviations for selected pressures and partial column average VMRs over different observing conditions (land, ocean, day, and night).The partial column refers to the CO column between the minimum and maximum flight altitudes of each aircraft profile.The average VMR over this range is computed by interpolating both the CrIS retrieval and the aircraft xval profile to these end points.Since aircraft flights normally occur during daytime, there are fewer coincident pairs for CrIS night retrievals.Tang et al. ( 2020) find larger bias and variance for nighttime MOPITT data in comparisons with in situ aircraft data, especially for flights over urban regions, suggesting that more night validation flights are needed to properly evaluate night satellite retrievals.
Figure 5 shows how the observed partial column average VMRs and CrIS retrieval bias with respect to NOAA GML xval profiles vary with latitude, and Fig. 6 shows how these vary with time.No significant bias dependence on latitude is observed for the NOAA GML flight sites.Although a bias drift of −0.007 ± 0.001 % d −1 is detected, we recognize that our comparison time range is not sufficient for a reliable estimate of bias drift, and more years of comparisons would be required.files for all latitudes and three latitude ranges: 30 • S to 30 • N, 90 to 30 • S, and 30 to 90 • N. The vertical behavior of the bias is similar to the above CrIS comparisons with NOAA GML flights, with a positive bias in the lower troposphere and a negative bias in the upper troposphere, and is also similar to the MUSES/AIRS CO profiles compared to ATom flights (Hegarty et al., 2022).However, for MOPITT V9T comparisons to ATom flights (Deeter et al., 2022), the vertical bias pattern is again mostly opposite, with a negative bias (∼ 4 %) in the lower to middle troposphere and a positive bias (∼ 2 %) in the upper troposphere.This TROPESS/CrIS CO bias also differs from Nalli et al. (2020), who examined the bias of NUCAPS profiles (including CO) with respect to ATom in situ profiles.That study, using the multiple-FOV NUCAPS retrievals, found a small positive bias (∼ 2 %) for SNPP/CrIS CO with respect to ATom CO at all tropospheric vertical levels after applying their averaging kernels.CrIS CO comparisons with ATom have less variance than comparisons with NOAA GML, especially for 90 to 30 • S. Table 3 gives the mean bias and standard deviations for se- lected pressures and partial column average VMRs over different observing conditions (land, ocean, day, and night) and latitude ranges.As described above, the partial column average VMR is computed over the altitude ranges of each aircraft profile.Due to the nature of the ATom campaign, there are fewer observations over land.

TROPESS/CrIS CO validation with ATom
Figure 8 shows how the observed partial column average VMRs and CrIS retrieval bias with respect to ATom xval profiles vary with latitude.It appears that tropical and Northern Hemisphere subtropical latitude ranges have a slightly higher positive bias than what is observed for higher latitudes, potentially indicating a TROPESS/CrIS retrieval issue with water vapor or some other interferent that is not fully characterized and requires further investigation.For example, Deeter et al. (2018) found that an empirical correction to MOPITT radiances resulting from a linear dependence on water vapor removed most of the latitude-dependent bias in MOPITT CO profiles.Another gas interferent in the TIR CO band is N 2 O, and we will also need to consider the latitudedependent N 2 O anomalies observed by ATom (Gonzalez et al., 2021) when assessing the contributions to this latitude dependence in TROPESS/CrIS CO bias.
In Fig. 9, we examine the seasonal behavior of CO sampled by ATom and CrIS in mostly remote ocean regions.In the high-latitude Southern Hemisphere (SH), we see the lowest values in summer and fall (Jan-Feb and Apr-May) as expected due to the chemical destruction of CO in a region with few local combustion sources.In the tropics, we find high values corresponding to African and South American biomass burning plumes over the Atlantic in all seasons except Northern Hemisphere (NH) spring.Lower values of CO in the tropics for NH summer and winter correspond to profiles over the Pacific Ocean (e.g., Strode et al., 2018;Bourgeois et al., 2020).The close alignment of the CrIS and ATom xval partial column average values in Fig. 9 indicates that CrIS is able to capture the seasonal, latitudinal, and hemispherical variations observed by ATom.

Dependence on CO amount
For both the NOAA GML and ATom flights we find a small negative dependence of TROPESS/CrIS retrieval bias with respect to CO amount, with magnitude less than 0.1 % ppb −1 .Figure 10 shows how the partial column average VMR bias varies with CO VMR for the two validation data sources, and we can also see how ATom flights sampled air with lower CO concentrations.Figure 10 indicates that TROPESS/CrIS CO average column VMRs have very little dependence on CO amount, and we find similar results for CrIS-retrieved CO at vertical levels 511 and 750 hPa (shown in the Supplement).

Evaluation of TROPESS/CrIS CO retrieval observational errors
Here we compare the observed variance of differences between retrieved CrIS CO profiles and in situ aircraft profiles, after applying Eq. ( 1), with the TROPESS reported observational errors defined in Eqs. ( 3) and (4).As described in Sect.4.3, we expect the differences between retrieved CrIS and aircraft CO profiles ( xval ) to have a variance due to the combination of observational errors and geophysical variation from imperfect coincidence.Figure 11 shows comparisons of individual and average computed observational fractional errors to the standard deviation (SD) of CrISxval profile differences as well as the diagonal for the a priori covariance and the SD of priorxval profile differences.SD(CrISxval ), but in some vertical ranges, they are much less and could be underestimated via instrument and systematic error assumptions in the TROPESS retrieval as Hegarty et al. (2022) suggest.Additional studies to test the sensitivity of the comparison variance to a range of coincidence criteria are needed to confirm a retrieval underestimate, but these would require several repeated validation measurements for the same observing conditions.Despite the potential for underestimated observational errors, the general behavior of the error comparison is what we expect from Eq. ( 1), and we can see the retrieval influence on the shape of SD(CrISxval ).Near the surface, where there is less retrieval sensitivity as indicated by the averaging kernel, we see that SD(priorxval ) becomes smaller than SD(CrISxval ).This is expected for vertical ranges with less retrieval sensitivity since the priori contribution becomes more dominant in xval .In contrast, for the middle troposphere where we have the most sensitivity for TIR remote sensing, it is clear that SD (CrISxval ) represents an improvement over SD (priorxval ).In Fig. 12, the error comparison is shown separately for three ATom latitude ranges, and we can see that the agreement between observational errors and SD (CrISxval ) is closest for ATom flights in the mostly clean middleto high-latitude Southern Hemisphere, where it is most likely that the aircraft and satellite are observing similar air masses with background CO concentrations.These results give confidence that TROPESS single-retrieval error characterization can be used to weight data for averaging and inverse analysis applications.

Summary and conclusions
This study used in situ observations from routine NOAA GML flights and the four ATom campaigns to evaluate TROPESS single-pixel CO retrievals from the SNPP/CrIS FTS instrument.We find the following.
1.The single-FOV CrIS product provides improved representation of CO in smoke plumes compared to retrievals that combine multiple FOVs.
2. Comparisons with aircraft in situ profiles (after extension, interpolation, and application of Eq. 1) show that biases have a vertical dependence in the troposphere that is consistent for both sets of in situ data with average biases that are positive (∼ 2.3 %) in the lower troposphere and negative (∼ −4.5 %) in the upper troposphere.
3. Small biases (0.6 % and −0.04 % for NOAA GML and ATom, respectively) are observed for the CrIS CO partial column average VMR corresponding to the aircraft profile vertical ranges.
4. No significant latitude dependence of CrIS CO column bias is found for the NOAA GML comparisons, but comparisons with ATom, which better covered a range of latitudes, have a slightly more positive bias for tropical scenes that could indicate a small, uncharacterized retrieval dependence on water vapor or another interferent species.
5. CrIS CO retrievals capture the seasonal and spatial variations observed by ATom.6.There is a small negative dependence (magnitude < 0.1 % ppb −1 ) of CrIS bias on CO amount.
7. Comparisons of computed observational errors and standard deviations of retrieval-aircraft comparison differences show expected vertical behavior and demonstrate significant improvement over the standard deviation of prior-aircraft differences in vertical ranges with higher retrieval sensitivity.
TROPESS/CrIS CO biases detected in this study are in general much smaller than comparison standard deviations.We therefore make no recommendations for automated bias corrections in data processing, similar to other validation studies for satellite CO retrievals (e.g., Deeter et al., 2019Deeter et al., , 2022)).This is unlike other TROPESS products such as CH 4 (Kulawik et al., 2021) for which a bias correction is more appropriate given the size of bias detected as well as the atmospheric lifetime (∼ 10 years for methane) and reduced atmospheric variability compared to CO.Each analysis using TROPESS/CrIS CO data must consider the variability of CO over the domain of interest and ascertain whether the biases observed here could affect numerical conclusions.The biases reported from this study will need to be included when long-term records of satellite CO observations are harmonized and used together for computing trends, data assimilation, or other analyses.For example, with the 22-year record of MOPITT CO profiles, this is especially important when combining datasets since the vertical bias pattern for MO-PITT data with respect to in situ observations has a positive bias in the upper troposphere and negative bias in the lower to middle troposphere with the opposite behavior compared to the TROPESS/CrIS vertical bias pattern.Future validation of the TROPESS/CrIS CO products will include a longer time record of comparisons and quantification of bias drift for CrIS on SNPP and on the JPSS satellite series.The validation results presented here demonstrate that these products are suitable for tropospheric CO data analyses.The bias at all vertical levels is < 10 %, and error characterization for single retrievals can be used to weight data for averaging and applications such as data assimilation and inverse modeling.et al., 2018).TROPESS/CrIS CO products are available via the GES DISC from the NASA TRopospheric Ozone and its Precursors from Earth System Sounding (TROPESS) project at https://doi.org/10.5067/I1NONOEPXLHS(Bowman, 2021).
Author contributions.HMW, GLF, SSK, JDH, KCP, ML, and VHP designed the study, and HMW prepared the paper.GLF analyzed the satellite-aircraft comparisons and prepared the figures.SSK, KB, DF, VK, ML, KCP, VHP, and JRW developed the MUSES algorithm and provided the CrIS CO retrievals.RC and KM participated in the ATom campaign and provided guidance in the use of the measurements.KM provided the NOAA GML aircraft data.All authors reviewed and edited the paper.
Competing interests.At least one of the (co-)authors is a member of the editorial board of Atmospheric Measurement Techniques.The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.
Disclaimer.Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figure 1 .
Figure 1.SNPP TROPESS/CrIS and SNPP/VIIRS observations for 14 September 2020.Panel (a) shows the average CO VMR for 700 to 350 hPa for all processed TROPESS CO retrievals with good data quality (see text).Panel (b) shows the same free troposphere CO averages as (a) but with cloudy scenes removed (see text).Panel (c) shows the average TROPESS a priori CO VMR for 700 to 350 hPa.Panel (d) shows the NASA Worldview SNPP/VIIRS image for 14 September 2020 with clouds and smoke (true color) as well as fire thermal anomalies (red).

Figure 2 .
Figure 2. SNPP observations of the Pole Creek fire in Utah, USA, on 13 September 2018.The Great Salt Lake is in the upper left of each panel, and state borders with Idaho, Wyoming, and Colorado are indicated by solid straight lines.Dotted lines indicate a 1 • latitude by 1 • longitude grid, with the top left corner at 42 • N, −113 • E. Panel (a) shows CLIMCAPS/CrIS CO at 500 hPa (MMR converted to VMR).Panel (b) shows the TROPESS/CrIS tropospheric CO column average VMR, and panel (c) shows the corresponding NASA Worldview SNPP/VIIRS image with clouds and smoke (true color) as well as fire thermal anomalies (red).
, Juncosa Calahorrano et al. (2021)  showed how SNPP/CrIS single-pixel MUSES retrievals of acyl peroxy nitrates, also known as PAN, along with CO, can be used to follow fire plume chemical evolution.After subtracting background amounts, the normalized excess mixing ratios (NEMRs) of PAN with respect to CO, computed from the CrIS observations for this plume, were consistent with in situ aircraft observations of smoke plumes from the summer 2018 WE-CAN (Western Wildfire Experiment for Cloud Chemistry, Aerosol Absorption, and Nitrogen) campaign.
selection, coincidence criteria, and vertical extension of aircraft profiles TROPESS/CrIS CO profiles are selected for comparison if they have retrieval quality of 1 and effective cloud optical depth less than 0.1 to ensure non-cloudy CrIS observations.

Figure 3 .
Figure 3. Examples of TROPESS/CrIS CO averaging kernel (A) (a) and the validation process (b).The colors of the averaging kernel indicate the pressure level (66 levels from 1017.45 to 0.1 hPa) corresponding to each row, with the surface-level row also indicated by the squares.The number of degrees of freedom for signal (DFS), given by the sum of the diagonal (i.e., trace) of this averaging kernel, is 1.26.The right panel shows the CrIS CO profile retrieval (solid red line) with total error (dashed red lines), observation error (dotted red lines), a priori profile (solid cyan line with squares), and diagonal uncertainty (dashed cyan lines).The closest ATom aircraft profile had 10.4 km-3.5 h coincidence.The original ATom profile (dashed grey line) is interpolated to the CrIS vertical grid (solid grey with squares) and transformed by the instrument operator to give ATom xval (Eq. 1) (solid black line with squares).

Figure 4 .
Figure 4. Relative differences (%) in single CrIS retrievals with coincident NOAA GML xval profiles (grey) and the average percent difference with 1σ horizontal bars (red).Both day and night CrIS observations are included for coincidence search, with 1866 day and 266 night comparison pairs found.

Figure 5 .
Figure 5. Latitude dependence of CO partial column average VMR (ppb) for TROPESS/CrIS retrievals and NOAA GML xval (a) as well as bias difference statistics (b) shown by box-whisker symbols representing minimum and maximum values (whisker), lower quartile (box bottom), median (white stripe), and upper quartile (box top).A minimum of five comparisons per bin was required.

Figure 7
Figure7shows the bias (% relative difference) of the CrIS CO retrieved profiles with respect to ATom xval in situ pro-

Figure 6 .
Figure 6.Time dependence of CO partial column average VMR (ppb) for TROPESS/CrIS retrievals and NOAA GML xval (a) as well as bias difference statistics (b) shown by box-whisker symbols representing minimum and maximum values (whisker), lower quartile (box bottom), median (white stripe), and upper quartile (box top).A minimum of five comparisons per bin was required.The dashed line indicates a fit for bias drift (see text).

Figure 7 .
Figure 7. Relative differences (%) in single CrIS retrievals with coincident ATom xval profiles (grey) and the average percent difference with 1σ horizontal bars (red).Latitude ranges are indicated in each panel along with the number of comparison pairs.Both day and night CrIS observations are included.

Figure 8 .
Figure 8. Latitude dependence of CO partial column average VMR (ppb) for TROPESS/CrIS retrievals and ATom xval (a) and bias difference statistics (b) shown by box-whisker symbols representing minimum and maximum values (whisker), lower quartile (box bottom), median (white stripe), and upper quartile (box top).A minimum of five comparisons per bin was required.

Figure 9 .
Figure 9. Latitude dependence of partial column average CO for each ATom campaign.Black squares show ATom xval partial column average values over Atlantic Ocean scenes; black circles indicate ATom values over Pacific Ocean scenes.Blue triangles indicate CrIS CO partial column average values over land and Atlantic Ocean scenes; red diamonds indicate CrIS values over Pacific Ocean scenes.

Figure 10 .
Figure 10.Bias of CrIS partial column average CO vs. CO amount for NOAA GML flights in the top panel and ATom flights in the bottom panel with box-whisker symbols in 5 ppb bins.Linear regression results are shown in the legend boxes.

Figure 11 .
Figure 11.Error comparison of CrIS observational error estimates and the standard deviation (SD) of CrIS xval (in black) for NOAA GML flights (a) and ATom flights (b).Single-profile CrIS observational error estimates are plotted in red, with the average in dark blue with triangles.For reference, and the standard deviation of CrIS prior with aircraft xval is in cyan, and the a priori fractional uncertainty (0.3) is shown in cyan with triangles.

Figure 12 .
Figure 12.Same as Fig. 11 but for three ATom latitude ranges.

Table 1 .
Aircraft in situ validation observations used in this study.
(Sweeney et al., 2015)cgg/aircraft/(lastaccess:14 September 2022); https://espo.nasa.gov/atom/content/ATom(lastaccess:14September2022).depending on aircraft limitations at each site.Flask samples are then sent for laboratory analysis of a multitude of trace gases including CO, which was measured with vacuum UVfluorescence spectroscopy during the time period of this analysis.CO mixing ratios are reported relative to the WMO X2014A scale (https://gml.noaa.gov/ccl/co_scale.html, last access: 14 September 2022) and have reproducibility ∼ 1 ppb(Sweeney et al., 2015).NOAA GML aircraft profiles of CO have been used for the long-term validation of the MOPITT CO record, with updated validation for each new data version(Deeter et al., 2019, and references therein).For the current analysis, we use NOAA GML aircraft network observations of CO collected during 2016 and 2017 from seven locations (Table Tang et al. (2020)found very little sensitivity in MOPITT CO validation results for 25, 50, 100, and 200 km coincidence except for the cases with a 25 km radius that resulted in an insufficient number of matches for meaningful statistics.TheTang et al. (2020)study also tested the time coincidence criterion (12, 6, 2, and 1 h) with similar conclusions.Application of the 9 h-50 km coincidence criteria yielded 2092 CrIS-aircraft profile pairs for NOAA GML flights from 2016 and 2017 and 1052 profile pairs for the ATom 1-4 campaigns.Since the aircraft profiles used for validation do not span the full vertical range of satellite-retrieved profiles, we must extend these with a reasonable approximation of atmospheric CO to facilitate the comparison as described below in Sect.4.2.Here we use the TROPESS a priori profiles (from model climatology, described above) to extend the in situ profiles above the highest altitude sampled.The a priori profile is scaled to match the CO abundance of the aircraft measurement at the highest al- titude.The choices of model and approach for extending the aircraft profiles are examined more in Tang et al. (2020) and Hegarty et al. (2022), with similar conclusions that the impacts apply mostly to bias estimates in the middle to upper troposphere.Martìnez-Alonso et al. (2022) compute the unhttps://doi.org/10.5194/amt-15-5383-2022Atmos.Meas.Tech., 15, 5383-5398, 2022

Table 2 .
Bias and standard deviation (SD) for comparisons of SNPP TROPESS/CrIS CO retrievals and in situ CO profiles from NOAA GML fights.

Table 3 .
Bias and standard deviation (SD) for comparisons of SNPP TROPESS/CrIS CO retrievals and in situ CO profiles from ATom flight campaigns 1-4.