A new stratospheric and tropospheric NO2 retrieval algorithm for nadir-viewing satellite instruments : applications to OMI

. We describe a new algorithm for the retrieval of nitrogen dioxide (NO 2 ) vertical columns from nadir-viewing satellite instruments. This algorithm (SP2) is the basis for the Version 2.1 OMI This algorithm (SP2) is the basis for the Version 2.1 Ozone Monitoring Instrument (OMI) NO 2 Standard Product and features a novel method for separating the stratospheric and tropospheric columns. NO 2 Standard Product and features a novel method for separating the strato-spheric and tropospheric columns. The approach estimates the stratospheric NO 2 directly from satellite data without us-ing stratospheric chemical transport models or assuming any global zonal wave pattern. Tropospheric NO 2 columns are retrieved using air mass factors derived from high-resolution radiative transfer calculations and a monthly climatology of NO 2 proﬁle shapes. We also present details of how uncertainties in the retrieved columns are estimated. The sensitivity of the retrieval to assumptions made in the stratosphere– troposphere separation is discussed and shown to be small, in an absolute sense, for most regions. We compare daily and monthly mean global OMI NO 2 retrievals using the SP2 al-gorithm with those of the original Version 1 Standard Product (SP1) and the Dutch DOMINO product. The SP2 re-trievals yield signiﬁcantly smaller summertime tropospheric columns than SP1, particularly in polluted regions, and are more consistent with validation studies. SP2 retrievals are also relatively free of modeling artifacts and negative tropo-spheric NO 2 values. In a reanalysis of an INTEX-B validation study, we show that SP2 largely eliminates an ∼ 20 % discrepancy that existed between OMI and independent in situ springtime NO 2 SP1 measurements.


Introduction
Nitrogen oxides are important atmospheric trace gases that have significant impacts on human health.The two principal nitrogen oxides, nitric oxide (NO) and nitrogen dioxide (NO 2 ) (collectively NO x ), play key roles in atmospheric aerosol formation and tropospheric ozone chemistry (e.g., Finlayson-Pitts and Pitts, 1999;Seinfeld and Pandis, 1998).Major sources of tropospheric NO x include combustion, soil emissions, and lightning.In the lower troposphere, NO 2 is a toxic gas and a precursor to tropospheric ozone through the reaction of NO x with volatile organic compounds (VOCs).In the stratosphere, NO x contributes to both production and loss cycles of ozone and may indicate long-term changes in tropospheric emissions of nitrous oxide (N 2 O), an important greenhouse gas.Stratospheric NO x is produced mainly by the reaction of N 2 O with O( 1 D).
NO 2 has an easily observable spectral signature with strong spectral absorption lines in the visible, infrared, and near ultraviolet.In particular, its broad, highly structured absorption feature in the blue-violet range can be exploited for remote sensing (Platt and Perner, 1983;Platt, 1994).
Satellite and in situ measurements of tropospheric nitrogen oxides are used with chemical transport models (CTMs) to quantify sources and transport of NO 2 pollution from power plants, automobiles, ships, and aircraft (e.g., Martin et al., 2003Martin et al., , 2006;;Zhang et al., 2007;Beirle et al., 2004Beirle et al., , 2011;;Jaeglé et al., 2005;Frost et al., 2006;Boersma et al., 2008;Lin et al., 2010;Russell et al., 2010).Instruments on satellite platforms are particularly valuable, since they can obtain NO 2 measurements over large geographical regions.Top-down NO 2 measurements are helpful in constraining emissions for global-and regional-scale atmospheric models (Martin et al., 2003;Choi et al., 2008;Lamsal et al., 2010).Multiyear, consistent time-series measurements allow the study of interannual variability and long-term trends (Richter et al., 2005), which have been used to assess the effectiveness of emission control regulations and the effects of economic trends on industrial activity (Frost et al., 2006;Kim et al., 2006;Castellanos and Boersma, 2012).NO x produced by lightning (LNO x ) contributes an additional 10-15 % to total NO x production in the troposphere (Schumann and Huntrieser, 2007), and LNO x measurements are helpful in estimating the global NO x budget (Tie et al., 2002;Martin et al., 2007).
In unpolluted areas, the stratospheric NO 2 can exceed 90 % of the total NO 2 column (Martin et al., 2002a).The partitioning of NO x and NO y in the stratosphere is sensitive to photochemical conditions; thus, NO 2 has a strong diurnal dependence that varies as a function of latitude and season (Dirksen et al., 2011).Although NO 2 in the stratosphere is more zonally symmetric than in the troposphere, there is still spatial structure that is important for understanding the morphology of stratospheric NO 2 itself, while complicating the retrieval of tropospheric NO 2 from satellite-derived slant columns.The accuracy of the inferred tropospheric contribution critically depends on the characterization and separation of stratospheric NO 2 .The procedure used to determine the two components of the NO 2 vertical column will be referred to as the stratosphere-troposphere separation (STS) algorithm.
Determining the relative amounts of stratospheric and tropospheric NO 2 from a given absorption spectrum is inherently difficult.Although the shape of the NO 2 absorption cross section varies with altitude (due to temperature), cross sections at different temperatures are not orthogonal.Therefore, the stratospheric and tropospheric NO 2 amounts cannot be independently determined from the spectral fit.Instead, most STS algorithms rely on spatial information from multiple slant columns measured over a wide geographic area.All such algorithms are prone to errors associated with the a priori information assumed about the stratospheric vertical column.The reference-sector (RS) method, discussed by Richter and Burrows (2002) and Boersma et al. (2004), assumes zonal invariance.The stratospheric vertical column at any latitude is set equal to the measured total column at the same latitude in the central Pacific Ocean.Because the central Pacific contains small background amounts of tropospheric NO 2 , the RS method can slightly overestimate the stratospheric fraction of the column.Martin et al. (2002a) corrected this by using model estimates of Pacific tropospheric NO 2 .More importantly, the real stratospheric NO 2 varies with longitude, leading to potential inaccuracies in both the stratospheric and derived tropospheric vertical column.Other methods, such as the image processing technique (IPT) of Leue et al. (2001) and Velders et al. (2001), and the wave-2 stratospheric model of Bucsela et al. (2006), allow for some longitudinal variation in stratospheric NO 2 .However, like the RS method, the IPT and wave-2 algorithms required relatively simplistic assumptions about which regions to use in constructing the global NO 2 stratospheric field.The wave-2 model, in particular, can introduce stratospheric artifacts, especially at high latitudes (Dirksen et al., 2011).Some approaches have tried to capture more realistic structure in the stratospheric NO 2 field by using CTMs to estimate the spatial variation in stratospheric NO 2 .In the Dutch OMI NO 2 (DOMINO) product (Boersma et al., 2011;Dirksen et al., 2011), OMI NO 2 measurements are assimilated in a CTM model.CTM-based algorithms require daily model runs and relatively complex assimilation schemes.As will be shown, CTM-based algorithms can also introduce occasional modeling artifacts.If independent stratospheric measurements are available, a more observation-based approach can be used.Beirle et al. (2010) and Hilboll et al. (2013) have described methods for combining nadir measurements from OMI or SCIAMACHY with limb measurements of stratospheric NO 2 from SCIAMACHY.Because the limb measurements are sparsely sampled, these approaches require significant spatial interpolation to obtain a continuous stratospheric field.
In this paper, we describe a new algorithm for the retrieval of NO 2 vertical columns using only nadir-viewing satellite slant-column measurements, simple tropospheric climatologies, masking and interpolation.The algorithm is now used to produce NASA's Version 2 OMI NO 2 (OMNO2) Standard Product (SP2), and could be employed for other satellite measurements.For OMI data, SP2 is a significant improvement over the original SP1, which was based on the wave-2 STS algorithm.SP2 continues the philosophy of minimizing Based on data from 30 • S-5 • N of 5 orbits.
Stratosphere-troposphere separation Stratospheric NO 2 field based on a global analysis, assuming a zonal wave-2 structure.
In regions of tropospheric pollution, stratospheric column is inferred using a local analysis of the stratospheric field.
Air Mass Factor (AMF) NO 2 profile shape GEOS-Chem annual mean tropospheric NO 2 profiles for the year 1997 coupled with a single stratospheric profile.
Monthly mean NO 2 profile shapes derived from GMI CTM multiannual (2005-2007)  the use of model information in retrievals, but includes a number of features not present in SP1.The SP2 stratospheric slant column is estimated from the total slant column using an a priori monthly tropospheric NO 2 model climatology, but only where tropospheric contamination of the observed NO 2 column is below a threshold.The threshold is set by an upper limit on the amount that tropospheric NO 2 absorption may contaminate the observed stratospheric vertical column.The SP2 algorithm also features improved air mass factors based on new radiative transfer calculations and terrain reflectivities and uses monthly, rather than annual, mean NO 2 profile shapes.Cloud properties are obtained from the OMI OM-CLDO2 data product, which has recently been updated to include better wavelength calibration, look-up tables using sun-normalized radiances, and cloud pressures clipped at the surface pressures (Maarten Sneep, private communication).The purpose of this paper is twofold: (1) to introduce the new STS algorithm and (2) to discuss the additional, more incremental changes that distinguish SP2 from SP1.We describe the algorithm in Sect. 2 and list the differences between the old and new retrieval approaches.We present error analysis in Sect. 3 and discuss additional considerations and comparisons with other datasets in Sect. 4.Although a thorough treatment of validations comparing the new algorithm with independent datasets is beyond the scope of this paper, a validation example is included in Sect. 4. Numerous additional validation studies will be presented separately by Lamsal et al. (2013).Section 5 contains a summary and conclusions.

Algorithm description
The architecture of the algorithm is summarized in the flow diagram in Fig. 1.Spectral data are fitted to obtain raw NO 2 slant columns, S (Sect.2.1), and are corrected for instrumental artifacts (also referred to as striping; see Sect.2.3) to yield the de-striped slant columns, S. The data are analyzed to separate stratospheric and tropospheric NO 2 partial vertical columns, V strat and V trop , and to obtain total column amounts, V total (Sect.2.4).The stratospheric and tropospheric air mass factors, A strat and A trop (Sect.2.2), used in the calculations are based on a priori information from radiative transfer (RT) and CTM models.The RT calculations used to process the OMI data in this study were carried out using TOMRAD (Davé, 1965).Some details in the SP2 algorithm are similar to the approach used in SP1, but there are many important changes.The similarities and differences are summarized in Table 1, and details of the SP2 algorithm are presented in Sects.2.1-2.4.

OMI spectral fitting
The NO 2 slant columns used in this study were extracted from OMI spectra.The OMI instrument is a UV-VIS hyperspectral, push-broom, nadir-viewing satellite spectrometer (Levelt et al., 2006) on the NASA EOS Aura satellite (Schoeberl et al., 2006), launched in July 2004.Aura has an Equator crossing time of 13:30 LST and an orbital period of 99 min so that OMI views the entire sunlit portion of the Earth in ∼14.5 orbits.On each orbit, OMI makes simultaneous measurements in a swath width of ∼2600 km, divided into 60 fields of view (FOVs) or pixels.One swath is measured every two seconds, for approximately 1650 swaths from southern , respectively.The section outlined in blue is OMIspecific.TOMRAD is a forward vector radiative transfer model (Davé, 1965).
to northern terminator on the sunlit side of the earth.Swaths in adjacent orbits are nearly contiguous at the Equator and overlap elsewhere.LST differences from the west to east sides of a swath range from approximately 1.5 h at the Equator to several hours at mid-to high latitudes.
The NO 2 slant columns are estimated by spectral fitting of OMI earthshine radiances.The fitting algorithm uses the Differential Optical Absorption Spectroscopy (DOAS) method (Platt and Stutz, 2006), applied in the spectral range of 405 nm to 465 nm (Boersma et al., 2002;Bucsela et al., 2006).The earthshine radiances are normalized by a reference OMI-measured solar irradiance spectrum [R(λ) = I (λ)/F (λ)].The use of a static measured solar reference spectrum reduced much of the calibration-induced striping that was discovered soon after OMI operations began (Dobber et al., 2008).(The removal of residual striping is described in Sect.2.3).The normalized spectra, R(λ), are fitted to laboratory-measured trace gas absorption spectra at a fixed stratospheric temperature (T 0 = 220 K), a reference ring spectrum (Chance and Spurr, 1997), and a polynomial function that models the spectrally slowly varying scattering by clouds and aerosols and reflection by the Earth's surface.In the current version, the only trace gas absorption spectra considered are those of NO 2 (Vandaele et al., 1998), O 3 (Bass and Johnsten, 1975), and H 2 O (Harder and Brault, 1997).The temperature dependence of the NO 2 cross section is accounted for later in the algorithm.The trace gas absorption spectra used were produced by convolving highresolution, laboratory-measured absorption spectra with the measured OMI slit function, measured pre-launch by Dirksen et al. (2006).The result of the spectral fit is the raw slantcolumn density for each OMI pixel.
The calibrations of the 60 cross-track FOVs have relative biases that are observed to be persistent on time scales of several orbits to several days.As a result, the retrieved NO 2 slant columns show a pattern of stripes running along each orbital track.This instrumental artifact can be corrected to some extent using the "de-striping" procedure described in Sect.2.3.A more severe effect is the "row anomaly" (RA), which was first noticed in the data in June 2007 and is likely caused by an obstruction in part of OMI's aperture.The extent of the RA has increased since 2007 and currently affects approximately half of the FOVs.Current RA information is available at http://www.knmi.nl/omi/research/product/rowanomaly-background.php.Users of OMI data are discouraged from using FOVs flagged as RA-affected.

Air mass factors
In DOAS retrievals, the air mass factor, A, is the ratio of a slant column, S, to the vertical column, V , we want to retrieve.We write this relationship generically as A = S/V .The air mass factor is assumed to be wavelength-independent across the slant-column fitting window.In a given partial atmospheric region (stratosphere or troposphere), the air mass factor is computed as the ratio of the sum over layers of the slant sub-columns S i to the sum of vertical sub-columns V i : where i is the layer index.The summation combines all layers in the appropriate partial atmospheric column.Temperature is assumed to be constant within a layer.Slant and vertical sub-columns can be represented as integrals over all pressures p within layer i: and Here, m(p) is the atmospheric scattering weight (also referred to as the "box" or "layer" air mass factor), α(p) is a temperature-correction factor for the NO 2 absorption cross section, ζ (p) is the a priori NO 2 mixing ratio, and κ is a constant equal to the reciprocal of the weight of an air molecule.
The formulation in Eqs.
(2) and ( 3) implicitly decouples atmospheric scattering and NO 2 absorption, as described by Palmer et al. (2001), so that the m(p) are independent of NO 2 amount.The temperature factor α(p) is needed to correct for the fixed-temperature NO 2 cross section (T 0 = 220 K) used in the slant-column fitting and can be written as a function of the local temperature T (p) as The coefficient 0.003 (units K −1 ) was obtained empirically by fitting synthetic radiance spectra with NO 2 cross sections measured at several temperatures.This coefficient is in line with temperature correction coefficients proposed in Boersma et al. (2002Boersma et al. ( , 2004)).
For partly cloudy scenes, we use an independent-pixel approximation for the air mass factor (e.g., Martin et al., 2002a) and express scattering weights as the weighted sum of cloudy and clear components, m(p) cloudy and m(p) clear , respectively: Here the weighting factor, w, denotes cloud/aerosol radiance fraction (CRF), the fraction of the measured radiation that comes from clouds and aerosols.In the SP1 and SP2 algorithms, aerosols are not distinguished from clouds, since weakly absorbing aerosols can have similar effects on the air mass factor in some circumstances (Boersma et al., 2011).The effect of stratospheric aerosols is also not explicitly considered in the algorithm.The value of w is generally larger than the O 2 -O 2 geometrical cloud fraction at 470 nm since the clouds are assumed to be optically thick with an effective Lambertian albedo of 0.8 (Stammes et al., 2008).
The cloudy and clear scattering weights for a given observation depend on parameters including viewing geometry, surface (terrain or cloud) pressure, and surface reflectivity.
Scattering weights are computed and stored a priori in sixdimensional look-up tables (LUT) generated from a radiative transfer model.For clear-sky scattering weights, the six LUT parameters are solar zenith angle (SZA), viewing zenith angle (VZA), relative azimuth angle (RAA), terrain reflectivity (R t ), terrain pressure (P t ), and atmospheric pressure level, (p).For cloudy scattering weights, we treat clouds as opaque Lambertian surfaces and replace the terrain reflectivity and terrain pressure with cloud reflectivity (R c = 0.8) and cloud optical centroid pressure (P c ), respectively.The latter is estimated with the OMI O 2 -O 2 cloud algorithm (Acarreta et al., 2004;Sneep et al., 2008;M. Sneep et al., private communication, 2012).The SP2 scattering weights are computed from parameter sets that have been improved relative to SP1.In particular, the terrain reflectivities, which were derived from GOME (Koelemeijer et al., 2003) in SP1, are now based on OMI measurements (Kleipool et al., 2008).Terrain pressures are obtained as described by Boersma at al. (2011) from a 3 km digital elevation model provided with the Aura data.The reflectivities and other parameters are no longer assumed to vary linearly between tabulated values, as was the case in SP1, and are now interpolated using Lagrange polynomials.The resolution in the six-dimensional parameter space has also been increased.In the new algorithm, the number of nodal points in SZA, VZA, RAA, R t , P t , and p are 9, 6, 5, 8, 6, and 35, respectively.These improvements reduce interpolation errors noted in SP1 (Dirksen et al., 2011) by up to 15 %.
The a priori NO 2 mixing ratio profiles for the air mass factor calculations in SP2 are obtained 4 from the Global Modeling Initiative (GMI) CTM (Duncan et al., 2007;Strahan et al., 2007).The model simulates the stratosphere and troposphere and includes emissions, aerosol microphysics, chemistry, deposition, radiation, advection, and other important chemical and physical processes, such as lightning NO x production (Duncan et al., 2007).The GMI chemical mechanism combines the stratospheric mechanism described by Douglass et al. (2004) with a detailed tropospheric O 3 -NO x -hydrocarbon chemistry originating from the Harvard GEOS-Chem model (Bey et al., 2001) and is driven by GEOS-5 meteorological fields at the resolution of 2 • latitude × 2.5 • longitude (Rienecker et al., 2008).The vertical extent of the model is from the surface to 0.01 hPa, with 72 levels and a vertical resolution ranging from ∼150 m in the boundary layer to ∼1 km in the free troposphere and lower stratosphere.The tropopause pressure is defined in the GEOS-5 reanalysis driving the CTM using a combination of Ertel's potential vorticity (EPV) and potential temperature.The tropopause pressure is taken as the higher of the EPV = 3.6 × 10 −6 K kg −1 m 2 s −1 and 385 K theta surfaces.For the NO 2 SP2 algorithm, the use of alternative tropopause definitions (e.g., a chemical tropopause) was found to have little effect on the retrieval, since the differences in pressure were generally small and occured in regions where NO 2 concentrations are low.
Model outputs were sampled at the LST of OMI overpass, and monthly mean profiles were derived using four years (2004)(2005)(2006)(2007) of simulation.In contrast, SP1 used annual mean tropospheric profiles for 1997 from a GEOS-Chem simulation (Bey et al., 2001;Martin et al., 2002b), with only a single profile used for the stratosphere (Bucsela et al., 2006).Unlike the stratospheric air mass factor, which depends mainly on the viewing geometry, the air mass factor in the troposphere is particularly sensitive to the NO 2 profile shape.Model profile shapes vary by geographic region and exhibit daily variability as well, as validated by in situ measurements (e.g., Boersma et al., 2008;Bucsela et al., 2008).Our sensitivity studies indicated that monthly mean profiles captured the seasonal variation sufficiently well so that daily profiles were not included in the SP2 algorithm; however, use of 30-day running mean NO 2 profile shapes is being considered for a future version of the algorithm.

De-striped slant columns
As described in Sect.2.1, an instrumental artifact introduces a bias in the retrieved OMI NO 2 slant columns, resulting in the appearance of orbital "stripes" when the data are mapped.The de-striping algorithm computes the mean cross-track biases using raw NO 2 slant columns and stratospheric air mass factors from five consecutive orbits over clean regions (30 • S to 5 • N).This approach relies on identifying and estimating cross-track bias in slant columns from cross-track variation in the stratospheric air mass factors.An initial estimate of the bias δ i for each cross-track position i is computed from the mean slant column ‹ S › i and stratospheric air mass factor ‹ A strat › i for that cross-track position and the average slant column ‹‹ S › i › and average stratospheric air mass factor ‹‹ A strat › i › over the entire swath from 30 • S to 5 • N. The computation of the entire swath averages excludes all scan positions that have extreme values of ‹ S › i /‹ A strat › i (> 10 17 cm −2 ) and those known, a priori, to be affected by the row anomaly.The initial bias estimate is The final value of the cross-track bias is recomputed from Eq. ( 6) by applying an additional screening criterion in the calculation of ‹‹ S › i › and ‹‹ A strat › i ›.The cross-track scan positions whose δ i values lie outside a ±2σ interval are excluded to ensure that very high or low values of the bias in any of the cross-track scan positions (including those af-fected by the RA) do not affect the average.The resulting cross-track bias for a given OMI orbit is a set of 60 correction constants.At each pixel in the orbit, the corresponding bias is subtracted from the measured slant column S to give a corrected ("de-striped") slant column S.

Stratosphere-troposphere separation (STS)
The STS scheme described in this study takes advantage of the fact that, over most of the Earth, the NO 2 absorption contributing to the slant-column measurements is almost entirely stratospheric.Therefore, a simple and reasonable initial estimate of the stratospheric vertical column is the ratio of the de-striped measured slant column to the (nearly geometric) stratospheric air mass factor: In areas with relatively little tropospheric NO 2 , we obtain the value of the stratospheric vertical column by subtracting a fixed model estimate of the (small) tropospheric column from V init and applying spatial smoothing to the resultant geographic field.Where there is substantial tropospheric NO 2 pollution, the stratosphere is estimated by spatial interpolation from the surrounding clean regions.The tropospheric vertical column is then computed as the difference between S and the stratospheric slant column, divided by a tropospheric air mass factor.Figure 2 illustrates the steps of the STS algorithm for one day of data, beginning with the spectrally fitted slant columns (Fig. 2a) and initial vertical columns V init (Fig. 2b).The following seven steps summarize subsequent computations.
1. Subtract an a priori troposphere from V init to get initial stratospheric vertical column.
2. Mask the field wherever tropospheric contamination exceeds a threshold.
3. Bin this initial stratospheric vertical-column estimate onto a geographic grid.
4. Interpolate the binned vertical columns over masked areas.
5. Identify and eliminate "hot spots" in the stratospheric field.
6. Smooth and interpolate to pixel-center coordinates to give the final V strat at each FOV.
7. Subtract the stratospheric contribution to get the tropospheric vertical column.
For steps 1 and 2, we first compute an a priori tropospheric slant column, S trop , at each satellite pixel: where V trop_a_priori is a geographically gridded (2 • latitude × 2.5 • longitude), monthly mean model of NO 2 climatology of tropospheric vertical columns, and A trop is the tropospheric air mass factor.The NO 2 climatology used in computing V trop_a_priori is the same as that used in the calculation of A trop .
The initial estimate of the stratospheric field V o strat (Fig. 2c) is computed as The algorithm then masks V o strat for all pixels in which the tropospheric contamination of the NO 2 column is large.Masked pixels, shown as white areas in Fig. 2d, are eliminated from the stratospheric field calculation.The masking threshold is chosen to exclude pixels where V init would exceed the actual stratospheric vertical column by more than a value ε.We require (S trop /A strat ) < ε. (10) In the current algorithm, we chose an absolute threshold of ε = 0.3 × 10 15 cm −2 to limit the stratospheric verticalcolumn uncertainty introduced by the a priori troposphere to a value of 0.2 × 10 15 cm −2 or less.This uncertainty is consistent with previous estimates of uncertainty in the stratospheric NO 2 column (see Sect. 3.2) and is comparable to pixel noise associated with the slant-column uncertainty (see Sect. 3).Using this masking scheme allows polluted pixels to remain unmasked where the lower troposphere is obscured by clouds.These unmasked pixels provide a more robust stratospheric retrieval in polluted areas than would be possible if all polluted regions were automatically masked.Leue et al. (2001) similarly made use of cloudy pixels to construct a stratospheric field relatively free of tropospheric contamination.Conversely, in regions where amounts of tropospheric NO 2 are relatively small (∼0.5 × 10 15 cm −2 ), tropospheric NO 2 can still contaminate the measurements if skies are clear and surface reflectivities are high.Examples are the Sahara and southern Arabian Peninsula, which require more masking than similarly unpolluted ocean regions (see Fig. 2d).We have chosen an absolute, rather than relative, threshold to assess tropospheric contamination of the observed column, since using a relative threshold leads to unnecessary masking of areas where the magnitude of small stratospheric columns begins to approach the absolute measurement uncertainty.Globally, the fraction of pixels masked is approximately constant throughout the year and ranges from about 10 % in the Southern Hemisphere to nearly 35 % in the Northern Hemisphere.
Steps 3-6 are performed with the stratospheric field data binned on a uniform 1 • × 1 • geographic grid.A separate global stratospheric field is constructed for each orbit by forming weighted averages of the data in each 1 • × 1 • bin and including data from the adjacent ± 7 orbits.Largest weights are assigned to data from the "target" orbit, so that adjacent orbits are essentially used only when data from the target orbit are unavailable.The weighting scheme minimizes the effects of mixing data from different local times in overlapping orbits with the data from the target orbit.Any unfilled bins in this vertical-column field are then interpolated using a 2-D averaging function in the form of a rectangular window of dimensions δ Lon in longitude and δ Lat degrees in latitude.At middle latitudes, we use a window of δ Lon ∼30 • and δ Lat ∼20 • , but we modify these values at low and high latitudes.In particular, the longitude dimension near the Equator is increased to 360 degrees to reduce synopticscale contamination of the stratospheric field by NO 2 enhancements due to tropical lightning.Martin et al. (2007) have discussed the existence of longitudinally broad tropospheric LNO 2 enhancements in the tropics between South America and Africa.The binned, interpolated field is shown in the grid-cell map of Fig. 2e.Note the scattered grid cells containing unmasked data within the smooth, interpolated (i.e., masked) field of Eastern Europe.These grid cells contain information responsible for the stratosphere's structure in regions that otherwise may look to be uniformly masked in the orbital map of Fig. 2d.
To further reduce contamination of the stratosphere by tropospheric NO 2 not accounted for in the climatology, we use statistical criteria to identify and mask tropospheric hot spots (step 5).These may include unknown anthropogenic sources or intense, localized soil-and lightning-related emissions.For hot-spot removal, we employ a smaller averaging window of δ Lon ∼15 • and δ Lat ∼10 • .The V strat value in the bin at the center of the window is masked and replaced by the mean if its V strat exceeds the mean by more than 1.5 standard deviations.A comparison of Fig. 2e and f shows the result of the hot-spot removal.Note the removal of small areas of locally enhanced NO 2 in Western Canada, the eastern Gulf of Mexico, and in various locations throughout Asia.
Finally, the stratospheric vertical columns are smoothed using a small window of δ Lon ∼5 • and δ Lat ∼3 • and interpolated from the 1 • × 1 • grid back to the pixel-center coordinates.The smoothing step effectively degrades the spatial scale of resolvable stratospheric features to approximately 300 km so that any smaller-scale features in the V init field will be interpreted as tropospheric.
The NO 2 tropospheric column at each pixel is the difference between the total and the stratospheric columns, computed as follows: where S is the de-striped total measured slant column (Sect.2.3) and A strat and A trop are the air mass factors (derived from a priori and cloud information as described in Sect.2.2).Tropospheric values are generally positive, as seen in Fig. 2h, but local negative values may occur at any pixel where the binning, interpolation, and/or smoothing in the STS algorithm results in a V strat value larger than V init .The total column is the sum of the tropospheric and stratospheric columns: Note that V total is generally larger than V init , since A trop is typically smaller than A strat , especially where tropospheric NO 2 is concentrated in the boundary layer and/or hidden by clouds.

Error estimates
The uncertainties in the total column amounts result from uncertainties in (1) the fitted NO 2 slant columns, (2) the stratospheric and tropospheric air mass factors, and (3) the algorithm used to separate the stratosphere and troposphere (STS).Descriptions of these errors in the context of OMI NO 2 retrievals may be found in Boersma et al. (2004Boersma et al. ( , 2011) ) and Wenig et al. (2008).The uncertainties in the slantcolumn amounts have been described previously (Boersma et al., 2004(Boersma et al., , 2011) ) and will not be discussed in detail here.
For data collected during the first two to three years of the mission, the rms fitting error in the OMI NO 2 slant column had a median value of approximately 10 15 cm −2 , which is on the order of 10 % of the total slant column for polluted regions.For swath positions affected by the row anomaly (see Sect. 2.1), we calculate NO 2 values but do not estimate uncertainties.We treat the S, A, and STS errors as statistically independent and discuss the latter two in Sects.3.1 and 3.2.The combined errors for the vertical-column retrievals are given in Sect.3.3.

Errors in air mass factors
The air mass factor (A strat or A trop ) is computed as shown in Eqs.(1-5).A general expression for the air-mass-factor uncertainty, σ A , can be written as a sum of variances: where σ m A is the net air-mass-factor error associated with the scattering weights, m, and σ CTM A is the net air-mass-factor error associated with the CTM used for the NO 2 profile shape.The parameters that most affect the scattering weights are the terrain reflectivity, R, the cloud radiance fraction, w (w also implicitly accounts for aerosols; Boersma et al., 2011), and the effective cloud pressure (also referred to as optical centroid pressure), P c .The parameters relating to the CTM are the NO 2 sub-column profile, V i , and temperature profile, T i .In general, the uncertainties in these quantities are not independent; e.g., an overestimation of R can lead to an underestimation of w, and the derived cloud pressure P c can also be related to w in cloud retrieval algorithms (Sneep et al., 2008).Likewise, the temperature profile T i affects the model's prediction of NO 2 mixing ratios, ζ (p).Uncertainties in the viewing geometry and terrain pressure are neglected in this error formulation, although errors in the latter can affect integrated profile amounts, particularly over mountainous terrain (Schaub et al., 2007;Boersma et al., 2008;Hains et al., 2010;Russell et al., 2011).
In spite of these interdependencies, we assume, for computational convenience, that these parameters can be decoupled as follows: and where σ R A , σ w A , and σ P c A are the air-mass-factor errors due to errors in terrain reflectivity, R, cloud radiance fraction, w, and cloud pressure, P c , respectively.We also assume σ ζ A and σ T A are the respective air-mass-factor errors due to errors in the model NO 2 mixing-ratio profile, ζ i , and temperature profile, T i .
We compute the terms on the right-hand sides of Eqs. ( 14) and (15) from Eqs. (1-5) and from a priori estimates of the uncertainties σ R , σ w , σ P c , σ ζ , and σ T in terrain reflectivity, cloud-radiance fraction, cloud pressure, NO 2 profile, and temperature profile, respectively, and the sensitivities of A to each of these parameters.Using Eqs.(1-5), we can write simplified expressions for the variances (σ β A ) 2 in the five parameters β = R, w, P c , ζ , or T .If the atmosphere is divided into N vertical layers (i = 1, . . .N), we define an N-element Jacobian column vector J β and its (row) transpose J T β .Each element (J β ) i is the derivative of A (Eq. 1) with respect to parameter β in layer i.With these definitions, the five variances can be written in compact matrix notation, with the corresponding explicit expressions for the Jacobian elements as follows: where where and where In Eqs.(16-20), U is defined as an N × N-unit matrix (matrix of all elements equal to one).S ζ and S T are the N × Nelement covariance matrices for the a priori model NO 2 mixing-ratio and temperature profiles, respectively.In general, for parameter β, the (i, j ) element of the covariance matrix is the expectation value of the product of the deviations (σ β ) i and (σ β ) j from their respective mean values, (S β ) i,j = ‹ (σ β ) i • (σ β ) j ›.The ζ and T covariance matrices can be estimated with daily profiles from the CTM, by considering the respective average covariances of ζ and T within each layer.
Combining Eqs.(13-20), we summarize the net variance in the air mass factor as In this expression, we have omitted the uncertainty due to temperature, since the error it introduces in the uncertainty relative to the other terms was found to be negligible.The fourth term, involving the covariance of the a priori NO 2 vertical profile shapes, also neglects any unresolved horizontal variation in the profiles.The variation can significantly affect the magnitudes of air mass factors within the 2 • × 2.5 • CTM grid cells, as shown by Heckel et al. (2011) and Lamsal et al. (2013).
Equation ( 21) can be applied to both the stratospheric and tropospheric air mass factors.In practice, however, the A strat is very nearly geometrical and has a very small uncertainty.For simplicity in calculation, we assume a fixed nominal 2 % error in this value: σ A strat = 0.02 •A strat .Under clear skies (ignoring uncertainties related to clouds), the error in A trop is a function of uncertainties in terrain reflectivity and the NO 2 profile shape.Assuming a nominal uncertainty in terrain reflectivity of σ R ∼0.015 (Wenig et al., 2008), the associated error A trop is on the order of 10 to 15 %, and a similar uncertainty results from errors in the profile (Bucsela et al., 2008).Therefore, a conservative estimate of clear-sky relative uncertainty in A trop is 20 %.When clouds are present, we compute uncertainties in A trop of 30 to 80 %.

Errors in the estimated stratosphere
The stratospheric vertical-column uncertainty, σ V strat , from the STS algorithm depends on a number of factors, including the conditions associated with the slant-column measurement, the STS algorithm parameters, and errors associated with the a priori tropospheric model.Measurement errors relate to the geographic region of the measurement, the local cloud parameters (cloud radiance fraction and cloud pressure), and the degree of tropospheric pollution affecting the region.Sources of retrieval-parameter error include the masking thresholds (for the initial masking and hotspot removal) and the widths of the geographical averaging functions.Finally, the a priori tropospheric estimate from the CTM introduces both random-type errors, due to differences between the monthly mean climatology and daily tropospheric profiles and any systematic errors affecting the model.In clean regions, errors in the climatological tropospheric vertical columns can bias the stratospheric estimate.Effects of such CTM errors are examined further in Sect.4.1 and discussed by Lamsal et al. (2013).
Because of the multiple dependencies, the stratospheric error is difficult to quantify.However, we can make a reasonable estimate by combining the effects of the three largest independent sources of uncertainty: (1) σ CTM S trop are errors in the a priori S trop due to the (mostly unknown) uncertainty in Although the CTM errors are not known, we estimate the uncertainties from sources (1) and (2) (first two terms) to be approximately 50 %.Therefore, by Eq. ( 10), with ε = 0.3 × 10 15 cm −2 , these errors are <0.15 × 10 15 cm −2 each.The third term in Eq. ( 22) applies only to masked areas (see Sect. 4.1.2).An estimate of its value was obtained from an analysis of simulated data.Using the monthly mean GMI model NO 2 profiles and daily views of geometry, pressure, reflectivity, and cloud parameters from OMI, a set of simulated slant-column measurements was constructed.
The STS algorithm was then applied to these slant columns and the retrieved stratospheric vertical columns were compared to those of the original model.Histograms of the stratospheric errors in the masked regions for four months are shown in Fig. 3, which indicates σ errors of approximately 0.1 × 10 15 cm −2 .Therefore, the combined stratospheric vertical-column error computed from Eq. ( 22) is on the order of 0.2 × 10 15 cm −2 .Errors in masked (polluted) regions can be slightly larger than this value, while errors in the cleanest areas (e.g., high-latitude, unpolluted areas) are typically significantly smaller.The use of 0.2 × 10 15 cm −2 as an approximate value for stratospheric uncertainty is consistent with previous estimates by Boersma et al. (2004) and Bucsela et al. (2006).The masking threshold of 0.3 × 10 15 cm −2 was chosen, in part, to make the total stratospheric verticalcolumn uncertainty close to this value.Section 4.1 offers further discussions of the effects of errors associated with the STS algorithm.

Vertical-column errors
The uncertainties in the retrieved vertical-column amounts are calculated by treating the uncertainties in S, V strat , A strat , and A trop as independent.Based on the definitions of tropospheric and total columns in Eqs. ( 11) and ( 12), this assumption yields the following variances in the tropospheric and total vertical columns: where σ S , σ V strat , σ A strat , and σ A trop are the uncertainties in S, V strat , A strat , and A trop , described in Sects.3.1 and 3.2.Global mean tropospheric vertical-column errors are typically on the order of 1 × 10 15 for clear skies and can be greater than 3 × 10 15 cm −2 as the cloud radiance fraction approaches unity, as shown in Fig. 4. The tropospheric errors are driven mainly by the slant-column uncertainty when pollution is low and by both slant-column and air-massfactor uncertainties under polluted conditions.Relative tropospheric vertical-column uncertainties in cloudy, polluted regions can approach 100 %.

Discussion and comparisons with other datasets
The main components that distinguish the SP2 algorithm from SP1 and other satellite NO 2 retrieval schemes involve the STS and tropospheric air mass factors.In this section, we further examine the STS algorithm, compare with OMI retrievals from other algorithms, and re-examine validation using in situ measurements.

Retrieval effects of a priori assumptions in the STS algorithm
The a priori tropospheric NO 2 columns and the masking, interpolation, and smoothing components of the algorithm affect NO 2 retrieval accuracy.In general, over the cleanest areas (open-ocean and high-latitude regions), the SP2 algorithm yields V strat values that are approximately as accurate as the NO 2 slant columns and contain only small amounts of a priori model information from the tropospheric climatology.Relatively little independent tropospheric information is retrieved from these regions, but local enhancements relative to the a priori troposphere can be observed.The retrieval over clean regions consists of small-scale (smaller than ∼3 • ) tropospheric features and measurement noise.Over polluted (masked) regions, the SP2 algorithm provides V trop retrievals that are mainly dependent on the assumed local profile shapes and surface reflectivity climatology via the air mass factors.Two scenarios in particular are challenging to the SP2 STS algorithm (and for nadir-viewing satellite retrievals in general).Firstly, over clean, unmasked regions, non-localized (broad) variations in tropospheric NO 2 relative to the a priori climatology will necessarily appear as stratospheric NO 2 .Secondly, in polluted, masked areas, stratospheric features that depart significantly from the mean stratosphere in surrounding unmasked areas will be aliased into the tropospheric column.For example, a small-scale stratospheric enhancement over a cloud-free polluted region on the US East Coast would be retrieved as a tropospheric enhancement.The inherent ambiguity in these scenarios can not be resolved without additional information about the partitioning of stratospheric and tropospheric NO 2 .
To examine the behavior of the STS algorithm under such conditions, we consider idealized, noise-free retrievals over unmasked and masked regions.These are discussed and illustrated in Sects.4.1.1-4.1.3.For our simulations, we assume that the "true" stratospheric air mass factor, A strat , is equal to its a priori estimate, A strat , and is invariant on scales smaller than the widths of the smoothing windows: <A strat >≈ A strat (on this scale we ignore the effects of viewing geometry).In this discussion, we use an overline for variables to represent "true" (as opposed to a priori) atmospheric values, brackets <>to indicate window averages, and a prime ( ) to designate values in masked (polluted) areas.

Retrievals in unmasked (clean) regions
If the measured slant column is the sum of the true stratospheric and tropospheric slant columns, S = S strat + S trop , then it can be shown from Eq. ( 9) that the retrieved stratospheric vertical column in unmasked (clean) regions is given by where < V strat > is the window-averaged true stratospheric vertical column.Equation ( 25) states that the retrieved stratospheric vertical column is the average of the true stratospheric vertical column plus an error term from the difference between the true and a priori tropospheric slant columns.When the a priori tropospheric slant column is correct and the true stratospheric field is smooth on the scale of the smoothing window, then the stratospheric retrieval is unbiased: If the true stratospheric field is homogeneous within the averaging window from Eq. ( 11), then the retrieved troposphere in unmasked regions is When the a priori tropospheric air mass factors are accurate (A trop ≈ A trop ) and slowly varying, Eq. ( 26) becomes Equation ( 27) shows that, in unmasked regions, the retrieved tropospheric vertical column is approximately equal to the a priori mean value (first term).However, the retrieval has additional fine-scale structure equal to the difference between the true tropospheric vertical column and its mean (second term).

Retrievals in masked (polluted) regions
The retrieved stratospheric vertical column in masked areas, V strat_RET , is actually the interpolation-window average of the retrieved stratosphere in surrounding unmasked areas.We define the "true" stratosphere in masked areas as the similarly averaged true stratosphere from surrounding regions plus an amount, δV strat , which varies from point to point inside the mask.The standard deviation of δV strat is σ δV strat , which was derived in Sect.3.2 from the histogram widths in Fig. 3.With these definitions, the tropospheric retrieval in a masked region can be shown to be where A strat and A trop are the a priori stratospheric and tropospheric air mass factors in the masked area, and A trop is the true tropospheric air mass factor.As before, <S trop > and < S trop > are the smoothed a priori and true tropospheric slant columns, respectively, in the surrounding unmasked areas.We want our retrieval V trop_RET to be as close as possible to the true troposphere V trop , and the three terms in Eq. ( 28) identify three possible sources of error.The first arises from potential mismatch of the true and a priori tropospheric air mass factors, A trop and A trop .The true tropospheric vertical column will be scaled by their ratio.The second term shows errors due to the incorrect estimation of the a priori tropospheric slant columns in the surrounding areas.The third term describes tropospheric errors resulting from differences between the true stratosphere in the masked region and the mean stratosphere estimated from the surrounding regions.
The second and third terms increase as the tropospheric air mass factor in the masked region (the denominator of each) decreases due to increasing aerosol or cloud fraction, for example.The result is that any biases caused by non-zero values for δV strat or for <S trop > -< S trop > outside the mask will be amplified as the cloud fraction for a given pixel inside the mask increases.The bias is bounded, because measurements with large cloud fractions generally switch to the unmasked case described in Sect.4.1.1.

Examples using simulated data
Figure 5 shows retrievals of simulated OMI slant-column data.The plots represent nadir pixels along sections of two OMI orbits, with viewing geometry and cloud parameters taken from the orbital data.In this simulation, we assume that all a priori air mass factors are correct, i.e., the true air mass factors are the same as those used in the retrieval.Figure 5a and c illustrate stratospheric and tropospheric retrievals, respectively, in an unmasked (clean) part of the eastern Pacific.The retrieved stratospheric vertical column (red curve in Fig. 5a) is biased high because the simulated tropospheric data were intentionally made 50 % larger than the a priori troposphere in the unmasked regions.This bias affects the retrieved stratosphere via the second term in Eq. ( 25).
Figure 5c shows the corresponding tropospheric retrieval.As expected from Eq. ( 27), the retrieval (red) follows the a priori (blue) rather than the true data (black), on average.However, it is evident that some of the smaller-scale differential structure in the original data is preserved in the retrieval.Retrievals in a masked region of the Eastern US are shown in Fig. 5b and d.The differences between the true and retrieved stratospheres (Fig. 5c) are similar to the unmasked case (Fig. 5a), except that some regional variability is apparent.These deviations are due to intentional non-zero values of δV strat used in constructing the stratospheric NO 2 field.An example can be seen for the latitudes 30 • N-37 • N, where the stratospheric retrieval (interpolation from surrounding unmasked regions) underestimates the stratospheric data.The corresponding troposphere in the masked region is shown in Fig. 5d.Unlike the unmasked case, the retrieved troposphere here (red) is mostly independent of the a priori troposphere (blue) and is generally a good estimate of the true data (black).An obvious exception around 30 • N-37 • N latitude results from the previously noted stratospheric underestimation, which leads to a tropospheric overestimation.Elsewhere, the tropospheric vertical column is slightly underestimated by an absolute amount comparable to that of the unmasked troposphere (Fig. 5c).As in the unmasked case, this underestimation is due to the error in the a priori troposphere for the clean regions.The relative effect in this case appears small, since overall tropospheric columns are much larger in the masked region (note the difference in y-axis scaling for Fig. 5c and d).
In summary, the absolute stratospheric retrieval errors are generally small in most areas.The magnitude of the error depends on the magnitude of the bias between the a priori and true tropospheric fields.For OMI, we estimate that this bias introduces a stratospheric uncertainty of ∼0.2 × 10 15 cm −2 or < 10 %.When tropospheric air mass factors are accurate, our simulations show that the absolute tropospheric verticalcolumn errors due to stratospheric errors are also small (on the order of 0.5 × 10 15 cm −2 ) in both masked and unmasked regions.The corresponding relative tropospheric errors in unmasked regions may be large due to the small tropospheric background amounts in those regions.Errors in tropospheric air mass factors will lead to proportional increases in both the absolute and relative tropospheric vertical-column errors in all areas.

Masking and interpolation sensitivity tests
The retrieved stratospheric field was examined for sensitivity to parameters that control the initial masking and interpolation in the STS algorithm.These steps are illustrated in Fig. 2d and e.We modified the masking threshold and the dimensions of the interpolation function (window) and computed the resultant stratospheric fields.In each case, the field based on the nominal parameters (see Sect. 2.4) was subtracted from the modified field.The results, shown in Fig. 6, suggest that the retrieval is fairly robust with respect to the threshold and window dimensions.In Fig. 6a, the masking threshold was reduced from its nominal value of 0.3 × 10 15 cm −2 to 0.2 × 10 15 cm −2 , and in Fig. 6b, the threshold was increased to 0.4 × 10 15 cm −2 .The reduced threshold increases the number of masked pixels by a factor of ∼2, while the threshold increase approximately halves the masked-pixel count.The difference in the resultant V strat is generally smaller than 0.5 × 10 15 cm −2 and is less than 0.1 × 10 15 cm −2 over most of the Earth.The biggest effects are seen for the smaller threshold (0.2 × 10 15 cm −2 ), since this threshold results in more than half of Northern Hemisphere pixels being masked, leaving little data from which to accurately interpolate the stratospheric field.Tests involving the interpolation algorithm are shown in Fig. 6c and d.In these figures, the latitude and longitude dimensions of the window were approximately halved and doubled, respectively.Results show that effects on the stratospheric field are even smaller than those seen in the threshold tests.We have also found that changing the shape of the interpolation function (e.g., from boxes to circles) makes a negligible difference in the resultant stratosphere.

Comparisons with other NO 2 retrieval algorithms and models
Using NO 2 slant columns from one day of OMI data for illustration, we compare SP2 retrievals with those from the SP1 algorithm (Bucsela et al., 2006) and the DOMINO v2 algorithm (Boersma et al., 2011).The retrieved stratospheric fields for one day are shown in Fig. 7, along with the GMI model field for the same date.The GMI stratosphere was sampled at the OMI overpass time and adjusted by empirical scaling factors to give approximate agreement in magnitude with the retrieved OMI stratosphere over the Pacific.The scaling factors were assumed to be linear functions of latitude and to vary between about 1.1 and 1.4.Linear variation was assumed for simplicity only -the actual ratio between the OMI and GMI stratospheres is more complicated, as evidenced by the discrepancy between the two stratospheres (e.g., near the Equator).Nonetheless, a comparison of Fig. 7b and d shows that synoptic-scale structures in the model stratosphere are qualitatively similar to those retrieved by the SP2 algorithm.The stratosphere of the SP1 retrieval lacks structure on this scale and contains artifacts associated with the wave-2 assumption in the SP1 retrieval.Ex-amples are the low SP1 stratospheric NO 2 values in Western Asia and the band of enhanced NO 2 across the Southern US and parts of the Atlantic Ocean and North Sea.In some of these regions, the SP1 stratospheric values exceed the values of V init (Fig. 2b), which should not occur, apart from measurement noise, since tropospheric NO 2 amounts must be positive.Such artifacts are also evident in the v2 DOMINO stratospheric field over southern high latitudes, the North Atlantic and parts of Siberia.Note the anomalously high DOMINO stratospheric values in eastern Siberia associated with the breakup of the polar vortex (Dirksen et al., 2011).DOMINO also shows stronger cross-track diurnal variation in comparison to SP2 and V init (Fig. 2b).Tropospheric retrievals for the same day in March are shown in Fig. 8.The tropospheric fields for all three OMI products (Fig. 8a, b, and c) are qualitatively similar to the GMI March monthly mean field shown in Fig. 8d.The SP1 field shown in Fig. 8a has been recomputed using an off-line version of the SP1 algorithm that retains any negative values of tropospheric NO 2 .The SP2 tropospheric retrieval shows relatively few instances of negative tropospheric NO 2 compared to the other two OMI products.Monthly means from January and July (not shown) indicate that approximately 8 to 9 % of V trop columns retrieved from SP2 are significantly negative, defined here as V trop < −0.2 × 10 15 cm −2 .DOMINO tropospheric vertical columns have a somewhat higher frequency of negative values, and these occur predominantly in regions that are cloudy or snow-covered (and thus flagged as unreliable).Approximately 21 % of the mostly cloudy (cloud radiance fraction > 0.5) V trop retrievals from DOMINO are significantly negative, compared to ∼15 % of DOMINO retrievals in relatively cloud-free regions.A comparison of Figs.7c and 8c shows that some of the negative tropospheric values in DOMINO are associated with the strong crosstrack variation in the DOMINO stratospheric field.Additional comparisons among the three OMI data products may be found in the Supplement.Maps are shown in Figs.S1-S5, and statistical comparisons in the form of PDFs are given in Fig. S6.The figures provide further evidence of the relative scarcity of negative tropospheric column amounts in SP2 compared to SP1 and DOMINO.
Larger-scale similarities and differences in the NO 2 retrievals can be seen by examining monthly means.Figure 9 compares monthly zonal means of stratospheric NO 2 in January and July from SP1, SP2 and DOMINO, and the longitudinal variation of monthly mean tropospheric NO 2 in northern mid-latitudes for the same two months.In January and July, the stratospheric zonal means (Fig. 9a and b) are similar in all three products, with the SP2 slightly higher and the SP1 slightly lower than DOMINO.Larger differences are evident in the tropospheric means shown in Fig. 9c and d.Although the mean values of SP2 and DOMINO are similar (DOMINO is slightly lower in January and slightly higher in July at most longitudes), SP1 is consistently higher than SP2 by almost a factor of two in July.This discrepancy is likely due to the tropospheric air mass factors used in the SP1 retrieval, which did not account for the seasonal variability in NO 2 profile shape (Lamsal et al., 2010).

Comparison with in situ measurements from INTEX-B
Validation of the OMNO2 SP2 is the subject of ongoing studies and will be covered in detail in a separate paper (Lamsal et al., 2013).Preliminary results suggest improved agreement with independent datasets for SP2 relative to SP1.The following example shows how NO 2 from SP2 compares with data from the Intercontinental Chemical Transport Experiment (INTEX-B).The INTEX-B campaign was conducted from March to May 2006 and included in situ data from the airborne Laser-Induced Fluorescence (LIF) instrument (Thornton et al., 2000(Thornton et al., , 2003)), which measured NO 2 mixing ratios with an estimated accuracy of ±10 % or ±5ppt.Mixing-ratio profiles were obtained at a number of locations in and near the Gulf of Mexico and parts of the Pacific Ocean.The former region included land measurements at polluted locations near Mexico City and Houston.
The LIF data were analyzed and compared by Bucsela et al. (2008), Boersma et al. (2008), andHains et al. (2010) with OMI data.In the present study, we have employed a similar approach to that of Bucsela et al. (2008).LIF data were selected for analysis based on the cloud/aerosol amount, altitude range of the aircraft, and temporal (< 3 h) and spatial (< 20 km) proximity to the OMI overpass data.The profiles were integrated for comparison with OMI NO 2 tropospheric columns.All profiles required extrapolation in altitude, both above and below the actual measurements, to cover the full tropospheric column, and the amount of extrapolation was accounted for in the uncertainties assigned to each profile.Comparisons with OMI were made using x-and y-error weighted linear regression.
The following information describes our analysis of three different sets of OMNO2 data.The data are from (1) SP1 applied to collection 2 OMI data (as in the original study of Bucsela et al., 2008), (2) SP1 applied to collection 3, and (3) SP2 applied to collection 3. The collection 2 slant columns were retrieved from the OMI spectra based on pre-launch calibrations.Improved post-launch calibrations were used to construct a collection 3 dataset in 2007, as described by Dobber et al. (2008).All OMI examples shown previously in this study are based on collection 3 data.A summary of the comparison results is shown in Fig. 10.In general, the agreement between the LIF and OMI data is good in all cases.Figure 10a shows that the SP1 algorithm, using the original collection 2 OMI slant columns yields vertical columns slightly below those of the in situ LIF columns (see also Bucsela et al., 2008).The regression OMI vs. in situ yields slope = 0.9, y-intercept = 0.1 × 10 15 cm −2 , and Pearson's correlation coefficient r = 0.83.Using the collection 3 spectral radiances and the same SP1 algorithm (Fig. 10b), we obtain slope = 1.2, y-intercept = 0.2 × 10 15 cm −2 , and r = 0.72, respectively.This result implies a modest overestimate of NO 2 by OMI relative to the in situ measurements.Figure 10c shows the reanalysis of the same collection 3 data using the new SP2 algorithm.In this case, the slope and intercept are approximately unity and zero, respectively, and the correlation coefficient is r = 0.76.The slope and intercept in the latest OMNO2 dataset indicate the best agreement with the INTEX-B data of the three comparison figures.Although this analysis was based on springtime rather than summertime data, the smaller tropospheric columns in SP2 relative to SP1 (for the same collection 3 dataset) appear consistent with Fig. 9d and the results of Lamsal et al. (2013).

Summary and conclusions
The retrieval algorithm described in this paper represents an improvement on many previous and existing methods for retrieving NO 2 vertical columns from nadir-viewing satellites.SP2 provides a more realistic, detailed stratosphere and troposphere and has a relatively small dependence on a priori information and assumptions.In the stratospheric retrieval, there is no assumption of global zonal invariance as in reference-sector methods, no assumption of wave-2 zonal variation as in SP1, no use of ancillary stratospheric limb measurements (e.g., Beirle et al., 2010), and no stratospheric CTM scaling or assimilation as employed in the OMI DOMINO product.The stratosphere computed in SP2 requires a monthly tropospheric climatology, but this is applied only to clean regions or to areas cloudy enough to effectively block the satellite's view of tropospheric NO 2 .Here the errors associated with the climatology are comparable to nominal measurement uncertainties in the stratosphere.In other regions, the stratosphere is interpolated with the introduction of small additional errors in the stratospheric field.The effect on the retrieved stratosphere of modest changes in the interpolation parameters (e.g., the extent of masking or the interpolation-window size and shape) is relatively small.Tests using simulated data show reasonable accuracy in the OMI SP2 retrievals, with clear-sky tropospheric errors over most regions on the order of ∼10 15 cm −2 .Recent validation studies comparing OMNO2 SP2 with independent measurements also suggest improvement over SP1-based validations.We note fewer instances of negative tropospheric vertical columns relative to SP1 and the DOMINO product.However, the general agreement between OMNO2 and DOMINO has improved with the introduction of the SP2 algorithm.This agreement is noteworthy, given the very different STS algorithms used in the two products.The discrepancies between polluted summertime tropospheric vertical columns from OMI and those from independent measurements have been greatly reduced in SP2 compared to SP1 and are also small relative to DOMINO.
The quality of the SP2 data is currently being established by independent measurements in ongoing validation campaigns from ground, aircraft, and satellite instruments.Better error estimates in the SP2 product should facilitate the comparisons, and future work will help to refine the error estimates.Differences in the behavior of the SP2 algorithm under clear/cloudy, polluted/clean, and masked/unmasked conditions should also be kept in mind when comparing OMI datasets.Versions of the SP2 algorithm are also planned for testing with data from other satellite instruments, including SCIAMACHY, GOME-2 and TROPOMI.

Fig. 1 .
Fig. 1.Flow diagram of retrieval algorithm for stratospheric and tropospheric NO 2 columns.S, V , A, and SW represent slant-column density, vertical-column density, air mass factor, and scattering weight (m), respectively.The section outlined in blue is OMIspecific.TOMRAD is a forward vector radiative transfer model(Davé, 1965).

Fig. 3 .
Fig. 3. Histograms of the difference between estimated and original stratospheric NO 2 columns over masked (polluted) areas based on simulated-data retrievals for January, March, July, and October 2005.Each histogram has a 2-sigma level deviation of approximately 0.2 × 10 15 cm −2 .

Fig. 4 .
Fig. 4. Mean tropospheric NO 2 vertical-column errors (solid circles) as a function of cloud radiance fraction, based on SP2 data from 21 March 2005.Vertical lines show error standard deviations and open circles indicate median values.

Fig. 5 .
Fig.5.NO 2 vertical-column retrievals (red), using simulated data (black).The mean value of the tropospheric data (truth, black) was defined to be larger than that of the algorithm's a priori troposphere (blue).Shown are nadir pixels along two orbital segments: (a) and (c) are stratosphere and troposphere, respectively, for an unmasked segment in the eastern Pacific; (b) and (d) are stratosphere and troposphere, respectively, for a masked segment over eastern North America.In these simulations, geolocation, viewing geometry, and cloud parameters were taken from two orbits on 21 March 2005.

Fig. 9 .
Fig. 9. Monthly mean NO 2 comparisons of SP1 (blue), SP2 (red), and DOMINO (green): (a) January stratospheric zonal mean, (b) July stratospheric zonal mean, (c) January NH troposphere (averaged over latitudes 35 • N to 55 • N), (d) July NH troposphere (averaged over latitudes 35 • N to 55 • N).In (c) and (d), the three regions of enhanced tropospheric NO 2 represent, from left to right, the USA, Europe, and E. Asia, respectively.All measurements used for the tropospheric means were screened to exclude cloud radiance fractions greater than 50 %.

Fig. 10 .
Fig. 10.Comparisons of tropospheric OMI NO 2 retrievals vs. integrated in situ LIF NO 2 measurements obtained during the INTEX-B field campaign over the Gulf of Mexico and clean Pacific locations.Shown are (a) SP1, but with collection 2 slant columns, (b) SP1 (current collection 3), and (c) SP2 (current collection 3).The solid line is an error-weighted least squares fit, and the dotted line is 1:1.