Quantitative bias estimates for tropospheric NO 2 columns retrieved from SCIAMACHY , OMI , and GOME-2 using a common standard

For the intercomparison of tropospheric nitrogen dioxide (NO2) vertical column density (VCD) data from three different satellite sensors (SCIAMACHY, OMI, and GOME-2), we use a common standard to quantitatively evaluate the biases for the respective data sets. As the standard, a regression analysis using a single set of collocated ground-based Multi-Axis Differential Optical Absorption Spectroscopy (MAX-DOAS) observations at several sites in Japan and China from 2006–2011 is adopted. Examinations of various spatial coincidence criteria indicates that the slope of the regression line can be influenced by the spatial distribution of NO2 over the area considered. While the slope varies systematically with the distance between the MAX-DOAS and satellite observation points around Tokyo in Japan, such a systematic dependence is not clearly seen and correlation coefficients are generally higher in comparisons at sites in China. On the basis of these results, we focus mainly on comparisons over China and estimate the biases in SCIAMACHY, OMI, and GOME-2 data (TM4NO2A and DOMINO version 2 products) against the MAX-DOAS observations to be −5± 14 %,−10± 14 %, and+1± 14 %, respectively, which are all small and insignificant. We suggest that these small biases now allow for analyses combining these satellite data for air quality studies, which are more systematic and quantitative than previously possible.


Introduction
Three satellite sensors, SCIAMACHY (SCanning Imaging Absorption SpectroMeter for Atmospheric CHartographY) (Bovensmann et al., 1999), OMI (Ozone Monitoring Instrument) (Levelt et al., 2006), and GOME-2 (Global Ozone Monitoring Experiment-2) (Callies et al., 2000), were all in orbit together until April 2012, observing tropospheric nitrogen dioxide (NO 2 ) pollution on a global scale and providing long-term data records (since 2002) of vertical column densities (VCDs).Observations by these satellite sensors were performed at different local times, and the diurnal variation pattern seen in the NO 2 data has been reported for various locations over the world (Boersma et al., 2008).However, the diurnal cycle observed by SCIAMACHY and OMI has been validated only over the Middle East, a region with highly active photochemistry (Boersma et al., 2009).The observations of the diurnal variation are expected to provide additional constraints to improve models, beyond a single VCD data set at a specific local time (e.g., Lin et al., 2010).The combined use of SCIAMACHY, OMI, and GOME-2 data is desirable to improve our understanding of short-term variations in chemistry, emissions and transport of pollution.There have been, however, few studies attempting to quantify the biases in SCIAMACHY, OMI, and GOME-2 data in a consistent manner based on comparisons with independent observations.In East Asia, validation comparisons for specific satellite data sets are very limited, except for the NASA OMI standard product (Irie et al., 2009).Here we present a consistent data set based on Multi-Axis Differential Optical Absorption Spectroscopy (MAX-DOAS) observations performed at several sites in Japan and China from 2006-2011.Because MAX-DOAS provides continuous measurements during the daytime, its data are used as a common reference to validate all three satellite data sets.The present work focuses on estimating representative bias between satellite and MAX-DOAS NO 2 VCD data over East Asia for each satellite data set.

Satellite observations
The present study targets tropospheric NO 2 VCD data from SCIAMACHY, OMI, and GOME-2, all of which are equipped with a UV/visible sensor measuring sunlight backscattered from the Earth's atmosphere and reflected by the surface as well as the direct solar irradiance spectrum.SCIA-MACHY was launched onboard the ENVISAT satellite in March 2002.It passes over the equator at about 10:00 LT and achieves global coverage observations in six days, with a spatial resolution of 60 × 30 km 2 .OMI was launched aboard the Aura satellite in July 2004.The equator crossing time is about 13:40-13:50 LT.Daily global measurements are achieved by a wide field of view (FOV) of 114 • , in which 60 discrete viewing angles (at a nominal nadir spatial resolution of 13 × 24 km 2 ) are distributed perpendicular to the flight direction.The GOME-2 instrument, launched aboard a MetOp satellite in June 2006, has a ground-pixel size of 80 × 40 km 2 (240 × 40 km 2 for the back scan) over most of the globe.With its wide swath, near-global coverage (with an equator crossing time around 09:30 LT) is achieved every day.While observation specifications are thus somewhat different between the three sensors, tropospheric NO 2 VCD data retrieved with the same basic algorithm (DOMINO products for OMI and TM4NO2A products for SCIAMACHY and GOME-2) (Boersma et al., 2004(Boersma et al., , 2007(Boersma et al., , 2011) ) are compared in detail with MAX-DOAS data below.The error in the satellite tropospheric NO 2 VCD data includes uncertainties in the slant column, the stratospheric column, and the tropospheric air mass factor (AMF) (Boersma et al., 2004), and can be expressed as ∼ 1 × 10 15 molecules cm −2 + 30 % for polluted situations.Comparisons are made for the version 2 retrievals under cloud-free conditions, i.e. cloud fraction (CF) less than 20 %.

MAX-DOAS observations
Here we briefly describe ground-based MAX-DOAS measurements -scattered sunlight observations in the UV/visible at several elevation angles between the horizon and zenith (e.g., Hönninger and Platt, 2002;Hönninger et al., 2004)performed at three sites in Japan and three sites in China (Table 1 and Fig. 1).As can be seen in Fig. 1, the MAX-DOAS measurements were conducted at various levels of NO 2 pollution, covering urban (Yokosuka), suburban (Tsukuba) around Tokyo, and remote areas (Hedo) in Japan and the northernmost (Mangshan), middle (Tai'an), and southernmost (Rudong) parts of the highly polluted area in China.This set of observations extends the data set used by Irie et al. (2009) for the validation of the NASA OMI NO 2 standard product.The present study additionally uses data for 2009-2011 and data from the Mangshan and Rudong sites.The observations at Tai'an, Mangshan, and Rudong were made as part of intensive observation campaigns for a limited time period of about 1 month for each site (Table 1).The instrumentation and retrieval algorithm used for all the sites have been described in detail elsewhere (e.g., Irie et al., 2008Irie et al., , 2009Irie et al., , 2011;;Takashima et al., 2011Takashima et al., , 2012)).The retrieval utilizes absorption features by NO 2 and the oxygen dimer (O 4 ) at 460-490 nm.The NO 2 absorption cross section data of Vandaele et al. (1998) at 294 K were used.The quality of our DOAS analysis is supported by formal semi-blind intercomparison results indicating good agreement with other MAX-DOAS observations to within ∼ 10 % of other instruments for both NO 2 and O 4 differential slant column densities ( SCD) and for both the UV and visible regions (Roscoe et al., 2010).
The O 4 SCD values derived from the DOAS analysis are converted using our aerosol retrieval algorithm (e.g., Irie et al., 2008) to aerosol optical depth and the vertical profile of the aerosol extinction coefficient.At the same time, the socalled box AMF is uniquely determined, as it is a function of the aerosol profile.Using this AMF information and a nonlinear iterative inversion method, the NO 2 SCD values are converted to the tropospheric VCD and the vertical profile of NO 2 .Error analysis for the retrieved NO 2 VCDs has been done based on the method described by Irie et al. (2011).For an NO 2 VCD of about 100 × 10 14 molecules cm −2 , typical random errors were estimated to be 5 × 10 14 molecules cm −2 (5 %).Systematic errors due to uncertainty in the AMF determination, which is likely the dominant source of systematic error in our profile retrieval method, were estimated to be  7 × 10 14 molecules cm −2 (7 %).For the present study, additional sensitivity analysis is performed using a different fitting window for NO 2 (425-450 nm) and different NO 2 cross section data (at 220 K).The errors were estimated by a manner similar to Takashima et al. ( 2012) to be about −3 % (the VCD retrieved from 425-450 nm is smaller) and −23 % (the VCD retrieved using the cross section at 220 K is smaller).
Scaling the latter estimate to the actual temperature variation below 2 km (possibly cooled down to ∼ 260 K at an altitude of 2 km) yields −11 %.This value could be smaller, since NO 2 should be abundant near the surface, where the temperature is usually warmer than 260 K and occasionally can exceed 294 K.However, we quantified the overall uncertainty to be 14 % as the root-mean squares of all the above estimated errors.The representative horizontal distance for air masses observed by MAX-DOAS was estimated to be about 10 km (Irie et al., 2011), a magnitude comparable to or better than the satellite observations.The temporal resolution was 30 min, which corresponds to a complete sequence of elevation angles.In the present study, a comparison is made only when the time difference between MAX-DOAS and satellite observations was less than 30 min.

Results and discussion
Here we compare MAX-DOAS observations performed in Japan and China from 2006-2011 with all three types of satellite products in a consistent manner.In Yokosuka) and at China sites (Tai'an, Mangshan, Rudong), respectively.Hedo data are used 10 in both regression analyses but do not constrain the slope much, since the comparisons at 11 other sites are made over a wide range of NO 2 VCD values.For each case, the slope, 12 correlation coefficient (R 2 ), and number of data points (N) are given in the plot.13 analysis has been made with the intercept forced to be zero, in order to simplify the interpretation of changes in the bias estimated from the slope of the regression line under various conditions.When the intercept is set as a variable, it is calculated to be 1.4 × 10 15 molecules cm −2 .This is small compared to the range of NO 2 VCD data plotted in Fig. 2 but is larger than the error quoted for the satellite retrieval (∼ 1.0 × 10 15 molecules cm −2 + 30 %).Furthermore, for example, comparisons between MAX-DOAS and GOME-2 under the same conditions as in Fig. 2 reveal that the intercept is as large as 3.0 × 10 15 molecules cm −2 and is inconsistent with that estimated from the comparisons with OMI.We found that the intercept and therefore the slope tend to be influenced, at least when the number of comparisons is too small.This could complicate the interpretation of the change in slopes over different sensors and various coincidence criteria.
In Fig. 2, we find that the slope (± its 1σ standard deviation) is almost unity at 1.03 ± 0.02 for the China case.The correlation coefficient (R 2 ) is as high as 0.78.On the other hand, for the Tokyo case the slope and R 2 are 0.63±0.01 and 0.69, respectively.When we perform the correlation analysis under conditions similar to those for the comparisons at the Chinese sites, in terms of the number of data points and season, the slope and R 2 were found to be essentially unchanged at 0.68 ± 0.04 and 0.64, respectively.Also, analysis made under the same aerosol conditions using the MAX-DOAS aerosol optical depth (AOD) data at a wavelength of 476 nm reveals an insignificant impact by aerosols; at AOD smaller (greater) than 0.8 the slopes for the China and Tokyo cases are 1.04 ± 0.04 (1.02 ± 0.05) and 0.64 ± 0.02 (0.62 ± 0.05), respectively.The AOD threshold of 0.8 is taken from the statistics of retrieved AOD values at Tai'an (Table 3), in order to evenly distribute the data to high and low AOD cases.For the Tokyo case, it can be seen that the slopes of the regression lines tend to be smaller when a looser coincidence criterion is used, for all comparisons with SCIA-MACHY, OMI, and GOME-2.It is thought that tropospheric    NO 2 VCD values in the surrounding areas of Yokosuka and Tsukuba sites usually drop quickly, owing to limited NOx source regions.For a larger x, there should be a higher probability that the satellite footprints include clean air masses, and this can lower both the slope and R 2 .These expected features of the spatial distribution are confirmed by satellite data only (Fig. 6).In Fig. 6, the satellite tropospheric NO 2 VCD values selected for regression analysis are plotted as a function of a given coincidence criterion x for each measurement site.Only data compared with MAX-DOAS are used.
The NO 2 VCD values are differentiated from the NO 2 VCD at x = 0.50 • .While values other than 0.50 can be used to check the dependence of NO 2 VCD on x, we choose the value of 0.50, which provides robust statistics as the standard for all sensors at a relatively small x.
The Yokosuka site is surrounded by industrial facilities, ocean (Tokyo Bay), heavy ship activity, etc., resulting in a large range of tropospheric NO 2 VCDs but more scatter in the correlation, compared to the Tsukuba data (Figs. 2 and 3).To better address such influences of spatial inhomogeneity within a satellite pixel, validation observations covering several points in a satellite pixel at the same time would be desirable (e.g., Piters et al., 2012).
In Fig. 7, monthly-mean values of tropospheric NO 2 VCDs retrieved from satellite observations over Yokosuka are plotted.The color represents the spatial grid size for
Results of the estimated slopes and R 2 for the China case are shown in Fig. 5. Results with an insufficient number of comparisons (less than 3) at Chinese sites have been omitted.It can be seen that the slopes slowly vary with x, but the variations are not as systematic as those of the Tokyo case.R 2 values are greater than 0.6 for all comparisons and usually higher than those for the Tokyo case (Figs. 4 and 5).Furthermore, dependencies of satellite-retrieved tropospheric NO 2 VCDs on x are not as systematic as those seen at sites of the Tokyo case (Fig. 6).These suggest that the spatial distributions of tropospheric NO 2 VCDs around the Chinese sites during the observation periods were rather homogeneous and therefore appropriate for bias estimates.
For the China case, by simply averaging the slopes over the entire x range, the biases with respect to MAX-DOAS observations are estimated to be 0 ± 14 %, −8 ± 14 %, and −10 ± 14 % for SCIAMACHY, OMI, and GOME-2, respectively (Table 5).The error is calculated as the root-sumsquares of the uncertainty of the slope and the uncertainty of the MAX-DOAS NO 2 retrieval.It is expected, however, that the validation comparison can be more precise using a stricter coincidence criterion owing to the increased probability of observing the same air masses by a satellite sensor and MAX-DOAS.Considering this, our best estimates of the biases from slopes at a strict x range below 0.50 • are −5 ± 14 %, −10 ± 14 %, and +1 ± 14 % for SCIAMACHY, OMI, and GOME-2, respectively (Table 5).Thus, we conclude that the biases are less than about 10 % and insignificant for all three data sets.
Note that considering the error quoted for satellite retrievals (∼ 1×10 15 molecules cm −2 + 30 %), the estimate biases may be invalid, at least, for NO 2 VCD values smaller than ∼ 1 × 10 15 molecules cm −2 .Also, the estimated biases could vary by changing the x range, but we note that the slopes are all less than 20 %, irrespective of the choice of x (Fig. 5).In the above bias estimates, the slopes for the Tokyo cases are not included, as the slopes vary significantly with x.However, very similar slopes are obtained for all comparisons with SCIAMACHY, OMI, and GOME-2 in the Tokyo case (Fig. 4), supporting the conclusion that differences among biases for all sensors are small, as found from the China case.Thus, our study confirms the hypothesized consistent quality KNMI products retrieved with the new method of Boersma et al. (2011).
Finally, there is the possibility that the biases between satellite and MAX-DOAS data are not necessarily constant over location and time.To address this issue, the precise validation for MAX-DOAS retrievals and/or more systematic MAX-DOAS observations would be essential.

Conclusions
To quantify the biases in the tropospheric NO 2 VCD data from SCIAMACHY, OMI, and GOME-2 in a consistent manner, we created a single data set from MAX-DOAS observations performed at three sites in Japan and three sites in China from 2006-2011.Regression analysis between satellite and MAX-DOAS tropospheric NO 2 VCDs showed that the slope of the regression line tends to be biased by the distance between MAX-DOAS and satellite observation points, due to a difference in the spatial representativeness between MAX-DOAS and satellite observations under loose coincidence criteria.This feature is more clearly seen around Tokyo with strong spatial gradients in air pollution.These results serve as a guideline for future satellite validation, in terms of the choice of coincidence criteria and validation sites.We recommend conducting validation observations under relatively homogeneously polluted conditions.From the slopes of the regression lines for strict coincidence criteria, we estimated biases in SCIAMACHY, OMI, and GOME-2 data to be −5 ± 14 %, −10 ± 14 %, and +1 ± 14 %, respectively, compared to the MAX-DOAS data.Thus, we conclude that the biases are less than about 10 % and insignificant for all three data sets.With a consideration of these characteristics, the present study encourages the combination of these satellite data to realize air quality studies that are more systematic and quantitative than previously possible.
Fig. 2. Correlations between tropospheric NO 2 VCDs (10 16 molecules cm -2 ) from OMI and 2 MAX-DOAS observations at a coincidence criterion (x) of 0.20°.Comparisons over Tsukuba, 3 Hedo, and Yokosuka are shown in blue, green, and gray, respectively, and campaign-based 4 short-term observations in China are shown in red.Error bars for both OMI and MAX-DOAS 5 data are shown only for comparisons over Tsukuba at MAX-DOAS NO 2 VCDs larger than 6 1×10 16 molecules cm -2 , for clarity.Linear regression analysis has been performed for the 7 respective cases 1 (Tokyo case; blue) and 2 (Chinese case; red), where the slopes of their 8 regression lines are constrained mainly by comparisons made around Tokyo (Tsukuba and 9

Fig. 2 .Fig. 3 . 5 Fig. 3 .
Fig. 2. Correlations between tropospheric NO 2 VCDs (10 16 molecules cm −2 ) from OMI and MAX-DOAS observations at a coincidence criterion (x) of 0.20 • .Comparisons over Tsukuba, Hedo, and Yokosuka are shown in blue, green, and gray, respectively, and campaign-based short-term observations in China are shown in red.Error bars for both OMI and MAX-DOAS data are shown only for comparisons over Tsukuba at MAX-DOAS NO 2 VCDs larger than 1×10 16 molecules cm −2 , for clarity.Linear regression analysis has been performed for the respective cases 1 (Tokyo case; blue) and 2 (Chinese case; red), where the slopes of their regression lines are constrained mainly by comparisons made around Tokyo (Tsukuba and Yokosuka) and at China sites (Tai'an, Mangshan, Rudong), respectively.Hedo data are used in both regression analyses but do not constrain the slope much, since the comparisons at other sites are made over a wide range of NO 2 VCD values.For each case, the slope, correlation coefficient (R 2 ), and number of data points (N) are given in the plot.

Fig. 4 . 5 Fig. 5 . 7 Fig. 5 .
Fig. 4. (a) Slopes and (b) R 2 of the regression lines as a function of coincidence criterion x between satellite and MAX-DOAS observations for case 1 (Tokyo case) Fig. 4. (a) Slopes and (b) R 2 of the regression lines as a function of coincidence criterion x between satellite and MAX-DOAS observations for case 1 (Tokyo case) (+0.35/−0.10)* Insufficient number of data points.

Fig. 6 .
Fig. 6.Dependence of satellite-retrieved tropospheric NO 2 VCDs on the coincidence criterion x over each measurement site.Differences from VCDs at x = 0.50° are shown in per cent.Mean values for data compared with MAX-DOAS are plotted.Error bars represent 1σ standard deviations.

Fig. 6 .
Fig. 6.Dependence of satellite-retrieved tropospheric NO 2 VCDs on the coincidence criterion x over each measurement site.Differences from VCDs at x = 0.50 • are shown in per cent.Mean values for data compared with MAX-DOAS are plotted.Error bars represent 1σ standard deviations.

xFig. 7 .
Fig. 7. Dependence of satellite-retrieved tropospheric NO 2 VCDs on the spatial grid si 7 for averaging (corresponding to the coincidence criterion x) around Yokosuka for each 8 in 2010.All available cloud-free satellite data are used.9 Fig. 7. Dependence of satellite-retrieved tropospheric NO 2 VCDs on the spatial grid size used for averaging (corresponding to the coincidence criterion x) around Yokosuka for each month in 2010.All available cloud-free satellite data are used.

Table 1 .
Site Information for MAX-DOAS Observations.

Table 5 .
Estimated Biases in Satellite Tropospheric NO 2 Products for Different Coincidence Criterion Thresholds.