Validation of six years of TES tropospheric ozone retrievals with ozonesonde measurements : implications for spatial patterns and temporal stability in the bias

In this analysis, Tropospheric Emission Spectrometer (TES) V004 nadir ozone (O 3) profiles are validated with more than 4400 coinciding ozonesonde measurements taken across the world from the World Ozone and Ultraviolet Radiation Data Centre (WOUDC) during the period 2005–2010. The TES observation operator was applied to the sonde data to ensure a consistent comparison between TES and ozonesonde data, i.e. without the influence of the a priori O3 profile needed to regulate the retrieval. Generally, TES V004 O3 retrievals are biased high by 2–7 ppbv (7–15 %) in the troposphere, consistent with validation results from earlier studies. Because of two degrees of freedom for signal in the troposphere, we can distinguish between upper and lower troposphere mean biases, respectively ranging from −0.4 to+13.3 ppbv for the upper troposphere and +3.9 to +6.0 ppbv for the lower troposphere. Focusing on the 464 hPa retrieval level, broadly representative of the free tropospheric O3, we find differences in the TES biases for the tropics ( +3 ppbv,+7 %), sub-tropics ( +5 ppbv,+11 %), and northern ( +7 ppbv,+13 %) and southern mid-latitudes (+4 ppbv,+10 %). The relatively long-term record (6 yr) of TES–ozonesonde comparisons allowed us to quantify temporal variations in TES biases at 464 hPa. We find that there are no discernable biases in each of these latitudinal bands; temporal variations in the bias are typically within the uncertainty of the difference between TES and ozonesondes. Establishing these bias patterns is important in order to make meaningful use of TES O 3 data in applications such as model evaluation, trend analysis, or data assimilation.


Introduction
Satellite measurements provide a relatively new perspective on global tropospheric ozone (O 3 ) distributions and their changes in time.The Tropospheric Emission Spectrometer (TES) on board the EOS-Aura satellite has provided retrievals of three-dimensional global distributions of tropospheric O 3 since the second half of 2004 (Beer et al., 2001;Beer, 2006).TES measures upwelling radiances in the infrared part of the spectrum and uses the absorption by O 3 around 9.6 µm to infer the vertical distribution of O 3 with 4-6 degrees of freedom (DOFs) for signal, including 1-2 in the troposphere as compared to 0.5 to 1 DOF for UV/VIS sensors (e.g.Liu et al., 2010;Zhang et al., 2010).Despite its limited coverage (global in a month) compared to its European counterpart Infrared Atmospheric Sounding Interferometer (IASI; global coverage in 1-2 days) (Clerbaux et al., 2009), which has been available since 2007, the 6 yr record makes TES in principle a suitable sensor to investigate multiannual changes in tropospheric O 3 from space.
Here we validate the TES O 3 data for their observational performance and fitness for model evaluation and to investigate whether their quality has improved after retrieval updates from version 2 to version 4/5 using a relatively longterm data set.Another aim of our study is to revisit the assertion by Nassar et al. (2008) that biases in TES do not appear to depend on location or season.Such statements about the quality are important in view of the need for optimal bias corrections when using the TES O 3 data in model evaluation studies.Another important quality indicator of the TES O 3 data is whether any biases are constant in time or reflect some degree of instrument degradation.This is especially important for those users interested in using the TES O 3 data for trend analyses.
Initial validation of TES O 3 data was carried out by Worden et al. (2007), who compared a limited number of early-mission retrievals to ozonesonde measurements (taken between September-November 2004), using the first version of TES nadir O 3 data (V001).A more elaborated validation analysis of version 2 (V002) data was conducted by Nassar et al. (2008), who examined approximately 1600 TES and ozonesonde coincidences from October 2004 to October 2006.Apart from comparisons with ozonesondes, Richards et al. (2008) used aircraft observations during the Intercontinental Chemical Transport Experiment-B (INTEX-B) to validate TES tropospheric O 3 profiles, and Osterman et al. (2008) evaluated TES measurements with OMI-derived total, stratospheric, and tropospheric O 3 columns.The results of these validations studies all pointed to TES tropospheric O 3 concentrations overestimating the reference data, with an average high bias of 3-11 ppbv.Boxe et al. (2010) performed a study in which the Aura TES instrument took several measurements corresponding to ozonesonde launches timed to match the Aura overpass.Their study confirmed the previously quantified bias between TES and ozonesonde, and also showed that the calculated random error of the TES O 3 estimates is consistent with the actual error (RMS difference between TES O 3 estimate and sonde).
Our purpose is to extend the TES versus ozonesonde comparison exercise from Worden et al. (2007) and Nassar et al. (2008) by using version 4 (V004) data ranging from 2005 to 2010 and by using ozonesondes extracted from the WOUDC database (http://www.woudc.org/).A proper comparison between TES and ozonesondes requires that the vertical resolution and sensitivity of TES are taken into account by applying the TES observation operator to the sonde data (the TES averaging kernel and a priori constraint).This operation ensures a vertical profile for the sonde data that represents what TES would estimate if the sonde profile can be considered representative of the true atmospheric state (Rodgers and Connor, 2003).Differences between TES O 3 profiles and ozonesonde profiles with the TES operator are compared for different latitudinal zones, and TES-sonde O 3 biases are estimated.We also evaluate the temporal stability of the O 3 biases by testing the null hypothesis of the slope of a fitted linear regression to the compiled time series of the TES-sonde bias for the period 2005-2010.

Tropospheric ozone retrieval from TES
TES is an infrared Fourier transform spectrometer (Beer et al., 2001;Beer, 2006) on-board the NASA Earth Observing System Aura (EOS-Aura) platform.The Aura satellite is in a near-polar, sun-synchronous orbit with Equator crossing times of 13:40 local mean solar time for the ascending part of the orbit.TES is predominantly nadir viewing and measures radiance spectra of Earth's atmosphere at frequencies between 650 and 2250 cm −1 (3.3-15.4µm) with a spectral resolution of 0.1 cm −1 .Along with O 3 profiles, atmospheric temperature, and concentrations of water vapour, also other atmospheric and surface variables are derived from TES radiance spectra.The nadir vertical profiles are spaced 1.6 • apart along the orbit track and have a footprint of approximately 5 × 8 km 2 (Beer et al., 2001;Beer, 2006).The TES data are available at the TES website: http://tes.jpl.nasa.gov/data/.TES retrievals (described in detail by Bowman et al., 2006;Clough et al., 2006;Kulawik et al., 2006a) are based on iteratively fitting of radiative transfer forward model simulations -which relates the atmospheric state to radiance values using atmospheric temperature and constituent profiles as well as the surface properties -to observed radiances at the TES sensor.The measured at-sensor radiance values contain information on upwelling radiation, attenuated surface emission and reflected downwelling radiation.In order to retrieve the concentration of species like O 3 in the atmospheric profile, a cost function is solved, such that differences between simulated and observed radiances, and between the retrieved and initial guess (a priori) state vectors, are minimized.The state vector, containing surface pressure, gridded temperature, constituents mixing ratios (including O 3 ), nadir viewing angle, instrument line shape and others, specifies the elements of the state of the atmosphere being measured and of the instrument characteristics.Non-linear spectral fitting of retrieval parameters is applied based upon a Levenberg-Marquardt minimization algorithm described in Bowman et al. (2006).
The a priori O 3 profile information is taken from monthly mean simulations from the MOZART model (Brasseur et al., 1998) and averaged over 10 • × 60 • grids (latitude by longitude).Thus, TES retrievals contain a mixture of observed information where TES sensitivity is high, and assumed a priori information where TES sensitivity is low.Using the concept of averaging kernel (AK), the vertical sensitivity of TES-retrieved O 3 can be evaluated.The averaging kernel defines the relative contribution of each element of the true state to the retrieved estimate at a particular pressure level (Rodgers, 2000).The trace of the averaging kernel matrix defines the number of degrees of freedom (DOFs) for signal.The DOFs are not constant for all TES retrievals but depend on the temperature contrast between surface and atmosphere, the retrieved state (Bowman et al., 2006), and cloud optical depth (Kulawik et al., 2006b;Eldering et al., 2008).In general, the TES vertical resolution for O 3 profiles is 6-7 km, corresponding to ∼ 1-2 degrees of freedom in the troposphere with the highest number of DOFs for the clear-sky tropics and subtropics where TES can distinguish between lower and upper tropospheric O 3 (Jourdain et al., 2007).A more formal description of the averaging kernel is given in Sect. 4. Worden et al. (2007) showed that TES errors are dominated by the TES radiance noise and the interference from other species in the retrieval, with error budgets of up to 15-20 %.
Figure 1 shows the seasonal mean (June-July-August, hereafter JJA) global distribution of tropospheric O 3 (at 464 hPa) averaged for the period 2005-2010 as retrieved by TES when using a global and annual mean (MOZART) a priori O 3 profile.Using a universal a priori O 3 profile, instead of the MOZART a priori that changes incrementally every 60 • longitude and 10 • latitude, removes the structure to the retrieval that is not actually measured.Reprocessing the TES profiles using a universal a priori is based on the method described in Zhang et al. (2006) by adding to the original O 3 retrievals the difference of both (MOZART 10 • × 60 • and MOZART global mean) a priori profiles multiplied by the difference of the unity and averaging kernel matrix.In Fig. 1, we see relatively low O 3 values over the tropical oceans, where O 3 is generally destroyed through photolysis and subsequent reactions with water vapour (e.g.Crutzen, 1979).In contrast, O 3 concentrations are high over and downwind of polluted regions with strong precursor emissions, i.e.NO x , volatile organic compounds (VOCs), ample sunlight, and favourable dynamical conditions such as the Mediterranean, the Middle East, eastern China, and central Africa (e.g.Liu et al., 2009;Worden et al., 2009).

Ozonesonde data from WOUDC
Vertical profiles of O 3 concentrations are routinely measured in situ from balloon sondes launched from stations around the world, typically between 10:00-12:00 UTC, but also at other times (∼ 05:30, ∼ 24:00 UTC, etc).The sondes provide measurements of O 3 , temperature, pressure, and humidity up to 35 km altitude with a vertical resolution of approximately 150 m.Here we take all O 3 profiles measured from 2005 to 2010 that have been reported to the World Ozone and Ultraviolet Radiation Data Centre (WOUDC, http://www.woudc.org).WOUDC contains the largest amount of data on O 3 and surface ultraviolet radiation measured by instruments mounted on ground-based, shipborne or airborne platforms.The majority of sonde measurements originate from Europe and North America, but sondes are also launched from a few stations in South America, Asia and Africa.
WOUDC O 3 profiles have been measured with the electrochemical concentration cell (ECC) (Komhyr, 1969) and Brewer-Mast (BM) sondes.Recent comparisons between tropospheric O 3 measured by ECC and BM sondes confirmed the level of agreement to within 5 % and indicated that the observed differences between the sonde types are not significant at a 90 % confidence level (Stübi et al., 2008).Logan et al. (2012) reported mean biases of 0.9 ± 2.8 ppbv (one sigma) in the lower troposphere (681-580 hPa) between sonde data and MOZAIC time series at Frankfurt and Munich (1999Munich ( -2008)).In the higher troposphere (501-430 hPa), the biases are larger (1.7 ± 3.8 ppbv) for the same time period (Logan et al., 2012).Generally, WOUDC O 3 data are sufficiently accurate and, irrespective of the sensor type (WOUDC, 2007), can be used to validate the TES O 3 retrievals.

Application of the TES averaging kernel
The TES estimate of the O 3 profile, x TES , is written as (Bowman et al., 2002) x with x a the a priori O 3 profile (as predicted by the MOZART CTM), A the averaging kernel that provides the sensitivity of the retrieval to the true O 3 profile x true , and ε an error term representing the propagation of spectral noise and the interference effects from other absorbing species into the retrieved state.The estimate, x TES , as well as the true state and a priori are in units of log (VMR -volume mixing ratio).
Since the original sonde data are provided on various irregular pressure grids, all sonde data are interpolated to a fine level pressure grid (800 levels from 1260 hPa to 0.46 hPa).Subsequently a mapping matrix is used to interpolate the sonde data to the 67-level pressure grid (from 1212 to 0.1 hPa) used in the TES retrievals.The profile x sonde measured by the ozonesonde, interpolated vertically to the TES vertical pressure grid, is then with δ describing the sonde errors which are assumed to be uncorrelated, i.e. diagonal-only covariance (Worden et al., 2007).We then apply the TES averaging kernel to represent the O 3 profile estimate that TES would retrieve if the instrument observed the same air mass as probed by the sonde: The difference between TES and the sonde profile estimates is given by and this comparison is now independent of assumptions on the a priori profiles, making it a useful metric to identify other biases in the TES O 3 profiles.Since the errors from the sonde profiles are much smaller (δ is ±5 %) than the TES retrieval errors, we can safely assume that the TES sonde difference (Eq.4) is primarily an indicator of errors in the TES retrieval, as long as TES and sonde sampled very similar atmospheric air masses.
Figure 2 illustrates a typical vertical profile of the TES averaging kernel showing how the O 3 retrieved at one specific pressure level is influenced by the total O 3 profile over De Bilt, the Netherlands, on 14 July 2005 (11:23 UTC) for a clear day with a DOF of 3.98.This AK indicates that the sensitivity in the troposphere peaked well below 464 hPa (at approximately 600 hPa).The right panel of Fig. 2 shows the corresponding O 3 profile retrieved by TES, the a priori O 3 profile used in the TES retrieval, the O 3 profile measured by the sonde launched from De Bilt on 14 July 2005, and the sonde estimate as TES would have observed it in the absence of any spatial or temporal sampling differences.We see that the O 3 profiles from TES and from the sonde smoothed with the TES AK correspond closely, and deviate substantially from the a priori.The vertical detail in the original sonde profile is obviously smoothed by the AK.Only at the very lowest atmospheric layers (1000-800 hPa), TES reverts back to its a priori values, reflecting the low sensitivity to O 3 near the surface, but at 464 hPa TES retrievals are still sensitive to free tropospheric O 3 .

TES-sonde coincidence criteria and TES data screening
Conducting a proper evaluation of TES measurements with ozonesonde data as reference requires data pairs from both the satellite and sonde that return information from very similar air parcels.We follow the temporal and spatial coincidence criteria of ±300 km and ±9 h proposed by Nassar et al. (2008), as these criteria are sufficiently loose to provide a large number of profiles for a statistically meaningful comparison, but sufficiently strict to warrant a high probability that TES and sonde sampled similar air masses.Worden et al. (2007), Osterman (2007), Nassar et al. (2008), Boxe et al. (2010) all suggested restricting the temporal and spatial separation of both measurements in order to obtain a sufficient number of data pairs for statistical comparison.Worden et al. (2007), validating V001 TES data, used criteria of 600 km for spatial and ±48 h for the temporal separation, which resulted in a total of 55 TES-sonde data pairs.Since more TES and sonde data were available for the validation study of V002 TES data, conducted by Nassar et al. (2008), more rigid constraints could be applied by using a maximum range of 300 km and a maximum time difference of ±9 h resulting in approximately 1600 coincidences.We also used these criteria for validating V004 TES data with WOUDC sonde O 3 data.Figure 3 (upper panel) presents a map of ozonesonde stations with their coinciding TES retrievals, showing that most TES-sonde pairs occur in the northern mid-latitudes because this region has the largest number of sonde launches (and stations).Before the coincidence criteria can be applied, the TES data are subject to screening since clouds and suboptimal O 3 retrievals degrade the quality of the TES O 3 measurements.Profiles with thick high clouds in the field of view were removed because these obscure the infrared emission from the lower troposphere, greatly reducing TES sensitivity.Profiles with a cloud top pressure less than 750 hPa (cloud top height above ±2.5 km) and with an effective (cloud) optical depth larger than 2.0 are considered to be obscured by clouds.Moreover, TES retrievals with a radiance root mean square error (RMS) of more than 1.75 are also excluded from the analysis (Osterman, 2007), reflecting a too large difference between observed and simulated radiances (e.g. a not well-minimized cost function).Applying all these constraints in the TES O 3 master quality flag resulted in a total of 4460 "good" coincidences for the full time range of the study from 2005 to 2010.

Comparison of TES and sonde ozone measurements
Previous versions of TES O 3 retrievals have been compared to ozonesondes and aircraft-based measurements over relatively short periods showing that TES O 3 profiles are generally biased high by 3-10 ppbv in the troposphere (Worden et al., 2007;Nassar et al., 2008;Osterman et al., 2008;Richards et al., 2008).Here, we focus on validating version 4 (V004) TES O 3 data retrieved for the six-year period from 2005 to 2010.Compared to the V002 TES data, improvements to the temperature and water retrievals resulted in slightly better agreement between calculated and actual uncertainties of the vertical O 3 profile (Boxe et al., 2010).However its not clear that this changed any bias characteristics of the TES data.We define the TES bias as the mean difference between collocated TES and ozonesonde data pairs over a certain area of interest.We will return to the validity of this definition later.Figure 4 presents TES-sonde O 3 (profile) differences for a number of latitude zones taking into account all coincidences within the six-year (2005-2010) period, irrespective of season.We follow the comparison approach by Nassar et al. (2008) and show in the left panels the absolute O 3 differences (TES-sonde) in the troposphere (1000-300 hPa).The right panels show the relative differences ((TES − sonde) × 100/sonde) for the full O 3 profile (1000-1 hPa).Figure 4 shows that TES is generally biased high within the troposphere over all latitude zones by up to 10 ppbv, corresponding to relative differences up to +15 %.The TES bias varies as a function of pressure.For the cold Antarctic and Arctic, TES appears to be unbiased with respect to the sondes in the lower troposphere, but this actually reflects the next-to-zero sensitivity of TES to O 3 in the lower atmosphere for situations with low brightness temperature.In such cases, the TES retrieval mostly provides the a priori information in the lower troposphere (see Eq. 1 and Fig. 2).The TES bias and standard deviations increase with altitude for the polar and northern mid-latitude regions, probably because of low tropopause levels (suppressing the DOFs for tropospheric O 3 ), and strong variability in lower stratospheric and upper tropospheric O 3 in those regions (because of stratosphere-troposphere exchange).Positive upper troposphere (UT) biases are also observed for other infrared sounders like for example IASI (Dufour et al., 2012).In contrast, over the tropics, where stratospheric O 3 has generally little influence on UT O 3 concentrations, the bias in the UT is much smaller than in higher latitudes.
In line with the suggestion by Nassar et al. (2008), we now analyse the TES bias for two vertical regimes: the lower troposphere (LT, 1000 to 500 hPa) and the upper troposphere (UT, 500 hPa to tropopause).The 1-2 degrees of freedom for signal in the troposphere should make such an analysis meaningful.From a linear regression of all TES vs. sonde O 3 data pairs in the lower troposphere, we find generally better agreement between TES and ozonesondes for the tropics (slope = 1.3, r = 0.8, bias = +5 ppbv) than for the mid-latitude regions (slope = 1.5, r = 0.7, bias = +7 ppbv) (for more details see Zörner, 2012).This suggests that TES retrievals are more sensitive to lower tropospheric O 3 over the tropics, where the thermal contrast is generally higher than over mid-latitudes.A similar analysis shows much better correlation between the TES and sonde measurements in the upper troposphere (r > 0.84 and slopes < 1.25 for all regions) than for the lower troposphere (Zörner, 2012).This is indicative of the higher TES sensitivity to upper tropospheric O 3 (as witnessed also by the averaging kernels in Fig. 2).Above-mentioned numbers are summarized in Table 1 with an overview of the correlation between TES and sonde O 3 for UT and LT for different zonal bands.Scattergrams are available in Zörner (2012).The linear relationship between TES and sonde measurements for the different latitudes for both UT and LT gives confidence to users of TES data that relative variations as observed on a global map are significant, even though biased (Worden et al., 2007).Consequently, TES data sets have the capability to study variability and trends in tropospheric O 3 .
Our results confirm those found in previous validation exercises for earlier TES versions with fewer available validation data.Nassar et al. (2008) reported biases in UT O 3 for TES V002 data ranging from +3 to +10 ppbv excluding the Arctic and Antarctic zones where TES sensitivity is low.As discussed earlier, errors in the ozonesonde measurements are very likely smaller than 5 % (Smit et al., 2007).The sonde data are unlikely to systematically underestimate O 3 , so that they cannot explain the TES high bias found in this study.This is supported by Richards et al. (2008), who used a completely independent measurement technique, but arrived at similar conclusions that TES tropospheric O 3 is biased high by +7 ppbv (over western hemisphere mid-latitudes).The mean TES-ozonesonde differences are not driven by the mismatches in space and time between TES and ozonesonde either, but these are probably important in explaining the standard deviation around the mean differences.We suggest that the mean differences between TES and ozonesondes reported in this study, apart from TES sensitivity, are related to instrumental artefacts, spectroscopy or forward model errors, and consequent retrieval difficulties in TES.This may also explain why the bias has not appreciably decreased between successive of TES retrieval algorithms.
The idea that thermal contrast is driving the sensitivity and the quality of the TES retrievals led us to analyse the TES bias as a function of season in the northern mid-latitudes.We find some support for this because of better correlation (r = 0.5-0.6) between TES and sonde LT O 3 in the warm spring (March-April-May, MAM) and summer (June-July-August, JJA) seasons than in the cold winter (December-January-February, DJF (r = 0.3)).Although the absolute bias remains +5 ppbv irrespective of season, the relative errors are smaller in the summer (high O 3 concentrations at the Northern Hemisphere) than in the winter (low concentrations) reflecting higher signal-to-noise ratios in summer.
We now turn our attention to the 464 hPa retrieval level, where TES generally shows good sensitivity to O 3 in the troposphere, with little influence from stratospheric O 3 in all regions and seasons except for the winter at the northern mid-latitudes.At 464 hPa, the TES sensitivity to free tropospheric O 3 is substantial at all regions.The 464 hPa kernel is sufficiently sharp to limit the influence of lower stratospheric O 3 above, but still broad enough to include O 3 contributions from the lower troposphere (Fig. 2).This makes 464 hPa the appropriate level for evaluating the skill of O 3 simulations by chemistry transport models, and for the analysis of trends in free tropospheric O 3 .Figure 5 indicates that the TES-sonde bias is clearly smallest by +3 (±2) ppbv in the tropics, and increases with latitude to +7 (±1) ppbv for mid-latitudes, possibly reflecting the generally weaker vertical sensitivity to tropospheric O 3 (the lower DOFs) at higher latitudes.Since the bias on the estimate depends on the sensitivity (Worden et al., 2011), low sensitivity affects the bias.Biases from not completely resolving variability in temperature and H 2 O vertical profiles will also have effects on the TES O 3 because they cannot be completely reduced through averaging.The TES mean biases in the tropics and in the northern mid-latitudes are different to within their standard errors (σ mean = σ / √ N ), but it is difficult to make for the observed bias if for instance O 3 series from global chemistry transport models are evaluated with TES observations.One way to do this is to subtract the mean bias values for individual latitude zones as indicated in Table 2, and in Figs. 5 and 6 (see next section) from the TES measurements.Worden et al. (2008) applied a similar correction by lowering the TES observations by 15 % to account for the known high bias of TES compared with ozonesondes in their study area (45 • N-45 • S).Another approach is to use the suggested linear relationship between the O 3 concentrations measured by coincident sondes and TES retrievals in order to analyse the nature of the bias.There is considerable support for such a method because of the observed linear relationship between TES and sonde O 3 concentrations reported by Nassar et al. (2008) and confirmed in this study.Assuming that the (kernel convolved) sonde measurements are a close approximation of the "true" O 3 concentrations at 464 hPa, we perform a reduced major axis regression (Clarke, 1980) of the TES data (at 464 hPa) to the sonde data, and interpret the regression coefficients (intercept and slope) as correction factors to the TES data (x sonde = a + bx TES ).The regression coefficients a and b could be used to correct TES O 3 retrievals for situations without coinciding ozonesondes.Table 3 gives an overview of correction functions based on the linear regression between sonde and TES O 3 data at the 464 hPa pressure level.Using the slopes and intercepts of Table 3, and inserting the TES values, the corrected TES values are moving more closely to the sonde values reducing the mean biases with a value up till 80 % of the number before the correction was applied.The results in Table 3 suggest that the bias in the tropics and sub-tropics is mostly additive in nature (slopes: 0.94 to 1.01; intercepts: −5.1 to −0.4 ppbv), in contrast to  trends in the other regions in a similar manner, we use seasonal mean biases for the other latitude zones.Table 2 lists the slopes and intercepts from linear regressions to seasonally averaged TES-sonde biases for the period 2005-2010 at the 464 hPa pressure level for all latitude zones, and for completeness we also include the trend results for the northern mid-latitudes based on the seasonal mean biases.For none of the regions we find trends (slopes in Table 2) in the TES bias that differ substantially from the null hypothesis.In any case, all p values (> 0.05, thus not significant based on 95 % confidence interval) show that none of the slopes are significant to the extent that the null hypothesis (no trend in the TES bias) can be rejected (p values not given in Table 2).We therefore conclude that the bias in TES O 3 retrievals is stable over the 2005-2010 period, and that the TES V004 record is appropriate for analysing year-to-year changes (or "trends") in tropospheric O 3 , at least for the 464 hPa level.

Conclusions
Tropospheric Emission Spectrometer (TES) V004 nadir O 3 profiles were validated with more than 4400 coinciding ozonesonde measurements taken across the world and during the period 2005-2010.We applied the TES operator (averaging kernel and vector constraints) to the sonde data to ensure that the influence of the TES a priori profile cancels from the comparison, and find that TES V004 O 3 retrievals are generally biased high by 2-7 ppbv in the troposphere, consistent with validation results from earlier studies.Because TES has up to two degrees of freedom for signal in the troposphere, we distinguish upper troposphere and lower troposphere mean biases.The lower troposphere O 3 biases ranged from +3.9 to +6.0 ppbv, excluding the Arctic and Antarctic where sensitivity was very low, and hence no valid results could be drawn.In the upper troposphere, TES O 3 biases range from −0.4 to +13.3 ppbv.Because our results are highly consistent with the findings of Nassar et al. (2008), who validated TES V002 retrievals, we conclude that V004 retrievals have not improved much over V002, at least not in terms of improved accuracy.Focusing on the 464 hPa retrieval level, broadly representative of the tropospheric O 3 , we find significant differences between the TES biases for the tropics, sub-tropics, and mid-latitudes.The TES bias is generally smallest (+3 ppbv) and mostly additive over the tropics, and highest over the northern midlatitudes (+7 ppbv, with an additive as well as a multiplicative component), possibly reflecting better retrieval sensitivity (enhanced brightness temperature) and less influence from stratosphere-troposphere exchange in the tropics.Establishing such a bias pattern is important in order to make meaningful use of TES O 3 data in applications such as model evaluation, trend analysis, or data assimilation.The relatively long-term record of TES-ozonesonde comparisons allowed us for the first time to conduct a time-series analysis of the monthly and annual mean TES biases in free tropospheric O 3 , at 464 hPa.For none of the regions we found any significant trend (p > 0.05) over time.Based on the data pairs, linear regression corrections were proposed for free tropospheric O 3 retrieved from TES for all the data and per latitude zone.For the northern mid-latitudes, where enough data pairs were available, seasonal corrections were computed.Thanks to (i) the good correlation between TES and ozonesondes, (ii), the robust bias patterns, and (iii) the fact that the time series of the TES-sonde O 3 biases are not changing over time, it can be concluded that TES is an appropriate instrument for trend analysis of free tropospheric O 3 time series.

Figure 1 .Fig. 1 .
Figure 1.Seasonal average (June-July-August) of tropospheric ozone concentrations at 464 2 hPa as observed by TES between 2005 and 2010 averaged on a 3˚× 2˚(longitude × latitude) 3 grid.The effect of the variable MOZART a priori has been removed by reprocessing the TES 4 retrievals with one global and annual mean (MOZART) a priori O3 profile following the 5 method by Zhang et al. (2006).TES retrievals with residual clouds and errors have been 6 excluded following the standard TES quality flags, leaving typically 50 data points in each 7 grid cell.8 9 10 11 12 13 14 15

22 1Figure 2 .Fig. 2 .
Figure 2. Left panel: typical averaging kernel of the vertical ozone profile as provided by TES 2 over De Bilt, the Netherlands (latitude: 52.10˚, longitude: 5.18˚, altitude: 9 m, 14 July 2005, 3 11h23 UTC) for a clear day (DOF = 3.98, with high sensitivity from the 1000-400 hPa range).4 Right panel: corresponding ozone profiles observed by the ozonesonde launched at De Bilt 5 (grey), retrieved by TES (red), and the ozonesonde profile as TES would observe it (after 6 application of the TES AK), in black.The dashed red line indicates the MOZART a priori O3 7 profile as used in the TES retrieval.8 9 10 11 12 13 14 15

Fig. 3 .
Fig. 3. Upper panel: location of the worldwide ozonesonde profiles of WOUDC (red dots) used in this analysis together with their coincident TES measurements (squares).Lower panel: distribution of the amount of TES-ozonesonde matchups available for this study as a function of latitude.The dashed grey lines indicate the regions used in this study to evaluate the TES biases in more detail.In the lower panel, the northern mid-latitude range: > 35-56 • N, the Arctic: > 56-82 • N, the northern sub-tropics: > 15-35 • N, the tropics: 15 • S-15 • N, the southern mid-latitude: > 35-56 • S, the Antarctic: > 56-82 • S.

Fig. 4 . 2 Figure 5 .Fig. 5 .
Fig. 4. Absolute TES-sonde O 3 differences (left panels) and relative differences (right panels) for six latitude zones.Individual difference profiles are shown in grey; the mean difference and 1 standard deviation profiles are overlaid in black.N is the number of valid profiles after flagging TES data.

Table 1 .
Overview of the correlation between TES and sonde O 3 for the lower and the upper troposphere for different zonal bands.The slope, intercept, R, RMS and bias of the reduced major axis regression are provided.

Table 2 .
Slope and intercept of a (unweighted) linear regression fit to the seasonal mean TES-sonde O 3 biases per latitude zone at the 464 hPa pressure level for the period 2005-2010.The standard error (SE) is also given for the slope (ppbv/season) and the intercept (ppbv).The mean annual bias (ppbv) is also shown.The statistical procedure testing if the intercept of the linear relation is significantly different from zero is given by the p values (rejection of null hypothesis at p < 0.05).