Tropospheric ozone and ozone profiles retrieved from GOME-2 and their validation

This paper describes and assesses the performance of the RAL (Rutherford Appleton Laboratory) ozone profile retrieval scheme for the Global Ozone Monitoring Experiment 2 (GOME-2) with a focus on tropospheric ozone. Developments to the scheme since its application to GOME-1 measurements are outlined. These include the approaches developed to account sufficiently for UV radiometric degradation in the Hartley band and for inadequacies in knowledge of instrumental parameters in the Huggins bands to achieve the high-precision spectral fit required to extract information on tropospheric ozone. The assessment includes a validation against ozonesondes (sondes) sampled worldwide over 2 years (2007–2008). Standard deviations of the ensemble with respect to the sondes are considerably lower for the retrieved profiles than for the a priori, with the exception of the lowest subcolumn. Once retrieval vertical smoothing (averaging kernels) has been applied to the sonde profiles there is a retrieval bias of 6 % (1.5 DU) in the lower troposphere, with smaller biases in the subcolumns above. The bias in the troposphere varies with latitude. The retrieval underestimates lower tropospheric ozone in the Southern Hemisphere (SH) (15–20 % or ∼ 1–3 DU) and overestimates it in the Northern Hemisphere (NH) (10 % or 2 DU). The ability of the retrieval to reflect the geographical distribution of lower tropospheric ozone, globally (rather than just ozonesonde launch sites) is demonstrated by comparison with the chemistry transport model TOMCAT. For a monthly mean of cloud-cleared GOME-2 pixels, a correlation of 0.66 is found between the retrieval and TOMCAT sampled accordingly, with a bias of 0.7 Dobson Units. GOME-2 estimates higher concentrations in NH pollution centres but lower ozone in the Southern Ocean and South Pacific, which is consistent with the comparison to ozonesondes.


Introduction
Ozone is an important atmospheric trace gas, absorbing ultraviolet (UV) radiation from the sun that would otherwise damage the cells of living organisms at the Earth's surface.In the stratosphere, where approximately 90 % of ozone is found, the vertical distribution determines heating rates and thereby also dynamics.The vertical distribution of stratospheric ozone is determined by the Chapman cycle (Chapman, 1930), and catalytic cycles involving nitrogen, hydrogen and halogen radicals.In the troposphere, ozone is produced though complex reaction pathways involving nitrogen oxides (NO x ) and volatile organic compounds (VOCs).Ozone is also introduced by exchange from the stratosphere, particularly at mid-latitudes.As a secondary pollutant from anthropogenic and biomass burning sources, it is an environmental hazard particularly in urban environments because it is a lung irritant.High levels of ozone have been linked to increased mortality/excess deaths when associated with localised heat wave events (Gryparis et al., 2004).Tropospheric ozone can be damaging to agriculture by increasing the failure rate of crops (Holloway et al., 2012).For these reasons, it is vitally important to monitor ozone in the troposphere as well as the stratosphere, but in situ surface observations and ozonesondes are sparse and heavily favour the Northern Hemisphere.
Tropospheric ozone is also a greenhouse gas.The uncertainty in estimates of radiative forcing from tropospheric ozone is as large as that associated with the non-well mixed greenhouse gases (IPCC, 2013) and as such good knowledge of the atmospheric concentration of tropospheric ozone is required.This uncertainty remains in part due to the reliance on atmospheric models and their spread, in addition to uncertainty about pre-industrial ozone amount.Estimates do not currently incorporate any information from satellites (IPCC, 2013).An accurate, contemporary distribution of tropospheric ozone from satellites would help to verify chemistry transport models (CTMs) and coupled chemistryclimate models (CCMs), and hence their estimates of radiative forcing and the forward projections by CCMs.The MetOp series and its successor MetOp-SG/Sentinel 5 have the potential to monitor tropospheric as well as stratospheric ozone in the decades to come.
The total atmospheric column of ozone has been measured historically via UV nadir-viewing sensors (e.g.BUV, SBUV, TOMS, SBUV-2, GOME, SCIAMACHY, OMI and GOME-2), with accuracies typically between 0.5 and 2 % (Klenk et al., 1982;Loyola et al., 2011;van Roozendael et al., 2012, and references therein).Ozone profiles have also been produced from UV nadir-sounders (e.g.Bhartia et al., 1996), however, retrieving tropospheric ozone presents a significant challenge, because ∼ 90 % of atmospheric ozone resides in the stratosphere above.Tropospheric columns have been derived by subtracting an estimate of the stratospheric component from the measured total column, using knowledge of the tropopause height and making assumptions about the ozone profile shape (e.g.Fishman and Larsen, 1987;Schoeberl et al., 2007;Ziemke et al., 2011).Tropospheric columns have also been derived in the tropics by differencing total columns in cloud-free pixels from those in nearby pixels with thick/high convective cloud (Valks et al., 2014).However, as suitable occurrences are sparse, only monthly averages are useful.Direct retrieval of tropospheric information from temperature-dependent spectral structure in the Huggins bands (320-345 nm) was first proposed by Chance et al. (1997) and has been exploited by several schemes (Munro et al., 1998;van der A et al., 2002;Liu et al., 2005Liu et al., , 2010;;Cai et al., 2012), applied to the Global Ozone Monitoring Experiment (GOME) class of instruments.
Here, we describe and assess the performance of the RAL (Rutherford Appleton Laboratory) ozone profile retrieval scheme applied to GOME-2 measurements, with a particular focus on the troposphere.This scheme has been developed directly from that presented by Munro et al. (1998), which was the first to demonstrate retrieval of tropospheric ozone from space.Substantial improvements have been made to that algorithm and GOME-2, which was launched on MetOp-A in 2006, also improves in certain respects upon its pre-decessor.The RAL ozone profile optimal estimation (OE) retrieval scheme was selected for the ESA Climate Change Initiative (CCI) (Plumber, 2009) after independent comparison to the GOME-2 operational ozone profile scheme (Keppens et al., 2014).It was selected principally because of the demonstrated sensitivity to tropospheric ozone and persistently higher number of degrees of freedom for signal (DFS).
In Sect. 2 of this paper, the GOME-2 instrument will be briefly introduced, before the RAL ozone profile scheme and the principal improvements since Munro et al. (1998) are described.In Sect.3, an error assessment is described.Section 4 presents a validation of the ozone profile scheme against global ozonesondes and a comparison to tropospheric ozone distributions from a chemistry transport model.A summary is presented in Sect. 5.

Retrieval algorithm
The RAL ozone profile retrieval scheme is an optimal estimation (OE) algorithm (Rodgers, 1976(Rodgers, , 2000) ) which uses prior information to constrain ill-posed problems such as profile retrievals from nadir-viewing satellite instruments.OE also provides an estimate of the errors associated with retrieved parameters.
The RAL algorithm is a three-step sequential retrieval, first performing a fit to the sun-normalised radiance spectrum in Band 1 (using wavelengths between 266-307 nm) to utilise information in the long-wave tail of the Hartley band.Band 1b spectra are averaged onto Band 1a spatial pixels to improve their signal-to-noise ratio.Ozone absorption and Rayleigh scattering coefficient both decrease strongly with wavelength across this interval, yielding information predominantly on the mid-to-upper stratospheric ozone profile.In addition to the ozone profile, the retrieved parameters are a wavelength-independent Lambertian effective surface albedo, detector dark (leakage) current (in raw signal units) and a wavelength mis-registration parameter for the Earthshine spectrum with respect to the direct-sun spectrum.Rotational Raman scattering is also accounted for by retrieving a scaling factor for the theoretically calculated spectrum of in-filling by the (singly scattered) Ring effect (as modelled via the approach of Joiner et al., 1995).
The second step is to retrieve an effective surface albedo at 336 nm in Band 2. This step is important because the effective albedo retrieved from the longest wavelengths (< 307nm) in Band 1, is not appropriate in the Band 2 fit (using wavelengths from 323-335 nm) due to the differing fields of view (FoV).The retrieved ozone profile and its associated error covariance matrix from the Band 1 fit and the retrieved 336 nm effective albedo contribute to the prior information for the third and final fit in the Huggins bands (323-335 nm).
The fit in Band 1 is a direct fit of the sun-normalised radiance, r, defined as where I is the measured Earthshine radiance and I 0 the direct-sun irradiance measurement.As such, accurate (< 1 %) radiometric calibration is required.GOME-2, as with GOME-1 and SCIAMACHY, has experienced degradation of the UV photometric throughput during its lifetime, the effects of which are greater for the shorter wavelengths (Lang et al., 2009;Lacan and Lang, 2011;Cai et al., 2012).To produce self-consistent global ozone distributions over the mission lifetime, it has been necessary to implement an empirical degradation correction to the Band 1 measurements, as outlined below in Sect.2.3.1.
In order to obtain accurate information on tropospheric ozone, a high fitting precision in the Huggins Bands is required, < 0.1 % rms.In order to achieve this, the Band 2 retrieval fits the differential wavelength structure arising from temperature-dependent vibration-rotational structure in ozone absorption, using the logarithm of the sun-normalised radiance, with a fourth-order polynomial in wavelength subtracted in order to remove coarse-scale artefacts in the spectrum1 and reveal the fine-scale ozone differential spectral structure.This method of fitting differential spectral structure is somewhat analogous to the DOAS approach (Platt, 1994) and is robust against instrumental effects (including some aspects of the degradation).The stringent fitting precision requirement necessitates good knowledge of the instrument's slit function, which varies across Band 2. This is achieved by an off-line fit to each direct-sun spectrum, to retrieve a scaling factor to apply to slit function key data from pre-flight characterisation (Siddans, 2003).This is done on a daily basis because the slit functions are observed to change with time (seasonally and over shorter time periods) in association with thermal cycling of the instrument focal plane.This process is discussed further in Sect.2.3.3.
The state vector for the Band 2 retrieval step is composed of a wavelength mis-registration of the sun-normalised radiance spectrum with respect to the ozone absorption crosssection spectrum in vacuo, a wavelength shift between the Earthshine radiance and direct-sun irradiance spectra, the ozone profile, Ring effect scaling factor, vertical column NO 2 , BrO and formaldehyde.Other species that absorb in this spectral region (such as SO 2 ) are modelled in the fit (based on a climatological profile shape) but not retrieved.
The retrieved ozone profile is represented in the state vector as the logarithm of the volume mixing ratio on a fixed pressure grid: surface pressure, 450, 170, 100, 50, 30, 20, 10, 5, 3, 2, 1, 0.5, 0.3, 0.17, 0.1, 0.05, 0.03, 0.017, 0.01 hPa.The forward model performs radiative transfer calculations on a finer pressure grid (approximately 2 km throughout profile), and uses the assumption that the log of ozone concentration varies linearly with log pressure between the retrieval levels.The pressure levels are herein for convenience expressed as a pressure-altitude coordinate, where an approximate equivalent altitude is assigned to a pressure profile based on the relation where Z * is in kilometres and p in hPa.This predicts approximate equivalent altitudes of the pressure grid of 0, 6, 12, 18 km then every 4 km up to 80 km.These values are usually within 2 km of the geometric altitudes calculated for hydrostatic balance.Altitudes expressed herein are Z * altitudes.The forward model grid is finer in order to accurately model atmospheric radiative transfer.There are typically 5-6 • of freedom for signal (Rodgers, 2000) for the combined Hartley-Huggins bands retrieval.This is almost independent of latitude and season.The retrieval grid oversamples the profiles in terms of the information content of typical GOME-2 measurements so the retrieval is further constrained using a priori correlations (see below).
The ozone a priori profile used is that of the McPeters et al. (2007) climatology derived in part from ozone sondes, which varies by month and latitude.The diagonal elements of the a priori error covariance matrix (S a ) are set to the larger of the climatological % standard deviation and the following values: 0-12 km (100 %), 16 km (30 %), 20-50 km (10 %), 56 km (50 %) and 60-80 km (100 %).In practice, it is these fixed percentage values that apply in the troposphere, except at very high latitudes where the climatological standard deviation is greater.A 6 km Gaussian correlation length is imposed to specify the off-diagonal elements of the a priori covariance for the initial Band 1 step.The retrieved profile and

G. M. Miles et al.: Tropospheric ozone and ozone profiles retrieved from GOME-2
error covariance matrix from the Band 1 step are used as the a priori profile and to define the diagonal elements of the covariance matrix for the Band 2 steps.An 8 km Gaussian correlation length is then applied to further stabilise the Band 2 ozone retrieval in the region of the upper troposphere and lower stratosphere (UTLS).
To achieve photometric signal-to-noise adequate to retrieve tropospheric ozone information, it is necessary to average Band 2 spectra from eight adjacent GOME-2 ground pixels2 .Averaging eight Band 2 pixels (two across-track and four along-track) to create a composite pixel of 160 × 160 km reduces photometric noise by a factor of approximately 1/ √ 8.For radiative transfer, the scheme uses a version of the GOMETRAN++ (Rozanov et al., 1997) but with a number of processing speed improvements (which do not degrade numerical accuracy).A polarisation correction based on scalar/vector LIDORT look-up tables is also implemented, as provided by BIRA (C.Lerot, personal communication, 2012).The retrieval scheme uses ECMWF Interim Re-analysis meteorological products for temperature and pressure profiles obtained from the ECMWF data server.The solar reference spectrum is that provided by Chance and Kurucz (2010).The ozone absorption cross-sections are those derived by Brion et al. (1993Brion et al. ( , 1998)); Daumont et al. (1992); Malicet et al. (1995).
Although cloud may be modelled according to information from GOME-2 measurements in the O 2 A-Band (760 nm) or collocated vis/IR imagery from AVHRR/3 on MetOp, for the purposes of this exercise, cloud radiative transfer is not modelled explicitly, and instead an effective Lambertian surface albedo is co-retrieved.With this approach it is expected that the presence of cloud will lead to a negative bias in retrieved ozone, at altitudes below the cloud top, from where there is limited information.

Optimal estimation
The retrieval uses the standard optimal estimation algebra for the non-linear problem (Rodgers, 2000), used widely for deriving atmospheric properties from satellite measurements.An estimate of the state vector is obtained by combining measurement and prior information in accordance with their respective error covariance matrices.In the case of ozone profile retrieval from nadir UV spectral measurements such as those of GOME-2, the prior constrains what is otherwise an ill-posed problem.The solution is obtained by minimising a cost function, χ 2 : where y is the measurement vector, x and x a are the state vector (or expected solution) and a priori vector, F is the forward model and S y and S a the error covariance matrices for the measurement and prior, respectively.The Levenberg-Marquardt method is used to minimise the cost function (summarised in Press et al., 1995), and the state vector is iteratively updated as follows: where γ is the step size, depending upon which the iteration tends towards either Newtonian iteration or steepest descent (Rodgers, 2000).K is the weighting function at iteration i, defined as The sensitivity of the retrieval to perturbations in the measurements is characterised by the gain matrix G, of dimensions m by n, where m is the number of measurements (in the sun-normalised radiance spectrum) and n the number of retrieval levels.This is defined as follows (Rodgers, 2000): The sensitivity of the retrieval to perturbations in the true state is given by the (n by n) averaging kernel matrix A (also herein referred to as AK): Errors on the solution are characterised by the covariance matrix: The square-roots of the diagonals of this matrix are referred to here as the estimated standard deviations (ESDs) of the retrieval, and are assumed to provide a reliable measure of the error applicable to each level of the retrieved profile.The extent to which this is true is investigated in Sect.4.1, below.S x includes errors arising from measurement noise (as characterised by S y ) and smoothing error deriving from the prior constraint (as characterised by S a ), However, it should be noted that the covariance matrix applies to the profile as represented on the retrieval grid.It does not include smoothing errors related to finer-scale structures than the retrieval grid, and it is not formally possible to estimate errors at finer scales directly from it (as discussed in von Clarmann, 2014).S x should provide a reasonable characterisation of the difference between retrieved profiles and true profiles which have been interpolated onto the retrieval grid, after having been smoothed to a commensurate resolution (see also Calisesi et al., 2005).Application of averaging kernels to the true profile allows the most appropriate means to compare with the retrieved profile.In that case smoothing errors (including effects on finer scales) are accounted for and differences between retrieval and smoothed "true" profiles should ideally be characterised by the retrieval noise covariance: 2.3 Improvements to ozone profile retrieval scheme for GOME-2 GOME-2 measurements are subject to measurement errors from a variety of sources, which must be characterised on a pixel-by-pixel basis for accurate retrievals using optimal estimation.As an estimate of the photometric and dark current noise was not supplied with the Level 1b data acquired by GOME-2 before 2013, we use a model to estimate the measurement noise, based on calibration key data derived for the GOME-2 error study (Kerridge et al., 2002) now updated with calibration key data for the MetOp-A GOME-2 instrument, and similar to the model used by Nowlan et al. (2011).The noise model is described in detail in Miles et al. (2012).

Correction for degradation to GOME-2 UV throughput
The MetOp-A GOME-2 instrument (and instruments of its class) is subject to throughput degradation over time that is more acute at the shorter UV wavelengths (Lang et al., 2009;Lacan and Lang, 2011;Cai et al., 2012).To accommodate this, a low-order polynomial fit in wavelength and time has been derived empirically from the ratio between a climatological (in this case the same as the a priori) modelled UV sun-normalised radiance (with its associated solar viewing geometry) and the observed sun-normalised radiance spectrum.This is similar to the approach by van der A (2001) for ozone column retrieval.A detector dark current, or leakage current, in raw signal units, which is assumed constant for all detector pixels in Band 1, has been jointly fit with the loworder polynomial in order to separate the wavelength/time polynomial from this instrumental parameter, since the dark current is co-retrieved with the ozone profile and other parameters from individual Band 1 (Hartley band) measurements.A separate polynomial correction has been derived for each of the West, Nadir and East Band 1 scan positions, sampling only cloud-free data within 30 • of the equator 1 day per week throughout the mission.The empirical degradation correction employed in Band 1 has resulted in a relatively stable stratospheric ozone distribution from that band.A degradation correction has not been applied in the Band 2 (Huggins bands) step and so the retrieval is still sensitive to trends in the total column ozone) although the use of differential structure greatly reduces sensitivity of the Huggins bands retrieval step to UV radiometry.The more subtle effect on ozone re-trieval of differential UV degradation (for the irradiance and irradiance) in Band 2 will be a topic of future work.

Systematic residual from spectral fit to the Huggins bands
A systematic residual spectral signature remains from the Huggins band fit that is of the order of 0.2 % amplitude (of sun-normalised radiance).This signature has a characteristic spectral structure, which is quite persistent not only with sun-Earth viewing geometry and time.Although its origins in the solar spectral irradiance, atmosphere/surface (polarised) radiative transfer and/or instrument response have yet to be firmly established, the persistence of the spectral residual is amenable to the co-retrieval of a scaling factor, which enables an rms fit precision of < 0.1 % to then be achieved in the Huggins bands, commensurate with the estimated photometric noise level.In practice, the leading six principal components of the systematic residual spectral signature have been determined (considering fit residuals from observations on selected days spanning the missing to date) and scaling factors for each of these included in the retrieval state vector.Variations of the retrieved scaling factors with both time and space, give some physical insights into their origin and an opportunity for future development.
Although these principal components of the systematic residual signature should not be spectrally correlated to ozone, some correlation is found between the retrieved scaling factors and tropospheric ozone under conditions that are particularly challenging, such as at high latitudes in the Northern Hemisphere spring, below high columns of stratospheric ozone and where temperature is close to isothermal over a broad layer near the tropopause.
Some quality control of the retrieved product is necessary under these circumstances, where if the line-of-sight zenith angle component of the total column ozone in step 1 (Band 1) is greater than 500 DU, the retrieved tropospheric column is unreliable and the pixel should not be used.These conditions usually coincide with an extensive near-isothermal tropopause.Since the information on the ozone profile below the stratospheric peak is principally derived from the temperature-dependent ozone spectral structure, such conditions are particularly unfavourable for high-precision retrievals in this region.

Retrieval of slit function width
In order to achieve the fit precision in the Huggins bands needed to retrieve tropospheric ozone, accurate knowledge of the spectral response function (or slit function) of individual detector pixels is required.The slit functions for the GOME-2 instrument were characterised prior to launch from laboratory measurements (Siddans et al., 2006), but it became apparent while in orbit that they had changed and continue to change (Cai et al., 2012).Failure to adequately char- acterise the changing slit function leads to a spurious trend (with respect to ozonesondes) in the retrieved ozone; particularly in the troposphere.To account for this, an offline slit function OE retrieval has been added to the fit of daily direct-sun measurement to the high-resolution solar reference spectrum (Chance and Kurucz, 2010) which is used to refine wavelength registration (Sect.2.2).In addition to the series of wavelength polynomial coefficients for radiometric gain, radiometric offset and a wavelength shift/squeeze, the state vector has been extended to incorporate a single scaling factor for the full width half maxima (FWHM) of all the slit functions in the Band 2 wavelength interval from 320-340 nm.This encompasses the wavelength range needed for ozone retrieval and makes an allowance for edge effects from Legendre polynomials.The retrieved FWHM scaling factor is shown in Fig. 1 from January 2007 to July 2012.Also shown is an example of how a slit function for a single detector pixel is modified by this parameter, demonstrating the effective narrowing of the slit functions with time in this spectral region.The overall change in FWHM is in good agreement with that suggested by others (e.g.Cai et al., 2012).

Error analysis and retrieval characterisation
An extensive simulation study of errors pertaining to ozone profile retrieval by the RAL scheme from the GOME-1 UV spectrometer was reported by Siddans (2003).This was based on retrieval simulations for a set of standard geophysical scenarios which had been defined for the GOME-2 Error Study (Kerridge et al., 2002), which had presented a detailed error budget, based on information available at that time.The retrievals for the GOME-2 instrument in flight is found to behave broadly as predicted.

Retrieval characterisation and error analysis
Estimation of the averaging kernel for the three-step process needs to account for the fact that off-diagonals of the ozone a priori covariance used in step 3 are different to the solution error covariance output from step 1.This is done considering the sensitivity of the step 3 retrieval to changes in the a priori used in step 3, which in turn is related to the true profile by averaging kernel for step 1, as well as the sensitivity of step 3 to the measurements used in that step: A 1 and A 3 are the averaging kernel matrices for step 1 and 3, considered in isolation, applying Eq. ( 9) to the matrices used in the respective steps.In this equation, the quantity (I − A 3 ) is the a priori gain matrix.In practice the equation is slightly complicated by the fact that the full state vector is not identical in the two bands (other state parameters are retrieved).
The equation can be extended to include mapping of the sensitivity of the band 3 retrieval to the surface albedo retrieved in step 2, but this has negligible impact on the averaging kernel for the ozone profile.The retrieval precision, or estimated standard deviation (ESD), as given by the square roots of diagonals of the solution error covariance matrix is generally in the few percent range in the stratosphere increasing to a few tens of percent in the lowest retrieval levels.
S c , the estimated covariance on subcolumn amounts, is given by where S x is the solution covariance (from the final, third retrieval step) in volume mixing ratio (VMR) units and M is n by n − 1 matrix with elements which transform the mixing ratios on levels to subcolumn amount between levels.M has elements which are all zero except M (i,i) and M (i+1,i) (for i = 1, n−1) which have the necessary weights to perform the integration of the subcolumn, making the same assumption as the FM for the variation of ozone with pressure between the retrieval levels.The ESD on the subcolumn amounts is given by the square root diagonal elements of this matrix.Estimated retrieval noise errors can be similarly derived, applying Eq. ( 11) to S n .An example is presented in Fig. 2 for a mid-latitude profile in Northern Hemisphere summer.In this case, the ESD on retrieval levels and layer subcolumns is typically much smaller than the a priori error throughout the profile.The retrieval noise error is around a factor of 2 smaller than the ESD. Figure 3c shows an example of how the ESD varies for a typical orbit cross-section and is also given as a ratio with the prior uncertainty in Fig. 3d.In general, at all altitudes and latitudes a reduction compared to the prior uncertainty is observed.An indication of ESD in the presence of cloud is given later in Sect. 4.

Averaging kernels
Figure 2 also shows example averaging kernels for a midlatitude ozone profile.The AKs for retrieval levels at the surface and in the mid-troposphere show pronounced peaks in the troposphere, while for higher levels the AKs become smoother.The AKs for retrieval levels in the troposphere have tails which extend much higher, indicating an apparent sensitivity of retrieved tropospheric ozone to true perturbations occurring in the stratosphere and mesosphere.However, variability in ozone number density at the altitudes where these tails are large is in practice very small, and therefore so is its influence on the tropospheric ozone retrieval.The influence of realistic variations in stratospheric ozone on retrieved tropospheric ozone is therefore usually small, and retrieval in the troposphere generally reflects realistic tropospheric variability (as evident in the comparison to model fields shown in Sect.4.3).Where stratospheric perturbations are unusually large these can cause spurious tropospheric signals.However, this sensitivity is described by the AKs provided along with the retrievals, so this effect can be properly taken into account when using the data.Figure 3a shows a retrieved ozone orbit cross-section, the improvement of retrieval error as compared to prior error and the combined surface and 450 hPa AKs.The largest reduction upon prior uncertainty in the example given here is found in the UTLS and lower stratospheric region (6-20 km) at mid-to-high latitudes, where it is reduced in places to less than 20 % of the prior error.In the tropics, the largest reduction is found in the mid-troposphere.The smallest reduction is found near the surface at high southern latitudes, which in the case of this orbit cross-section coincides with the southern ocean off the south coast of Australia, consistent with the averaging kernels for the lowest levels shown in Fig. 3b.It is apparent from this that there is some sensitivity to the lowest 3 km of the atmosphere, although the dominant contribution is from around 500 hPa.Most significantly, this AK has very little contribution from above 10 km and in most circumstances is quite independent of stratospheric ozone.The behaviour of AKs is critical to inter-comparisons with ozonesondes, for validation, and with model distributions, as discussed in the following section.

Validation and model inter-comparison
In this section the performance of the retrieval algorithm as applied to real measurements will be validated against ozonesondes and inter-compared with the global distribution predicted by a chemistry transport model.

GOME-2 ozonesonde comparison
The period of interest considered here is 2007 (start of mission operations) through 2008.This is principally because some of the characteristics of the instrument changed in September 2009 as a result of an instrument throughput test and it is more straightforward to interpret the results from GOME-2 before that event.The WOUDC/NDACC (www.woudc.organd www.ndsc.ncep.noaa.gov)and SHADOZ (Thompson et al., 2003) ozonesonde databases are used for this analysis, adopting collocation criteria of < 200 km and < 2 h, with cloud screening (effective cloud fraction of < 0.2 and a cloud top pressure of > 700 hPa) unless otherwise stated.All biases are evaluated with respect to the sonde (retrieval minus ozonesonde).
Ozonesonde measurements are known to differ in accuracy with sensor type, time, altitude and launch site.They are currently the focus of effort by the global ozonesonde community to homogenise the quality of the products (SI 2 N, 2012).Spurious sondes have been eliminated in this analysis by testing whether each 4 km subcolumn for each sonde site is outside 4σ of the monthly mean for that site/subcolumn.This eliminates most aberrant sondes whilst retaining characteristic natural variability at the sonde location.Only sonde profiles that extend above 20 km are considered.

Subcolumns and application of averaging kernels
Sonde comparisons are performed in terms of the vertically integrated subcolumn between retrieval levels.Sondes are directly integrated using where C i is the subcolumn amount between vertical retrieval grid levels i and i + 1, p is pressure, x is ozone mixing ratio, and D is a constant such that the resulting subcolumns are in Dobson units.GOME-2 subcolumns are first interpolated onto the forward model grid in a manner consistent with that used in the retrieval (see Sect. 2.2).Direct comparisons are made between the retrieved and sonde derived subcolumns; however, it is also important to account for differences caused by retrieval smoothing using the averaging kernels.These are applied to ozonesonde profiles as described in Deeter et al. (2007), and we apply their Eq.( 6) to estimate the volume mixing ratio (VMR) profile expected from the retrieval: where x is the expected simulated retrieval, x a the a priori profile, x S and x S a are the sonde profile and the a priori profile, defined on the vertical grid at which the sonde profile is provided (indicated by superscript S).Each row of A S characterises the expected perturbation to the retrieval at a given level to perturbations in the supposed true profile, which is expressed on the relatively finely spaced sonde grid.Retrieval output files contain the mixing ratio averaging kernel A (square matrix), given directly by Eq. ( 10), whose rows describe the effect of perturbations to the true profile on the retrieval grid.The transformation of A to A S (which must account for the different thicknesses of the layers concerned) is achieved by first forming the layer thickness normalised averaging kernel A N using where p j is the effective pressure thickness associated with retrieval level j : Here index i refers to rows of the kernel (retrieval levels) while j refers to columns (levels in the true profile).The rows of A N are then linearly interpolated to the vertical grid of the ozonesonde measurement.This is then scaled to give A S using where p S j is the effective thickness of sonde grid index j .Applying Eq. ( 12) will provide estimated mixing ratios on the retrieval grid with vertical smoothing consistent with the satellite vertical sensitivity.These are then integrated to give subcolumn amounts, in the same way as the retrieved mixing ratios (i.e. by first interpolating to the forward model grid in the appropriate manner).

Results
We first consider statistics from an ensemble of all ozonesondes at all sites, and then provide examples in separate latitude bands.Figure 4 shows the bias, standard deviation and correlation coefficient for a priori and retrieved ozone profiles calculated with respect to individual ozonesondes for the full ensemble.The bias is the ensemble average difference between each GOME-2 retrieved profile and the corresponding sonde profile.The fractional bias (also shown) is the bias divided by the mean sonde amount in that layer.
The ensemble standard deviation of differences between GOME-2 retrievals and corresponding sonde profiles is an independent estimate of the (random) error on an individual retrieved profile with respect to the ozonesonde (i.e.groundtruth).The bias, fractional bias and standard deviation are also computed for the a priori profiles.When AKs are applied to the sonde profiles, the retrieval is seen to add substantial information to the a priori, except for the lowest subcolumn.This is also the case for the correlation coefficient and is due to atmospheric variability in this lowest layer as sampled by the sondes being generally smaller than the ESD.It is therefore important to note that ozonesondes only partially sample the global variability (as shown in Sect.5) The retrieval bias with respect to sondes is rather small once AKs are applied (∼ 6 % in the lowest layer and < 5 % in higher layers), and substantially lower than that of the a priori.Figure 5 shows the geographical distribution ozonesondes and the number of collocated profiles used in this comparison.
Figure 6 shows histograms of the subcolumn error ratio, ER c , defined as the difference between retrieved (C GOME2 where ESD c denotes the estimated retrieval error for the subcolumn rather than a retrieval level.Subscripts denote individual layers (i) for each collocation (k).The analysis is performed both with and without averaging kernels applied to the sonde profile.When averaging kernels are not applied, the standard deviation of the histograms (after accounting for mean bias) is typically only slightly larger than 1, confirming that the reported ESDs provide a good estimate of the retrieval random error.When averaging kernels are applied to the sondes, standard deviations are reduced, however not as much as might be expected considering that noise-only errors are around a factor of 2 smaller than ESDs (see Fig. 2).ing influenced by an a priori profile (and its associated covariance) which is very unrepresentative of ozone hole conditions occurring in the Antarctic spring stratosphere.There is seen to be a small persistent positive bias (+2-3 DU) in the stratosphere (< 100 hPa) in all other latitude bands.

Retrieval performance in the presence of cloud
Retrievals of tropospheric ozone are affected by the presence of cloud.Extensive, thick cloud prevents photons penetrating to lower layers.As discussed in Sect.2.2, the fitting of a surface albedo in Band 1 (270-308 nm) and in Band 2 (335 nm) partially accommodates cloud sun-normalised radiance and above-cloud scattering, so the remaining impact of cloud is obscuration of the ozone column beneath, as demonstrated in Fig. 8. Cloud information (effective cloud fraction and cloud top pressure) provided in the GOME-2 L1 data for each ground pixel from the FRESCO scheme (Fournier et al., 2004) is provided with the RAL height-resolved ozone product, so as to allow filtering by users.

Comparison to the global chemical transport model TOMCAT
Whereas ozonesondes can provide a near-truth in situ at fixed locations, they cannot necessarily indicate how well a satellite product captures the regional or global spatial distribution of ozone.Once validated quantitatively with ozonesondes, spatial agreement with CTMs can be indicative of this.These are driven by realistic atmospheric circulation (e.g.ECMWF re-analysed winds) and emission inventories, but employ differing schemes for chemistry, surface deposition, boundary layer mixing, convection and other vertical transport processes.Intercomparison of satellite data with a CTM can nonetheless be informative to evaluate both.Here we present a comparison of GOME-2 lower-tropospheric ozone with the TOMCAT CTM.We focus our comparison on the lowest layer, which is the most challenging for ozone retrieval from satellite.

TOMCAT chemistry transport model
A full description of the TOMCAT CTM is given elsewhere (Arnold et al., 2005;Chipperfield, 2006, andsummarised in Richards et al., 2013), but it is briefly outlined here.TOM-CAT is a 3-D chemical transport model which is optimised to reproduce the composition of the global troposphere.The version used here has a horizontal resolution of approximately 2.8 • × 2.8 • and has been driven by ECMWF ERA-Interim temperature, winds and humidity (Dee et al., 2011).It operates on 31 hybrid sigma-pressure levels and the chemistry scheme and emission inventories used in this study are detailed in Richards et al. (2013).The model was spun-up for 6 months and then global O 3 fields were output four times a day at 00:00, 06:00, 12:00 and 18:00 UT.Model fields were interpolated in time and space to the satellite sampling (MetOp has an overpass time of 09:30 LT) for 2008.Lower-tropospheric ozone retrieved from GOME-2 by the RAL scheme has previously been shown to have excellent agreement with TOMCAT, in particular for the NH summer Mediterranean region (Richards et al., 2013).

Model comparison
Figure 9 compares GOME-2 with TOMCAT for the lowest retrieved subcolumn in August 2008.The GOME-2 data have been cloud-screened, based on cloud height and fraction from FRESCO in the L1b data, and GOME-2 AKs have been applied to the model.Geographical structure in the monthly mean distribution is seen to be represented quite consistently by GOME-2 and the model.In particular, there is seen to be agreement in locations of high ozone concentration over the Mediterranean region and Southeast China, which are typically found at this time of year, although peak values observed there by GOME-2 are higher than predicted by TOM-CAT.
Consistency between GOME-2 and TOMCAT geographical distributions is indicated quantitatively by the standard deviation (4 DU) and correlation coefficient (0.66) for the August 2008 ensemble in Fig. 9.The global mean bias between GOME-2 and TOMCAT (∼ 0.8 DU) for August 2008 is comparable to that between GOME-2 and ozonesondes in this layer (∼ 1 DU) for the 2 years 2007-2008.Furthermore, the latitudinal dependence of the GOME-2 minus TOMCAT difference in Fig. 9 also mirrors that of the GOME-2 minus ozonesonde bias in Fig. 6 -being positive at northern mid/high-latitudes and negative at southern mid-latitudes.

Model time series comparison
Figure 10 shows monthly mean averages for the GOME-2 retrieval and its a priori, and the TOMCAT model (with GOME-2 spatial sampling) in four regions.These are the NH remote Pacific, the USA, the Mediterranean and eastern China.The remote Pacific in particular is not well sampled by ozonesondes.In the four regions selected, there is good agreement between GOME-2 and TOMCAT in the shape of the seasonal cycle in lower-tropospheric ozone.This is particularly the case for the USA and eastern China, where a double peak in the seasonal cycles is seen by both the model and the retrieval, but not the a priori.In the case of eastern China, a higher correlation is found between the model and the a priori than with GOME-2, and the former both predict lower absolute amounts of ozone in the annual cycle.However, the a priori does not capture the seasonal cycle that is present in both the model and the satellite record.In the Mediterranean, the summer peak is found to occur at a similar time in the retrieval and model but several months earlier in the prior.

Summary
The RAL ozone profile retrieval algorithm for nadir-viewing satellite UV spectrometers has been developed to have sensitivity to tropospheric as well as stratospheric ozone.This has been achieved by a three-step retrieval approach in which high fit precision (< 0.1 % RMS) is required in the third step to extract tropospheric information from the temperature- dependent Huggins bands (323-335 nm).The bias with respect to ozonesondes sampled worldwide over 2 years is of the order of 6 % (∼ 1 DU) in the surface to 450 hPa layer and < 5 % in the subcolumns above.The bias in part reflects the extent to which uncertainties in knowledge of the GOME-2 absolute UV (Hartley band) radiometry and (Huggins bands) slit function shape can be accommodated.The bias varies systematically with latitude/solar zenith angle.It is typically less than ±3 DU, except in the tropical UTLS region where there is a positive bias of up to 5 DU, due to smearing of the sharp change in ozone vertical gradient near the tropopause.This corresponds to a bias of less than ±20 % in the troposphere and +10 % in the tropical UTLS.As expected, the retrieval shows a negative bias in the troposphere in the presence of high or pervasive cloud because, for this validation exercise, cloud parameters have not been co-retrieved or explicitly modelled; their effects on UV sun-normalised radiance have been accommodated only through retrieval of an effective Lambertian albedo (and no ghost column has been added).
The GOME-2 retrieval and the CTM TOMCAT show agreement in the August 2008 monthly mean global distribution of lower tropospheric ozone and specifically in the location of high ozone concentrations over the Mediterranean and over Southeast China.Concentrations in the surface-450 hPa layer retrieved from GOME-2 are persistently higher at northern mid/high latitudes and lower at southern midlatitudes than predicted by TOMCAT, a pattern which is consistent with the GOME-2 ozonesonde bias for 2007-2008.Significant developments to the GOME-2 retrieval scheme are now planned.These include: (a) updating to and evaluating performance with the latest ozone spectroscopy (e.g.Serdyuchenko et al., 2014), as this has been identified as potentially important by e.g.Liu et al. (2007) and others; (b) improved modelling of the slit function shape and related changes with time, which is expected to impact upon tropospheric ozone in particular; (c) improved handling of radiometric degradation occurring in both the Hartley and Huggins UV bands over the mission lifetime, for a more accurate multi-year time series and (d) addition of the visible Chappuis bands as a fourth retrieval step, to increase ozone sensitivity in the lower troposphere over land which cannot at present be achieved with UV measurements alone.

Figure 1 .
Figure 1.Variation with time of retrieved scaling factor for nominal FWHM of slit function (black solid).Red dashed lines indicate discontinuities associated with various in-orbit operations, including the second (and last) throughput test in September 2009.The inset panel shows an example of how the effective shape of the measured slit function is modified for the pixel centred at 317.5 nm, where the black line indicates start of operations (January 2007) and the pink line is the shape in January 2013.

Figure 2 .
Figure 2. The left panel shows averaging kernels derived in number density units on levels for a nadir pixel at 45 • N on 25 August 2008.The averaging kernels themselves are unitless but the magnitude and shape of the off-diagonal elements are very different when evaluated in either VMR or number density.The centre panel shows the associated ESD, noise-only and a priori subcolumn errors, and the panel on the right the errors for the profile.

Figure 3 .
Figure 3. (a) An ozone cross-section on 25 August 2008 retrieved from the Band 2 (final) step for the nadir pixel.The orbit track is also indicated.(b) The combined surface and 450 hPa (circa 0 and 6 km) averaging kernels.(c) Relative retrieval error.(d) The associated ratio of retrieved to a priori error.

Figure 4 .
Figure 4. Statistical comparison of RAL GOME-2 ozone profiles with ozonesondes sampled worldwide for 2007-2008.Collocation criteria are given in the text.The standard deviations (left) and biases (centre) in GOME-2 minus ozonesonde values are in absolute (DU) units and as % of sonde value in the top and bottom rows, respectively.The top right panel shows the correlation coefficient.Points denote the mid-point of each subcolumn.In each case, results are shown for the a priori vs. sonde and for the retrieval vs. sonde with and without application of AKs to the ozonesonde profiles.Statistics have been derived from percentage difference calculated with respect to each individual ozonesonde.
normalised by the ESD c for the subcolumn:
Figure  7shows the a priori and retrieval biases for subcolumns in Dobson units for different latitude bands as well as for the global average.Sonde agreement varies with latitude for a number of reasons, not least because of the changing vertical gradients and amount of ozone present.For the 450-170 hPa layer, the bias is seen to vary from +3 DU in the 30 • S-30 • N band to −3 DU in the 30-60 • S, 60-90 • S and 60-90 • N bands.The bias exceeds +5 DU in the 60-90 • S band for the 50-30 hPa and 30-20 hPa layers, which is due to both the limited vertical sensitivity and to the retrieval be-

Figure 6 .
Figure 6.Histograms of differences between retrieved and sonde layer amounts relative to the estimated standard deviation (ESD) on each layer, for the lower-most and second subcolumns (top and bottom), with and without averaging kernels applied (right and left).The mean/standard deviation values are as follows: −0.184/1.12(top left), 0.2/1.09(top right), 0.158/1.41(bottom left), 0.135/1.12(bottom right).

Figure 7 .
Figure7.Bias with respect to ozonesondes as a function of latitude and pressure for subcolumns in Dobson units for the a priori (left) and retrieved (centre) profiles and for retrieved profiles with GOME-2 AKs applied to the sonde profiles (right).The pink lines indicate the averages over all latitude bands, for comparison to the black and green lines in the left hand panel of Fig.4, which depict the same a priori and retrieval biases as % differences from the ozonesondes.

Figure 8 .
Figure 8.The lowest subcolumn ozone (surface to 450 hPa) differenced from ozonesonde subcolumn, without AKs applied and without any cloud clearing.In the presence of high/thick cloud where fewer photons can penetrate, there is less sensitivity to the lowermost ozone subcolumn.

Figure 9 .
Figure 9. (a) GOME-2 surface to 450 hPa layer ozone gridded (1.125) monthly mean for September 2008.Pixels have been strictly cloud-cleared such that only pixels with a cloud fraction of < 0.2 and cloud top pressure of > 700 hPa remain; (b) a priori for GOME-2 retrieval (all pixels); (c) TOMCAT model with satellite sampling, (d) TOMCAT model with GOME-2 averaging kernels also applied; (e) correlation of (a) and (c) with associated bias and standard deviation; (f) correlation of (a) and (d).The vertical and horizontal black lines in panels (e) and (f) indicate the respective standard deviations of those data sampled along the TOMCAT axis, and the numbers of points in log 10 units are indicated by the colour bar.

Figure 10 .
Figure 10.Time series comparison of surface to 450 hPa ozone for four regions of TOMCAT (black), GOME-2 (green) and the GOME-2 retrieval a priori/climatology in 2008.Monthly correlation coefficient of TOMCAT and the a priori (red) and GOME-2 (green) are also given for each region.In all cases GOME-2 averaging kernels have been applied to TOMCAT.Bars and second axis indicate number of measurements in each month for each region.