Interactive comment on “ Investigating the long-term evolution of subtropical ozone profiles applying ground-based FTIR spectrometry

This study focuses on improvements in the application of FTIR remote sensing techniques to assess ozone trends as a function of altitude using measurements performed at the subtropical site of Izaña from 1999 to 2010. This paper first presents a full description of the implementation of different ozone profile retrievals as well as their influence on ozone data quality. An in-depth discussion about retrievals sensitivity and capability, based on averaging kernel analysis and error estimation, is provided. Then, retrieved FTIR ozone profiles are validated by comparing them with ECC-sonde data corrected with simultaneous Brewer measurements, in order to analyze and discuss ozone trends and seasonality above subtropical stations.


Introduction
In the coming decades some kind of ozone recovery is expected, however, it is difficult to predict how, when and to what extent it will occur (Weatherhead and Andersen, 2006).Currently it is discussed how climate change will interact with ozone recovery.The multiple interactions between the components of the chemistry climate system complicate a clean attribution of changes in ozone to changes in ODSs (Ozone-Depleting Substances) and other factors such as the Brewer-Dobson circulation, anthropogenic emissions of greenhouse gases, stratospheric temperatures, etc. (WMO, 2011).For example, climate models predict an accelerated stratospheric circulation, leading to changes in the spatial distribution of stratospheric ozone and an increased stratosphere-to-troposphere ozone flux (Hegglin and Shepherd, 2009, and references therein).Furthermore, rising tropospheric temperatures can increase the water amounts that are injected into the stratosphere, thereby triggering ozone destruction (Anderson et al., 2012).Nonetheless, the effect of the decrease of anthropogenic halogen abundances in the upper stratosphere from the mid-1990s has to be considered also (WMO, 2011).In order to verify or decline the different climate model simulations consistent long-term observations of the vertical distribution of ozone are required.Since the expected signals are rather small (e.g., expected trends from −3 % to +1 % per decade between 1960 and 2100, Hegglin and Shepherd, 2009;Li et al., 2009), only high precision observations are useful.
Within the NDACC (Network for the Detection of Atmospheric Composition Change, e.g., Kurylo and Zander, 2000) high resolution solar absorption infrared spectra have been measured by ground-based FTIR (Fourier Transform InfraRed) spectrometers for up to two decades at globally distributed sites.It has been shown that these measurements can provide very high quality ozone total column amounts (Schneider and Hase, 2008;Schneider et al., 2008a;Viatte et al., 2011) and profiles (Schneider et al., 2008b).Due to its long-term characteristic and its high precision, the FTIR data are very interesting for trend studies.Vigouroux et al. (2008) (updated in WMO, 2011) estimated ozone trends at several European NDACC FTIR sites.In this work, we examine in detail the FTIR error sources and discuss how they can affect the estimated ozone trends.We present three different FTIR ozone profile retrieval setups, including the setup applied by Vigouroux et al. (2008), and discuss their reliability for providing correct ozone trend estimates.This study is performed for the ozone super-site Izaña Observatory, where since 1999 the FTIR measurements have been performed coincidently to several other high quality atmospheric ozone measurement techniques (e.g., Brewer spectrometer, Electro Chemical Cell, ECC, sondes, photometric in situ surface).
The Izaña Observatory and its Ozone Program is described in Sect. 2. In Sect.3, we present the three different FTIR retrieval setups, perform detailed theoretical error estimations and discuss the error sources that can affect trend estimations.In Sect.4, we briefly discuss the quality of the ECC sonde data and in Sect. 5 we show a day-to-day comparison between the three different FTIR datasets and the ECC dataset.In Sect.6, we present the ozone seasonality and the trends obtained at different altitudes from the different FTIR datasets and discuss their consistency to the values obtained for the ECC dataset.Finally, the main results are summarised in Sect.7.

The Iza ña Observatory and its ozone program
The Izaña Observatory (28.3 • N, 16.5 • W) belongs to the Spanish Meteorological Agency (AEMET).It is a subtropical high mountain observatory, located at 2.37 km altitude on Tenerife Island, and typically above a temperature inversion layer which acts as a natural barrier for local pollution.Hence, it offers excellent conditions for the remote-sensing of the upper atmosphere.
Since many years the Izaña Observatory has been a WMO/GAW (World Meteorological Organisation/Global Atmospheric Watch) station and an NDACC site, monitoring a large variety of atmospheric constituents, among them, ozone total column amounts and ozone profiles.Different ozone measuring techniques are applied: Brewer spectrometer, FTIR spectrometer, Differential Optical Absorption Spectroscopy (DOAS), in situ ultraviolet photometric analysers, and ECC sondes.In this study, we focus on ozone profiles and in the following two subsections we briefly describe the techniques that can measure ozone concentrations at different altitudes above the Izaña Observatory.

FTIR program
Ground-based FTIR systems measure solar absorption spectra applying a high resolution Fourier Transform spectrometer.The FTIR activities at Izaña started in 1999 when, in the framework of a collaboration between AEMET and KIT (Karlsruhe Institute of Technology, Germany), a Bruker IFS 120M was installed at the Observatory.In January 2005 KIT scientists substituted this spectrometer by a Bruker IFS 120/5HR, which is one of the best performing FTIR spectrometers commercially available.During March and April 2005 both instruments measured side-by-side, which allows for documenting the consistency of both FTIRs.
Since 1999, the Izaña experiment provides data to the NDACC network.Currently there are about 25 ground-based FTIR NDACC experiments.For NDACC, the solar absorption spectra are measured in the mid-infrared spectral region (740 and 4250 cm −1 , corresponding to 13.5 and 2.4 µm), which is covered by six individual measurements applying different filters in order to achieve an optimal signalto-noise ratio.In this spectral region, the FTIR spectra are recorded using a potassium bromide (KBr) beamsplitter, whereby two liquid nitrogen-cooled detectors are applied: a mercury cadmium telluride (MCT) for wavenumbers below 1850 cm −1 and an indium antimonide photododiode (InSb) for higher wavenumbers.For operational ozone measurements the FTIR spectra covering the 1000 cm −1 region were measured with an aperture of 1.5 mm, which corresponds to a field-of-view of only 0.2 • .Therefore, the FTIR instrument only analyses sunlight coming from the centre of the solar disc (diameter of 0.5 • ).In order to increase the signal-to-noise ratio several scans, with a high spectral resolution of 0.005 cm −1 (maximum Optical Path Difference, OPD max of 180 cm), are co-added (8 for the ozone measurements).Thereby, the measurement of one spectrum takes about 10 min.At Izaña the FTIR spectra typically are measured on two or three days per week.

ECC and in situ surface program
The Ozone Sonde Program on Tenerife started in November 1992 and since March 2001 these ECC sonde activities form part of the NDACC.The sondes (type: Scientific Pump 6A) are launched weekly very close to the Izaña Observatory: from Santa Cruz de Tenerife (35 km northeast of the Observatory) and since October 2006 from Güímar (15 km east of the Observatory).Smit et al. (2007) demonstrate that the expected uncertainty of these ECC sonde profiles is ±5-10 %.
Since 1987, and in the framework of the GAW Program, in situ surface ozone has been monitored.During this period, different ultraviolet absorption instruments have been applied.Since 1999, two TEI analysers (Thermo Electron corporation environmental Instruments) are in operation and continuously record one-minute average ozone values.Zerochecks with an activated-carbon absorber are performed on a daily basis to detect instrumental offset drifts.This Surface Ozone Program has been audited by the World Calibration Center for Surface Ozone, Carbon Monoxide and Methane each two or four years since 1996.The expected uncertainty is to be ±1 ppb (Zellweger et al., 2009).At the Izaña high altitude Observatory, the surface measurements are well representative of the free troposphere and, thus, are well suited for validating the tropospheric ozone concentrations observed by the FTIR system (see also Sepúlveda et al., 2012).

Ozone retrieval strategy
The high resolution spectra allow an observation of the pressure broadening effect and, thus, the retrieval of trace gas profiles.The inversion problems faced in atmospheric remote sensing are in general ill-determined and the solution has to be properly constrained.An extensive treatment of this topic is given in the textbook by C. D. Rodgers (Rodgers, 2000).
We apply the retrieval code PROFFIT (Hase et al., 2004) for retrieving the FTIR ozone profiles and investigate three different retrieval setups (in the following referred as retrieval setups A, B and C, see Table 1).All setups apply the spectral ozone microwindow suggested by Barret et al. (2002) (see Fig. 1) and the different ozone isotopologues ( 666 O 3 , 686 O 3 and 668 O 3 ) are retrieved on a logarithmic scale.Likewise, all setups use the same a priori profile taken from an ECC sonde climatology calculated from measurements between 1996 and 2006 and approximated to HALOE climatology above 30 km (Schneider et al., 2008b).
Setup A can be considered as the "NDACC" approach except for the logarithmic instead of the linear scale retrieval of ozone.This strategy has already been used for the ozone trend estimations at Izaña and Kiruna as presented in Vigouroux et al. (2008) (updated in WMO, 2011).Setup B is further refined by an additional temperature retrieval, for which we simultaneously fit four CO 2 microwindows between 962 and 970 cm −1 .Schneider and Hase (2008) demonstrated that such temperature retrieval is very important when aiming for high quality ozone data.For the setups A and B the inversion problem is solved using an ad-hoc Tikhonov-Phillips slope constraint (TP1 constraint).This constrains the vertical profile slope and the absolute value for the uppermost atmospheric model altitude.The strength of the constraint is determined by starting with a weak constraint and then increasing it until we observed a significant increase in the residual of the spectral fit (L-curve criterion).This is different to setup C, for which an Optimal Estimation (OE) constraint instead of the ad-hoc TP1 constraint is used.In this case, the a priori covariance matrix, S a , is taken from an ECC sonde climatology (Schneider et al., 2008b).In addition, setup C includes an inter-species constraint between the different ozone isotopologues ( 666 O 3 , 686 O 3 and 668 O 3 , Schneider et al., 2006).As a priori for the ozone isotopologue ratios, we assume a heavy ozone enrichment of 10 % throughout the atmosphere (Johnson et al., 2000;Mauersberger et al., 2001).Setup C can be considered as the setup with the most realistic constraint, since the actual covariances of ozone and of the different ozone isotopologues are taken into account.Applying a realistic constraint facilitates a correct interpretation of the day-to-day variability in the measured spectra.Assuming that atmospheric trends will be largest for altitudes with largest atmospheric day-to-day Table 1.Description of FTIR ozone retrieval setups.Note that the level of refinement increases from the setup A to the setup C. 1 Four temperature microwindows (MW) with isolated CO 2 signatures [cm −1 ]: 962.80-963.80, 964.25-965.25, 967.20-968.20, 968.60-969.60It is important to mention that we do not vary any of the a priori trace gas profiles that depend on the season.Thereby, all variability observed in our retrieved ozone profiles comes from the measurements.As a priori for the temperature retrievals, we use the diurnal radiosondes (Vaisala RS92) up to 30 km and extended it by the NCEP (National Centers for Environmental Prediction) 12:00 UT daily temperature profiles.The radiosondes are launched twice per day (23:15 UT and 11:15 UT), just about 15 km southeast of the Izaña Observatory on the coastline.
For all retrievals, we apply the ILS (Instrumental Line Shape: the interferometer's modulation efficiency and phase error) as derived regularly from low pressure gas cell measurements by means of the code LINEFIT (Hase et al., 1999).

Vertical resolution of FTIR ozone profiles
The vertical structures that are detectable by a ground-based FTIR system are given by the averaging kernel matrix (avks, Â).The columns of this matrix describe how an atmospheric perturbation is smoothed out by the remote-sensing system.As example, Fig. 2 depicts the avks columns for a typical measurement of the IFS 120/5HR and for setups A, B, and C. We can observe that the maxima of these response functions generally peak at the altitude of the perturbations: the green line describes the response for an 1.0 disturbance at 5 km and it peaks close to 5 km, the red line represents the response on a disturbance at 18 km and it peaks close to 18 km, etc.The different widths of the avks and the different Table 2. Mean (M) and standard deviation (σ ) of the DOFS time series of the retrieved ozone obtained from the Izaña IFS 120/5HR for all setups.These values are shown for each layer (2.37-13 km, 12-23 km, 22-29 km, 28-42 km) and for the total column (2.37-120 km).

Layer
Retrieval Setup sensitivities (sum along the row of the avks, row ) are mainly due to the differences in the applied a priori constraint.For all setups, and for altitudes below 40 km, the sensitivity is better than 80 %.Beyond 40 km the sensitivity significantly decreases for setup C, whereas for the setups A and B it remains very close to 1.0 (which is a typical behaviour for TP1 constraints).Following Rodgers (2000), the number of independent layers of a retrieved trace gas profile can be quantified by the trace of the averaging kernel matrix, the so-called "number of degrees of freedom for signal" (DOFS).For ozone, a mean total DOFS of around four is observed for all setups (see Table 2), meaning that the FTIR system is able to resolve four independent atmospheric layers: the troposphere (2.37-13 km), the tropopause region (12-23 km), the lower/middle stratosphere around the ozone maximum (22-29 km) and the middle/upper stratosphere (28-42 km).We have highlighted the ozone kernels at the altitudes of 5, 18, 29 and 39 km in representation of these layers (see Fig. 2).Note that for all layers the achieved DOFS are typically larger than one, indicating that the FTIR system is well sensitive for these layers.
The influence of the retrieval settings is also visible in the DOFS: for setup C the DOFS below 29 km is higher if compared to the setups A and B. Up to 29 km the setup C profiles have a better vertical resolution than the setup A and B profiles.This is a consequence of the rather loose constraint applied for setup C at these altitudes (the constraint of setup C is based on real ozone data, which show high variabilities from 15 to 25 km due to a vertically shifting tropopause).In addition, including a simultaneous temperature fit improves the retrieval quality (i.e., reduces the residues of the fitted spectra), thus, allowing for a more detailed interpretation of the spectra (see also the increased DOFS when including the temperature fit).
When interpreting the FTIR ozone profile time series it is important to consider that trends in the sensitivity of the remote-sensing system (avks or DOFS) can influence the ozone trend: decreasing DOFS might underestimate gradually increasing differences between the a priori ozone amounts used as constraint and the real ozone amounts, i.e., the ozone trends will be underestimated.Furthermore, there might be a bias between the climatologic ozone data (the a priori) and the FTIR ozone data due to systematic error sources like spectroscopic line parameters.In this case, the magnitude of the bias will decrease with decreasing DOFS, thereby leading to a trend even though real atmospheric ozone remains stable.The variability and the drifts in the DOFS can be observed in Fig. 3, which shows the time series of the total DOFS for setup C. The apparent drift between 1999 and 2004 means that trends estimated for this time period have to be treated with care.The DOFS time series also shows some kind of annual cycle, which is strongly anti-correlated with the annual cycle of the ozone slant column amounts (for low slant columns -less saturated ozone lines -the DOFS is higher than for high slant columns -partially saturated ozone lines).Likewise, in 2005 the change from the IFS 120M to the IFS 120/5HR instrument can be observed in the DOFS time series.As expected, we observe that the DOFS obtained from IFS 120M spectra are smaller than the DOFS obtained from the IFS 120/5HR spectra: the total DOFS values differ by 7 %.This difference is smaller in the troposphere (6-7 %) than in the stratosphere (up to 10 %).

Error estimation
The theoretical error estimation is analytically performed by the retrieval code PROFFIT (Hase et al., 2004).It is based on the formalism suggested by Rodgers (2000).We consider three error types: (a) errors due to uncertainties in the input parameters (instrumental characteristics, spectroscopy data, etc.), (b) the smoothing error, and (c) errors due to measurement noise.
As uncertainties in the input parameters we assume the values as listed in Table 3.The uncertainties are split into statistical and systematic contributions, 80 % and 20 %, respectively, except for spectroscopic parameters (line strength and pressure broadening coefficient), for which the entire uncertainty in the input parameter is systematic.The assumptions of Table 3 are reasonable for the IFS 120/5HR, but are very likely too optimistic for the IFS 120M (see, for instance, the discussion in Sect.3.4).Hence, Table 3 only describes the errors for the IFS 120/5HR.For the IFS 120M the errors due to LOS (Line Of Sight) and ILS uncertainties are by a factor of 2-3 larger.
The propagation of uncertainty sources for a typical measurement of the IFS 120/5HR, and using the different retrieval setups, is displayed in Fig. 4. The error profiles are shown as the root-square of the diagonal elements of the error covariance matrix for the different error sources considered (see Table 3).The error covariance matrices are calculated following the matrix multiplication formalism as suggested by Rodgers (2000).The smoothing error (SE), associated with the smoothing of the real vertical distribution of ozone by the FTIR measurement process, is the leading error for all setups.It is ( Â − I)S a ( Â − I) T , whereby I is a unity matrix, Â is the averaging kernel, and S a the assumed a priori covariance of atmospheric ozone.We use the same S a matrix for the three setups, which is obtained from an ECC sonde climatology (Schneider et al., 2008b).Note that the inverse of this matrix (S −1 a ) has been used for the optimal estimation constraint of setup C. The SE reaches about 40 % in the tropopause region, where the ozone concentrations are very variable and the profile might be highly-structured.The FTIR system is not able to resolve such fine vertical structures.Excluding the smoothing error, below 20 km the random errors are dominated by the measurement noise, the temperature and the ILS uncertainties.Above 20 km and for setup A the error due to temperature uncertainties notably increases, reaching about 5 % at 40 km.In contrast to the setups retrieving the temperature (setups B and C), where the respective errors are lower than 2.5 %.
As summary, Table 4 shows the random error budget for the different partial column amounts and for the total column amount.It lists the total random errors (TRE) estimated as the root-square-sum of all parameter errors (TPE, input parameters and measurement noise) and the smoothing error The significant contributors to the TPE (ILS, temperature and measurement noise) are also shown.Note that the inclusion of a simultaneous temperature fit (setup B and C) reduces significantly the random error associated with the temperature for all layers.For setup A, and especially for the higher layers (22-29 km and 28-42 km), the temperature uncertainty accounts for most of error on the ozone partial columns.This fact illustrates that a simultaneous temperature retrieval is important when aiming on high quality middle stratospheric ozone data (Schneider and Hase, 2008).Applying the retrieval setup C, the FTIR technique provides ozone partial columns with an overall precision of better than 3 % for the tropopause and middle/upper stratosphere and of better than 6 % for the troposphere.
Regarding setup B and C, the spectroscopic parameters are responsible for most of the systematic errors, whereas for setup A the temperature uncertainty becomes an important systematic error source, especially above the troposphere.
High quality measurements are very important for trend studies, since they minimise possible artificial trends caused by drifts in the error sources.For example, a simultaneous temperature fit minimises the artificial trend that might be caused by a drift in the temperature uncertainty (e.g., it might be −1 • C in 2000 and gradually improve to ∼0 • C in 2010).Another example is the ILS uncertainty (see Fig. 4).If a possible drift in the ILS is not adequately considered in the retrieval an artificial trend will be the consequence (in the following subsection, we demonstrate that it is very important to regularly monitor the ILS by laboratory cell measurements).Furthermore, a realistic constraint assures a correct interpretation of the variability as seen in the measured spectra, thereby leading to ozone data of a best possible quality (compare error budgets of setup B and C).

Long-term consistency of the ILS
Figure 4 shows that the ILS uncertainties are an important error source (in particular, middle and upper stratosphere).When aiming on a consistent long-term quality of ozone profiles, a continuous and precise documentation of the ILS is mandatory.Therefore, at Izaña we make regular low pressure N 2 O cell measurements.These measurements allow for retrieving the actual ILS by means of the LINEFIT code Table 4.Estimated random errors relative to actual ozone partial columns and total column [%] for the Izaña IFS 120/5HR for all setups and for the different layers.

Comparison between IFS 120M and IFS 120/5HR
During March and April 2005 both instruments (IFS 120M and IFS 120/5HR) measured side-by-side.The comparison between these coincidental measurements is displayed in Fig. 6, where the ozone partial columns as obtained from retrieval setup C are shown.
The agreement is excellent, except for the highest layers, where the FTIR data are more sensitive to instrumental uncertainties (ILS and measurement noise, see Fig. 4).For example, we observe a mean ratio between the IFS 120/5HR and 120M data of 0.98 ± 0.39 × 10 −2 (±1 standard error of the mean value) with a correlation of 99 % in 2.37-13 km layer and 0.98 ± 0.65 × 10 −2 with a correlation of 88 % in 28-42 km layer.These values are for setup C results.When comparing the IFS 120M and IFS 120/5HR results obtained by retrieval setups A and B, we find more scatter and lower correlation coefficients at the highest altitudes.This is expected from the error estimation (more uncertainty in setup A and B than in setup C retrieval data, see Table 4).Note also that higher errors are expected for the IFS 120M ozone retrievals due to its higher ILS uncertainties (see Fig. 5) and its higher measurement noise and, consequently, lower sensitivity.The lower sensitivity is clearly observed by comparing the DOFS time series of retrieved ozone from the two spectrometers (see Fig. 3).
We decided not to correct the ozone partial column time series from IFS 120M (1999)(2000)(2001)(2002)(2003)(2004)) by the bias as derived during the two months side-by-side intercomparison period.We think that this could introduce artificial trends, since a two months inter-comparison period in 2005 cannot be perfectly representative for the whole IFS 120M time series (1999)(2000)(2001)(2002)(2003)(2004)(2005).Instead, we document the long-term consistency of the FTIR data by an intercomparison to the independent ECC sonde time series (see Sect. 5).
Table 5. Statistics of coincident measurements from the IFS 120/5HR and IFS 120M ozone partial columns for each setup and layer (N = 19).

Consistency of ECC sonde time series
Before using the ECC data as reference for empirically assessing the quality of the FTIR ozone data and their representativeness for annual cycles and trends, it is very important to check the consistency of the ECC ozone time series.In this section, we use the coincident measurements of ozone total column from Brewer spectrometers and of surface ozone from in situ analysers for empirically documenting the quality and the long-term stability of the ECC sonde dataset.

ECC sonde vs. Brewer
The consistency and quality of the ECC sonde ozone profile (x ECC org ) time series can be estimated by comparing it to independent and coincident very high quality measurements of ozone total amounts.At the Izaña Observatory such high quality measurements are performed by the FTIR system and by Brewer spectrometers.The Brewers have been operative since May 1991, and like the ECC sonde and FTIR Programs, they have been part of NDACC since March 2001.Furthermore, since November 2003 they represent the Regional Brewer Calibration Center for Europe (http://www.rbcc-e.org/) of WMO/GAW, which guarantees the high quality of their ozone total column measurements (better than 1 %, Redondas and Cede, 2006).
The ECC sondes normally burst between 30 and 34 km.In order to homogenise the study, we only consider the ECC data measured up to 29 km.Thus, the ozone total columns from the ECC sondes (TC ECC org ) have been calculated by integrating the ozone profiles up to 29 km and adding a typical residual ozone column up to the top of atmosphere.This residual has been estimated from the mean difference between the daily Brewer total ozone columns (TC Brewer ) and ECC ozone partial amounts up to 29 km (PC ECC org ) for all 515 Brewer/ECC coincidences from 1999 to 2010 (for more details see Schneider et al., 2008b).We found a mean and a standard deviation (1σ ) for the ozone residual of 81.5 and 10.6 DU, respectively.Note that in our study, we apply Brewer measurements in order to remain independent from the FTIR data.
The Brewer total ozone columns can be used to correct the ECC sonde profiles (x ECC ) by: whereby CF (CF = TC Brewer PC ECC org + 81.5 ) is the daily correction factor.Schneider et al. (2008b) illustrated that by this correction the quality of the ECC data can be significantly improved.
The CF time series is displayed in Fig. 7a.We observe a jump in 2005.In fact, the mean CF is 1.02 ± 0.03 (±1σ ) and 0.98 ± 0.03 (±1σ ) for the periods 1999-2004 and 2005-2010, respectively.Such systematic change of about −4 % between the ECC records can be expected by changing the sensing solution type or ECC sonde type (Smit et al., 2007, and references therein).During 2005 the sensing solutions in the ECC sondes were substituted by a new batch, but the same manufacturer's operating procedures, and ratio of cathode sensing solutions, have been kept during whole period (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010).Nonetheless, in the light of above results, we might assume that some change in the sensing solutions was introduced by using the new batch.The discontinuities in the CF time series is also identified by a rank order change point test (Lanzante, 1996;Romero et al., 2011).This nonparametric method, based on the ranks of the monthly values from a time series, is not particularly affected by gaps and outliers in the time series.
The time series of ECC ozone total columns, without and with correction (TC ECC org and TC ECC , respectively), is displayed in Fig. 7b.The corrected ECC records benefit from the synergies of the two techniques: the high vertical resolution of the ECC sondes and the high precision of the Brewer measurements.

ECC sonde vs. surface
At the Izaña Observatory the tropospheric surface ozone is also monitored by photometric in situ analysers.This surface dataset can also be used for documenting the long-term consistency of the ECC sonde profiles at Izaña's altitude.
In order to evaluate possible drifts in the ECC sonde dataset, we examine the time series of the differences in the monthly means between the ECC and the photometric surface data (Fig. 8).To do so the photometric surface data were averaged 30 min around the ECC measurements.For the ECC data corrected according to Eq. ( 1), the drift, i.e., the trend in the differences is not significant (0.49 ± 0.51 % yr −1 ).On average, we find a mean bias and a scatter (1σ ) of 10 % and 15 %, respectively, between the two techniques.Note that, if the ECC profiles are not corrected, we observed a significant drift in the difference between the ECC and the photometric surface data (0.94 ± 0.51 % yr −1 ), confirming the importance of correcting the ECC time series before using it for long-term studies.
Both the Brewer and the photometric surface measurements identify a jump in the ECC time series.This jump would significantly affect the estimated ozone trends as well as the inter-comparison with another technique.Therefore, in the subsequent analysis we will exclusively apply the corrected ECC sonde time series (corrected according to Eq. 1).

Validation of FTIR ozone profiles
The corrected ECC sonde time series can be used for evaluating the quality of the FTIR ozone profiles.To do so we have to consider that the vertical resolution of the ECC sonde and the FTIR profiles are rather different.Therefore, we only compare the layers that are sufficiently well detectable by the ground-based FTIR system, i.e., the ozone partial columns for the layers 2.37-13 km, 12-23 km and 22-29 km (the DOFS for all these layers is typically larger than one).Since the ECC sonde profiles are not smoothed by the FTIR's avks (e.g., Schneider et al., 2008b), the advantage of the improved vertical resolution observed for the more sophisticated retrieval setups can directly be validated.Even more important, not smoothing the ECC sonde data with the FTIR's avks assures that both the FTIR and the ECC time series remain completely independent and, for instance, the ECC sonde trends become not influenced by possible trends of the FTIR's avks (see Fig. 3).Here we compare two completely independent datasets, which would not be the case if we smooth the ECC data by the FTIR's avks.
The straightforward comparison of ozone partial columns from FTIR and corrected ECC sondes is rather satisfactory, as shown in Fig. 9.We find a mean bias between −8 % and −4 % in troposphere (2.37-13 km), about 1 % in the tropopause region (12-23 km) and about 7 % in middle stratosphere (22-29 km) for all retrieval setups, while the scatter is between 6 % and 9 % for the two first layers and only of 3 % in middle stratosphere (see Fig. 9a and b).These results rather agree with the expected uncertainty for the ECC sondes given by Smit et al. (2007) (±5-10 %) and with our FTIR theoretical error estimation (Sect.3.3).Furthermore, they are in good agreement with other intercomparison studies using ECC sonde data (Calisesi et al., 2005;Nair et al., 2011 and references therein).It illustrates the good agreement between the ECC sonde and FTIR techniques.As example, Fig. 10 shows the direct comparison between the coincident FTIR (setup C) and ECC sonde partial columns from 1999 to 2010 (N = 262).
Part of the discrepancy observed between FTIR and ECC sonde measurements is due to the smoothing error (recall Table 4).Smoothing the ECC sonde profiles with the FTIR avks improves the agreement for all setups by reducing the scatter, on average, to 7 %, 5 % and 2 % for the 2.37-13 km, 12-23 km and 22-29 km layers, respectively.Other sources of discrepancy might be errors in the FTIR and/or ECC data and the observation of different air masses by the FTIR experiment, on the one hand, and ECC experiment, on the other (Schneider et al., 2008b).The systematic discrepancies might be attributable to systematic ECC and FTIR errors caused, for instance, by uncertainties in the spectroscopic line parameters.
It is important to note that the scatter between FTIR and ECC concentrations decreases as the level of refinement of setups increases, especially in the troposphere and the tropopause region.Due to the reduction of the temperature error and the use of a realistic constraint, setup C provides the FTIR profiles that best agree with the ECC sonde profiles.This is in good agreement to our theoretical estimation as presented in Table 4 and Fig. 4. As addendum to the comparison to the ECC data, Fig. 9 shows the comparison to the Brewer data that have been used in the Schneider et al. (2008a) paper.We observe that the scatter between the FTIR and Brewer ozone total column amounts is significantly reduced (from 1.06 % to 0.71 %) as the FTIR setups become more refined, confirming the results of the FTIR-ECC comparison.Similar results were found for the comparison between FTIR and Izaña's surface data, i.e., the best agreement is obtained for the retrieval setup C. We find a scatter (1σ ) of 20 % for the setup C and of 23 % for setup A. The scatter between the ECC sonde and Izaña's surface data is smaller (15 %, recall Sect.4.2).The slightly increased scatter when comparing with the FTIR is due to the FTIR's smoothing error.
In order to evaluate possible artificial trend in the FTIR's DOFS time series, we analyse the time series of the differences between smoothed and unsmoothed ECC sonde data.This can be done for the 262 occasions during the 1999-2010 time period where FTIR and ECC measurements are performed in coincidence.For all setups, we observe that there are no significant trends in the 2.37-13 km and 12-23 km layers, but that there is a significant trend (at 95% of confidence) for the 22-29 km layer.For example, for setup C, the trend is of 0.12 ± 0.10 DU yr −1 (i.e., 0.11 ± 0.09 % yr −1 ), which might indicate that drifts in the avks affect the Izaña 1999-2010 middle/upper stratospheric FTIR ozone trend.These facts consolidate our applied strategy not to smooth the ECC sondes for trend comparisons.Otherwise, both instruments might show similar trends, due to the fact that there is a trend in the avks.However, we also have to consider that the data subset of ECC/FTIR coincidences is not well representative for the whole FTIR time series (the number of coincidences is rather small and most of them occur from April to October).

Ozone trends
Obtaining observational evidence of the projected trends in the vertical distribution of ozone is a difficult task.The trends are rather small and small instrumental drifts might cause artificial trends.Therefore, it is important to apply different measurement techniques.Our study estimates the  ground-based FTIR ozone trends and examines its consistency to trends obtained from ECC sonde and surface in situ analyser datasets.
The ozone trends are estimated by using a bootstrap resampling method (Gardiner et al., 2008), which models the total variation in ozone by a function F (t) and allows for separating the annual cycles from possible long-term trends: where t is measured in years, f o is a baseline constant and f trend the linear trend in change per year.The annual cycle is modelled in terms of a Fourier series where a i and b i are the parameters of the Fourier series to be determined and ω i = 2 π i/T with T = 365.25 days.We consider frequencies up to 3 yr −1 (p = 3), since the third order Fourier series provided the best overall results (Gardiner et al., 2008;  Vigouroux et al., 2008).Figure 11 shows the FTIR ozone time series (setup C) and the fitted function F (t) for the different layers.
Although our bootstrap model does not capture atmospheric processes such as quasi-biennial oscillations (QBO) or solar cycle variations (e.g., Reinsel et al., 2002), they become a noise source in the linear trend determination and, thus, feed into the uncertainties in the determined trends (Vigouroux et al., 2008).The significance of linear trends is estimated by assuming that the residuals are Gaussian and uniform over the whole analysed time period (Gardiner et al., 2008).
The observed trends from the FTIR time series slightly depend on the applied retrieval setup (Fig. 12).However, these differences are not significant and lie clearly within the 95 % confidence range (see error bars in Fig. 12, note that the confidence range is given by 2σ standard deviations of the bootstrap re-sampled distributions).For all setups, we observe a significant negative trends between 1999 and 2010 for the tropopause region, while in the middle/upper stratosphere the FTIR data document a significant stratospheric ozone increase.The corrected ECC sonde time series confirms the FTIR trends in the middle stratosphere, while for the rest of layers the ECC trends are not significant.In the troposphere, we observe no significant trends when analysing the datasets produced by the setups that include a temperature retrieval (setup B and C), whereas the setup A dataset indicates a positive trend.In order to get more insight into the long-term evolution of this tropospheric layer, we have also estimated the ozone trends from the night-time surface in situ data, whose for Izaña are well representative of the free troposphere conditions.Thus, the airmass detected by the in situ technique can be compared to the tropospheric airmass remotely sensed by the FTIR system (Sepúlveda et al., 2012).We observe a significant small negative trend (see Fig. 12), which better agrees with the setups that simultaneous fit the temperature profile.In summary, the trends obtained from the setup C datasets tend to be in better agreement with the trends obtained from the ECC sonde and surface in situ datasets, although the differences to the trends estimated from setup A and B datasets are rather small.
It is interesting to mention that the significant ozone trends observed by the FTIR system in the upper troposphere/lower stratosphere (negative trend) and in the middle/upper stratosphere (positive trend) are also predicted by current climate models at the northern subtropical latitudes (Hegglin and Shepherd, 2009;Li et al., 2009).For example, Li et al. (2009) estimate that a decrease of ozone is to be expected in the lower stratosphere in the northern subtropical latitudes associated with the predicted increase in the stratospheric Brewer-Dobson circulation.An accelerated Brewer-Dobson circulation transports more ozone from tropics to the midhigh latitudes and could delay ozone recovery in the tropics and advanced ozone recovery in the extra-tropics.However, we have to be aware that a 12-yr time series might be too short for validating the models due to the large year-toyear variability of this stratospheric circulation (Weber et al., 2011).Likewise, Li et al. (2009) show that the photochemical response to strong cooling induced by increasing greenhouse gases concentrations would lead to an upper stratospheric ozone increase.In addition, the leveling off of anthropogenic halogen components in the upper stratosphere since the mid-1990s has to be considered (WMO, 2007).Similar results were reported by WMO (2011).
For the upper stratosphere the quality of the FTIR ozone time series has not been empirically validated in this work by day-to-day intercomparisons due to the lack of respective ECC data.Previous campaign studies show good agreement between the ozone measurements obtained from FTIR and other measurement techniques in this layer, such as groundbased LIDAR and millimeter-wave radiometer (Kopp et al., 2002;Vigouroux et al., 2008).However, at these altitudes the FTIR ozone data are very sensitive to ILS uncertainties and to measurement noise.Consequently, our upper stratospheric trend estimations should be treated with care, although they are consistent for all setups and are supported by other experimental studies.For example, Steinbrecht et al. (2009) found that the upper stratospheric ozone (35-45 km) has been slightly increasing since the late 1990s at other NDACC sites at similar latitudes (Mauna Loa, 19.5 • N 155.6 • E, and Table Mountain, 34.5 • N 117.7 • E), using satellite-and groundbased LIDAR measurements.This stratospheric ozone recovery has also been documented experimentally by FTIR records in northern subtropical, middle and polar latitudes (updated from Vigouroux et al., 2008in WMO, 2011).

Ozone annual cycle
The annual ozone cycles have been calculated by producing monthly averages considering the whole time series.Figure 13 shows the annual cycles for the FTIR data (setup A and C) and the corrected ECC sonde data (not smoothed and smoothed by avks from the FTIR setup A and C).We observe that the agreement between the two techniques is rather satisfactory.
The annual cycle for the troposphere (2.37-13 km) reveals a maximum in spring-summer, which indicates the importance of photochemical production of tropospheric ozone.In the tropopause region (12-23 km) we observe a maximum in winter-spring and a minimum in summer and in early autumn, which is due to the large vertical shift of the subtropical tropopause altitude.In winter there is a mid-latitudinal tropopause and this layer belongs to the lower stratosphere, while in summer there is a tropical tropopause and this layer has rather upper tropospheric characteristics.The transformation from mid-latitudinal to tropical characteristics is also observed in the middle/upper stratosphere (22-29 km and 28-42 km).There we observe maximum ozone concentrations in summer-autumn when the tropical conditions prevail.In the middle stratosphere the FTIR system detects a maximum in spring that is not found in the ECC sonde annual cycle.This is due to the smoothing error: the FTIR ozone partial column in this layer contains information from the lower layers (12-23 km), where the spring maximum is due to the annual cycle in the tropopause altitude.
Note that applying different constraints and temperature retrievals do not significantly affect the ozone seasonality.

Conclusions
In this paper, we document the quality of the ozone profiles obtained from ground-based FTIR systems and discuss its application for long-term studies.We investigate three different retrieval setups: (A) an ad-hoc constraint for ozone and no temperature profile retrieval, (B) an ad-hoc constraint for ozone and a simultaneous temperature profile retrieval, and (C) an ozone constraint based on an ozone climatology (optimal estimation retrieval) and a simultaneous temperature profile retrieval.
Our theoretical error assessment reveals that the measurement noise and the uncertainties in the ILS and the applied temperature profile (for setup A) are the leading error sources.In particular, the retrieved middle/upper stratospheric ozone amounts are strongly affected by ILS and temperature uncertainties.We show that the temperature error can be significantly reduced by performing a simultaneous temperature profile retrieval.Moreover, the ad-hoc constraint retrievals offer more DOFS in the middle/upper stratosphere than the optimal estimation retrieval.At lower altitudes it is vice versa.An optimal estimation retrieval is supposed to interpret the measured spectra in a best possible manner.Consequently, the ad-hoc constraint retrievals might misinterpret ozone variability (over-/under-estimate variability at higher/lower altitudes).
For an empirical quality assessment, we use a coincident ECC sonde ozone profile dataset as reference, whose quality, in turn, has been checked, independently from the FTIR data, by a comparison to Brewer total column and surface in situ measurements.During the 12-yr period of 1999-2010, the agreement between the vertical ozone distribution obtained by the FTIR and the ECC sondes is very satisfactory.We show empirically that the FTIR system is well able to capture the day-to-day ozone variability in the troposphere, tropopause region and middle stratosphere.Furthermore, both techniques reveal very similar annual seasonality.For the ozone retrieval setup that applies a constraint based on an ozone climatology and includes a simultaneous temperature profile retrieval we observe a slightly better agreement than for the other setups.These observations confirm our theoretical quality assessment.
Regarding ozone trends, we estimate the trends for the 1999-2010 time period for the ECC, surface in situ analysers, and FTIR datasets.In the middle stratosphere we observe a significant positive trend (95 % confidence interval) for the ECC and for all three FTIR datasets (the FTIR also reveals a significant positive trend above 30 km, where there are no ECC data available).In the upper troposphere/lower stratosphere region the FTIR observes a significant negative trend (95 % confidence interval), which cannot be confirmed by the ECC dataset.At these altitudes ozone amounts are very variable and the trend estimates are rather uncertain.This is especially true for the ECC trend estimates, since there is only one ECC observation per week.FTIR observations are made more frequently (several times per week), leading to smaller uncertainties in the trend estimates.In the troposphere we observe no significant trend neither in the ECC nor in the FTIR datasets.The surface in situ data reveal a significant negative trend, which is interestingly also indicated by the two FTIR retrieval setups that apply a simultaneous temperature retrieval.
A main reason for this satisfactory agreement is the fact that we take a lot of care in documenting the ILS (see Fig. 5), thereby avoiding artificial trends due to drifts in the ILS.A regular ILS monitoring, applying low pressure gas cell measurements, is very important for FTIR trend studies of stratospheric absorbers.Furthermore, we think that a simultaneous temperature retrieval is important, since it can significantly reduce the risk of artificial trends caused by possible drifts in the temperature uncertainty, thereby theoretically increasing the reliability of the FTIR trends.In our study, we observe that the temperature retrieval modifies the estimated trends.Using a realistic constraint instead of an ad-hoc constraint is important for reproducing the large day-to-day variability (see comparisons in Sect.5), and it also slightly affects the estimated trends.Finally, one should consider the temporal evolution of the DOFS when using remote-sensing data for trend studies.For example, if there is a bias in the remotesensing data, this bias will very likely decrease with decreasing DOFS, thereby giving rise of an artificial trend.
In summary, we think that correctly estimating the small expected ozone trends is a very difficult task for any measurement technique.In this context, ozone super-sites like the Izaña Observatory, that concentrate numerous measurement techniques, are important.They allow for intercomparing the techniques, thereby documenting the long-term consistency of the different profile datasets.Small trends that are consistently detected by different and independently working measurement techniques are rather reliable, whereas a small trend detected by an individual technique implicates the risk to be artificial and caused by drifts in error sources or instrumental properties.

Fig. 2 .
Fig. 2. Columns of the FTIR averaging kernels, avks, for retrieved ozone values using the setup A (a), setup B (b), and C (c), expressed as ln[O 3 ], for the Izaña IFS 120/5HR.row (black line) is the total sensitivity of FTIR system and DOFS are the degrees of freedom for signal.

Fig. 3 .
Fig. 3. Time series of total DOFS of retrieved ozone values using the setup C for the whole FTIR time series between 1999 and 2010 (number of data, N , is 1887).

Fig. 4 .
Fig. 4. VMR random and systematic errors relative to actual ozone VMR profiles [%] for the Izaña IFS 120/5HR for the setups A (a), B (b) and C (c).ILS means the joint error due to the modulation efficiency and phase error uncertainties and TPE (Total Parameter Error, black line) is the sum of all random errors except for smoothing error.

Fig. 5 .
Fig. 5. Times series of the modulation efficiency [%] at different optical path differences (OPD) for the Izaña spectrometers.Individual data points indicate individual cell measurements.The are the smoothed efficiency curves used during the FTIR retrievals.Black at 38 cm, red at 85 cm, green at 133 cm and blue at 180 cm.

Fig. 6 .
Fig. 6.Comparison between the IFS 120/5HR and IFS 120M ozone partial columns [DU] for the 120M-120/5HR coincidences during March and April 2005 (N = 19) and for setup C: (a) 2.37-13 km, (b) 12-23 km and (c) 22-29 km and (d) 28-42 km.The black solid lines are the linear regression lines through origin, whose parameters are shown in the legend (S is the slope of regression fit and R the correlation coefficient).The dotted lines are the diagonals.

Fig. 7 .
Fig. 7. Time series of correction factor (CF, according to Eq. 1) at the Izaña Observatory (a) and of ozone total column amounts from ECC sondes without and with correction (TC ECC = CF•TC ECC org ) (b).The mean correction factors for the periods 1999-2004 and 2005-2010 are also shown.The black arrows indicate the change-point date.

Fig. 8 .
Fig. 8. Monthly mean time series of the relative differences [%] between the ozone VMR from the ECC sonde data (corrected, ECC, and not corrected by the daily CF, ECC org,) at Izaña's altitude and surface data (N = 112).The solid lines are the linear regression lines of the least square fits.The slopes and the 95 % confidence ranges (±2σ ) are shown in the legend (S).

Fig. 9 .
Fig. 9. Relative differences [%] between the ozone partial columns (2.37-13 km, 12-23 km and 22-29 km) calculated from the ECC sonde and from the different FTIR setups: (a) mean values (error bars indicate the standard error of the mean) and (b) scatter or standard deviation (1σ ).It is also shown the differences between the FTIR and Brewer ozone columns.X means the ECC or Brewer data.

Fig. 10 .
Fig. 10.Comparison between the ozone partial columns from FTIR (setup C) and ECC sonde data (N = 262): (a) 2.37-13 km, (b) 12-23 km and (c) 22-29 km.The black solid lines are the linear regression line of the least square fits, whose parameters are shown in the legend (S and B are the slope and the bias of the regression fit, respectively, and R the correlation coefficient).The dotted lines are the diagonals.

Fig. 13 .
Fig. 13.Annual cycle of the ozone partial columns [DU] obtained from the FTIR (setup A and C), the corrected ECC sonde data (not smoothed and smoothed by avks from the FTIR setup A and C): (a) 2.37-13 km, (b) 12-23 km, (c) 22-29 km and (d) 28-42 km.The error bars indicate the standard error of the mean value. .

Table 3 .
Assumed experimental and temperature uncertainties.