Validation of the IASI FORLI/EUMETSAT ozone products using satellite (GOME-2), ground-based (Brewer–Dobson, SAOZ, FTIR) and ozonesonde measurements

. This paper assesses the quality of IASI (Infrared Atmospheric


Introduction
Ozone (O 3 ) plays a major role in the chemical and thermal balance of the atmosphere. In the stratosphere, O 3 protects the biosphere and humans from harmful ultraviolet (UV) radiation. In the troposphere, O 3 plays different roles depending on altitude. Near the surface, ozone in excessive amounts is one of the main air pollutants impacting both human health (Brunekreef and Holgate, 2002;Lim et al., 2012) and ecosystems (Fowler et al., 2009). In the upper troposphere, ozone is an important anthropogenic greenhouse gas (IPCC, 2013) and acts as a short-lived climate forcer (Shindell et al., 2012). Tropospheric O 3 originates either from complex photochemical reactions involving nitrogen oxides (NO x ), carbon monoxide (CO) and hydrocarbons (e.g., Chameides and Walker, 1973;Crutzen, 1973) or from the stratosphere by downward transport to the troposphere especially at midand high-latitudes (e.g., Holton et al., 1995) as well as from long-range transport (e.g., Stohl and Trickl, 1999). The lifetime of tropospheric ozone varies with altitude and ranges from 1 to 2 days in the boundary layer, where dry deposition is the major sink, to several weeks in the free troposphere, meaning that the transport scale of O 3 can be intercontinental and hemispheric (Monks et al., 2015). Therefore, to better understand its variability and impacts, it is crucial to obtain information on its vertical, spatial and temporal distribution. This information can be provided by observations from spaceborne instruments.
The Infrared Atmospheric Sounding Interferometer (IASI) is a nadir-viewing spectrometer (Clerbaux et al., , 2015 that has been flying on board the EUMETSAT's (European Organisation for the Exploitation of Meteorological Satellites) Metop-A andMetop-B satellites, since October 2006 andSeptember 2012, respectively. In order to ensure the continuity of IASI observations for atmospheric composition monitoring, a third satellite (Metop-C) is scheduled to be launched in November 2018. Thanks to the nadir geometry complemented by off-nadir measurements of up to 48.3 • on both sides of the satellite track (swath of about 2200 km), each IASI instrument covers the globe twice a day, with a field of view of 4 pixels of 12 km in diameter on the ground at nadir. The two Metop satellites are on the same orbit with Equator crossing times of 09:30 (21:30) local mean solar time for the descending (ascending) part of the orbit. Therefore, there are numerous common observations between two consecutive tracks. However, as Metop-A and Metop-B are 180 • out of phase, there is a ∼ 50 min temporal difference between both instruments (one satellite might be before or after the other); thus, the observations are never quite simultaneous. In addition, the geometry of the observations is different and generally off-nadir with opposite angles, meaning that the location of the observation between the two instruments varies and the pixels are not absolutely geographically co-localized.
With a twice daily coverage and a 12 km diameter footprint at nadir, IASI has the potential to provide global O 3 measurements with a high spatial resolution. Previous studies have demonstrated IASI's ability to measure O 3 separately in the stratosphere (Scannell et al., 2012;Gazeaux et al., 2013), in the upper troposphere and lower stratosphere (UTLS) (e.g., Barret et al., 2011;, and in the troposphere (e.g., Eremenko et al., 2008;Dufour et al., 2010Dufour et al., , 2015Safieddine et al., 2013Safieddine et al., , 2014. Using the longterm IASI O 3 record, the interannual variability of tropospheric ozone and long-term trends can be derived Wespes et al., , 2018Gaudel et al., 2018). Recently, Wespes et al. (2017) analyzed more than 8 years of IASI O 3 data to identify the main geophysical drivers (e.g., solar flux, the Quasi-Biennial Oscillation, the North Atlantic Oscillation, the El Niño-Southern Oscillation) of O 3 regional and temporal variability.
Several research groups have developed O 3 retrieval algorithms for IASI based on different approaches (e.g., Barret et al., 2011;Dufour et al., 2012;Hurtmans et al., 2012;Oetjen et al., 2016). In particular, ULB & LATMOS have developed the Fast Optimal Retrievals on Layers for IASI O 3 (FORLI-O 3 ) software , which uses the IASI Level 1C data to retrieve Level 2 O 3 products. A series of validation exercises of IASI O 3 products retrieved from different versions of FORLI-O 3 (v20100825, v20140922), focusing on a particular region and/or relatively short period of time, were undertaken (e.g., Dufour et al., 2012;Pommier et al., 2012;Scannell et al., 2012;Gazeaux et al., 2013;Safieddine et al., 2016). Boynard et al. (2016) performed an extensive validation of IASI O 3 products retrieved from FORLI-O 3 v20140922 against a series of independent observations, on the global scale, for the period from 2008 to 2014. This study reported that, on average, FORLI-O 3 v20140922 overestimates the ultraviolet (UV) total ozone column (TOC) by 2 %-7 % with the largest differences found at high latitudes. It is worth mentioning that Boynard et al. (2016) did not perform any comparison with measurements in spectral ranges other than the UV. The comparison with ozonesonde vertical profiles shows that, on average, FORLI-O 3 v20140922 Atmos. Meas. Tech., 11,2018 www.atmos-meas-tech.net/11/5125/2018/ A. Boynard et al.: Validation of the IASI FORLI/EUMETSAT ozone products 5127 underestimates O 3 by ∼ 5 %-15 % in the troposphere, while it overestimates O 3 by ∼ 10 %-40 % in the stratosphere depending on the latitude. Several algorithm improvements were introduced later in FORLI-O 3 , including absorbance look-up tables recalculated to cover a larger spectral range using the 2012 HITRAN spectroscopic database (Rothman et al., 2013), with additional numerical corrections. Boynard et al. (2016) evaluated 12 days of the new IASI O 3 products retrieved from FORLI v20151001 and found a correction of ∼ 4 % for the TOC positive bias when compared to the UV groundbased (GB) and satellite observations, bringing the overall global comparison to ∼ 1 %-2 % on average. It was shown that this improvement is mainly associated with a decrease in the retrieved O 3 concentration in the middle stratosphere (MS, above 30 hPa/25 km). This O 3 retrieval algorithm (FORLI-O 3 v20151001) is currently being implemented in the EUMETSAT processing facility under the auspices of the "Ozone and Atmospheric Composition Monitoring Satellite Application Facility" (AC SAF) project in order to operationally distribute Level 2 IASI O 3 profiles to users through the EUMETCast system in 2018. IASI Level 2 and Level 3 O 3 products processed with FORLI v20151001 are part of the European Space Agency O 3 Climate Change Initiative (Ozone_cci, http://www.esa-ozone-cci.org, last access: 30 August 2018) and the European Centre for Medium-Range Weather Forecasts (ECMWF) Copernicus Climate Change (C3S) projects, respectively. These programs focus on building consolidated climate-relevant ozone datasets as essential climate variables (ECVs). Therefore, validating the latest version of the IASI O 3 products over a long time period and assessing their stability are necessary for decadal trend studies, model simulation evaluation and data assimilation applications. This is one of the main motivations of the present work. The goals of the Ozone_cci project are described in Garane et al. (2018) whilst its requirements in term of satellite product stability, which is defined as 1 %-3 % decade −1 based on the requirements formulated by the Global Climate Observing System (GCOS) and the "Climate Modelling User Group" (CMUG) climate modeling community for ozone, are detailed in Van Weele et al. (2016).
In this paper, we assess the quality of the IASI O 3 products retrieved using FORLI-O 3 v20151001 (hereafter referred as to "IASI O 3 products"), with GOME-2 (Global Ozone Monitoring Experiment-2; also on Metop), ground-based network data (Brewer, Dobson, SAOZ -Système d'Analyse par Observation Zénithale -and FTIR -Fourier transform infrared) and ozonesonde measurements. Sections 2 and 3 describe the characteristics of the datasets used for the validation and the comparison methodology, respectively. Section 4 presents the intercomparison between IASI-A and IASI-B O 3 derived total and tropospheric columns. Section 5 provides the IASI-A and IASI-B TOC and partial ozone column product validation results using independent satellite, GB and ozonesonde observations. Finally, Section 6 summarizes the results from this new validation.
2 IASI measurements and independent datasets used for the validation 2.1 IASI ozone retrievals IASI ozone retrievals are performed in the 1025-1075 cm −1 spectral range using the optimal estimation method (OEM) (Rodgers, 2000) and tabulated absorption cross sections at various pressures and temperatures to speed up the radiative transfer calculation. The ozone climatology by McPeters et al. (2007) is used as a priori information consisting of one single O 3 a priori profile and variance-covariance matrix. The EUMETSAT Level 2 data (pressure, water vapor, temperature and clouds) are used as input in FORLI. It is worth mentioning that the EUMETSAT dataset is not homogenous, as it has been processed using different versions of the IASI Level 2 Product Processing Facility between 2008 (v4.2) and 2016 (v6.2), as summarized in Van Damme et al. (2017). The error budget of the retrieved O 3 profile shows that the dominant errors originate from the limited vertical sensitivity, from the measurement noise and from uncertainties in the fitted (water vapor column) or fixed (e.g., surface emissivity, temperature profile) parameters . In order to avoid cloud contaminated scenes, retrievals are only performed for clear or almost-clear scenes with a fractional cloud cover below 13 %, identified using the cloud information from the EUMETSAT operational processing (August et al., 2012). In addition, no retrieval is performed for pixels characterized by an error related to the Level 1C IASI data, by missing Level 2 EUMETSAT data associated with Level 1C data or when temperature, water vapor, surface pressure or cloud values are missing in Level 2 EUMETSAT data. The IASI O 3 dataset used in this paper covers the period from January 2008 to July 2017. The O 3 product is a vertical profile given as partial columns in molecules per square centimeter in 40 layers between the surface and 40 km, with an extra layer from 40 km to the top of the atmosphere. It also includes other relevant information such as quality flags, the a priori profile, the total error profile and the averaging kernel (AK) matrix, on the same vertical grid. Quality flags were applied to filter the dataset for further validation analysis, and data were excluded under the following specific conditions: (i) when the spectral fit residual root mean square error (RMS) was higher than 3.5 × 10 −8 W (cm 2 sr cm −1 ) −1 , reflecting a difference that was too large between observed and simulated radiances; (ii) when the spectral fit residual bias was lower than −0.75 × 10 −9 W (cm 2 sr cm −1 ) −1 or higher than 1.25 × 10 −9 W (cm 2 sr cm −1 ) −1 ; (iii) when the partial O 3 column was negative; (iv) when there were abnormal averaging kernel values; (v) when the spectral fit diverged; (vi) when the total error covariance matrix was ill condi-  tioned; (vii) when the O 3 profiles had an unrealistic C-shape (i.e., abnormal increase in O 3 at the surface, such as over desert due to emissivity issues), with a ratio of the surface-6 km column to the total column higher than or equal to 0.085 and (viii) when the DOFS (degree of freedom for signal) was lower than 2, which was mostly associated with bad quality data in the Antarctic region. A representative IASI-A averaging kernel matrix is illustrated in Fig. 1a, showing the difficulty involved with distinguishing the ozone structures between one level and another. However, it also shows the altitude ranges characterized by peaks of sensitivity: ∼ 5, 12, 18 and 40 km. Another way to visualize the AK matrix is to represent the AK profiles as a function of altitude as shown in Fig. 1b. The AK are not maximal at their nominal altitudes, which indicates that other altitudes contribute to the ozone value at the individual retrieval altitude. A way to estimate the vertical resolution of IASI O 3 profiles is to analyze the DOFS as a function of altitude. The cumulative DOFS, which is presented in Fig. 1c, continuously increases with altitude, given that there is information in the observations for the entire range of altitudes.
The IASI retrieval error on the TOC, including the smoothing and the measurement error, is usually below 2 %, except in the Antarctic (> 4 %), which is due to the particularly weak signal in this region; for the surface-300, 300-150, 150-25 and 25-3 hPa partial columns, it is estimated to be ∼ 15 %, 17 %, 4 % and 3 %, respectively.

The Global Ozone Monitoring Experiment-2
(GOME-2) data The GOME-2 instrument, also on board the Metop-A and B platforms, is a UV-vis-NIR (visible-near IR) nadir view-ing scanning spectrometer, with an across-track scan time of 6 s and a nominal swath width of 1920 km, which provides global coverage of the sunlit part of the atmosphere within a period of approximately 1.5 days (Hassinen et al., 2016;Munro et al., 2016). GOME-2 ground pixels have a footprint size of 80 km × 40 km, which is larger than that of IASI (pixel diameter of 12 km). In the framework of the EUMETSAT AC SAF project, GOME-2 total ozone data are processed at DLR (Deutsches Zentrum für Luft-und Raumfahrt) operationally, both in near real time and offline, using the GOME Data Processor (GDP) algorithm (Loyola et al., 2011;Hao et al., 2014;Valks et al., 2014). The GOME-2 products have been validated using ground-based measurements (e.g., Loyola et al., 2011;Koukouli et al., 2012Koukouli et al., , 2015Hao et al., 2014), which have shown an overall agreement within 1 % in most situations. As shown in Hao et al. (2014), there is an excellent agreement between the GOME-2A and GOME-2B TOCs, with a mean difference of around 0.5 %. Therefore, in this study, the IASI-A and IASI-B validation is limited to the comparison with GOME-2A TOC products. In this comparison, we only use GOME-2A TOC data meeting the valid conditions given in Valks et al. (2017): a TOC value ranging between 75 and 700 Dobson units (DU) and a slant column error low than 2 %.  Weber et al., 2005;Balis et al., 2007a, b;Koukouli et al., 2012Koukouli et al., , 2015Boynard et al., 2016). For the validation of IASI-A and IASI-B TOCs, only direct sun observations are used as GB UV reference data as they are the most reliable for both the Dobson and the Brewer spectrophotometers, the latter offering an accuracy of about 1 % at moderate solar zenith angles (e.g., Kerr, 2002). TOC measurements are also obtained from SAOZ zenith sky UV-vis spectrometers (Pommereau and Goutail, 1988), which are part of the Network for the Detection of Atmospheric Composition Change (NDACC, http://www.ndacc. org, last access: 22 December 2017). The SAOZ TOC measurements are performed in the visible Chappuis bands between 450 and 550 nm with a medium spectral resolution of 1 nm, and take place twice a day during twilight (sunrise and sunset) at solar zenith angles ranging between 86 and 91 • . The retrieval is based on the differential optical absorption spectroscopy (DOAS) procedure (Platt, 1988). Since observations are performed at twilight, SAOZ can be operated throughout the year at all latitudes up to ±67 • . At latitudes higher than the polar circle, there is no measurement during permanent night in winter and during permanent day in summer. SAOZ performances have been continuously assessed by regular comparisons with UV-vis independent observations (e.g., Hofmann et al., 1995;Roscoe et al., 1999;Hendrick et al., 2011). The SAOZ total accuracy, including a 3 % cross-section uncertainty, is ∼ 6 % (Hendrick et al., 2011). In this study, eight SAOZ stations deployed at latitudes from the Arctic to the Antarctic (see Table 3 in Boynard et al., 2016 for their locations) are used for IASI-A and IASI-B TOC validation.

Ground-based data
Regular ozone measurements from high-resolution solar absorption spectra recorded by GB FTIR (Fourier transform infrared) spectrometers available for the period from 2008 to 2017 were downloaded from NDACC. The ozone FTIR retrieval principle, which is based on the optimal estimation method (Rodgers, 2000), as for FORLI, is detailed in Vigouroux et al. (2008). Measurements such as these have the advantage of providing not only TOCs with a precision of 2 %, but also low vertical resolution profiles with about four independent partial columns: one in the troposphere and three in the stratosphere up to about 45 km, with a precision of about 5 %-6 % (Vigouroux et al., 2015). Therefore, the FTIR measurements are used to validate not only IASI TOCs but also IASI partial ozone columns. The stations considered in the present work were used in several papers for trend analyses (Vigouroux et al., 2008(Vigouroux et al., , 2015García et al., 2012; and validation studies (Dupuy et al., 2009;Viatte et al., 2011). The latitudinal coverage ranges from 67.8 • N to 45 • S, so only the southern high latitudes are not covered. The locations of the six FTIR stations used in the comparison are given in Table 1 and presented in Fig. 2. Since these solar absorption measurements require daylight conditions, there is no measurement at Kiruna during polar winter. All stations use the high-resolution spectrometers from Bruker, which can achieve a resolution of 0.0035 cm −1 or better. Details on the harmonized retrieval parameters can be found in Vigouroux et al. (2015). For all stations, the 10 µm spectral region is fitted to retrieved O 3 using two retrieval algorithms: either PROFFIT9 at Kiruna and Izaña or SFIT2/4 at the other stations. The two algorithms have been compared in Hase et al. (2004). The spectroscopic database used is HITRAN 2008 (Rothman et al., 2009). Each station uses the daily pressure and temperature profiles from NCEP (National Centers for Environmental Prediction) and has one a priori profile, which is obtained from the same model, WACCM4 (Whole Atmosphere Community Climate Mode; Garcia et al., 2007).

Ozonesonde data
High resolution ozone vertical profiles measured from ozonesonde for the period from 2008 to 2017 were downloaded from the WOUDC and NOAA-ESRL (http://www. esrl.noaa.gov/gmd/dv/ftpdata.html, last access: 17 December 2017) archives. The sondes provide measurements of O 3 up to 30-35 km with a vertical resolution of ∼ 150 m. Only sonde measurements based on electrochemical concentration cells (ECCs), which measure the oxidation of a potassium iodine (KI) solution by O 3 (Komhyr et al., 1995), were used in this study. Their accuracy is generally good (±3 %-5 %) and their uncertainties are of about 10 % throughout most of the profile below 28 km (Deshler et al., 2008;Smit et al., 2007), while other types of ozonesondes have a somewhat poorer accuracy (5 %-10 %) (e.g., Hassler et al., 2014;Liu et al., 2013). A total of 56 ozonesonde stations in midlatitudes, polar and tropical regions are considered in the present study. The locations of the ozonesonde stations used in the comparison are presented in Fig. 2.

Comparison methodology
Since the characteristics are not the same from one dataset to another, different comparison methodologies and collocation criteria are applied and described in this section. For all datasets, the differences are calculated as: where DATA corresponds to the independent data used for the validation of the IASI ozone data (i.e., GOME-2, Brewer-Dobson, SAOZ, FTIR and sonde ozone data). The IASI-A and IASI-B O 3 products are assessed in terms of TOCs and partial ozone columns. The validation exercise is performed using the same partial columns as those used in . These columns are used because they contain around one piece of information, have maximum sensitivity approximately in the middle of each of the layers and reproduce the well-known cycles related to chemical and dynamical processes characterizing the following layers: surface-300 hPa (TROPO), 300-150 hPa (UTLS), 150-25 hPa (LMS for lower and middle stratosphere) and 25-3 hPa (MS). On average, these pressure columns correspond to the following altitude columns: surface-8, 8-15, 15-22 and 22-40 km, respectively. However, it should be noted that for the comparison between IASI and ozonesonde data, the MS is limited to the 25-10 hPa column as sonde generally burst around 30-35 km (see Sect. 3.2 below). For the assessment of IASI vertical profiles, we refer to Keppens et al. (2018, this issue).
The comparison of IASI-A and IASI-B against DATA is performed over the period from 2008 to 2017 and 2013 to 2017, respectively.

Direct comparison with GOME-2, Brewer, Dobson and SAOZ data
Since only the TOCs are provided in the independent GOME-2A, Brewer, Dobson and SAOZ datasets, a direct IASI/DATA comparison is performed in this validation exercise. The comparison of IASI and GOME-2A TOCs is not straightforward because the pixels are not co-localized in time and space and the IASI and GOME-2 instruments have a different pixel size. In order to compare collocated data, a simple method is to calculate the daily average of IASI-A, IASI-B and GOME-2A TOCs along with their relative difference over a constant 1 • × 1 • grid cell. As the UV-vis instrument provides daytime observations, only the IASI daytime data (SZA < 90 • ) are used in this comparison.
For the comparison of IASI against Brewer and Dobson TOCs, the coincidence criteria are set to a 50 km search radius between the satellite pixel center and the geo-colocation of the ground-based station as well as to the same day of observations. For each GB measurement, only the closest IASI measurements are kept for the comparison.
For the comparison of IASI against SAOZ TOCs, sunrise (sunset) SAOZ measurements are compared to collocated daytime (nighttime) IASI daily data averaged in a 300 km diameter semi-circular area located to the east (west) of the ground-based station. Note that since similar results are found for daytime and nighttime measurements, only comparisons for daytime data are shown in the following.

Comparison with FTIR and ozonesonde data
For the comparison of IASI data against FTIR and sonde TOCs and partial ozone columns, the coincidence criteria used in this study are the same as those defined in , except for the time coincidence which is slightly different in order to be more consistent with the temporal variability of tropospheric ozone: we apply coincidence criteria of a 100 km search radius and ±6 h. As the ozonesonde measurements are mainly performed in the morning (local time), this implies that most of the pixels meeting these coincidence criteria correspond to pixels of the IASI morning overpass, which is not the case for FTIR measurements that can be performed throughout the day.
In the comparison with FTIR data, the FTIR retrieved profiles are adjusted following Rodgers and Connor (2003, their Eq. 10) in order to take the different a priori profiles used in both IASI and FTIR retrievals into account: where A FTIR is the FTIR AK matrix, I is the unity matrix, and x a,FTIR and x a,IASI are the respective FTIR and IASI O 3 a priori profiles.
In addition, when validating satellite profile products, a proper comparison method is used to account for the Atmos. Meas. Tech., 11, 5125-5152, 2018 www.atmos-meas-tech.net/11/5125/2018/ difference in vertical resolution. In the present work, the ozonesonde and adjusted FTIR profiles are first interpolated on the corresponding IASI vertical grid and then degraded to the IASI vertical resolution by applying the IASI AKs and a priori O 3 profile according to Rodgers (2000): where x s is the smoothed ozonesonde/FTIR profile, x raw is the ozonesonde/adjusted FTIR profile interpolated on the IASI vertical grid (referred as "raw" FTIR), x a is the IASI a priori profile and A is the IASI AK matrix. Incomplete ozonesonde profiles above ozonesonde burst altitude are filled with the a priori profile. For each ozonesonde/FTIR measurement, we calculate the TOCs (only for the FTIR data) and the four partial columns defined above from all IASI and smoothed ozonesonde/FTIR profiles meeting the coincidence criteria. We then average all IASI and smoothed ozonesonde/FTIR total and partial columns. Once this has been carried out there is one IASI-DATA profile pair per ozonesonde/FTIR measurement. To avoid unrealistic statistics, skewed by extremely unrealistic, low values in the UTLS O 3 columns found in the smoothed ozonesonde data, we filter out extreme outliers exceeding 200 % relative differences with IASI (which can be up to ∼ 8 % of the data in the tropical UTLS).

IASI-A and IASI-B O 3 consistency
Before validating IASI-A and IASI-B O 3 products, we assess the consistency between both instruments over the common period from May 2013 to July 2017. For the intercomparison exercise, we first calculate the daily IASI-A and IASI-B averages over a 1 • ×1 • grid. Then for each 1 • ×1 • grid cell, we calculate the relative difference as 100× [(IASI-A − IASI-B) / IASI-B]. Finally we calculate the monthly averaged data from the daily gridded differences. A statistical analysis of IASI-A and IASI-B TOCs and TROPO O 3 columns is performed with respect to time and latitude. Figure 3 illustrates the 1 • zonal monthly relative differences between IASI-A and IASI-B TOCs (computed from daily gridded differences) for daytime measurements (Fig.  3a) and nighttime measurements (Fig. 3b). IASI pixels are considered as daytime or nighttime data if the solar zenith angle (SZA) is < 90 • or ≥ 90 • , respectively. An excellent agreement between both IASI-A and IASI-B TOCs is observed, with differences within 0.4 %, except for the polar regions. As previously discussed in Boynard et al. (2016), a possible reason for the larger differences in polar regions is the combination of the overlap of consecutive orbits with different times, which equates to different meteorological conditions. Metop, with its polar orbit, makes 14 revolutions per day and passes by the poles on each revolution. This leads to a larger number of observations over the poles each day at different local times for the same grid cell. Therefore, the variability in O 3 is much larger leading to both larger differences between the measurements and a larger standard deviation (not shown). Two interesting features that can be noted from Fig. 3 are (i) the slight increase in the differences in 2015 (April-September) and the decrease in the differences between the period prior to April 2015 and the period after September 2015. These two points will be discussed in the following. Figure 4 illustrates the 1 • zonal monthly relative differences between IASI-A and IASI-B TROPO O 3 columns (computed from daily gridded data) for daytime measurements ( Fig. 4a) and nighttime measurements (Fig. 4b). In general, the differences between IASI-A and IASI-B TROPO O 3 columns are within ±2 % although larger differences can be found locally, especially in the polar regions. As for the TOCs product, the differences decrease from October 2015 with respect to the period from May 2013 to April 2015, and the differences are significantly larger for the period from April to September 2015 (up to 10 %). Another noticeable feature during the period from April to September 2015 is the opposite signs between the differences in TOCs (Fig. 3) and the differences in TROPO O 3 columns (Fig. 4).
The reason for these unexpected differences lies in the fact that on 13 April 2015, there was an error in the IASI-A pixel registration, which slightly modified the IASI-A viewing angle . This was corrected in September 2015 and produced a ∼ 5-month period (between April and September 2015) with somewhat larger differences observed between IASI-A and IASI-B O 3 products. Furthermore, on 7 October 2015, the IASI's cube corner compensation device, which was shown to generate micro-vibrations and random errors in the IASI spectra, was stopped. As a result, since October 2015, the IASI-A and IASI-B spectra are of better quality/stability Jacquette et al., 2016).
Because of the changes made in the IASI-A Level 1 data processing, the comparison statistics are performed over two periods, excluding the period between April and September 2015. Over the period from May 2013 to March 2015, the IASI-A TOC product measures 0.3 ± 1.1 % less ozone than IASI-B for both daytime and nighttime measurements. From October 2015, as expected, the overall differences and standard deviation are smaller: the IASI-A TOC product measures 0.1 ± 0.5 % less ozone than IASI-B. Similar results are found for the TROPO O 3 column: before April 2015, the IASI-A TROPO O 3 product measures 2.4 ± 0.5 % and 2.1±0.4 % less ozone than IASI-B for daytime and nighttime measurements, respectively. From October 2015, the overall difference between both instruments decreases and is equal to 1.4 ± 1.3 %.
The excellent agreement between the current IASI-A and IASI-B TOC and TROPO O 3 columns (April-September 2015 excluded) allows the combined use of IASI-A and IASI-B instruments to provide homogeneous total and tropospheric ozone data with full daily global coverage measurements. Whilst the IASI-B O 3 products for the period from  April to September 2015 are better suited for high quality use, it is worth noting that the IASI-A instrumental issue only affects the TOC by 0.4 % and the tropospheric ozone by 10 %. These differences are much lower than the TOC and tropospheric retrieval errors estimated to 2 % and 15 % on average, respectively, justifying the potential use of the IASI-A data over the April-September 2015 period if it is required. In the validation exercise presented in the next section, the period from April to September 2015 is included.
The interannual variability of IASI-A TOCs and TROPO O 3 columns is illustrated in Fig. 5. The highest TOC occurs in the northern mid-and high-latitudes during springtime while the lowest TOC values (< 200 DU) occur in the southern high latitudes from September to November. The lowest TROPO O 3 occurs south of 70 • S as well as in the tropics (values less than 15 DU), whereas monthly mean TROPO O 3 values occur in the northern midlatitudes during summer, and are mainly caused by stratosphere-troposphere exchange processes in spring-summer coupled with O 3 production from pollution events in summer.

Validation results
5.1 Comparison with GOME-2 TOCs Figure 6 illustrates the 1 • zonal monthly relative differences between IASI-A and GOME-2A TOCs (computed from daily data) for the period from 2008 to 2017 and their associated standard deviation. Good agreement is observed between both TOC products, with the lowest differences found in the midlatitudes and tropics and the largest differences found in the polar regions, especially over Antarctica (differences larger than 20 %). In the tropics the differences are mostly positive, while they are negative in the midlatitudes. Figure 7 shows the seasonal distributions of relative differences between IASI-A and GOME-2A TOCs, computed from daily gridded data for the 2008-2017 period (see Table 2 for the associated statistics). The smallest differences are found in the northern midlatitudes during summer (June-July-August) where the IASI sensitivity is the highest, while the largest differences are found over the cold surfaces of Antarctica and Greenland where the IASI sensitivity is the lowest, especially during the March-April-May (MAM) season (3.5 % over Antarctica). The detailed analysis undertaken for different latitude bands given in Table 2 shows that the highest correlation coefficients are found in the midlatitudes and the northern high latitudes, with values higher than 0.93. Lower correlation is found between IASI-A and GOME-2A TOCs in the southern high latitudes during MAM (0.62) and in the tropics during SON (September-October-November) (0.55). However, during the O 3 hole season, a high correlation of 0.94 is found in the southern polar region, and IASI-A TOCs are negatively biased (∼ 2 %). This suggests Atmos. Meas. Tech., 11, 5125-5152, 2018 www.atmos-meas-tech.net/11/5125/2018/  that IASI-A TOC overestimates the extent of O 3 depletion (i.e., underestimates the TOCs in the ozone hole) with respect to GOME-2A TOC. Figure 8 illustrates the time series of the monthly mean relative difference between IASI-A and IASI-B against GOME-2A TOCs along with the standard deviation for the Northern Hemisphere (NH) and the Southern Hemisphere (SH). There is a pronounced seasonality in the difference between IASI-A and IASI-B against GOME-2A TOCs in the SH, with the largest differences being found during austral summer (up to 4 %) and the lowest differences during the austral winter. Compared to GOME-2A data, IASI-A (IASI-B) TOC shows less O 3 in the NH by 0.20 ± 0.74 % (0.15 ± 0.69 %) and more O 3 in the SH by 0.42 ± 1.42 % (0.28 ± 1.87 %); these differences are within the total retrieval error bars of the two products. Globally, the IASI-A (IASI-B) TOC product is slightly higher than the GOME-2A TOC product, with a global mean bias of 0.3 ± 0.8 % (0.4 ± 0.8 %). It is worth noting that the previous IASI TOC product (v20140922) was in disagreement by more than 5 % . The global mean bias is now within the total errors of GOME-2, (estimated to 3 %-7 %; Valks et al., 2017) and IASI, which demonstrates the good consistency between the IASI and GOME-2 TOC products.
Despite the global improvement of ∼ 5 % with the new IASI TOC product with respect to the previous IASI TOC product (v20140922), large discrepancies are still observed at high latitudes and are partly explained by the following: i. The low spectral signal to noise ratio due to very low surface temperature in high-latitude regions lead to limited information content in the IASI observations in these areas.
ii. A misrepresentation of the wavenumber-dependent surface emissivity, which is a critical input parameter to describe the surface, especially above continental surfaces , can occur in high-latitude regions. FORLI uses the emissivity climatology built by Zhou et al. (2011) providing weekly emissivity values on a 0.5 • ×0.5 • latitude/longitude grid for all 8461 IASI spectral channels. However, Zhou et al. (2011) climatology can have missing values. In such cases, the MODIS climatology built by Wan (2006), which provides values Figure 7. Seasonal distribution of the relative differences (in percent) between IASI-A and GOME-2A total ozone column products for the 2008-2017 period. The relative difference is calculated as 100× (IASI-A − GOME-2A) / GOME-2A. DJF represents December-January-February, MAM represents March-April-May, JJA represents June-July-August and SON represents September-October-November. Table 2. Summary of the correlation (R), the mean bias and the standard deviation values of IASI-A and GOME-2A TOC products computed from daily gridded data, for each season of the 2008-2017 period. The bias and the 1σ standard deviation are given in percent. The correlation coefficients lower than 0.85 are indicated in italics.

Latitude range
Dec-Jan-Feb Mar-Apr-May for only 12 channels in the IASI spectral range is used instead. Furthermore, in cases where there is no correspondence between the IASI pixel and either climatologies, the reference emissivity used for the Zhou climatology (Zhou et al., 2011) is used, which can significantly impact the retrievals. This is particularly true in arid or semi-arid regions where variations in emissivity are large both on spectral and spatial scales (Capelle et al., 2012) and also in ice regions as the reference emis-sivity does not necessarily reflect the actual snow or sea ice coverage.
iii. The temperature profiles used in FORLI-O 3 are less reliable at high latitudes and over elevated terrain (August et al., 2012). As shown in Boynard et al. (2009), the errors introduced by the uncertainties of 2 K on the temperature profile can reach up to 10 % of total error on the retrieved vertical profile, with the error due to the temperature uncertainty on the TOCs being much lower. Errors on thermal contrast can also have an impact on the retrievals.

Atmos
iv. The errors associated with TOC retrievals in the UVvis spectral range increase at high solar zenith angles in high-latitude regions, mostly due to the larger sensitivity of the retrieval to the a priori O 3 profile shape .
In the section below, a detailed analysis of the larger bias found in the Antarctic region is undertaken for individual ground-based Brewer and Dobson stations to try to understand this larger bias (see next section). Due to several years of GOME-2 instrumental degradation (Dikty et al., 2011), the stability of IASI-A and IASI-B is not assessed from comparison with GOME-2A. It will be explored in the subsections below against the other independent datasets used in this study. Figure 9 shows the dependency of the relative differences of IASI-A and IASI-B against GB measurements on latitude for the period from May 2013 to July 2017. For each daily ground-based measurement a relative difference is calculated as 100× (IASI − GB) / GB [%]. All relative differences are then separated into latitudinal bins of 10 • and the mean is calculated. As expected, very similar features can be seen between the IASI-A and IASI-B comparisons, with the Antarctic (80-90 • S latitude band) being largely overestimated (∼ 20 %) and the northern middle latitudes driving the mean comparisons to around the 0 % to 2 % level. As shown by the IASI-to-Dobson comparison (Fig. 9a), the dependency on latitude is less visible for the NH due to the high number of collocations which render the latitudinal means more representative than those of the SH. The comparisons with Dobson measurements show differences be-tween 0 and 2.5 % for the entire NH (except in the 70-80 • N belt where the difference reaches 3.5 % for IASI-A) and for latitudes ranging between 0 and 40 • S. South of 40 • S, the differences range between 2 % and 4 %, which is partially attributed to the small number of stations, the limited sensitivity in this region (especially for latitudes lower than 60 • S) and the larger TOC variability within the southern polar vortex (Garane et al., 2018, this issue;Verhoelst et al., 2015). The comparison with Brewer measurements show a similar picture for the NH. Note that there are a few Brewer stations in the SH, but they are not evenly distributed (all of them are located on the Antarctic) so their measurements are not used. Figure 9 also clearly displays the larger differences for the 20-30 • N latitude band (more visible for the comparison with Brewer measurements), where some desert stations, like Tamanrasset, Algeria and Aswan, Egypt (see further discussion in the next paragraph) are located; this suggests that the IASI quality flag established to filter the high values linked with emissivity-related issues (based on the ratio of the surface-6 km column relative to the TOC) is rather loose. Nevertheless the overall comparison with Dobson and Brewer TOCs shows that the new IASI TOC product is improved by 4 % in comparison with the previous IASI TOC product (v20140922; see Boynard et al., 2016) and is within IASI and GB TOC total error bars.

Comparison with Brewer-Dobson TOCs
To further examine the large discrepancies mentioned above, we analyzed the results obtained for individual stations located in Antarctic and desert regions in more detail. The stations located near desert areas show a diverging behavior with positive (Tamanrasset, Algeria) and negative (Aswan, Egypt and Springbok, South Africa) biases of +7 % to +8 % and −5 % to −4 %, respectively. Over Antarctica, four stations were examined. The bias was found to be extremely high (∼ 20 %) for Amundsen-Scott, which is located at 90 • S and at an altitude of 3 km, and less (but still posi- tive) for the other three stations: Halley Bay, Syowa and Arrival Heights (1.2 %-3.8 %), which are located on the Antarctic Ice Sheet. The comparison of GOME-2A with groundbased TOCs at Amundsen-Scott shows a very small bias of 1 %-2 %, indicating that there is no obvious issue with the ground-based measurements. Furthermore, the scatterplot for that particular station (compared to either Dobson or Brewer; plot not shown) shows that IASI-A has a much higher variability than the GB TOC values. This issue needs to be further explored by investigating, for instance, the impact of potential surface emissivity discrepancies on the retrievals over some regions of Antarctica and over deserts. Additional quality filters, e.g., on ice surface emissivity issues, could also be considered. Figure 10 shows the time series of the monthly relative differences between IASI-A, IASI-B and GB TOC over the corresponding IASI measurement period for the NH only. For each GB measurement, a daily relative difference is calculated. All the relative differences are then averaged per month. Each month includes more than 180 IASI GB pairs. As for GOME-2, we see an obvious seasonal variability in the differences, especially for the Dobson measurements: the smallest differences appear in summer and the largest differences in winter. The larger seasonal variability in the Dobson comparisons is explained by the fact that the Dobson measurements strongly depend on the stratospheric effective temperature . We also see a similar but less pronounced seasonal effect in the Brewer comparison. According to Garane et al. (2018, this issue) and references therein, even though Dobson and Brewer spectrometers basically follow the same principles of operation, TOC measurements from the two types of instruments show differences Atmos. Meas. Tech., 11, 5125-5152, 2018 www.atmos-meas-tech.net/11/5125/2018/ in the range of ±0.6 %; this is due to the use of different wavelengths in their respective TOC algorithms and the different temperature dependencies of the ozone absorption coefficients. However, it is worth noting that these differences between Brewer and Dobson TOCs are lower than their total uncertainty (∼ 1 %). The mean difference for the NH is lower than 1.1 % for both Dobson and Brewer comparisons to the IASI observations. According to the user requirements given in the "User Requirement Document" of the Ozone_cci project (van Weele et al., 2016), the stability of the ozone measurements must be between 1 % and 3 % decade −1 . To assess the long-term stability of the IASI-A TOC products, which is essential for trend studies, we calculate the IASI-A TOC decadal drift from the monthly relative differences between IASI-A and GB TOC over the period from 2008 to 2017 (see Fig. 10). The drift is considered statistically significant if its P value is lower than 0.05 and the drift value is higher than its 2σ standard deviation. For the Dobson comparison, the TOC relative differences exhibit insignificant drift of 0.68±0.69 % decade −1 . For the Brewer comparison, a < 3 % positive drift of 1.38±0.50 % decade −1 is found. When comparing against Brewer and Dobson measurements, the results show that the IASI-A TOC products are stable; thus, they are reliable for trend studies, as expected from the excellent stability in the Level 1 data . Figure 11 shows the temporal variation of the daytime monthly mean relative differences between IASI-A and IASI-B against SAOZ TOCs for the eight SAOZ stations for the period from 2008 to 2017. For each daily SAOZ measurement, a relative difference is calculated as 100× (IASI − SAOZ) / SAOZ [%]. All the relative differences are then monthly averaged. First, we clearly see the systematic seasonality in the differences, with increasing amplitude with latitude. Compared to SAOZ, the IASI-A and IASI-B TOCs are biased by 0.5 %-2 % (∼ 1 % monthly mean averaged standard deviation) in the tropics and midlatitudes, and are biased high to about 4 ± 3 % inside the polar circle. The results are consistent with those found for the comparison with GOME-2A along with Brewer and Dobson measurements (see Sect. 5.1 and 5.2, respectively). An improvement of 3 %-4 % is found when compared to the previous IASI product (v20140922).

Comparison with SAOZ TOCs
The IASI-A and SAOZ TOC relative differences show small or insignificant negative decadal drifts ranging between −0.05±0.70 % (OHP) and −2.27±0.71 % (Reunion). Bauru station is an exception to these drifts due to a SAOZ retrieval issue still under investigation. The good quality of the IASI-A TOC temporal stability satisfies the 1 %-3 % decade −1 Ozone_cci requirements for the long-term stability for total ozone measurements well (Van Weele et al., 2016), which again shows that the current IASI-A TOC products are homogeneous and reliable for trend studies. Figure 12 shows the temporal variation of the monthly mean relative differences between IASI-A and IASI-B against FTIR TOCs convolved with the IASI averaging kernels according to Eq. (2) for the six FTIR stations (see Table 1 and Fig. 2  . It can be noted that the biases between FTIR and IASI-A, and SAOZ and IASI-A for stations that are located close to one another with regards to latitude are very consistent if one takes this spectroscopic bias into account (i.e., UV at Sodankyla lower than IASI-A by 3.9 %, FTIR at Kiruna higher by 1.1 %; UV at OHP lower than IASI-A by 1.0 %, FTIR at Jungfraujoch higher by 3 %; UV at Kerguelen higher than IASI-A by 0.9 %, and FTIR at Lauder higher by 6.2 %). At Zugspitze and more particularly at Jungfraujoch, two jumps are visible in 2010 and 2014, with larger biases before 2011 and after 2014 with respect to the period in between. It is worth noting that these two jumps seem to coincide with changes in IASI Level 2 temperature (in September 2010 and September 2014). The analysis of surface temperatures used in both IASI (EUMETSAT) and FTIR (NCEP) retrievals (IASI Level 2 EUMETSAT and NCEP, respectively) shows that the differences between EUMETSAT and NCEP can reach up to 20 K for the surface temperature and vary between −10 and 10 K along the temperature vertical profile at both Jungfraujoch and Zugspitze. However, at the other stations the differences are much lower (less than 5 K). This suggests that IASI Level 2 EUMETSAT temperatures are less reliable above elevated areas. However a more in-depth analysis is needed (and is currently in progress) in order to understand the exact origin of the jumps found in the differences between IASI and FTIR TOCs at these stations.

Comparison with FTIR TOCs and partial ozone columns
The dominant systematic uncertainty in FTIR O 3 retrievals is due to the spectroscopic parameters (García et al., 2012). The IASI retrieval algorithm uses HITRAN 2012 and the FTIR retrieval algorithm uses HITRAN 2008, although no differences were found in either O 3 absorption band (Boynard et al., 2016). We do not expect a significant bias between the IASI and FTIR total columns due to ozone spec- Figure 11. Time series of the monthly relative differences (in percent) between IASI-A (blue) and IASI-B (red) against collocated SAOZ TOC measurements for eight stations from north to south. For each daily SAOZ measurement, a relative difference is calculated as 100× (IASI − SAOZ) / SAOZ [%]. All the relative differences are then monthly averaged. For the period from May 2013 onwards, only the common collocations between IASI-A and IASI-B are shown. The standard deviation of the average is also displayed (vertical bars). Comparison statistics including the mean bias and its 1σ standard deviation (in percent) for the period from 2008 to 2017 (IASI-A) and 2013 to 2017 (IASI-B) are indicated on each panel. The decadal drift, its 2σ standard deviation (in percent) and the P value for the IASI-A time series are also indicated on each panel. troscopy, as both retrieval algorithms use the same ozone spectroscopic parameters and the same fitting spectral range. Except at Lauder and Jungfraujoch, the mean biases between IASI and FTIR TOCs are relatively low and within total er-rors of FTIR (e.g., García et al., 2012) and IASI, which once more reenforces the good quality of IASI TOC data.
At all stations except for Jungfraujoch and Zugspitze, the IASI-A and FTIR TOC monthly relative differences show insignificant drifts of less than 0.9 % decade −1 (see Fig. 12 Table 2), which is within the 1 %-3 % decade −1 Ozone_cci requirements for the long-term stability of total ozone measurements (Van Weele et al., 2016). This demonstrates that the current IASI-A TOC products are homogeneous and reliable for trend studies. The significant negative drifts found at Jungfraujoch and Zugspitze, are explained by the bias drop observed from 2014, which is discussed above. As FTIR data also provide up to four independent pieces of information in the vertical ozone profile, we assess four IASI partial ozone columns characterized by a DOFS of ∼ 1 (surface-300, 300-150, 150-25 and 25-3 hPa), which should make such assessment meaningful. The comparisons of the four partial ozone columns between IASI-A and FTIR performed for the period from 2008 to 2017 are presented in Fig. 13. The correlation coefficients between FTIR and IASI-A partial columns are good to excellent (from 0.72 to 0.98), with the highest correlations found in the UTLS and LMS.
For all stations except Kiruna, the IASI tropospheric column is negatively biased by 5 %-14 %. The comparison for the UTLS O 3 columns shows that the IASI-A O 3 product is positively biased at all stations (except Izaña), with the highest bias found at Wollongong (21.1 ± 19.9 %) and the lowest bias found at Jungfraujoch (3.7 ± 15.0 %). The standard deviation is highest in the UTLS at Izaña and Lauder, which is due to strong O 3 variability and a large total retrieval error in this region as shown in . Indeed, Fig. 4b in  demonstrated that the estimated total retrieval error of vertical ozone profiles from IASI in tropical regions are larger than in middle latitudes, which suggests that this would also be the case for the ozone column. It should be noted that IASI is positively biased in the UTLS region, as reported in previous studies comparing IASI to ozonesonde data (e.g., Boynard et al., 2016;Dufour et al., 2012;Gazeaux et al., 2013 (2012) attempted to give some explanations for this particular feature, the exact reason for this overestimation is still not clear. One reason may be the use of inadequate a priori information. Note that FORLI only uses one single a priori profile : the global mean profile of the McPeters/Labow/Logan climatology (McPeters et al., 2007). As shown by Bak et al. (2013), using tropopause-based ozone profile climatology can significantly improve the a priori profile. However, using dynamical a priori information makes the comparison on a global scale less straightforward as a Atmos. Meas. Tech., 11, 5125-5152, 2018 www.atmos-meas-tech.net/11/5125/2018/ different a priori profile is used at each IASI pixel. The best correlation coefficients and smallest standard deviations (in percent) between IASI-A and FTIR data are found for the LMS column. The small standard deviations in the LMS comparisons allow the detection of consistent IASI-A negative biases at all stations (5 %-9 %). This consistent negative bias in the LMS, where the ozone partial column contributes the most to the total column, is reflected in the observed negative bias in TOC discussed above. These better correlation coefficients and standard deviations in the LMS are due to the better IASI sensitivity to this column (mean DOFS ∼ 1.2-1.5 as indicated in Fig. 13) compared to the other partial columns. The smallest biases between FTIR and IASI-A columns are found in the MS column (−0.2/ + 4.9 %), except at Kiruna where the bias reaches 13 %. This higher bias at Kiruna might be due to a bad collocation of sounded air masses which can be different in or out of polar vortex conditions for the two instruments. The FTIR instrument sounds the atmosphere along the sun-instrument line of sight; therefore, the sounded air masses in this higher partial column and for high solar zenith angles measurements might be located a fair distance from the station itself (few hundreds kilometers). Thus, collocation with the satellite, which would take the FTIR line of sight into account, would improve the comparisons. Similar results are found for the comparison of IASI-B with FTIR partial ozone columns over the period from May 2013 to 2017 (not shown).
The stability of the IASI-A partial ozone columns is also assessed based on the time series of monthly relative differences between IASI-A and FTIR data over the period from 2008 to July 2017. Table 3 gives the decadal drift values along with their 2σ standard deviations in percent per decade ( % decade −1 ) as well as the P value. As a reminder, the drift is considered significant if the drift value is higher than its 2σ standard deviation. For the TROPO column, we clearly see a significant negative drift at all stations ranging from −5.0 ± 4.8 % decade −1 (Izaña) to −16.1 ± 8.1 % decade −1 (Kiruna). Smaller or insignificant drifts are found in the UTLS and LMS. Regarding the MS, insignificant positive drifts are found, except at Izaña where a positive drift is found (3.7±2.5 % decade −1 ). As a consequence, the stability of the IASI-A partial O 3 columns compared to the six FTIR GB measurements that cover the IASI measurement period and that are characterized by limited vertical sensitivity cannot be confirmed.
The stability of the IASI-A partial O 3 columns was analyzed in detail by comparisons with ozonesonde measurements that provide numerous highly resolved vertical O 3 profiles. This comparison is outlined in the following section.

Comparison with ozonesonde partial ozone columns
A statistical comparison of IASI-A and IASI-B against sonde partial ozone columns at 56 stations (see Fig. 2) was performed, which gathered approximatively 2000 ozonesonde profiles during a period extending from May 2013 to July 2017 and 11 600 ozonesonde profiles over the whole IASI measurement period (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017). In order to assess the latitudinal variability of IASI O 3 retrieval performance, the comparison was performed for six 30 • latitude bands representative of the northern high latitudes (60-90 • N), northern midlatitudes (30-60 • N), northern tropics (0-30 • N), southern tropics (0-30 • S), southern midlatitudes (30-60 • S) and southern high latitudes (60-90 • S). Figure 14 shows the comparison of IASI-A against smoothed ozonesonde for four partial columns for each of the six latitude bands during the 2008-2017 period. For the TROPO O 3 columns (Fig. 14, first column), the mean biases and standard deviation are within 20 %. IASI-A underestimates the O 3 abundance in the tropics and midlatitudes (by ∼ 16 %-19 % and ∼ 6 %-11 %, respectively) and overestimates the O 3 abundance at high latitudes (by 4 %-5 %), compared with ozonesonde data. The correlation coefficients range from 0.8 to 0.9 in the tropics to 0.7 to 0.8 at middle latitudes, and from 0.5 to 0.8 at high latitudes. The linear regression slopes are in the range of 0.6-0.8, with lower values found at high latitudes due to the reduced retrieval sensitivity in the lower troposphere. It is worth noting that a lower correlation coefficient is found for the southern midlatitudes, which is likely due to the lower amount of data in comparison with the other latitude bands. The comparison for the UTLS O 3 columns (Fig. 14, second column) shows that IASI-A O 3 products overestimate the O 3 abundance irrespective of latitude, with the highest biases found at high latitudes (30 %-42 %) and the lowest biases found at midlatitudes (∼ 11 %-19 %). The standard deviation is highest in the UTLS in all latitude bands (compared to the other partial columns) due to the strong O 3 variability and the large total retrieval error, as shown in . The linear regression slopes are close to 1 in the polar and midlatitude regions but are around 0.4 in the tropics, which is closely related to the small amount of O 3 in the tropical UTLS. A positive bias from IASI-A O 3 products is also found for the LMS (Fig.  14, third column) and MS (Fig. 14, fourth column) columns (except at high latitudes for the latter). The correlation coefficients range between 0.6 (tropics and high latitudes) and 0.8 (midlatitudes) for the LMS column while they are much lower for the MS column, which is explained by the low DOFS values ranging between 0.4 and 0.6 (as indicated on the scatterplots). Note that the DOFS for the MS columns are lower than those calculated in Fig. 13 because they do not correspond to the full MS column calculated from IASI (25-3 hPa, i.e., ∼ 25-40 km); instead the DOFS correspond to the MS columns truncated to match the maximum altitude Table 3. IASI-A decadal drifts and their 2σ standard deviation (in percent) calculated from the monthly relative differences between IASI and the FTIR data over the period from 2008 to 2017 for the TOC and different partial ozone columns: surface-300 hPa (TROPO), 300-150 hPa (UTLS), 150-25 hPa (LMS) and 25-3 hPa (MS). The P value is indicated in parentheses. A P value lower than 0.05 indicates a significant drift; drifts indicated in bold are significant.
The long-term stability of the IASI-A partial O 3 column vs. ozonesonde measurements is assessed in Figure 16, which presents the monthly relative differences between IASI-A and ozonesonde for the TROPO, UTLS, LMS and MS O 3 partial columns for a total of 18 ozonesonde stations in the NH that cover 8 years or longer (from 2008 to 2017). With more than 30 IASI-sonde pairs per month, the NH presents sufficient collocated data to carry out a good statistical drift analysis unlike the SH (only eight ozonesonde stations). For each ozonesonde measurement, a daily relative difference is calculated. All the relative differences are then monthly averaged. A main feature that arises from this figure is the pronounced seasonality in the differences between IASI-A and sonde O 3 for the UTLS and LMS column, with the lowest differences found in summer and the highest differences found in winter. We also see a small but apparent seasonality in the differences for the TROPO O 3 column: the IASI TROPO O 3 column appears less biased with respect to the ozonesondes during winter. This reflects the low sensitivity of IASI associated with low brightness temperature in the troposphere, and in such situations the IASI retrieval mostly provides the a priori information (see Eq. 2). The differences in the TROPO O 3 column are better than −10 % during the period from 2008 to 2010 and decrease up to −20 % from 2011. This feature is also visible for the MS column: the difference baseline is around the 0 % level between 2008 and 2010 but near the 4 % level from 2011.
The linear trends of the monthly mean ozone biases for each partial column are plotted in Fig. 16 for the period from 2008 to 2016 (blue line). Note that 2017 is not included in the drift calculation because of the lower number of collocated data for that year. Based on the drift value with the 2σ standard deviation and the P value (indicated on each plot), the derived drifts are insignificant for the UTLS and LMS but are statistically significant for the TROPO and MS columns (−8.6±3.4 % decade −1 and ∼ 5.4±3.6 % decade −1 , respectively). This is in agreement with Keppens et al. (2018, this issue) who applied a different method based on the bootstrapping technique (Hubert et al., 2016). Note that for the TROPO column, the drift calculated for each individual station ranges between −16 % decade −1 and −5 % decade −1 , which is the same order of magnitude as those found in the IASI-A-to-FTIR TROPO comparison. If we limit the time  period to 2011-2016, a statistically significant drift is no longer found for the TROPO and MS (P value > 0.47), as expected from the excellent stability in the Level 1 product . However, since this difference in the drift values might only be due to the short time periods considered here associated with the high variability in the TROPO O 3 differences, a few more years are needed to confirm the observed negative drifts and evaluate them over longer periods.

Summary
In this study, we assessed the quality of IASI-A and IASI-B O 3 products (total and partial columns) retrieved with the FORLI v20151001 software for 9 years (2008-2017) through an extensive intercomparison and validation exercise using independent observations (satellite, ground-based and ozonesonde). Compared to the previous version of FORLI-O3 (v20140922), several improvements were introduced in FORLI-O 3 v20151001, including absorbance look-up tables recalculated to cover a larger spectral range using the 2012 HITRAN spectroscopic database (Rothman et al., 2013), with additional numerical corrections. This leads to a change of ∼ 4 % in the total ozone column (TOC) product, which is mainly associated with a decrease in the retrieved O 3 concentration in the middle stratosphere (above 30 hPa/25 km). The IASI O 3 products processed with FORLI v20151001 are part of the ESA Ozone_cci and ECMWF C3S projects, which focus on building consolidated climate-relevant ozone datasets as ECVs. Therefore, validating the latest version of the IASI O 3 products over a long time period and assessing their stability are necessary for decadal trend studies, model simula-tion evaluation and data assimilation applications. The main findings of this work can be summarized as follows: pending on the focus of the study. The consistency between IASI-A and IASI-B O 3 products becomes better after September 2015 (differences less than 0.1 % and 1.4 % for the TOC and TROPO O 3 column products, respectively), which is due to the better quality of IASI-A and IASI-B Level 1 data. This improvement in data quality stems from the deactivation of IASI's cube corner compensation device, which was proved to generate micro-vibrations and random errors Jacquette et al., 2016).
2. With respect to GOME-2A data, IASI-A and IASI-B TOCs are in excellent agreement: they are marginally lower in the Northern Hemisphere (by 0.2 %), while they are higher in the Southern Hemisphere (by 0.4 %).
There is a pronounced seasonality in the differences in the SH, with the largest differences found during the austral summer (up to 4 %); these large differences are related to the larger differences observed in the southern high latitudes. With respect to Dobson and Brewer data, the IASI-A and IASI-B TOC product overestimates the total O 3 abundance by 0.5 %-1.1 % with an obvious seasonal variability in the differences, which is caused by the ground-based measurements (see Sect. 5.2 for a more in-depth explanation). Compared to SAOZ, the IASI-A and IASI-B TOC product is biased by 0.6 %-2 % (∼ 1 % monthly mean averaged standard deviation) in the tropics and midlatitudes, and this value increases to about 2.5 %-3.8 % inside the polar circles. Finally, good agreement is found between IASI-A and IASI-B against the FTIR TOC product, with IASI underestimating the TOC by 1.1 %-6.2 %. The largest bias is found at Lauder, which is likely due to the FTIR data that might be biased high by 1.5 %-2 % at this station. It can be noted that the bias between FTIR and IASI-A, and SAOZ and IASI-A for stations that are close to one another regarding latitude are very consistent if one takes this spectroscopic bias into account (i.e., UV at Sodankyla is lower than IASI-A by 3.8 %, FTIR at Kiruna is higher by 1.1 %; UV at OHP is lower than IASI-A by 0.9 %, FTIR at Jungfraujoch is higher by 3.0 %; UV at Kerguelen is higher than IASI-A by 0.7 %, and FTIR at Lauder is higher by 5.6 %).
3. The time series of relative differences of IASI-A against UV-vis GB TOCs show insignificant negative drift in the NH (0.68 ± 0.69 % decade −1 and P value = 0.05) and a small negative drift in the SH (1.48 ± 0.53 % decade −1 and P value = 0.00), which satisfies the 1 %-3 % decade −1 Ozone_cci requirements for the stability of ozone measurements. Similar results are found for the IASI-A/FTIR TOC comparison. This demonstrates the long-term stability of the current IASI-A TOC products.
deviation observed for the UTLS (20 %-40 %) due to strong ozone variability and larger total retrieval errors . Attempted explanations for the larger bias found in the UTLS are given in Dufour et al. (2012), although no clear reason for the larger bias was found. A possible explanation may be the use of inadequate a priori information in that layer. The current version of FORLI uses a single global profile that is the mean of the McPeters/Labow/Logan climatology (McPeters et al., 2007) as an a priori profile. As shown by Bak et al. (2013), using tropopause-based ozone profile climatology can significantly improve the a priori profile. However, using dynamical a priori information makes the comparison on a global scale less straightforward to analyze. This is due to the fact that retrievals at each IASI pixel would be based on different a priori profiles. The IASI-A and IASI-B TROPO O 3 products underestimate the O 3 abundance in the midlatitudes and the tropics (by 11 %-13 % and 16 %-19 %, respectively) and overestimate the O 3 abundance in the high latitudes (by 4 %-5 %).
5. The IASI-A-to-FTIR TROPO O 3 column comparison exhibits significant negative drifts ranging between −8 % and −16 % decade −1 over the period from 2008 to 2017 at all stations. A significant negative drift of −8.6 ± 3.4 % decade −1 is also found in the IASI-A to ozonesonde TROPO O 3 column comparison for the Northern Hemisphere. The observed negative drifts in the IASI-A TROPO columns might partly explain the apparent disagreement between the ozone tropospheric trends observed by IASI and GOME/OMI in the Tropospheric Ozone Assessment Report (TOAR) (Gaudel et al., 2018). However, further investigation should be carried out since the TROPO columns are not calculated in the same way in the two studies. When considering the period from 2011 to 2016, the drift values for the TROPO column decrease and become statistically insignificant. However, since this difference in the drift values might only be due to the short time periods considered in this study, and associated with the high variability in the TROPO O 3 differences, a few more years are needed to confirm the observed negative drifts and evaluate them over longer periods. The observed negative drifts of the IASI-A TROPO O 3 product (8 %-16 % decade −1 ) over the 2008-2017 period might also be taken into consideration when deriving trends from this product and this time period.
6. The IASI-A TOC relative differences against independent measurements showed small or insignificant negative decadal drifts for the period from 2008 to 2017, which indicates that the current IASI-A TOC products are homogeneous and reliable for trend studies. The IASI-A TROPO O 3 relative differences against sonde and FTIR data showed significant negative drifts for the period from 2008 to 2017. Therefore, it is recommended for trend studies to wait for the new homogeneous IASI climate time series, which will be reprocessed using the ECMWF ERA5 temperature reanalysis (Hersbach and Dee, 2016)  The authors thank all those responsible for the WOUDC and GMD measurements and archives for making the ozonesonde data available. The ground-based total ozone column data used in this publication were obtained as part of WMO's Global Atmosphere Watch (GAW) and are publicly available via the WOUDC. We would like to acknowledge and warmly thank all of the investigators that provide data to these repositories on a timely basis, as well as the handlers of these databases for their upkeep and efforts to achieve the guaranteed quality of the product. The FTIR data used in this publication were obtained as part of the