Interactive comment on “ Benefit of ozone observations from Sentinel-5 P and future Sentinel-4 missions on tropospheric composition

In this study, Quesada-Ruiz et al. conducted an Observing Simulated System Experiment (OSSE) in order to assess the benefit of future ozone data from individual or combined use of GEO (Sentinel-4) and LEO (Sentinel-5P) satellite observations on tropospheric ozone composition. This OSSE, which focused over Europe during the summer 2003 period, consisted in the following two main steps: (1) assimilating S4 and S5P synthetic ozone profile data simulated by the DISAMAR inversion package using LOTOS-EUROS and TM5 3D-CTM fields as input, and (2) comparing the assimilation results to a reference run based on the assimilation of simulated ozone data at a selection of 1132 AirBase stations. Results showed that S4 and S5P satellite data in the UV range clearly bring direct added value to the tropospheric ozone composition


Introduction
The monitoring of tropospheric composition is of utmost importance for the evaluation of air quality and to improve the understanding of the intercontinental transport of air pollution (HTAP, 2010). Recently, satellite measurements have been widely used to improve the detection and the forecast of atmospheric pollutants through their assimilation into chemistry transport models (CTMs), including ground-based and/or airborne measurements (e.g., Elbern et al., 2010). The main advantage of 5 satellite observations, when compared to local measurements, is the global and/or regional coverage. However, the temporal and spatial resolution required for air quality (AQ) is still a drawback that should be addressed by future missions in order to reach up to 10 km resolution and up to 1 hour of revisit time. To address these issues, studies have analysed the combined use of various geostationary Earth orbit (GEO) / low Earth orbit (LEO) satellites (e.g., Lahoz et al., 2012;Barré et al., 2015).
The S4 mission will be carried on the Meteosat Third Generation (MTG) geostationary platform, and includes an ultraviolet visible near-infrared (UVN) spectrometer. The S5P mission, launched on 13 October 2017, includes the Tropospheric Monitor- 15 ing Instrument (TROPOMI) and was developed to reduce the gap between the Scanning Imaging Absorption Spectrometer for Atmospheric Cartography (SCIAMACHY) instrument on Envisat, the Ozone Monitoring Instrument (OMI) on Aura mission and the Sentinel-5 (S5) mission. The work presented here was part of a project called Impact of Spaceborne Observations on Tropospheric Composition Analysis and Forecast (ISOTROP, http://projects.knmi.nl/isotrop/), financed by the European Space Agency (ESA) to study the impact of trace gas observations by the Sentinel missions on air quality analyses. 20 The analysis of the benefit of a trace gas observation -in our case ozone-is carried out by performing an observing system simulation experiment (OSSE). The main goal of the OSSE concept is to determine the potential added value of a new observing system (OS) with respect to the existing ones. We use a state-of-the-art model (namely a CTM) run to construct a representation of reality, called hereafter the nature run(NR). From the NR ::::: nature ::: run, the satellite trace gas observations (level 2 data) and their corresponding errors are simulated using an instrument simulator, which combines a retrieval scheme and the instrument 25 model. These simulated observations are preferably fed into a data assimilation system of a different model, obtaining an assimilation run(AR). This second model is also run either without assimilation or with the assimilation of the existing OS data, which gives a reference run(RR). The use of two different models avoids the identical twin problem, which is known to lead to overoptimistic results (Arnold Jr and Dey, 1986;Timmermans et al., 2015). Numerous OSSEs dealing with observations of chemical species were performed using different satellite instrument specifications to show the benefit of selected additional 30 observations on the OS (e.g., Edwards et al., 2009;Claeyman et al., 2011;Zoogman et al., 2011;Abida et al., 2017).
In this study, we perform an OSSE to analyse the benefit of tropospheric ozone observations from S4 and S5P missions, following the recommendations for an AQ OSSE reported by Timmermans et al. (2015). We used as level 2 observations the nadir simulated ozone measurements in the ultraviolet (UV) from the future S4 and the current S5P missions. It is worth pointing out that this work started before launch of S5P, and its value includes the comparison with S4. To be consistent with S4 and the studied period, we simulated S5P ozone data and we used the same UV spectral range for both. Note that the results presented in this study correspond to the use of instrumental characteristics of a S4-like and S5P-like missions that are assumed to be consistent with the actual characteristics of S4 and S5P missions, but for the sake of simplicity we will call them hereinafter S4 and S5P. In addition, we also simulated ground-based stations (GBS) ozone data to evaluate the added value of 5 the satellite measurements within the lower troposphere in comparison to ground-based data. The ozone data are :::: The :::::::: simulated ::::: ozone :::::::::: observations ::: are :::::::: generated ::::: using ::: the ::::: ozone :::: data simulated from the NR ::::: nature ::: run, which is formed by the combination of the CTMs Long Term Ozone Simulation-European Operational Smog -LOTOS-EUROS- (Manders et al., 2017) and Transport Model version 5 -TM5- . These ozone data :::: These ::::::::: simulated ::::: ozone ::::::::::: observations were assimilated in the MOdèle de Chimie Atmosphérique à Grande Echelle-Projet d'Assimilation par Logiciel Multi-méthodes (MOCAGE-PALM) 10 system (Peuch et al., 1999;Lagarde et al., 2001) to provide both the ARs and the RR :::::::: reference ::: run ::: and ::: the ::::::::::: assimilation :::: runs that are compared to the NR ::::: nature ::: run. In our case, the RR ::::::: reference ::: run : is the assimilation of the GBS simulated data. The ARs :::::::::: assimilation :::: runs include the assimilation of GBS and satellite simulated data (S4 or S5P or both S4 and S5P -we note hereafter S4+S5P).
For this OSSE, we selected the summer 2003 (June, July and August). During this period, a heat wave affected Europe, 15 especially during the first two weeks of August, leading to the hottest summer recorded since the 16th century (Stott et al., 2004;García-Herrera et al., 2010). Note that our study does not take into account the heat wave of 2018, whose impact has not been fully assessed yet. Various studies suggested that from 40,000 to 70,000 deaths during this heat wave were attributed to heat and pollution in Europe (Robine et al., 2008;García-Herrera et al., 2010). The heat wave caused elevated ozone concentrations, due to its close correlation with high temperatures, that were enhanced by the anticyclonic conditions. High temperatures and 20 clear sky conditions during summer time are advantageous conditions for ozone precursor photochemical reactions, especially over populated areas, where anthropogenic ozone precursor emissions (nitrogen oxides -NOx-or carbon monoxide -CO-) are predominant. Regarding the heat wave consequences related to AQ, surface ozone measurements over central Europe were the highest since 1980s (Solberg et al., 2008). Additionally, unprecedented forest fires in Portugal occurred, emitting huge quantities of CO (Abida et al., 2017). High ozone and CO concentrations were also measured by MOZAIC (Measurements of 25 OZone, water vapour, carbon monoxide and nitrogen oxides by Airbus In-service airCraft) instruments on board commercial aircraft as reported by Tressol et al. (2008).
The general aim of this paper is to assess the benefit of future ozone data from individual or combined use of GEO (S4) and LEO (S5P) satellite observations for the understanding of local to regional scale ozone tropospheric composition with a focus on Europe. Section 2 describes the MOCAGE-PALM assimilation system. We define the OSSE components, including 30 the NR, the RR and the AR ::::: nature :::: run, ::: the :::::::: reference ::: run :::: and ::: the :::::::::: assimilation ::: run, in Sect. 3. We present the simulated ozone measurements in Sect. 4 and the metrics used to evaluate our OSSE in Sect. 5. We show the results of the ozone OSSE at different altitudes from the upper to the lower troposphere for the summer 2003 period in Sect. 6 which are discussed in Sect.

3
The assimilation system used in this study, MOCAGE-PALM, was jointly developed by Météo-France and the Centre Européen de Recherche et de Formation Avancée en Calcul Scientifique (CERFACS). The MOCAGE-PALM assimilation system has been used in several studies related to upper tropospheric and stratospheric ozone (El Amraoui et al., 2008a, b), in ozone OSSEs related to air quality (Claeyman et al., 2011) and to evaluate the quality of IASI (Infrared Atmospheric Sounding 5 Interferometer) total column ozone measurements (Massart et al., 2009).
MOCAGE (Peuch et al., 1999) is a 3D CTM that reproduces the main chemical and physical processes present in the troposphere and the stratosphere (Bousserez et al., 2007). From the various configurations, domains, grid resolutions and chemical/physical parametrizations available in MOCAGE, the following were selected: (i) a 2°x2°spatial resolution global grid using a 2-way nesting with a 0.2°x0.2°spatial resolution regional grid (32-72°N, 16°W -36°E), (ii) 47 sigma-hybrid 10 vertical levels from the surface up to 5 hPa, and (iii) the RACMOBUS chemical scheme. The RACMOBUS chemical scheme is the combination of the Regional Atmospheric Chemistry Mechanism tropospheric scheme -RACM- (Stockwell et al., 1997) and the REactive Processes Ruling the Ozone BUdget in the Stratosphere stratospheric scheme -REPROBUS- (Lefèvre et al., 1994). MOCAGE is used for diverse purposes: e.g., operational chemical weather forecasts (Dufour et al., 2005), Monitoring Atmospheric Composition and Climate (MACC) services (http://www.gmes-atmosphere.eu), and atmospheric composition cli-15 mate trends studies (Teyssèdre et al., 2007). Moreover, during the heat wave of August 2003, MOZAIC aircraft measurements were used to validate MOCAGE ozone fields over Europe (Ordóñez et al., 2010).
The data assimilation suite is run via the PALM coupler (Lagarde et al., 2001) that connects the CTM to a set of operators, such as the observation operators, the error covariance approximations, the increment propagators, and the minimizer, implementing several variational assimilation algorithms. We used the 3D-Var variant, which has already been presented for a S5P 20 CO OSSE (Abida et al., 2017), with a fixed assimilation window of one hour.

4
We present a regional chemical OSSE framework conducted to investigate the added value of S4 and S5P observations on tropospheric ozone. In addition to simulated satellite observations, the ground based station (GBS) ozone data are assimilated into our system to reproduce the existing OS.We describe the OSSE scheme in Fig. 1 with the links between the different elements: the observations, the free run (FR), the nature run(NR) ::::: nature :::: run, the reference run (RR) and the assimilation run(AR).

The nature run 10
The selection of the nature run OSSE component ( Fig. 1) is of utmost importance. The NR ::::: nature ::: run : model characterizes the true state of the atmospheric composition. A CTM is often used to simulate the NR ::::: nature ::: run : (Masutani et al., 2010) and, in turn, the NR ::::: nature ::: run : is used to simulate the reference state through a data simulator that includes the retrieval method and the instrument model. In this study, the ozone NR ::::: nature ::: run : is made up of two different models. On the one hand, the global CTM TM5  is run over Europe with a spatial resolution of 1°x1°, a temporal output resolution 15 of 3 hours, and 34 vertical layers from the surface up to 0.1 hPa. On the other hand, the regional LOTOS-EUROS AQ model (Manders et al., 2017) provides a description of the lowermost tropospheric air pollution over Europe, with a 7-km spatial resolution, a 1-hour temporal output resolution, and 4 vertical layers from the surface up to 3.5 km. The European Centre for Medium-Range Weather Forecasts (ECMWF) meteorological data are used as input for both LOTOS-EUROS and TM5.
The MACC global fire assimilation system (GFAS v1, Kaiser et al., 2012) is used for fire emissions in both models, and the 20 TNO-MACC-II emission database (Kuenen et al., 2014) for surface anthropogenic emissions in LOTOS-EUROS. The model runs include a spin-up period of three months. The NR ::::: nature ::: run : is built by merging the LOTOS-EUROS ozone profiles from the surface to 3.5 km with the TM5 results from 3.5 km to the top of the model atmosphere.
The ozone representation within TM5 and LOTOS-EUROS has been validated in Van Loon et al. (2007) for the year 2001.

The reference run and the assimilation run
The reference run (RR) -another essential component in the design of an OSSE-includes the assimilation of the existing OS data. In this OSSE, the GBS represent the existing OS at the surface. Therefore, in order to account for the impact of the existing OS, we assimilated the simulated ::: the :::::::: simulated :::: GBS : ozone observations from GBS ::: the ::::: nature :::: run using MOCAGE-PALM as it is done operationally. In addition, as stated previously, a well designed OSSE should use a different model to generate the 20 RR :::::::: reference ::: run than the one used for the NR ::::: nature ::: run. In our work, we generated the RR ::::::: reference ::: run : from the MOCAGE model constrained by the meteorological data from the Action de Recherche Petite Echelle Grande Echelle (ARPEGE) model (Courtier et al., 1991), which is different from the two models used to construct the NR ::::: nature ::: run.
Concerning the assimilation run(AR), we assimilated the simulated satellite (S4, S5P) ozone data and GBS measurements derived from the NR ::::: nature ::: run : using the assimilation system MOCAGE-PALM. Three assimilation runs (ARs) were per-25 formed: i) the S4 ozone AR ::::::::: assimilation :::: run (hereafter called S4_AR) which is the simultaneous assimilation of simulated ozone from S4 and GBS, to evaluate the added value of S4 ozone with respect to the existing OS ; ii) the S5P ozone AR :::::::::: assimilation ::: run : (hereafter called S5P_AR) which is the simultaneous assimilation of simulated ozone from S5P and GBS, to evaluate the added value of S5P ozone with respect to the existing OS; 30 iii) the S4 and S5P ozone AR :::::::::: assimilation ::: run : (hereafter called S4+S5P_AR) which is the simultaneous assimilation of simulated ozone from S4, S5P and GBS, to evaluate the synergy of the combined use of S4 and S5P ozone.

6
In this section, we discuss how we simulated the in situ and the satellite data from the NR ::::: nature ::: run.

The simulated GBS ozone data
We derived the ground-based simulated ozone data from the surface representation of the NR ::::: nature ::: run : (LOTOS-EUROS model). The locations of the stations are taken from the AirBase dataset. The GBS data at these locations are routinely used in 5 the operational system to forecast AQ (e.g. MACC reanalysis). The GBS are sorted out by keeping the stations representative of the background ozone (urban and rural). We used 1132 stations measuring ozone over Europe (Fig. 3) to preserve an homogeneous spatial representativity ::::::::::::::: representativeness. The observation error is taken to be 5 parts per billion by volume (ppbv) in our assimilation system, in agreement with Jaumouillé et al. (2012).

4.2
The simulated S4 and S5P ozone data 10 Synthetic satellite ozone profile observations were generated from the NR ::::: nature ::: run : based on S5P instrument model and characteristics for both S4 and S5P. The equatorial overpass time of S5P (13:30 local time of ascending node crossing) was adopted for the LEO simulations while the GEO data were generated using the S4 measurement geometry and an hourly measurement revisit time.
We focus on tropospheric ozone, and in our study the retrievals were performed in the 300-320 nm spectral window. : ,
The Determining Instrument Specifications and Analyzing Methods for Atmospheric Retrieval (DISAMAR) package (de Haan, is used for ozone profile retrievals involving forward model radiative transfer calculations to simulate the measured spectrum followed by the optimal estimation (OE) method to retrieve the profiles (Rodgers, 2000). The retrieval results are stored in a compact form following the approach outlined by Migliorini et al. (2008). The basic idea of Migliorini et al. is a representation of the optimal estimation retrieval results in a way which greatly reduces the volume of the data product without losing the information, and which leads to an efficient interface with data assimilation by reducing the number of observations (removing the noisy part of the retrieval solution). This is done by the following steps: i) Transforming the retrieval state in such a way that the a-priori does not explicitly enter the observation operator.

5
ii) The retrieval solution is expressed in the space of the eigenvectors of the retrieval problem.
iii) A rescaling of the final eigenvectors such that the noise becomes equal to 1.0.
iv) A removal of noise-dominated observations with no information.
As a result the a-priori vector and the covariance matrix (unity matrix) do not have to be stored, and only a truncated generalised averaging kernel is written to file, together with the retrieved values for the dominant states.

10
The transformations of the Migliorini et al. approach have specific aspects depending on the way the optimal estimation is implemented. Therefore we will provide details on how this was implemented for our ozone case. In DISAMAR, a basis transformation known as pre-whitening is applied leading to a transformed Jacobian K where S ε and S a are the measurement and a-priori covariance matrices, respectively, and K the Jacobian matrix. On the right 15 side, the singular value decomposition (SVD) has been applied. The diagonal matrix W contains the singular values w k while the matrix V contains the singular vectors (U is not used here). The retrieved profile x, a-priori profile x a and the true profile x true are connected by the averaging kernel A x where G is the gain matrix and ε describes the measurement noise. The covariance of the noise S noise is given by where the transformed gain matrix G is a diagonal matrix formed of the singular values w k The retrieval solution x is transformed three times. The first transformation shifts the solution removing the need to provide the a-priori profile 25 where x (a) is the shifted solution having the same S noise as x. The second transformation rotates the shifted solution to obtain a diagonal covariance Λ The storage of the covariance matrix can be avoided by scaling the rotated solution x (b) where the covariance of ε (c) is the identity matrix I. The transformed solution x (c) is obtained from the retrieval parameters while the transformed averaging kernel A (c) is obtained from To further reduce storage space requirements, only the leading q eigenvectors are provided by storing only the first q elements 10 of x (c) and the first q rows of A (c) .
In our case, we considered the first six leading eigenvectors (q = 6), hereafter labelled as v1 to v6. Figure  middle-stratosphere ozone from lower stratosphere ozone. The AKs of the other eigenvectors are much smaller but these include crucial tropospheric information. As can be seen from the right panel, they provide information above the altitude of 10 hPa (v3) and, more importantly, also below 100 hPa (v3 and v4). The AKs for v5 and v6 show small values for all the levels and the absolute value becomes small compared to the noise level (=1). Comparing the set of AKs, the tropospheric 20 information is likely contained in v1, v3 and v4, while the highest sensitivity of the retrieval is in the stratosphere. However, we used the first six leading eigenvectors (v1-v6) to keep nearly all the tropospheric information contained in the eigenvectors.
Moreover, keeping the v1 to v6 safely represents the DFS which is typically of the order of 4-5.
Performing the OE retrieval for each measurement requires excessively large computational resources. For our simulation study, we simplified the retrieval process by introducing look-up tables (LUTs) for A (c) and x (c) . Using the U.S. standard 25 atmosphere temperature and ozone profiles, A (c) was obtained for LUT nodes for solar zenith angle (sza), viewing zenith angle  (1050,970,890,801,701,601,501,401,301,201). Note that the eigenstates and kernel rows are 30 9 determined up to a plus or minus sign. Jumps from one sign to the other from one LUT point to the next will give problematic interpolation errors. Therefore an extra post-processing was applied to the LUT by checking the sign of neighbouring points in the LUT and by multiplying the kernel vectors by -1 when needed.
Using the LUT, the synthetic satellite ozone observations are generated by: i) Generating the orbit coordinates and individual pixel coordinates with an orbit simulator, for both S4 and S5P.

5
ii) Interpolating ECMWF high-resolution (cloud, temperature) meteorological fields to these orbits, and obtaining radiative cloud fractions and cloud heights.
iii) Interpolating the NR ::::: nature ::: run : results to the observation locations to obtain x true . iv) Interpolating A (c) from the LUT using the measurement geometry, cloud/surface pressure and cloud/surface albedo, weighting the result by radiative cloud fraction.
10 v) Computing the observation from x obs = A (c) x true + noise. The noise realisation is drawn from a Gaussian distribution with unit width.

Satellite observations error covariance matrix
The total observation error results from a sum of the observation error, as provided in the synthetic observations data product, and a representativeness error (Migliorini et al., 2008), that will be explained later in this section.

15
The synthetic observations are provided with an observation error ε (c) added, drawn randomly from a normal distribution with a covariance matrix equal to the identity matrix and a transformed AK, A (c) . The ozone retrievals are presented in the space of eigenvectors, and the data product contains the first six leading eigenvectors. The AKs are unique, and are computed for each observation separately, and depend on the satellite geometry, surface albedo and cloud properties. The absolute value of this retrieval (the eigenvectors can contain negative values) is roughly a measure of the SNR, since the observation error is 20 equal to 1 -in ln-concentration space-by construction. Therefore the observation error covariance matrix is the identity matrix.
The representativeness error is often difficult to estimate and could describe, for instance, the mismatch between the satellite footprint and the model grid box, and also the differences in the information content between the satellite and the model vertical layers. Furthermore, other assumptions and inaccuracies in the observation operator (which transforms the model state into the observation space) also contribute to the representativeness error. Ceccherini et al. (2018) showed the importance of 25 interpolation and coincidence errors for retrievals on different vertical grids in data fusion. For example, the difference in the layers between the retrieval grid and the MOCAGE CTM may easily lead to regridding (interpolation) errors that may make it difficult to assimilate stratospheric (high concentration) and tropospheric ozone (low concentration) together.
We estimate the representativeness error by using the NR ::::: nature ::: run provided in the retrieval product. The representativeness contribution to the covariance matrix is computed by taking into account the NR ::::: nature ::: run : profiles and by calculating the where H is a linear spatial interpolator and N is the total number of pixels over the European domain within each summertime month. The values of the corresponding monthly standard deviation obtained are presented in Table 1 for S4 and S5P. Because 5 random errors were added to the synthetic observations, we average the data each month to have robust statistics and also to take into account the possible change from month to month (intraseasonal variability).
The positive impact of including the representativeness error in R on the analysed ozone data has been evaluated for the assimilation of S4 ozone for the month of June using the values from Table 1. Using these values reduces the ozone weight in the stratosphere favouring the ozone assimilation in the troposphere and allows a stable combined assimilation of the GBS 10 ozone observations together with the S4 and S5P satellite data and gives a stable normalized χ 2 statistics for the assimilation, with values ranging between 0.6 and 0.7 (not shown). Note that the values for eigenvectors v4 to v6 are unchanged (equal to 1 ln concentration-space). The information contained in the first three leading eigenvectors (v1 to v3) have higher SNRs compared to the others (v4 to v6), leading to a larger absolute error. The higher the SNR, the larger the representativeness error. Conversely, the relative error (
15 Figure 5 shows the histograms of Observation minus Analysis (OmA) and Observation minus Forecast (OmF) for the first six leading eigenvectors using R in the assimilation process for S4 (S5P ones are similar but not shown) during the month of June 2003. One can see clearly that the OmA histograms are narrower than the OmF for the four first leading eigenvectors (v1-v4). This shows that these eigenvectors have more impact on our assimilation system than the two others likely due to the information representing greater ozone concentration, in particular for the v1. 20 In agreement with the conclusions of Migliorini et al. (2008), this sensitivity study shows the need to add a representativeness error to the observation covariance matrix in order to improve the assimilation. In our case, this is especially noticeable for the first three eigenvectors that contain most of the ozone information.

12
We calculate the mean bias error (MBE), the mean absolute error (MAE), the root mean square error (RMSE) and its reduction rate or skill score, and the correlation coefficient to quantify the bias, the error and the agreement between the NR and the RR ::::: nature ::: run ::: and ::: the :::::::: reference ::: run, or between the NR and the AR ::::: nature ::: run ::: and ::: the :::::::::: assimilation ::: run. The statistical indicators MBE, MAE, RMSE, skill score and correlation coefficient with respect to the NR ::::: nature ::: run, are defined as follows: 10 where X can be X RR or X AR , representing the RR or the AR ::::::: reference :::: run :: or ::: the :::::::::: assimilation ::: run data, respectively; X N R represents the NR ::::: nature ::: run : data; N is the total number of data samples; and the over-bar symbol represents the arithmetic mean operator. The data selection for X will depend on the chosen comparison.
This overestimation can also be seen by studying the MBE profile over the period (Fig. 7a). The bias for the three ARs 30 :::::::::: assimilation :::: runs is about 20% smaller than that of the RR :::::::: reference ::: run : in the upper troposphere, similar but with opposite sign in the mid troposphere and about 10% greater than to that of the RR ::::::: reference ::: run : in the lower troposphere. The RMSE ( Fig. 7b) is up to 20% lower for the ARs :::::::::: assimilation :::: runs than for the RR :::::::: reference ::: run : in the mid-to-upper troposphere (200 to 600 hPa) but slightly greater below this level (up to 5%). The mean skill score profile over the period (Fig. 7c) shows a reduction of the RMSE in the mid-to-upper troposphere (above 600 hPa) reaching more than 30% above 450 hPa.
According to the tropospheric profiles analysis presented above, we selected three levels that will be more extensively validated in the upper (200 hPa), mid (500 hPa) and upper part of the lower troposphere (700 hPa). Other intermediate levels have also been studied (not shown) but these three levels are the most helpful to explain the results obtained in this OSSE.
25 Figure 10 (left column) shows the MAE fields averaged over the studied period at 200 hPa for S4_AR, S5P_AR, S4+S5P_AR and the RR :::::::: reference ::: run. The MAE fields for the three ARs :::::::::: assimilation :::: runs are similar but much smaller than that of the RR ::::::: reference ::: run. In general, the MAE is smaller in the northern part than in the southern part of the European domain. This is especially marked for the S4_AR and S4+S5P_AR compared to S5P_AR showing a slightly higher added value of S4 data at this level. The spatially averaged MAE time series of the three ARs ::::::::: assimilation :::: runs : (Fig. 10 -left column bottom) are lower 30 than that of the RR :::::::: reference ::: run all along the studied period. The MAE of the RR ::::::: reference ::: run : goes from 35% up to 65%.
Conversely, the MAE of the three ARs :::::::::: assimilation :::: runs is much smaller ranging between 20% and 45%, with an average of about 30%. The simultaneous assimilation of both S4 and S5P data provides a slightly smaller MAE than S4_AR, which in turn is smaller than S5P_AR. This demonstrates the benefit of the assimilation of the satellite data at 200 hPa, in particular the synergy of both S4 and S5P data. Figure 11 (left column) shows the mean skill score fields and time series for the three ARs :::::::::: assimilation :::: runs over the studied period at 200 hPa. There is a net improvement in the full domain in terms of skill score for the three ARs :::::::::: assimilation :::: runs, with S4+S5P_AR presenting slightly greater skill score values than S4_AR, and in turn S4_AR performs better than S5P_AR.

15
At 500 hPa, the assimilated ozone fields for S4_AR, S5P_AR and S4+S5P_AR present similar patterns and are closer to the NR than the RR ::::: nature ::: run ::::: than ::: the :::::::: reference ::: run : (Fig. 9 -middle column). However, there is an overestimation of ozone in the South-East part coming from the assimilation of S4 data as one can see for S4_AR and S4+S5P_AR. We discuss this overestimation in Sect. 7.
As shown in Fig. 10 (middle column), the MAE fields for the three ARs ::::::::: assimilation :::: runs : present small values over all the 20 European domain (around 10%) which are similar but much smaller than those of the RR :::::::: reference ::: run. Greater MAE values are located in the North-West part of the European domain reaching up to 20% in particular for S5P_AR, and in the South-West part of the European domain with values up to 22% for the three ARs ::::::::: assimilation :::: runs, but still smaller than the MAE RR ::::::: reference :::: run values. In addition, the S4_AR and S4+S5P_AR exhibit MAE values reaching up to 22% in the South-East part of the European domain consistent with the overestimation found in the ozone fields. Regarding the temporal evolution ( Fig.   25 10 -middle column bottom), the MAE for the three ARs :::::::::: assimilation :::: runs : is stable during the whole studied period, ranging from 10% to 15%, while the MAE for the RR ::::::: reference ::: run : increases during July and August, reaching more than 25%. The MAE for S4+S5P_AR is similar to that of S4_AR, but slightly smaller than that of S5P_AR. These results show the benefit of the assimilation of the either S4 or S5P satellite data at 500 hPa, but the synergy between the two instrument observations does not improve the analysis. 30 In the northern part of the European domain, the skill score for the three ARs ::::::::: assimilation :::: runs : shows greater values than in the southern part of the European domain ( Fig. 11 -middle column). For the particular case of S4+S5P_AR and S4_AR, a negative skill score is found in the South-East, associated to the ozone overestimation mentioned above. In June, the skill score shows a significant variability while, in July and August, the mean skill score value is positive and increases reaching a stable value of 0.4 from mid-July to the end of August as shown in Fig. 11 (middle column bottom).
The skill score mean fields (Fig. 11 -right column) for the three ARs ::::::::: assimilation :::: runs is clearly separated into two parts: one with positive values (North-West part of the European domain coloured in red) and the other part with negative values (southern 20 and eastern parts of the European domain coloured in blue). The positive skill score region indicates a RMSE reduction of the ARs :::::::::: assimilation :::: runs with respect to the RMSE of the RR ::::::: reference ::: run : reaching up to 30%. The negative skill score pattern found in the southern and eastern parts of the domain is consistent with the ozone overestimation and the greater MAE. Note that for August ( Fig. 11 -right column bottom), the ARs :::::::::: assimilation :::: runs skill score becomes positive following the behaviour of the MAE time series. 25 Like at the higher levels (200 and 500 hPa), the three ARs :::::::::: assimilation :::: runs : improve the correlation coefficient field at 700 hPa ( Fig. 12 -right column) compared to the RR :::::::: reference ::: run values. This shows an improvement in terms of patterns almost all over the full European domain. The correlation coefficient values for the ARs ::::::::: assimilation :::: runs : range between 0.4 and 0.8 whereas those of the RR ::::::: reference :::: run range from 0.2 to 0.75. The improvement is also highlighted by the peaks location of the histogram of the full set of data ( Fig. 12 -right column bottom), which has increased from 0.55 (RR :::::::: reference In Sect. 6, we presented the distribution of the mean ozone fields, the mean absolute error, the skill score (equivalent to RMSE reduction) and the correlation coefficient for 200, 500 and 700 hPa. The metrics were chosen to evaluate the added value of the assimilation of the S4 and S5P data in terms of absolute error, improvement of the error, and agreement with the NR ::::: nature ::: run.
The added value of the S4 and S5P data is well characterized when all the three metrics consistently show an improvement.
However, the improvement in terms of MAE and skill score depends on the assimilation run time and/or the region studied for each level.
At 200 hPa, the results obtained from these metrics for the three ARs :::::::::: assimilation :::: runs are consistent during all the studied period (JJA) and the whole domain. Compared to the RR ::::::: reference :::: run, we find a reduction around 30% and up to 50% for the MAE and for the RMSE, respectively. Moreover, there is an increase of the correlation coefficient of 0.1 with a narrower histogram. Clearly, the assimilation of satellite data brings ozone information at this level, which is in line with the vertical sensitivity of the satellite data used in this work.
Conversely, for the levels 500 and 700 hPa, the results show on average an added value but not for all the studied period and/or the whole domain. Regarding the studied period, an added value is shown for July and August at 500 hPa, and for

15
August at 700 hPa. This delay is likely due to the information from the levels above that impact these lower levels. For July and August, we find a reduction of the MAE for the ARs :::::::::: assimilation :::: runs of more than 10% with respect to the RR :::::::: reference ::: run and a skill score value reaching more than 0.4 at 500 hPa. At this level, we obtain an increase of the correlation coefficient from 0.25 to 0.35 when compared to the RR :::::::: reference ::: run : for the whole period. At 700 hPa and for August, the reduction of the MAE for the ARs :::::::::: assimilation :::: runs compared to the RR :::::::: reference ::: run : is around 5% and the skill score is around 0.2. At 20 this level and for the whole period, the correlation coefficient of the ARs :::::::::: assimilation :::: runs increases between 0.05 and 0.15 compared to the RR ::::::: reference ::: run : one.
A detailed analysis of these results at 500 hPa shows that the improvement of the skill score during July and August is due to the fact that the RMSE of the RR ::::::: reference :::: run increases during this period (as seen in the MAE) while the RMSE of the ARs :::::::::: assimilation :::: runs is stabilized by the impact of the assimilation of both S4 and S5P data (not shown). A similar behaviour 25 is seen at 700 hPa but for August. This indicates that there is not a clear direct impact of S4 and S5P data at these two levels likely due to the low sensitivity of these two instruments in the lower troposphere.
An ozone overestimation occurs in the South-East corner of the European domain. This is well seen at 500 hPa for the assimilation runs containing S4 data and more pronounced at 700 hPa for the three ARs ::::::::: assimilation :::: runs. To better understand this fact, we calculate the zonal mean ozone profiles during the summer 2003 for four different latitudinal bands and the South- 30 East corner, which are presented in Fig. 13. One can clearly see that the RR :::::::: reference ::: run ozone profile shape is similar to the NR ::::: nature :::: run one for high latitudes, but significantly different for lower latitudes, especially for the South-East corner. The gradient of the ARs :::::::::: assimilation :::: runs profiles appears to be in line with that of the RR ::::::: reference ::: run. In our assimilation system, we used a B-matrix which does not evolve in time with the variance proportional to the model profile as described in Sect. 2. The assimilation of eigenvectors can be understood as the assimilation of several partial columns (6 in our case) with associated vertical sensitivity represented by the transformed AKs. From Fig. 4, one can conclude that the shape of the AKs in the troposphere is very similar, meaning that there is at most a tropospheric column information, with higher sensitivity in the upper troposphere. If we consider a single model grid point, the assimilation process spreads the 5 eigenvectors information on the vertical (from the stratosphere -with very high values-to the troposphere -with very low values-) by calculating an increment profile that minimizes the distance to the assimilated data. The assimilated profile is then a shifted profile resulting from the sum of the background profile and the calculated increment profile, and is highly dependent on the background profile shape and the B-matrix. In particular, when the background profile shape is not following the NR ::::: nature :::: run shape, this could sometimes lead to a significant over/under estimation in the levels where the assimilated 10 data has a low sensitivity. The S4 and S5P satellite sensors have a higher sensitivity in the upper troposphere and a lower sensitivity in the lower troposphere for ozone. In our case, the NR ::::: nature ::: run ozone concentration is greater than the background in the upper levels resulting in an overestimation of ozone in the lower levels due to the fact that the distribution of the ozone information is governed by the background profile shape and the AKs, which have much higher sensitivity in the upper troposphere compared to the lower troposphere. Notice that the vertical extent of the impact of GBS data assimilation is limited 15 by the background error vertical correlation which lengthscale has been set to one model level, and therefore the use of this data does not compensate this effect at 700 hPa and the height levels above.

Conclusions
We performed assimilation runs with synthetic data that mimic S4 and S5P satellite observations over Europe and during the period of the summer 2003. The reference run was performed with the assimilation of the simulated GBS ozone data, using the 20 same approach that is commonly used in an operational AQ forecast system. We analysed the troposphere, with a focus on the levels at 200, 500 and 700 hPa.
For the development of the ozone OSSE, an efficient interface to ozone observations has been used. More specifically, one of the innovations of this work is the generation of the ozone profile information in the form of leading eigenvectors of the radiative transfer code. This represents a very efficient and convenient interface between the retrievals and the data assimilation 25 system. The use of this approach has been validated in this work. In addition, we have shown the importance of correctly adding the representativeness uncertainties into the observation error covariance matrix.
The OSSE that we have set up is as little overoptimistic as possible to ensure the robustness of the results. The retrieved ozone profiles of S4 and S5P were obtained using the same spectral range (300-320 nm), and stored in the form of synthetic observations (six leading eigenvectors). Note that the instrumental characteristics chosen for the retrievals do not use all the 30 capacities of S5P and S4 but are assumed to be consistent with the actual characteristics of S5P (which is already flying) and the future S4 missions. The nature run is composed of two different CTMs, LOTOS-EUROS and TM5, and is built by merging the ozone profiles from the former for the boundary layer with the ones from the latter from the free troposphere to the stratosphere. A different model (MOCAGE) was used to perform the reference and assimilation runs, in order to avoid the identical twin problem. The diagonal of the background error covariance matrix is proportional to the model ozone profiles and does not evolve in time.
Under these conditions, we show that both S4 and S5P bring information from the upper troposphere to the middle troposphere. The maximum added value is above 500 hPa. As expected, the assimilation of both S4 and S5P ozone shows better 5 results than the reference run and is closer to the nature run up to these altitudes (in terms of mean absolute error, skill score and correlation coefficient). At 200 hPa there is a reduction of MAE from more than 60% to a more stable MAE of about 30% (for S4+S5P_AR). There is also a reduction in the RMSE (skill score) of the ARs ::::::::: assimilation :::: runs : of up to 50% compared to the RR :::::::: reference ::: run and a better correlation with the NR ::::: nature ::: run.
The behaviour of the assimilation runs S4+S5P_AR and S4_AR is quite similar in terms of MAE, reduction of RMSE (skill 10 score) and correlation coefficient, and slightly better than the S5P ozone assimilation (S5P_AR) between 200 and 700 hPa.
However, there is no significant difference between the added value given by the GEO S4 and the LEO S5P. This is likely due to the fact that there is no diurnal cycle of ozone above the boundary layer, so the information provided by a LEO is still adequate to constrain the model.
The outcome of our study is a result of the OSSE design and the choice into the components of the entire system: the 15 synthetic observation characteristics and uncertainty estimates, the assimilation approach, the treatment of the observations in the assimilation, and the modelling characteristics. Under these conditions, we show that a significant benefit from the S4 and S5P observations is found in the middle troposphere (200 -500 hPa). Moreover, at 200 hPa, the S4 and S5P increment values obtained are larger than at the lower troposphere, showing the added value obtained at this level from S4 and S5P ozone.
However, we did not find any significant impact at the lower troposphere (neither at the surface -not shown in the present 20 study) from any of the experiments based only on these UV ozone profile observations. From these observations, we obtain about one piece of information in the troposphere, with a larger sensitivity in the free troposphere compared to the boundary layer. These results confirm that the use of observations derived from the UV is of limited use to obtain the ozone distribution within the boundary layer, required for air quality. The assimilation of retrievals of total column ozone from S5P real data is currently being tested and appears to have a small impact in the CAMS analysis (Inness et al., in review, 2018). A way to 25 overcome this issue is to combine observations from various wavelength ranges, such as UV and Infrared or UV and Visible.