Aerosol data assimilation in the chemical transport model MOCAGE during the TRAQA/ChArMEx campaign: Lidar observations

This paper presents the first results about the assimilation of CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization) extinction coefficient measurements on-board the CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations) satellite in the chemistry transport model MOCAGE (Modèle de Chimie Atmosphérique à Grande Echelle) of MétéoFrance. This assimilation module is an extension of the Aerosol Optical Depth (AOD) assimilation system already presented 5 by Sič et al. (2016). We focus on the period of TRAQA (TRAnsport à longue distance et Qualité de l’Air dans le bassin méditerranéen) field campaign that took place during the summer 2012. This period offers the opportunity to have access to a large set of aerosol observations from instrumented aircraft, balloons, satellite and ground-based stations. We evaluate the added value of CALIOP assimilation with respect to the model free run by comparing both fields to independent observations issued from the TRAQA field campaign. 10 In this study we focus on the desert dust outbreak which happened during late June 2012 over the Mediterranean Basin (MB) during TRAQA campaign. The comparison with the AERONET (Aerosol Robotic Network) AOD measurements shows that the assimilation of CALIOP lidar observations improves the statistics compared to the model free run. The correlation between AERONET and the model (assimilation) is 0.682 (0.753), the bias and the RMSE, due to CALIOP assimilation, are reduced from -0.063 to 0.048 and from 0.183 to 0.148, respectively. 15 Compared to MODIS (Moderate-resolution Imaging Spectroradiometer) AOD observations, the model free run shows an underestimation of the AOD values whereas the CALIOP assimilation corrects this underestimation and shows a quantitative good improvement in terms of AOD maps over the MB. The correlation between MODIS and the model (assimilation)during the dust outbreak is 0.47 (0.52), whereas the bias is -0.18 (-0.02) and the RMSE is 0.36 (0.30). The comparison of in-situ aircraft and balloon measurements to both modelled and assimilated outputs shows that the CALIOP 20 lidar assimilation highly improves the model aerosol field. The evaluation with the LOAC (Light Optical Particle Counter) measurements indicates that the aerosol vertical profiles are well simulated by the direct model but with a general underestimation 1 https://doi.org/10.5194/amt-2019-482 Preprint. Discussion started: 28 January 2020 c © Author(s) 2020. CC BY 4.0 License.

resolution, but with a very low spatial coverage. On the other hand, modelling has the advantage of providing a 3D detailed spatio-temporal representation of different types of aerosols. Nevertheless, models generally face problems related mainly to the initial conditions, spatial resolution and emissions. Data assimilation is often used to overcome these difficulties and thus to improve the representation of different types of aerosols within models.
Chemical data assimilation consists in combining in an optimal way observations provided by instruments and a priori 5 knowledge about a physical system such as model output. The observations are acting as constraints for the models, and thus can be used to overcome model deficiencies (e.g., El Amraoui et al., 2014). Typically, observation-minus-forecast (OMF) statistics are used for monitoring biases between the observations and the models (e.g., El . Data assimilation systems produce a self-consistent four-dimensional (time and space) description of the dynamical and chemical state of the atmosphere taking into account both the available chemical observations and our theoretical understanding of the atmospheric 10 system.
The assimilation of different aerosol components has been conducted in the framework of many studies including AOD (e.g. Rasch et al., 2001;Zhang and Reid, 2006;Niu et al., 2008;Benedetti et al., 2009;Liu et al., 2011;Shi et al., 2011), particulate matters (PMs) (e.g. Tombette et al., 2009;Lee et al., 2013) and lidar profiles (e.g., Sekiyama et al., 2010;Zhang et al., 2011;Wang et al., 2014). Most of these studies have shown that data assimilation of different aerosol related quantities has 15 a positive impact on aerosol forecasting, especially during the first forecast hours. Moreover, the assimilation of lidar profiles has the advantage of constraining the vertical structure of the model in a much more realistic and direct way. Consequently, the assimilation of lidar information would serve to reduce the influence of diffusion and better constrain the vertical structure (Campbell et al., 2010).
The assimilation module presented in this work is an extension of the AOD assimilation system already described by Sič et al. (2016).
We consider the extinction coefficient measurements from the CALIOP lidar on-board the CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations) satellite. We focus on the African dust event occurred in late June-early July 1. Present the lidar assimilation module as well as the first results dealing with the assimilation of CALIOP observations in terms of extinction coefficient into the MOCAGE CTM.
2. Evaluate the impact of lidar assimilation on the 3D tropospheric aerosol distribution at regional scale during this large scale event. The lidar measurements from the CALIOP instrument are assimilated into the MOCAGE CTM using the variational 3D-FGAT (First Guess at Appropriate Time) method. The impact of the CALIOP extinction coefficient as-5 similation on the aerosol distribution has been evaluated using a set of independent data including AERONET (AErosol RObotic NETwork), MODIS, aircraft as well as balloon measurements.
The paper outline is as follows: In Section 2 we describe the CALIOP lidar measurements which are assimilated in terms of extinction coefficient as well as the model and the assimilation system used in this study. Section 3 presents the independent observations used for the evaluation of CALIOP assimilation: AOD observations from MODIS and AERONET as well as 10 the in-situ measurements collected during the TRAQA field campaign. Results concerning the assimilation of CALIOP lidar measurements during the TRAQA field campaign are presented in Section 4. Summary and conclusions are presented in Section 5.
2 Data and Analysis 2.1 Assimilated observations: CALIPSO/CALIOP measurements 15 The CALIPSO satellite is a partnership between NASA (National Aeronautics and Space Administration) and the French Space Agency, CNES (Centre National d'Etudes Spatiales). It was launched on April 28, 2006 with the cloud profiling radar system on the CloudSat satellite. It flew in the international "A-Train" constellation for coincident Earth observations until September 13, 2018 when CALIPSO began lowering its orbit from 705 km to 688 km above the Earth to resume formation flying with CloudSat as part of the "C-Train" (see: https://atrain.nasa.gov/). The CALIPSO satellite comprises three instruments, 20 the CALIOP lidar, the IIR (Imaging Infrared Radiometer), and the WFC (Wide Field Camera). For more information on the CALIPSO measurements, the reader could refer to the website: https://www-calipso.larc.nasa.gov/.
The CALIPSO satellite provides new insight into the role that clouds and atmospheric aerosols (airborne particles) play in regulating Earth's weather, climate, and air quality (e.g., Huang et al., 2015). CALIOP is a two-wavelength lidar that has the ability to differentiate between types of aerosols. It provides the processed backscatter signal, and the retrieved backscattering 25 and extinction coefficients. CALIOP is an elastic backscatter lidar operating at 532 nm and 1064 nm . It has generally two ascending and two descending orbits per day with a frequency of measurements that varies from day to day.
CALIOP provides continuous measurements both in space and time but with limited horizontal coverage since CALIOP has a very narrow swath. Nevertheless, it allows to have aerosol and cloud profiles with a vertical resolution of 30 to 60 m (Winker et al., 2012). 30 In this study, the quality controls of the selected aerosol profiles of CALIOP observations to be assimilated have to be consistent with the following criteria (see e.g., Cheng et al., 2019) : the measurements in each level of the vertical profile are not contaminated by clouds, the extinction coefficient must be greater than 0, and the extinction quality control flag must be equal to 0 or 1.
It should be noted that the vertical resolution of the CALIOP observations is much higher than that of the model. Before the assimilation, each CALIOP profile is adjusted to the model resolution. We first choose a vertical grid that best fits the model. This vertical grid is projected onto each profile of the CALIOP data in such a way that each level corresponds to the middle of 5 the layer. The intermediate levels are then averaged inside each layer. Thus, the profile best corresponds to that of the model while keeping as much as possible the vertical information.
We will thus see the ability of the CALIOP aerosol observations to constrain the MOCAGE model and to provide added value when assessed against independent observations.
2.2 The model and assimilation system 10 MOCAGE (e.g., Josse et al., 2004;Teyssèdre et al., 2007) is a global 3D-CTM which covers the planetary boundary layer, the free troposphere and the stratosphere. It provides a number of optional configurations with varying domain geometries and resolutions, as well as chemical and physical parametrization packages. In this study, MOCAGE is forced dynamically by wind and temperature fields from the ARPEGE (Action de Recherche Petite Echelle Grande Echelle) model analyses, the global operational weather prediction model of Météo-France (Courtier et al., 1991). It is run in a two-domain configuration 15 with a global grid of 2 • × 2 • and a smaller nested domain with a grid of 0.2 • × 0.2 • (called MEDI02) including the MB and the Sahara. The assimilation is done only on the nested domain MEDI02. The model uses a semi-Lagrangian transport scheme and includes 47 hybrid (σ, P ) levels from the surface up to 5 hPa, where σ = P/P s ; P and P s are the pressure and the surface pressure, respectively. In the boundary layer, MOCAGE has seven levels with a vertical resolution between 40 and 400 m. In the free troposphere, the vertical resolution of MOCAGE varies from 400 to 800 m. Modelled aerosol species for 20 this study are black carbon (BC), primary organic carbon (OC), desert dust and sea salt (Martet et al., 2009;Sič et al., 2015).
Biomass burning sources of BC and OC aerosols used in this study are the same used in Sič et al. (2016). They are based on a daily frequency from the Global Fire Assimilation System (GFAS) version 1.1 (Kaiser et al., 2012). These represented aerosol species are far from complete, secondary aerosol which can be the major part of the fine fraction, being lacking. This partly explains the generally observed negative biaises observed in their study. Assimilation corrects simply this bias, but possibly 25 also RMSE and correlation. The particle size distribution for each aerosol type is divided into six bins. The diameter range of different primary aerosol bins considered within the MOCAGE model is presented in Table 1. In total, we have 24 aerosol bins. Each aerosol bin is considered as a passive tracer during the model integration (including emission, transport and removal processes from the atmosphere). However, there are no physical transformations or chemical reactions between different types of aerosols/bins with gases. More details and information about the different parametrizations used within the MOCAGE CTM 30 as well as the primary aerosols can be found in Sič et al. (2015).
The assimilation system used in this study is the same as described in Sič et al. (2016). It uses the 3D-FGAT method which compares the observation and background fields at the correct time and assumes that the increment to be added to the background state is constant over the entire assimilation window. This technique has already produced good-quality results compared to independent data, especially for O 3 (e.g. Semane et al., 2007;El Amraoui et al., 2008a, b;Rabier et al., 2010;Bencherif et al., 2011), CO (e.g. El Amraoui et al., 2010;Claeyman et al., 2011b), H 2 O (e.g. Payra et al., 2016) and AOD 5 (e.g. Sič et al., 2016). This variant has the advantage that the linearised operator of the model evolution and its adjoint are replaced by the identity. The cost function of the 3D-FGAT incremental form is: B and R are the background and the observation error covariance matrices, respectively. In order to minimize the cost function more efficiently and to improve the convergence, the increment δx is transformed to : In this way the cost function becomes : i is the departure, at time t i , between the observation vector y i and its model equivalent in the observation The H operator is the tangent linear of the H operator.

15
In this formulation, there is no need for the explicit specification of the inverse matrix B −1 . Other advantages of such an approach are presented by Courtier et al. (1994).
The minimization of the cost function with the preconditioned form gives, as a result, an increment of the analysis in the space of variable v. After the minimization, it is necessary to pass into the model space again, and the increment is calculated as : More details on the assimilation algorithm are described by Pannekoucke and Massart (2008) and Massart et al. (2012).
The background error covariance matrix B a matrix can be represented as : where Σ is the diagonal matrix of the square root of the variances, and C is the positive definite symmetric matrix of horizontal 25 and vertical correlations.
The correlation matrix C contains both horizontal and vertical operators. The horizontal correlation is modelled using a twodimensional Gaussian function (Weaver and Courtier, 2001;Weaver and Ricci, 2003;Pannekoucke and Massart, 2008) with a homogenous length scale both in latitude and longitude.
The horizontal correlation (C h m,n ) between two points (m and n) separated by a distance (δ m,n ) is L x and L y are the longitude and latitude length scales in kilometres, respectively. L x = 2R e · sin α x π 360 and L y = 2R e · sin α y π 360 .
5 R e is the Earth's radius (6371.22 km) and α x and α y are the longitude and latitude length scales, respectively, in degrees.
In this study, both α x and α y are constant and fixed to 0.2 • (same as the grid resolution of the assimilated domain), which corresponds to a length scale of about 20-22 km.
The vertical correlation is modelled using a Gaussian function in terms of the logarithm of the pressure. Thus the vertical correlation (C v i,j ) between two pressure levels (p i and p j ) is as follows: The estimation of the dimensionless parameter, k, is based on the propagation shape of the increment arround the observation location (e.g., Massart et al., 2010). In the troposphere, it is found that k = 100 better characterizes the vertical correlation of the B matrix (e.g., El Amraoui et al., 2014).
In data assimilation, the covariance matrices B and R should be consistent (see e.g., Talagrand, 2003). This consistency could 15 be ensured thanks to the help of the χ 2 test (Lahoz et al., 2007;Ménard and Chang, 2000). A χ 2 value close to 1 is a good indication of the consistency of the assimilation algorithm (e.g., Talagrand, 2003). The forecast errors should be larger than the observation errors, consequently the information brought by the observations with respect to what was already known is considered as a 'signal' and the analysis should be better (see e.g., Zupanski et al., 2007). The background and observation error variances, located along the diagonal of B and R, influence the weight of the model and obser-20 vations in the cost function. In this study, following the same approach as in (Sič et al., 2016), the background and the observation error variances are specified as a percentage of the first guess field and the CALIOP lidar measurements, respectively.
Different validation exercises indicate that the CALIOP observations are situated within a range between 10% and 25% in comparison to different independent data (e.g., Liu et al., 2008;Mamouri et al., 2009;Sekiyama et al., 2010;25 Kacenelenbogen et al., 2011;Rogers et al., 2014;Zhou et al., 2017). Based on theses studies, we fixed the observation error covariance matrix R to 15%. Errors of observations are considered to be non-correlated, which means that all nondiagonal members (covariances) in the R matrix are zero. Further, based on this estimation of the R matrix, we have estimated the B matrix using the χ 2 test in order to check the consistency of the assimilation algorithm in the same way as we already did in our previous studies (see e.g., El Amraoui et al., 2014;Sič et al., 2016). The background error 30 variances, which are located on the diagonal of B matrix are found to be 30% of the background state.
Note that for a better model-observations comparison and memory optimisation, the assimilation cycle (assimilation window) is generally divided into time slots of 1 hour. During each slot, observations are read, the observation operator is run, its output field is interpolated to locations and times of the observations and compared with the observations, and the innovation vector is calculated and stored. In this study, the length of the assimilation window is the same as the time slot, consequently the cost function is minimized every hour.

Lidar assimilation
For aerosols, the modelled prognostic variable and observations are usually not the same physical quantity. In MOCAGE, the prognostic variable is the aerosol mass concentration of each bin, and the quantities that could be assimilated within this assimilation system are the aerosol optical depth and the lidar backscatter/extinction profiles. For assimilation, it is necessary to choose the control variable x (eq.3) in the way to be the best adapted to our system and its purpose. The observation operator 10 should be as simple as possible and easy to linearise.
In the literature we can find different choices for the control variable for the assimilation of different aerosol parameters. For more information about the different approaches concerning the choice of the control variable for the aerosol assimilation, the reader could be referred to Sič (2014); Sič et al. (2016).
For our assimilation system, we chose to use the 3D total aerosol concentration as the control variable as in Benedetti et al.

15
(2009). With this choice, the assimilation system is able to assimilate either AOD measurements or lidar profiles (separately or jointly). Moreover, the problem of minimization of the cost function is better determined than in the first approach, where one observation would be used to constrain 30 unknowns (bins). Also, it is better in terms of memory usage and computing performances. Still, in order to linearise the observation operator, it is necessary to make an assumption on how the analysis increment δx a will influence each bin.

20
In MOCAGE-Valentina, we keep the relative contribution of each bin constant in terms of their mass during the assimilation cycle. Bulk aerosol observations do not have any information of the contribution of different aerosol types. The validation of this approach has been done in Sič (2014), and successfully applied to AOD assimilation (Sič et al., 2016).
The information on the aerosol vertical profile can be obtained from lidar observations. Incorporating this information in MOCAGE-Valentina is an important improvement in the model. For the assimilation of lidar profiles, it is necessary to develop 25 an observation operator which links the total concentration in the model space with observed lidar quantities in the observation space.
The observation operator transforms the control variable in terms of total aerosols concentration into the lidar extinction coefficient observed quantity. First, the lidar profile observation operator within the MOCAGE-PALM assimilation system sums all individual species in order to calculate the total concentration. Second, it solves the lidar equation by taking 30 into account the contributions of aerosols, gases and Rayleigh scattering. In order to make connection between total aerosol mass and lidar observed quantities, the relative mass contributions among aerosol species and sizes (bins) are considered constant in the tangent linear and adjoint operators (during an assimilation cycle).
To calculate the increment at the end of the cycle, the same relative mass contribution determined before the assimilation is used to convert the total concentration into the all aerosols bins. The observation operator simulates measurements of an elastic backscatter lidar.
By using 3D total concentration as the control variable, we develop the system which is able to efficiently assimilate AOD and lidar profiles. The lidar quantities that could be considered and assimilated within the MOCAGE system are : the attenuated backscatter signal, the aerosol extinction coefficient and the aerosol backscatter coefficient. The theoretical concepts of the 5 observation operator as well as the tangent-linear and adjoint tests concerning the lidar assimilation are presented in detail in (Sič, 2014).
We extend our study to the lidar measurements derived from the CALIOP instrument on-board the CALIPSO satellite and we focus on the TRAQA campaign for which we have access to a wide range of data-sets comprising AOD, in-situ measurements from the aircraft and the LOAC balloon observations. We study the case of a desert dust transport from Africa to the MB. The 10 added value of the assimilation of CALIOP measurements will be assessed in terms of the improvement of the representation of the desert aerosol within the MOCAGE model during this event.
3 Independent observations used for the evaluation of assimilated fields

MODIS
The MODIS instruments on-board the two EOS (Earth Observing System) satellites Terra (since 2000) and Aqua (since 2002), 15 observe atmospheric aerosols and provide information about aerosol distribution on global coverage at horizontal resolutions of 10 and 3 km. The evaluation of CALIPSO analyses in terms of MODIS AOD is done by using Collection C61 retrievals at 550 nm from both Terra and Aqua. The MODIS data concern both the deep blue and the dark target products. For more information about the improvements of the C61 collection in comparison to the C6 collection, the reader is referred to "https://modis-atmosphere.gsfc.nasa.gov/sites/default/files/ModAtmo/C061_Aerosol_Dark_Target_v2.pdf" for the deep-blue 20 product, and to Gupta et al. (2016) for the dark target product.
The MODIS C61 version used in this comparison has a resolution of 10 km × 10 km. To fit the model resolution of 0.2 • × 0.2 • over the MB in which the CALIOP assimilation has been performed, we calculate the so-called super-observations (Daley, 1993) obtained by averaging all MODIS observations within the model grid.

25
The AERONET project is a federation of ground-based remote sensing aerosol networks. It uses CIMEL sun/sky radiometers that make measurements within the 340-1020 nm for the direct sun radiation (Holben et al., 1998). For more than 25 years, the project has provided long-term, continuous and readily accessible public domain database of aerosol optical, microphysical and radiative properties for aerosol research and characterization, validation of satellite retrievals, and synergism with other databases. AERONET measurements are available at three levels: Level 1 (unscreened), Level 1.5 (cloud screened), and Level 2   30 (cloud screened and quality assured). The network imposes standardization of instruments, calibration, processing and distri-bution. For more information about the AERONET project, the reader could be referred to "https://aeronet.gsfc.nasa.gov/". In this study, we used AERONET Level 2 (L2) data for the evaluation of model free run and AOD assimilated product.

In-situ measurements during the TRAQA field campaign
TRAQA is a scientific experiment within the MISTRALS (Mediterranean Integrated STudies at Regional And Local Scales) programme (http://www.mistrals-home.org). It was part of the preparation of the observation campaigns for the ChArMEx The objectives of the TRAQA field campaign were to study transport, ageing and mixing of the pollution occurring in the MB (see, e.g. Basart et al., 2016). The aircraft flight domain was located over the North-Western MB during the summer 2012. (see, e.g. Sič et al., 2016).
In this study, we will focus on this desert dust outbreak to evaluate the added value of the CALIOP observations within the assimilation system compared to the free model run.

Aircraft measurement: PCASP
During TRAQA, the ATR-42 aircraft was equipped with the PCASP instrument. It is an aerosol spectrometer that measures 20 the concentration and the particle size distribution of aerosols at high-frequency in 30 channels distributed over the diameter range 0.1-3 µm (Strapp et al., 1992). Additional information on the instrument, the calibration methods and the measurement errors are reported by Cai et al. (2013). In this study we use data averaged over a one-minute interval with a spatial resolution of about 8 km.

25
LOAC (Light Optical Particle Counter ; Renard et al., 2016) is an optical counter that measures the concentration in number of aerosols. It uses the two-angle diffusion aerosol measurement technique (Lurton et al., 2014;Renard et al., 2016). The LOAC used during the TRAQA campaign has 20 size classes in the diameter range between 2 and 100 µm and installed on-board meteorological balloons. The number uncertainties for LOACs are in the order of 20 and 60 % for concentrations above 1 cm −3 and for concentrations below 0.01 cm −3 , respectively. The vertical resolution of the LOAC measurements is the 30 product of the LOAC time resolution including averaging, and the ballon ascent speed. It is ranging in the troposphere between 300 and 400 m, which is in the same range as the resolution of the model MOCAGE in the free troposphere.

Assimilation of CALIOP Lidar measurements during the TRAQA field campaign
We assimilate the extinction coefficient measurements from the CALIOP instrument into MOCAGE during the TRAQA campaign period. The objective is to assess the added value of CALIOP analyses compared to the model free run. Both fields are

Performance of the assimilation
To evaluate the impact of CALIOP lidar measurements on the modelled field we analyse the behaviour of the assimilation diagnostics in terms of observation minus analysis (OMA) and observation minus forecasts (OMF). Figure 1 shows the OMF 15 and OMA histograms for all CALIOP lidar measurements in terms of extinction coefficient during the whole assimilation period (June 20-July 11, 2012). From this Figure, we notice that the OMA histogram is narrower and with its mean closer to zero than that for OMF (it means that the bias is reduced). The mean value of OMF (OMA) is 0.012 km −1 (0.0095 km −1 ) with a respective standard deviation of 0.15 km −1 (0.14 km −1 ). This indicates that the CALIOP lidar assimilated field is closer to the observations than the forecasts in terms of extinction coefficient. Note also that the bias between the observations and the model 20 field is reduced after the assimilation process. Note that the assimilation system is more efficient when OMF is negative. This likely corresponds to the lower observation values and lower observation uncertainties. In this case, the assimilation system gives more confidence to the observations. The results from this a posteriori diagnostics show that the CALIOP assimilation has improved the model field since assimilated field is globally closer to the observations than the free model field.

25
In this Section, we present a first evaluation of the extinction coefficient assimilated product with respect to the MOCAGE free run by comparing both fields with the MODIS independent observations in terms of AOD. As for the free run and the assimilated fields, MODIS observations from AQUA and TERRA corresponding to the whole day of comparison are averaged on the grid of the model. In this figure, we also show the tracks of all CALIOP orbits performed during each day of comparison. It shows that the MB region is sounded every day by 2 to 3 descendant and ascendant orbits. Table 2 shows the statistics of such a comparison for all the days of the study. It shows the correlation, the bias and the RMSE between MODIS and the model free run on one hand and between MODIS and the assimilated product on the other hand.

5
For all the comparison days, the statistics of the assimilated product are significantly improved compared to the free run field.  Many previous studies have highlighted the existence of biases between the CALIOP and MODIS observations (e.g., Kittaka et al., 2011;Redemann et al., 2012;Shikwambana and Sivakumar, 2018). The comparison between both datasets shows that MODIS AOD is generally higher than CALIOP-derived AOD (Oo and Holz, 2011). Ma et al. (2013) reported that the largest differences between CALIPSO and MODIS occurs during the active dust seasons over the major dust regions. Nevertheless, in our study the agreement between the AOD assimilated outputs and those resulting from the independent MODIS observations 25 is relatively good both in terms of quality and quantity. This shows that the assimilation of lidar observations has reduced the bias between MOCAGE and MODIS data.

Comparison to AERONET observations
In this Section, we exploit the AOD in-situ observations from AERONET to quantify the added value of the CALIOP assimilated field in comparison to the model free run. We therefore use all available AERONET AOD L2 data collected during 30 the period of study from different stations located within the assimilation domain. Figure 3 shows the location as well as the measurements. The assimilated field corrects this underestimation regarding the AOD amplitude since the agreement between CALIOP analyses and AERONET data is better than that of the free run model. Table 3 presents correlation coefficient, bias and root mean square error (RMSE) between AERONET data and the model free run in one hand, and between AERONET and 15 the assimilated field in the other hand over all the stations presented in Figure 3. Generally, the AOD derived from CALIOP analyses present better statistics than the model free run compared to the AERONET data over all the stations. The comparison between AERONET data and the model output and between the AERONET data and the assimilated product is presented in less, we note that the AODs from both the free model run and the assimilated field are overestimated for low AOD values (lower than 0.1). This is likely due to the observations from the stations located at high altitude in agreement with previous studies reporting that AOD values at high-elevation locations tend to be smaller compared to low-elevation locations 25 (e.g., Toledano et al., 2018;Wang et al., 2019).

Comparison with the aircraft in-situ measurements
In this Section, we evaluate in detail the performance of CALIOP lidar assimilated field by comparing the results of assimilation and the MOCAGE model with the aerosol concentrations from in-situ measurements on-board the instrumented aircraft.
We therefore use measurements of the PCASP instrument that was embarked on-board the ATR-42 aircraft to measure the 30 total concentration for particle diameters above 100 nm. Figure 6 shows the results of the total aerosol number concentration corresponding to the most representative flights which highlight the desert dust outbreak event over the MB already presented in Figure 2 : Flight A on June 29, 2012 from Toulouse to Corsica (Fig. 6-a), and flight B on the same day from Corsica to Toulouse ( Fig. 6-b). Figure 6-1 shows the time evolution of the aerosol number concentration over the flight period. Figure 6-2 presents the aircraft altitude over the time flight from the departure to the arrival airports. In Figure 6-3, we present the map of the total AOD averaged over the flight period superimposed by the aircraft track with the departure (D) and the arrival (A) points for each flight. Figure 6-4 is the same as Figure 6-3 but for the mean value of the desert dust AOD over the flight period instead of the total AOD (obtained from all types of aerosols within the MOCAGE model).
Flight A was performed on June 29 from 05 UTC to 09 UTC from Toulouse to Corsica. This flight coincides with the 5 beginning of the establishment of the desert dust event over southern France with incursions into eastern and north-eastern Spain and the western part of Italy (see Figure 6a-3-4). During this flight, three peaks of aerosol number concentrations were well captured by the aircraft with fairly high values throughout the flight from Toulouse to Corsica (maximum values varying between 8 cm −3 and 14 cm −3 ). These peaks were measured at altitudes between 4000 m and 5000 m. The contribution of the desert AOD to the total AOD exceeds 60% (Fig. 6-a-3 and Fig. 6-a-4). The MOCAGE free run clearly underestimates the 10 maximum values of these three peaks. However, the CALIOP assimilated field better represents the aerosol concentration peaks

Comparison with LOAC in-situ measurements
During the TRAQA campaign, the LOAC flew on-board three balloons, all launched from Martigues (5.05 • E, 43.40 • N : near 30 Marseille France). We focus on the two flights performed on June 29, 2012 within the desert dust plume. The total horizontal extent of the LOAC is quite small (∼ 15 km). This horizontal distance is smaller than the grid size of our domain of study (∼ 20 km). Therefore, we assume that the LOAC measurements represent the vertical profile of the aerosol above the launch point. The LOAC two flights are launched at two different hours of the same day, in the morning and at noon, but they flew within the same plume of desert dust (see the AOD maps on Figure 6). Figure 7 represents the vertical profile of the aerosol number concentration as deduced from the MOCAGE free run and the CALIOP assimilation both compared to the in-situ LOAC measurements of the two flights performed on June 29, 2012. The model free run well simulates the shape of the vertical profile, but for both cases, it underestimates the aerosol number concentration by a factor of 2.5 to 5 in the altitude range of 2-5 km. The assimilation of CALIOP lidar data improves this underestimation and shows a general good agreement 5 compared to LOAC measurements. The CALIOP lidar data assimilation product is closer to the LOAC measurements than that of the model free run especially in the altitude range of 1.5-5 km. This altitude range corresponds to the altitude within which the desert dust plume is transported (see section 4.6). CALIOP assimilation better simulates the shape of the profile as well as the aerosol number concentrations than the model free run.
The comparison of LOAC profiles to those resulting from the assimilation of AOD and CALIOP lidar extinction coefficient 10 observations (Figure 9 of Sič et al. (2016) for AOD assimilation, and Figure 7 of this study for CALIOP assimilation) seems to show an underestimation of the field resulting from the assimilation of the extinction coefficient of the CALIOP lidar. An explanation may be due to the fact that both MODIS AOD and LOAC measurements generally show an overestimation of aerosol concentrations compared to independent observations. Indeed, the study conducted by Shikwambana and Sivakumar (2018) highlights the overestimation of MODIS AOD compared to several datasets (e.g., CALIPSO, MERRA-2 and MISR). On 15 the other hand, the validation of LOAC measurements conducted by Renard et al. (2016) shows that the retrieved concentrations of the largest particles could be overestimated by up to 50% for particles above about 2 µm. Consequently, almost the total concentration of desert aerosols is affected by this overestimation since the majority of desert dust bins are greater than 2 µm (see Table 1).

20
In this section, we evaluate the impact of assimilating the observations from the CALIOP instrument on the desert aerosol vertical distribution.
First, we evaluate the capability of both the model free run and the assimilated field to reproduce the CALIOP observations. low values compared to the CALIOP observations especially for the measurements over the MB. However, the extinction coefficient deduced from the assimilated product shows a very good agreement compared to the CALIOP measurements over all levels of the vertical profiles. This comparison reveals that the assimilated product is closer to what CALIOP is observing compared to the product deduced from the model.
In a second step, we evaluate the added value of CALIOP observations assimilation to better represent the desert dust plume. Figure 9 shows an illustration of the impact of the assimilation of CALIOP observations on the vertical distribution of desert aerosol during the desert dust outbreak over the MB during June 29, 2012. Figure 9-a presents the same measurement orbit from the CALIOP instrument as for Figure 8-a (black and red colors).
5 Figure 9-b shows the total attenuated backscatter (km −1 sr −1 ) at the wavelength of 532 nm from the CALIOP instrument corresponding to the measurements presented in Figure 9-a. The white rectangle shows part of the orbit indicated in red color in Figure 9-a.
This part of the orbit highlights an airmass of desert dust above the MB in the altitude range between 1 and 5 km. This layer of desert dust is illustrated by relatively high values of the attenuated total backscatter from CALIOP measurements.  Figure 2). The results presented in Figure 9 illustrate once again the ability 20 of the assimilation of the CALIOP product to improve the vertical distribution of the desert aerosol.

Summary and conclusion
The aim of this paper is to present and describe the assimilation of lidar observations from the CALIOP instrument in the chemistry-transport model (  This approach has the advantage of making the problem of minimizing the cost function better determined than with other commonly used approaches (See e.g., Benedetti et al., 2009). Moreover, this approach is more adapted for the assimilation of various aerosol products such as lidar and AOD observations either independently or in synergy.
In this study, we have evaluated the added value of the assimilation of the CALIOP extinction coefficient observations to better document a desert dust transport event compared to the model free run. The CALIOP assimilation product has been evaluated against different independent datasets : AOD from MODIS (Moderate-resolution Imaging Spectroradiometer) and AERONET ( Note also that the represented aerosol species in this study do not consider the secondary aerosols which can be the main component of the fine fraction. The lack of the secondary aerosols may partly explain the negative biases generally observed in this study. 20 Space-borne aerosol lidar observations have revealed to be useful for better understanding the aerosol properties in the atmosphere (e.g., Yu et al., 2010). Particularly, the CALIOP instrument offers many opportunities to better estimate the vertical distribution of aerosols (e.g., . In this study we show that the assimilation of CALIOP lidar observations within the MOCAGE CTM allows a significant improvement in the model. We therefore get a better three-dimensional (3D) distribution of aerosols in comparison to different independent observations. 25 Despite the fact that satellite nadir-view active sensors such as CALIOP have limited spatial coverage compared to passive sensors, the global observations of aerosol vertical distribution from lidars have contributed for improving the quality of atmospheric aerosol observations (IPCC, 2013). In addition, the assimilation of the lidar aerosol products in the MOCAGE CTM has some advantages. Compared to the assimilation of AOD observations, the assimilation of lidar profiles is more straightforward and allows the introduction of direct information about the vertical distribution of aerosols into the model. This could give 30 more realistic vertical aerosol distributions. Indeed, during the lidar assimilation, the minimization is done in each level where the observation is available independently of the other levels. Even the correlation between adjacent levels is done via the B matrix, the lidar assimilation will bring modifications according to the intensity and the quality of observations in each level.
This has the advantage of better representing the different aerosol layers within the model and therefore better describing their transport process (e.g., desert dust, biomass burning, volcanic ash, ...). On the contrary, the assimilation of the AOD will tend to 35 uniformly modify the vertical profile of the model. This can induce biases, especially during extreme events. The assimilation of AOD and lidar profiles have been validated using the same versions of the model and the assimilation system. The next step will consist of making a complete comparison and a discussion about the results of both MODIS AOD and CALIOP lidar assimilations. We will particularly focus on the advantages and the limitations of each approach during a desert dust outbreak event.

5
We also plan to study the added value of measurements from passive and active probes during volcanic eruption events. This is a very important theme for Météo-France since it is one of the VAAC (Volcanic Ash Advisory Center) whose responsibility extends over a large part of Europe, Asia and Africa.
As a perspective of this work, we will consider simultaneously assimilating the observations from passive and active sensors by carrying out an initial de-biasing of both observation datasets. A much more ambitious solution will consist to assimilate 10 satellite radiances directly in a global model using an integrated approach. Assimilation of satellite radiances i.e. in numerical weather prediction assimilation systems has proved to be an essential component for improving the forecast skills, particularly for global models (e.g., Derber and Wu, 1998;McNally et al., 2000). This technique may be able to surpass some retrieval algorithms, and should provide improved results compared to data assimilation of retrieval products (e.g., Dong et al., 2007).        June 20 to July 11, 2012 for specific AERONET stations. The name as well as the coordinates (longitude and latitude) of the specified station is marked at the top of each panel. Correlation, bias and root mean square error for both the direct model and the assimilation run as compared to the AERONET data are given in Table 3.     Tropospheric CO vertical profiles deduced from total columns using data assimilation: Methodology and Validation, Atmos. Meas. Tech., 7, 3035-3057, https://doi.org/10.5194/amt-7-3035-2014, http://www.atmos-meas-tech.net/7/3035/2014/, 2014.