Optimization of Aeolus Optical Properties Products by Maximum-Likelihood Estimation

The European Space Agency (ESA) Earth Explorer Mission, Aeolus, was launched in August 2018 and embarks the first Doppler Wind Lidar in space. Its primary payload, the Aeolus LAser Doppler INstrument (Aladin) is a Ultra Violet (UV) High Spectral Resolution Lidar (HSRL) measuring atmospheric backscatter from air molecules and particles in two separate channels. The primary mission product is globally distributed line-of-sight wind profile observations in the troposphere and lower stratosphere. Atmospheric optical properties are provided as a spin-off product. Being and HSRL, Aeolus is able to 5 independently measure the particle extinction coefficients, co-polarized particle backscatter coefficients and the co-polarized lidar ratio. This way, the retrieval is independent of a-priori information. The optical properties are retrieved using the Standard Correct Algorithm (SCA), which is an algebraic inversion scheme to a (partly) ill-posed problem and therefore sensitive to measurement noise. In this work, we rephrase the SCA into a physically constrained Maximum Likelihood Estimation (MLE) problem and demonstrate predominantly positive impact and considerable noise suppression capabilities. These improvements 10 originate from the use of all available information within the SCA in conjunction with the expected physical bounds concerning the expected range of the lidar ratio. The new MLE algorithm is equally evaluated against the SCA on end-to-end simulations of two homogeneous scenes and for real Aelous data collocated with measurements by a ground-based lidar and the CALIPSO satellite to consolidate and to illustrate the improvements. The largest improvements were seen in the retrieval of the extinction coefficients and lidar ratio ranging up to one order of magnitude or more in some cases due to an effective noise dampening. 15


Introduction
Aeolus is an ESA (European Space Agency) Earth Explorer Core Mission launched on 22 August 2018 (Stoffelen et al., 2005;ESA, 2008). Aeolus' payload consists of the Atmospheric LAser Doppler INstrument (ALADIN), which is a UV high spectral resolution (HSRL) Doppler Wind Lidar operating at 355 nm wavelength (Chanin et al., 1989;Garnier and Chanin, 1992;Korb et al., 1992;Souprayen et al., 1999b,a;Gentry et al., 2000) and the first Doppler Wind Lidar in space. The primary mission 20 goal is to provide accurate global measurements of vertical wind profiles in the troposphere and lower stratosphere with global coverage each week for use in operational Numerical Weather Prediction (NWP) and scientific research. Additionally, Aeolus can contribute to the global monitoring of cloud and aerosol optical properties due to the applied aerosol HSRL method earth with off-nadir slant angle of 35°measured from the spacecraft, which accounts for approximately (37 ± 0.2)°off-nadir angle at the Earth surface due to earth's curvature. The diameter of the laser footprint is about 12-15m and the instrument field of view 19µrad. The laser light is backscattered by means of particles (aerosol and hydrometeors resulting in spectrally narrow Mie scattering) and air molecules (resulting in thermal and pressure broadened Rayleigh Brillouin scattering). The 95 frequency of the atmospheric backscattered laser light is Doppler-shifted relative to the emit frequency owing to the relative velocity of the scattering media (wind speed) along the instrument line-of-sight (LOS). The contributions of Earth's rotation and satellite movement are compensated by Aeolus' attitude control and on-ground data processing, such that the Doppler shift is entirely dominated by the LOS wind speeds. In order to measure the LOS wind speeds, Aeolus utilizes two different receiver spectrometers, namely a fringe imaging Fizeau interferometer for the spectrally narrow Mie back-scattered laser light, and two 100 sequential, double edge Fabry-Pérot interferometers for the spectrally broad Rayleigh back-scatter. Hence, continuous LOS wind speed profiles up to 30 km altitude can be measured regardless of the presence and absence of aerosol, unless optically thick features such as dense liquid water clouds block the beam. Additionally, the measured signal intensities allow a retrieval of particle optical properties.
The core of ALADIN is its diode-pumped frequency tripled Nd:YAG laser with 80mJ nominal pulse emit energy and 50Hz 105 pulse repetition frequency. The emit beam is circularly polarized, but the cross-polarized part of the backscattered light is discarded in the receive path optics due to the instrument design. Hence, in the case of strongly depolarizing targets, the signal measured at the detectors is strongly reduced with respect to non-depolarizing targets (ESA, 2008;Flamant et al., 2017).
Additionally, the expected atmospheric return signal in orbit is a factor of 2.5 to 3 lower than expected before launch, due to lower laser output energies than originally intended (45 -72 mJ) and decreased instrument transmission by about 30%, which 110 has caused a lower SNR since mission start (Reitebuch et al., 2020).
The light backscattered from the atmosphere is an attenuation of Mie and Rayleigh backscatter, which is then separated by the two instrument receivers. Figure 2 in Ansmann et al. (2007) illustrates how the spectral characteristics of the Mie and Rayleigh backscatter are exploited to seperate them with the spectrometers (channels) (Ansmann et al., 2007;Reitebuch et al., 2018a). If the LOS wind speed is non-zero, the whole spectrum will be shifted relative to the channel transmission curves due 115 to the Doppler shift. The light first travels to the Mie receiver, where the frequency narrow Mie peak (fringe) is measured to determine this Doppler shift directly. The frequency broadened Rayleigh scattered light gets reflected on the Mie channel and is then directed to the two Rayleigh channel filters. These capture different signal intensities in the two filters centred on each side of the centre emit frequency. By comparing and normalizing the responses in the two filters, the LOS wind Doppler shifts can be calculated. However, the return signals from particles (Mie) and molecules (Rayleigh) are not entirely seperated due 120 to the overlap of the transmission functions of the two channels and the sequential design of the receivers. Therefore, channel crosstalk is present, which needs to be corrected for to obtain un-biased LOS wind and optical properties.
The atmospheric echoes from the single pulses passing through the instrument receivers are collected with time gated accumulation charge coupled devices (ACCDs). The time it takes for the light to travel from the instrument, through the atmosphere and back to the receivers, is used to accumulate the light from the individual pulses over time equivalent to atmospheric ver-125 tical bins of 250 m. The same ACCDs, with a quantum efficiency of 85%, are used for both the particle (Mie) and molecular 4 https://doi.org/10.5194/amt-2021-212 Preprint. Discussion started: 6 August 2021 c Author(s) 2021. CC BY 4.0 License.
(Rayleigh) channels. In order to achieve sufficient charge build-up before read-out and digitalization, 19 laser pulses are accumulated directly on the storage columns of the ACCD. The accumulation of 19 laser pulses corresponds to an on-ground distance of about 2.9km along track and is the smallest horizontal measurement which is down-linked to Earth. The vertical resolution of the detectors can be independently varied between 250m and 2000m in steps of 250m with a total number of 130 ACCD rows and hence vertical range bins of 25. Of these range bins, one is used for solar background measurements, and one is used to sample the ground. In order to achieve the mission requirements for wind random errors, a number of measurements is added up onto coarser scale during the on-ground processing. For Rayleigh winds, a total of 30 measurements are integrated to one so-called Basic Repeat Cycle (BRC) or Observation, equivalent to approximately 87 km along track distance on ground.
Mie winds can be provided at smaller scales dependend on SNR of the aerosol feature. High SNR is also critical for optical 135 properties retrieval, particularly particle extinction coefficients. Hence, optical properties are primarily evaluated on observation scale as well and only refined afterwards in the so called group product. An example of raw signals on measurement scale (2.9 km) and accumulated onto observation scale (87 km) is given in Figure 1, which shows the test case discussed in Section 4.3.
The geolocated measurement data from the satellite, including detected Doppler shifts and the so called useful signals from 140 the two Rayleigh and the Mie channel, are provided in the level 1B (L1B) data product (Reitebuch et al., 2018a). These are the signals that have been corrected for the detection chain offset (DCO, measured for all range bins in seperate, nonilluminated pixels), dark current charge offset in memory zone (DC or DCMZ, from on-ground characterization), tripod obscuration (TOBS, in Mie channel only) and the solar background contribution (measured in range bin 25 for all channels) (Reitebuch et al., 2018a). The L1B product in combination with additional calibration data from the so-called CAL-Suite 145 processing step and meteorological information from a global weather forecast from the European Centre of Medium-Range Weather Forecasts (ECMWF), is used as input to the optical properties processor to generate the Level 2A (L2A) data product (Flamant et al., 2017). The calibration data is obtained from designated instrument modes and internal reference measurements (Reitebuch et al., 2018a). Within the L2A product, three different algorithms are implemented: The first is the so-called Standard Correct Algorithm (SCA), which makes use of the full HSRL potential and uses both channels to retrieve lidar ratios 150 directly from the data. The second is the Mie channel algorithm (MCA), which applies a Klett-like retrieval (Klett, 1981;Fernald, 1984) with an a-priori lidar ratio value (extinction-to-backscatter ratio). The third, the Iterative Correct Algorithm (ICA), intends to refine the SCA results on finer vertical scales, but fails to generate reasonable results in the nominal operation due to pronounced sensitivity to noise. Hence, given its more accurate approach and better performance, only the SCA algorithm is considered in the remainder of this paper.

Methods
The atmospheric forward model in lidar applications is based on the lidar equation (Weitkamp, 2006). In its simplest form, the lidar equation reads s(r) = KG(r)β(r)T (r) 2 .
(1) The signal power s received from distance r is made up of four factors. The constant K summarizes the signal transmission 160 through the lidar instrument, and G(r) = O(r)r −2 contains all range dependent terms regarding the measurement geometry.
The two unknown terms that contain information on the atmospheric state are the total backscatter coefficient β(r) at distance r and the atmospheric transmission 0 < T (r) < 1 that describes how much light gets lost on the way from the lidar to the target at a distance r. In the following we consider the case of the single scattering approximation, i.e., multiple scattering effects are neglected. As discussed in Flamant et al. (2008) and Ansmann et al. (2007), this is a valid assumption due to the small 165 divergence and narrow FOV of the ALADIN instrument. In case of ALADIN, the lidar equations for the two channels read with the Mie and Rayleigh channel signals s, laser pulse energy E 0 , number of accumulated pulses N p , atmospheric temperature t, atmospheric pressure p, Doppler shift f , instrumental calibration constants K ray and K mie and crosstalk coefficients 170 C 1...4 accounting for the fractions of molecular and particulate signal in the Rayleigh and Mie channel, respectively. The total backscatter coefficient has been split into molecular contribution and particle contribution β = β m + β p . Here, β p is explicitly split into the cross-polarized and co-polarized fraction β p = β ⊥,p + β ||,p , of which only the co-polarized particle backscatter coefficient is measured due to instrument design. T 2 label denotes the two way transmission with range dependent extinction coefficient α and label m for molecules and p for particles, respectively. The two unknown parameters of interest are co-polarized particle backscatter coefficient β ||,p and particle extinction coefficient α p , which can in principle be solved with the two equations because C 1 > C 2 > 0 and C 3 > C 4 > 0 holds true by instrument design. In the following, we are also making excessive use of the co-polarized lidar ratio (extinction-to-backscatter ratio) γ ||,p,i = α p,i /β ||,p,i .
Since the co-polarized particulate backscatter β ||,p is lower than the total particulate backscatter β p , the co-polarized lidar ratio 180 overestimates the true lidar ratio (γ ||,p > γ p = α p /β p ). The lidar equations can be lightened by introduction of the range resolved atmospheric signals at telescope entry with X(r) for molecular backscatter and Y (r) for particulate backscatter in units m −3 sr −1 . These resemble normalized signals 185 that would be obtained in the absence of channel crosstalk. So the lidar equations read when variable dependences are dropped for the sake of readability.

190
For a more detailed description of the SCA and a discussion of its short comings, we refer to Appendix A and Flamant et al. (2017). In the following, the index i ≤ n = 24 as subscript to the properties above denotes the range bin index. This implies for the signals s, X and Y that the property has been integrated over a discrete range [R i−1 , R i ], i.e., s ray,i =

Ri
Ri−1 s ray (r)dr. For all other variables like backscatter coefficients β, extinction coeffcients α and range R this subscript denotes the average in range bin i, i.e., β ||,p,i = 1 ∆Ri Ri Ri−1 β ||,p (r)dr with ∆R i = R i − R i−1 and equivalently for subscript m. As a consequence, 195 particle optical depth of a bin is denoted L p,i = α p,i ∆R i . The following approximations for the range corrected signals are made by using the mean bin properties from above (Flamant et al., 2017): with unknown optical depth L p,sat in between telescope and first range bin. With this and equations (7) and (8), SCA solves algebraically for the two unknowns, co-polarized lidar ratios γ ||,p,i and optical depths L p,i (and L p,sat ), which can be rephrased into / are equivalent to extinction coefficients α p,i and backscatter coefficients β ||,p,i (and L p,sat ). Backscatter coefficients are midbin backscatter and extinction coefficient profile products and lidar ratios. The SCA midbin product is averaging the SCA neighbouring bins onto a coarser resolution in order to dampen oscillations in the retrieved profiles in scenes with low signals in order to obtain a more stable product. Further details of the products and their performances is provided in section 4 and annex A.

210
The basis of MLE and OEM methods is formed by the forward model y = F (x), that maps physical properties from state space on measurement space (Rodgers, 2000). We use equations (7) to (10) in order to calculate the measurement space variables (signals) from the state space variables (optical properties) with The deviations between the actual measurement and the forward modelled state are comprised in a cost function that needs 215 to be minimized to obtain a good estimate of the underlying true state. Generally, the cost function in non-linear regression problems is composed of two terms of which the first describes deviation from the measurement and the second the deviation from some a-priori state (applied in OEM) or from another a-priori constraint (applied in PMLE), e.g., smoothness. It is also referred to as the penalty term. In 220 MLE, J is the log-likelihood function, which in the case of normally distributed measurement errors becomes the weighted least squares term with measurement error covariance matrix S y . Often the choice of Poisson noise in conjunction with the Kullback-Leibler Divergence is preferred in Lidar applications (Denevi, 2015;Marais et al., 2016;Garbarino et al., 2016;Weitkamp, 2006), 225 because photon shot-noise is the dominant noise source for the signals on the detectors, which is fairly Poisson distributed.
Here, we do not restrict noise amplitudes to the Poisson case to account for additional noise contributions, such as laser pulse frequency jitter, ACCD readout noise, dark electron/thermal noise contribution and potentially unknown 'noise' sources such as atmospheric variability. The description of J obs in terms of normal distributions is not a critical aspect of the method, because the Poisson noise distribution becomes indistinguishable from a Gaussian given sufficient signal accumulation (central limit theorem). For the time being, no a-priori term/constraint contributes to the cost function used throughout this work, although limits will be imposed on the state space variables by solving specifically the box-constrained MLE problem with box-constraints on lidar ratio and optical depth. The state that solves this minimization problem (14) is denoted x * .
In practice, the minimization problem (14) over all N lidar profiles with index k. This is equivalent to the ensemble of the seperate minimization problems because the k-th cost function is strictly positive and only sensitive to changes of the k-th state vector. So the collection of state vectors that 240 minimizes the sum of the cost functions has to minimize each summand independently. This way, we gain the freedom to use the inherent 2D information of the lidar signal in future developments, e.g., to couple neighboring profiles by the introduction of a regularization term to (15) that acts on the horizontal direction along the satellite orbit (Marais et al., 2016;Xiao et al., 2020).

245
Principally, extinction and backscatter coefficients (α p , β ||,p ) can be chosen as state space description, as well as lidar ratio and backscatter (γ ||,p , β ||,p ) . Though, the reason to favor a state space description containing the lidar ratio is that such states can be easily constrained to physical bounds by forcing upper and lower ranges within the retrieval. Within the (α p , β ||,p )description, the lidar ratio constraint would become non-linear and harder to handle by off-the-shelf tools for numerical optimization. An equally valid choice of the measurement space variables y are the crosstalk-corrected signals X i and Y i instead of 250 the Rayleigh and Mie channel signals s ray,i and s mie,i , but then the measurement covariance matrix S y would show off diagonal entries due to the linear transform in (A4-A5). The above choice has been made for the sake of simplicity and to lighten the cost function and its gradients, since the signals in seperated vertical range bins are expected to be uncorrelated except for a small range bin overlap (Weiler, 2015). This accounts for ±120m altitude for Mie ACCD and ±30m for Rayleigh ACCD in nominal operation and regardless of range bin settings. As in SCA, this overlap will not be considered in the following.

255
As pointed out by Povey et al. (2014), unbiased uncertainty estimates are a prerequisite to obtain good results. This is an issue for low signal intensities when estimating the Poisson uncertainty from the uncertain signal itself, i.e.,σ s = √ s + ε s with proxy for standard deviationσ s , true signal s and actual Poisson noise value ε s : The uncertainty will be biased by the exact noise value in the signal. In Aeolus L1B data products, the Poisson noise assumption is applied to calculate signal-to-noise 260 ratio (SNR) in both signal channels, including the solar background contribution. But here, we use instead the scaled variance of the signals on measurement scale, at 2.9 km, to approximate the noise level in the bins on observation scale, at 87 km, under the assumption of a homogeneous scene, see Appendix B.
The choice of the upper and lower lidar ratio bounds takes the following points into consideration: On one hand, the true 265 lidar ratio at 355nm is expected to exceed values of 100 sr only in rare cases and a physically lower bound might be presented by approximately 10 sr, see Fig. 8 in Illingworth et al. (2015) or (Wandinger et al., 2015). On the other hand, the coarse vertical resolution and the effects of depolarization need to be accounted for. Therefore, highly depolarizing aerosol, such as desert dust and ice particles in cirrus clouds, will appear highly attenuating, since the co-polarized lidar ratio can be artificially increased by a factor up to 1.85 for desert dust and 3 for cirrus clouds compared to the true lidar ratio. Taking into account typical lidar 270 ratio to depolarization distributions as in Wandinger et al. (2015); Illingworth et al. (2015), values as high as 130 sr will be well within physical limits for Aeolus. Additionally, as shown in Flamant et al. (2017), different hypothesis on the distribution of aerosol layers within a range bin can easily alter the true particle optical depth retrieval results by a factor of 16. As the same systematic errors apply to co-polarized particle backscatter coefficients, less variability is expected for the co-polarized particle lidar ratio estimates. However, the applied bounds need to account for the forward model errors by an extra margin. All things 275 considered, the limits of 2 sr to 200 sr resemble a reasonable trade-off.
The minimization problem (15) is solved approximately by the L-BFGS-B algorithm, a limited-memory quasi-Newton code for bound-constrained optimization. More precisely, the implementation described in Zhu et al. (1997), version 3.0, is used.
As an addition, the cost function gradient is evaluated efficiently via automatic differentiation. The initial conditions, or the 280 first guess, consist of aerosol free atmosphere with L p =0 and a lidar ratio of γ ||,p = 60 sr. Since the L-BFGS-B algorithm is not invariant under variable transforms, it was necessary to introduce an additional scale parameter in the state vector for good convergence rates of the cost function. The applied transformation maps L p → 200L p in the state vector to ensure that all entries are about the same order of magnitude. More advanced variable transforms such as pre-whitening of the variables (Rodgers, 2000) may be appropriate to optimise performance, but the proposed rescaling is found to be sufficient.

285
MLE estimates usually suffer from overfitting and noise amplification. Therefore, an implicit regularization is often achieved by optimal choice of the number of iterations (Denevi, 2015;Garbarino et al., 2016). But we supress noise amplification by the lidar ratio bounds. So, the L-BFGS-B iteration is stopped after a predefined number of iterations of 40.000, after which the average cost function value per bin is required to be smaller than 1. Usually, this holds true after much less iterations. But in the spirit of the SCA algorithm, the estimate should fit as close as possible to the signal data and only solve the physical contra-290 dictions. Hence, a fair comparison to the standard algorithms is achieved only without an implicit regularization. Furthermore, if the MLE can demonstrate its potential even in unfavorable conditions, this will be a strong argument for its usefulness.
In the following, two methods are used for error quantification of the estimated state vector x * .
A Monte-Carlo approach is applied to classify the uncertainties in simulation results (Xiao et al., 2020). For this, a sufficient 295 number of realizations of the measurement vector y obs is generated from the simulation scene. The variation of the SCA and the best possible results for simulation cases, but are not feasible on observation data.
The errors on observation data are estimated from a sensitivity analysis around the solution, similar to standard error propagation in spirit. Therefore, the forward model F (x) is linearized about the solution point to obtain the matrix equation with Jacobian K. This relation is reversed by the Moore-Pennrose pseudoinverse to obtain a generalized inverse K −1 and hence the sensitivity of the state estimate under changes in the measurement, see Appendix C. With this, the retrieval error covariance matrix S x * is calculated and rescaled in accordance with the lidar ratio constraint, see Appendix C.

305
The algorithms are validated in simulations, which are processed with the end-to-end simulator described in Reitebuch et al. (2018b), which allows realistic simulations of ALADIN data downlinked from Aeolus. Its output data are provided in the same format and temporal and spatial resolution as nominally downlinked from the satellite in order to test the whole processing chain up to the optical properties delivered in the L2A product. The simulation covers the charge transfer and detection on the accumulation charge coupled device (ACCD) including offsets, non-linearity and noise sources such as dark current noise, read-310 out noise, Poisson detection noise (shot noise) and the analog-to-digital conversion with 16 bit. Although perfect agreement with the instrument on board cannot be achieved, the simulated noise level approximately resembles nominal operations.

Results and Discussion
In this section, the two versions of the SCA and the MLE algorithms are tested on synthetic and real Aeolus observation test cases, and their performances are discussed. The simulated data are produced with the Aeolus end-to-end simulator described 315 in (Reitebuch et al., 2018b), which allows realistic simulations of ALADIN measurements from defined atmospheric scenes as input to the L1B algorithm. Its output data are provided in the same format and temporal and spatial resolution as nominally downlinked from the satellite in order to test the whole processing chain up to the optical properties delivered in the L2A product. The simulation covers the charge transfer and detection on the accumulation charge coupled device (ACCD) including offsets, non-linearity and noise sources, such as dark current noise, read-out noise, Poisson detection noise (shot noise) and In the following sections, the results from each test case are presented.

Atmospheric Simulation Case I
The simulated Aeolus observations from this test case is obtained from the End-to-End simulator and contains horizontally 330 homogeneous aerosol with a constant lidar ratio of 25 sr and the calibration data is known from seperate simulations of Aeolus' calibration modes. A standard atmosphere temperature and pressure distribution is used for the simulation of the molecular backscatter with altitude and the wind speed is zero for simplicity. The simulated atmospheric scene contains aerosol up to 40 km altitude, paticularly above Aeolus' range of about 0-20 km. This aerosol attenuates the useful signals by additional 0.9%, but no impact is expected on the optical properties since all algorithms allow for a constant attenuation factor.

335
The retrieval results for the simulation is shown for the SCA, SCA midbin and MLE algorithms in Fig. 2. Please note the logarithmic scales of the colorbar. The medium aerosol load in the input atmosphere with backscatter coefficients on the order of 1 to 10 Mm −1 sr −1 below 2 km is captured well by all algorithms (rows 1 and 2 of Fig. 2). However, clear background noise patterns are visible in the thin aerosol regime with particle backscatter coefficients of order 0.1 Mm −1 sr −1 above 2.5 km for all retrievals, as expected due to low Mie channel SNR (rows 1 and 2 of Fig. 2). Hence, there are many noise induced negative 340 values in SCA and SCA midbin retrievals. Additionally, SCA and SCA midbin backscatter profiles are shown to be biased in this regime. The MLE mean profile resembles the true backscatter profile, mitigates negative values and consistently shows smaller standard deviation as SCA and SCA midbin.
The coupled retrieval in the MLE unfolds its whole potential in the extinction retrieval (rows 3 and 4): The curtain plots for SCA and SCA midbin do not only suffer from intense background noise, but also from negative values within the dense 345 aerosol close to the ground, whereas the MLE achieves a much more robust retrieval when compared to the simulation input.
The mean profiles unveil high biases in the SCA case: Especially changes in vertical range bin thickness impose a challenge and are followed by extinction overestimation, due to the feedback between zero-flooring of extinction coefficients within the processing and decreased SNR in thin bins. A detailed explanation of the noise influence on the SCA extinction retrieval can be found in Flament et al. (2021). SCA midbin is less biased, despite a spurious oscillation at about 10 km altitude, but to 350 the price of high noise and lowered resolution. The MLE retrieves only slightly biased extinction coefficients over the whole profile with standard errors up to a magnitude smaller than SCA midbin product. Only the extinction in the bin closest to the ground appears overestimated, likely due to the diminishing influence of lowermost optical depth on the cost function: Here, the generalized averaging kernel K −1 K deviates from the identity matrix and suggests that this value cannot be well retrieved (see Appendix C).

355
The Lidar ratios in row 6 (lowest panel of Fig. 2) are calculated from mean(α p )/mean(β ||,p ) and median(α p )/median(β ||,p ) of rows 2 and 4 and their respective standard deviation. Otherwise, the first guess of γ ||,p = 60 sr would contaminate the statistics for MLE, e.g., mean(α p /β ||,p ) would be biased towards the first guess. The lidar ratio results (rows 5 and 6) indicate that the MLE is most robust, as it is the only algorithm for which the lidar ratio statistics converge to the true value of 25 sr everywhere. close to 60 sr in scenes with very low signal where the algorithm is not converging. The SCA and SCA midbin produces wither very high (yellow to red values) or zero or negative (black) values in these cases.

Atmospheric Simulation Case II
In simulation case II, the aerosol profile is the same as for case I, but in addition a cloud is placed between 8.5 and 10.5 km with an optical depth of 0.4, i.e., the return signals from below the cloud are more than halved by a slant two way transmission 365 of T 2 cloud ≈ 0.38, see Fig. 3. The discontinuity of the cloud (abrupt changes in optical properties) introduces some artifacts into the retrieval results as can be seen in Fig. 3: Most prominently, the vertical extent of the cloud is overestimated by SCA midbin due to the averaging of neighbouring bins onto a coarser resolution (rows 1 to 4). Hence, the cloud thickness appears to be 4 km compared to 3 km in SCA and MLE and 2 km in simulation input parameters. Furthermore, the SCA extinction coefficient reacts delayed compared to the backscatter coefficient, leading to that the attenuation by the first 500m of the cloud is not 370 captured correctly with consequences for lidar ratio estimation (row 3 and 4 in Fig. 3). The SCA extinction coefficient values below the clouds are biased high due to the abrupt change in signal error. Hence, the same feedback loop as described in the previous simulation case I for changes of range bin thickness is triggered.
The curtain plots of the SCA and SCA midbin optical properties (rows 1,3 and 5) reveal more noise induced negative values (black) below the cloud and MLE provides the most reliable backscatter estimates below 2 km altitude (row 1 and 2). The 375 statistics below the cloud are heavily impacted by the signal loss: Extinction coefficient estimates below 2 km are found to be biased high in SCA and SCA midbin results and so MLE returns the best fit to the simulation input parameters (row 4). The lidar ratio results (rows 5 and 6) underline that, although the dense aerosol below 2 km is in parts captured by all algorithms, MLE returns the most accurate estimates with highest precision. Hence, the potential of MLE to cope with highly noisy data compared to the standard approaches is well demonstrated. Again it should be noted that the MLE lidar ratio remains at 60 sr 380 in scenes with very low signal where the algorithm is not converging. The SCA and SCA midbin produces wither very high (yellow to red values) or zero or negative (black) values in these cases.
In summary, the coupled MLE locks extinction and backscatter to appear colocated and so consistently outperforms the standard approaches in terms of both, accuracy and precision. These are graphically summarized for the interested reader in the Appendix D in terms of bias and relative error, respectively.

Real data case I: Classifying a Saharan Air Layer with Aeolus
In this section, the algorithms are tested and compared using real Aeolus observations from June 30 2020, during the Aeolus satellite overpass close to Cape Verde between 7:30 and 7:40 UTC. The satellite passed above an extended Saharan Air Layer (SAL, Prospero and Carlson (1980)) containing significant amounts of desert dust. The spatial extent of this dust layer is visible in observations of the UV Aerosol Index reported by the Copernicus Sentinel-5p TROPOMI instrument (ESA, 2018) between    Calipso FM results show a partly lofted aerosol plume that is classified as desert dust in the latitude band 10°N to 20°N, which also compares well with the TROPOMI observations in Fig. 4a. The area towards the equator is partly attenuated by a high ice 400 cloud between 10 km to 16 km altitude, which was possibly on-top of a convective cloud tower. The dust plume likely extended below as more dust is reported by the CALIPSO FM close to the equator. In the planetary boundary layer (PBL) low, broken clouds are visible in the FM. Additionally, patches of clouds are visible, e.g., at about 5 km to 6 km height and 5°N. The optical properties processing results of the Aeolus signals show similar features, except for the high cloud, which was not present during the Aeolus morning overpass. It should also be noted that convective activity is low in the morning and at its maximum in 405 the early afternoon. Pronounced background noise patterns are present in SCA and SCA midbin backscatter coefficients (noise magnitude about 1 Mm −1 sr −1 ) and extinction coefficients (noise magnitude about 30Mm −1 ), very similarly to the simulation cases I and II. Therefore, SCA and SCA midbin show the dust plume only with little contrast and it might be horizontally and vertically disrupted by noise. What is likely clouds is visible by high return signal (yellow spots in backscatter) and low lidar ratios (dark blue color). In these bins with high signal-to-noise ratio, SCA and MLE give very similar results, as expected. SCA 410 midbin suffers from the lower vertical resolution and significantly enhances the cloud and aerosol plume layer thickness, both in the PBL and at about 5 km height, but reconstructs fairly homogeneous extinction coefficients where the plume is located.
Much higher contrast to the background is achieved with MLE, so the plume top and plume bottom heights can be inferred along the latitude. The plume detected by Aeolus algorithms agrees well with the CALIPSO FM results despite of Aeolus' coarse resolution and possible representativeness errors due to the difference in time and space between the observations by 415 the two satellites. Only MLE captures the partly lofted nature of the plume north of the co-location point (white line). Furthermore, MLE achieves more homogeneous results along-track for all properties, which demonstrates robustness and consistency, because no such smoothness constraint has yet contributed to the reconstruction.
Care has been taken in determination of the co-polar lidar ratio of the dust in Fig. 5. All bins that lie within the white rectangle in Fig. 4b have been considered for the estimates, due to the fairly homogeneous appearance in MLE and CALIPSO 420 FM. It is not reasonable to directly average the lidar ratio over all bins, as our convention would bias the result: Often, the backscatter-to-extinction ratio (BER) is reported instead, but different conventions produce different results, because α p /β ||,p = β ||,p /α p −1 with mean . . . . Hence, we first average backscatter and extinction coefficients over the box and report an unambiguous value for lidar ratio, namely  Fig. 4, this demonstrates that the presented MLE approach allows for more robust aerosol classification based on the lidar ratio estimates.

Real data case II: Ground-based Validation
On 9th November 2019, Aeolus passed close to a ground based, remote-controlled multiwavelength-polarization-Raman lidar 435 (Polly) in Tel Aviv at 3:50 UTC which is part of PollyNET Engelmann et al., 2016). This network consists of several such automated Polly lidars for automated and continuous 24/7 observations of clouds and aerosols around the world.
As such, they can also provide vital input for calibration and validation activities of Aeolus' optical properties observations.
The presented ground based lidar data in Fig. 6 has been accumulated over the time 2:41 to 3:40 UTC. The distance to the centers of the two closest Aeolus observations (which are 87 km averages along the satellite orbit) are 68 km and 84 km. A 440 special range bin setting is operational in this area: The altitude range between 2 km and 4 km is divided into 8 range bins of 250 m width to detect lofted dust at highest possible instrument resolution. Thus, the SNR in this layer is smaller than usual, which provides a challenging test case for validation of MLE against the standard approaches. The atmospheric scene was characterized by a temporally stable aerosol layering. For the entire night over Tel Aviv, a PBL with high backscatter values (3-6 Mm −1 ) up to 1 km was observed while above moderate backscattering ( about 2 Mm −1 ) up to 3.5 km was present.

445
According to the PollyNET target categorization (Baars et al., 2017), both layers were identified as a mix of pollution and dust with particle linear depolarization ratio values of about 10% and lidar ratio values between 40 and 50 sr (at 355 nm).
The optical properties from SCA, SCA midbin and MLE processing are overlaid in Fig. 6. The results of backscatter coefficients appear disrupted by noise for SCA and SCA midbin: Hence, negative or missing values are present in all profiles within the aerosol layer below 3.5 km but for MLE processing. The MLE is also able to retrieve the vertically coherent Aerosol layer in 450 good agreement with the ground truth (Raman method at 355nm, Ansmann et al. (1992)). Additionally, MLE results for both independent Aeolus observations agree well with each other and show much lower variability than the SCA and SCA midbin results. All aspects considered, the good agreement with the ground case and the consistent retrieval in both neighboring columns are strong indicators that the MLE successfully supressed noise.
The extinction retrieval is challenged by the lowered SNR in the fine vertical range bin setting, but MLE achieves the most 455 reasonable results closest to the ground truth and supresses outliers above 3.5 km. After all, the coherent aerosol layer cannot be well located by using extinction coefficients alone due to too fine range bin settings between 2 km and 4 km that cause high noise amplitudes. Still, the averaged lidar ratio is calculated from the altitude range 1.5 km to 3.5 km and both observations, as suggested in equation (17). The results read SCA : (75 ± 98) sr, SCA midbin : (73 ± 31) sr and MLE : (51 ± 11) sr, so MLE retrieves the most accurate and precise result compared to the ground truth value between 40 and 50 sr. It should be stressed 460 that these errors represent lower error bounds calculated from the Poisson assumption on measurement noise.

Conclusions
The optical properties retrieval within the Aeolus Level 2A aerosol optical properties data product has been rephrased into a MLE problem and was successfully implemented and tested. The evaluation of the new MLE retrieval revealed predominantly positive impact: It is demonstrated that the precision of the MLE outperforms SCA and SCA midbin algorithms in synthetic 465 homogeneous scenes and, additionally, reduces biases in the mean statistics. Furthermore, these trends have been confirmed on two cases of real Aeolus data. All cases consistently indicate that MLE is particularly advantageous for estimation of extinction coefficients and co-polarized lidar ratios. However, also the precision of co-polarized backscatter coefficients increased overall as well by application of MLE. This is mainly due to the introduction of lidar ratio constraints that force particle extinction to appear only where there is particle backscatter along the atmospheric column and vice versa. With this, anti-correlated noise in appear inhomogeneous (in the signals) when parts of the profiles have been attenuated above. Therefore, in order to estimate aerosol optical properties on finer scales, it is more elegant to apply a suited regularization within the retrieval problem itself, i.e., to add an additonal regularization term in the cost function (see equation 15). Shcherbakov (2007) applied a Thikonov smoothness constraint in the vertical direction, but stressed the need for a regularization that favours piecewise constant func-485 tions. Marais et al. (2016) and Xiao et al. (2020) have therefore made use of the Total Variation regularization in HSRL lidar ratio retrieval along both, horizontal and vertical directions. This is particularly useful to retrieve piecewise constant functions and, hence, the optimal choice for the lidar ratio variable, when the atmosphere is assumed to consist of patches of the same aerosol species. In summary, such regularization is principally able to merge (i) the problem of defining a feature mask from raw lidar data and (ii) the optical properties retrieval in low SNR conditions. Eventually, we propose to extend the current MLE 490 approach by a Total Variation regularization term in order to enable robust optical properties retrievals with Aeolus on finer scales. This could be attempted in future versions of the MLE algorithm.
like backscatter coefficients β, extinction coeffcients α and range R this subscript denotes the average in range bin i, i.e., β ||,p,i = 1 ∆Ri

Ri
Ri−1 β ||,p (r)dr with ∆R i = R i − R i−1 and equivalently for subscript m. As a consequence, particle optical depth of a bin is denoted L p,i = α p,i ∆R i and the particle one way transmission becomes with unknown optical depth L p,sat in between telescope and first range bin. The following approximations for the range 505 corrected signals are made by using the mean bin properties from above (Flamant et al., 2017): With this, equations 7 and 8 can be rephrased to This way, the signals in Rayleigh and Mie channel are expressed as functions of aerosol optical properties proxies γ ||,p,i and L p,i (and L p,sat ) solely, which can be rephrased into / are equivalent to α p,i and β ||,p,i (and L p,sat ). Temperature, pressure 515 and Doppler shift (HLOS wind) are assumed known a priori by means of ECMWF medium range weather forecast. Hence, the molecular optical properties, i.e., all variables with subscript m can be inferred with sufficient accuracy as well. The standard SCA solves equations (A2) to (A5) exactly in a recursive scheme beginning with default initial conditions in the first range bin to correct for non-zero optical depth above the measurement volume. More precisely, backscatter is calculated directly from And optical depth / extinction is retrieved independently from the profile of X i . Afterwards, the slant optical depth is transformed into the nadir optical depth by a trigonometric correction factor from which extinction is directly calculated.
But the SCA algorithm suffers from several shortcomings. Any input data is assumed noise-free and an exact solution of the ill-posed extinction retrieval problem is calculated. Among the former strategies to dampen the noise within SCA are averaged extinction estimates over vertically neighbouring range gates, so-called mid-bin properties, and zero flooring, whenever 525 negative extinction values would be obtained in the regular case. These SCA midbin properties are also denoted by SCA MB in this work. Both strategies are ad-hoc but averaging is believed to introduce less bias than flooring, because noise is cut off only in one direction by the latter. Furthermore, the signal noise properties can change abruptly from one range bin to another, due to the varying vertical range bin heights (250 m to 2 km). Hence, if this varying reliability of the signals is not taken into account, biases or oscillations can potentially be triggered whenever range bin heights change. Additionally, extinction 530 is retrieved independently from backscatter, after the linear equation system in (A4) and (A5) has been solved via standard methods. But due to the linear transform, the errors in properties X i and Y i are found to be highly anti-correlated. To illustrate this, one can solve equations (A4) and (A5) to obtain from error propagation that ε X,i =C 3 ε ray,i −C 2 ε mie,i (A7) ε Y,i = −C 4 ε ray,i +C 1 ε mie,i (A8) 535 with noise terms ε and rescaled coefficientsC 1...4 > 0. Accordingly, the noise from the Mie and Rayleigh channels contribute to the noise in X i and Y i with alternating signs so that the negative cross-correlation ε X,i ε Y,i < 0 is obtained. Thus, whenever a negative extinction coefficient is found in bin i, this indicates a spurious estimate of the backscatter coefficient as well (although it might appear to be well in physical limits) and vice versa. This additional knowledge is not accounted for within the processing and, hence, low SNR in one of the signal channels has the potential to deteriorate both backscatter and extinction 540 estimates. E.g., the contribution from the Mie channel in particle free atmosphere is pure noise, which affects the cross-talk corrected molecular signal X and consequently the extinction retrieval as well.

Appendix B: Uncertainty Estimates from Measurement Variability
The lidar data on 2.9 km measurement scale (indexed m) within an observation is horizontally accumulated onto observation scale of 87 km before the analysis, i.e., This estimate will cover all additionally known and unknown noise sources and coincides with the Poisson noise hypothesis in case that the measurement scale signals are truly Poisson distributed. The measurement error covariance S y comprises the terms σ 2 s for both channels, Mie and Rayleigh, and all range bins on its diagonal.

555
The errors on observation data are estimated from a sensitivity analysis around the solution, similar to standard error propagation in spirit. Therefore, the forward model F (x) is linearized about the solution point to obtain the matrix equation with Jacobian K. In an optimal estimation method (OEM) framework, this relation is inversed by the gain matrix G, which contains both the inverse measurement and inverse a-priori covariance matrices (Rodgers, 2000). The influence of the latter 560 diminishes to zero in the MLE limit of infinite a-priori variances. I.e., no a-priori knowledge is imposed, the cost function reduces to the MLE case and G = K −1 . In general, K will have some zero singular values so that some state vectors cannot be inferred from the measurement. Disregarding these states, this relation can be reversed by the Moore-Pennrose pseudoinverse to obtain a generalized inverse K −1 and hence the sensitivity of the state estimate under changes in the measurement. Thus, the equivalent to the OEM averaging kernel A = GK for the presented MLE reads A = K −1 K = I with identity matrix I.

565
This generalized averaging kernel can indicate spurious components of the state vector: Wherever A deviates clearly from the identity matrix, values will be flagged. But no information on spatial resolution can be inferred from A as opposed to the OEM.
With this, the uncertainty of MLEs retrieved state is phrased in terms of the covariance matrix assuming that the solution x * is the true state in the last equation. Hereafter, the bounds can be accounted for as follows:

570
The forward model is rephrased into the (α p , β ||,p )-formulation (see earlier) and the diagonal elements of S x * are denoted as variances σ 2 , which form a vector of standard deviations σ. Now the inequality σ α,i < γ max σ β,i is used to constrain the usually overestimated particle extinction error in accordance with the box constraints. Similarly, σ β,i < σ α,i /γ min constrains the particle backscatter error but is usually met by the results already. After this re-scaling of the diagonal elements, the offdiagonal elements of S x * can be re-normalized accordingly under the premise that the corresponding correlation matrix R x * 575 remains unchanged. These operations are summarized by

Appendix D: Biases and Errors in the Retrieval of the Simulation cases
The following figures provide dedicated information about biases and relative errors in the optical properties retrieval of the two simulation test cases.
Author contributions. Frithjof Ehlers is the main author of this paper. He has developed the MLE implementation and has tested it together with the SCA algorithms on the test datasets presented here. He has also performed the scientific analysis. The colleagues at Météo France