Comparing scattering ratio products retrieved from ALADIN/Aeolus and CALIOP/CALIPSO observations: sensitivity, comparability, and temporal evolution

. 10 The spaceborne active sounders have been contributing invaluable vertically resolved information of atmospheric optical properties since the launch of CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation) in 2006. To ensure the continuity of climate studies and monitoring the global changes, one has to understand the differences between lidars operating at different wavelengths, flying at different orbits, and utilizing different observation geometries, receiving paths, and detectors. In this article, we show the results of an intercomparison study of ALADIN (Atmospheric Laser Doppler 15 INstrument) and CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization) lidars using their scattering ratio (SR) products for the period of 28/06/2019−31/12/2019. We suggest an optimal set of collocation criteria (Δdist < 1º; Δtime < 6h), which would give a representative set of collocated profiles and we show that for such a pair of instruments the theoretically achievable cloud detection agreement for the data collocated with aforementioned criteria is 0.77±0.17. The analysis of a collocated database consisting of ~78000 pairs of collocated nighttime SR profiles revealed the following: (a) in the cloud- 20 free area, the agreement is good indicating low frequency of false positive cloud detections by both instruments; (b) the cloud detection agreement is better for the lower layers. Above ~7 km, the ALADIN product demonstrates lower sensitivity because of lower backscatter at 355 nm and because of lower signal-to-noise ratio; (c) in 50% of the analyzed cases when ALADIN reported a low cloud not detected by CALIOP, the middle level cloud hindered the observations and perturbed the ALADIN’s retrieval indicating the need for quality flag refining for such scenarios; (d) large sensitivity to lower clouds leads to skewing 25 the ALADIN’s cloud peaks down by ~0.5±0.4 km, but this effect does not alter the polar stratospheric cloud peak heights; (e) temporal evolution of cloud agreement quality does not reveal any anomaly for the considered period, indicating that hot pixels and laser degradation effects in ALADIN have been mitigated at least down to the uncertainties in the following cloud detection agreement values: 61±16%, 34±18% 24±10%, 26±10%, and 22±12% at 0.75 km, 2.25 km, 6.75 km, 8.75 km, and 10.25 km, respectively.


Introduction
Clouds play an important role in the energy budget of our planet: optically thick clouds reflect the incoming solar radiation, leading to cooling of the Earth, while thinner clouds act as "greenhouse films", preventing escape of the Earth's long-wave radiation to space. Climate feedback analyses reveal that clouds are a large source of uncertainty for the climate sensitivity of climate models and, therefore, for the predicted climate development scenarios (e.g. Nam et al., 2012;Chepfer et al., 2014;35 Vaillant de Guélis et al., 2018). Understanding the Earth's radiative energy budget requires knowing the cloud cover, their geographical and altitudinal distribution, temperature, composition, as well as the optical properties of cloud particles and their concentration.
Satellite observations have been providing a continuous survey of clouds over the whole globe. IR sounders have been observing our planet since 1979: from the TOVS (TIROS Operational Vertical Sounder) instruments (Smith et al., 1979) 40 onboard the NOAA polar satellites to the AIRS (Atmospheric InfraRed Sounder) spectrometer (Chahine et al., 2006) onboard Aqua (since 2002) and to the IASI (Infrared Atmospheric Sounder Interferometer) instrument (Chalon et al., 2001;Hilton et al., 2012) onboard MetOp (since 2006), with increasing spectral resolution. Despite an excellent daily coverage and daytime/nighttime observation capability (Menzel et al., 2016;Stubenrauch et al., 2017), the height uncertainty of the cloud products retrieved from the observations performed by these spaceborne instruments is limited by the width of their channels' 45 contribution functions, which is on the order of hundreds of meters, and the vertical profile of the cloud cannot be retrieved with accuracy needed for climate feedback analysis. This drawback is eliminated by active sounders, the very nature of which is based on altitude-resolved detection of backscattered radiation, and the vertical profiles of the cloud parameters are available from the CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization) lidar (Winker et al., 2003) and CloudSat radar (Stephens et al., 2002) since 2006, CATS (Cloud-Aerosol Transport System) lidar on-board ISS provided measurements for 50 over 33 months starting from the beginning of 2015 (McGill et al., 2015). The ALADIN (Atmospheric Laser Doppler INstrument) lidar on-board Aeolus (Krawczyk et al., 1995;Stoffelen et al., 2005; ADM-Aeolus Science report, 2008) has been measuring horizontal winds and aerosols/clouds since September 2018. More lidars are plannedin 2023, the ATLID (ATmosperic LIDar)/EarthCare instrument (Héliere et al., 2012) will be launched and other space-borne lidars are in the development phase. Even though all active instruments share the same measuring principlea short pulse of laser or radar 55 electromagnetic radiation is sent to the atmosphere and the time-resolved backscatter signal is collected by the telescope and is registered in one or several receiver channels, the wavelength, pulse energy, pulse repetition frequency (PRF), telescope diameter, orbit, detector, and many other parameters are not the same for any given pair of current or future instruments. These differences are responsible for the active instruments' capability of detecting atmospheric aerosols and/or hydrometeors for given atmospheric scenario and observation conditions (day, night, averaging distance). At the same time, there is an obvious 60 need of ensuring the continuity of global spaceborne measurements and obtaining a seamless transition between the satellite missions (Chepher et al., 2018). 3 This works seeks to address this issue using ALADIN/Aeolus spaceborne wind lidar operating at 355 nm and CALIOP/CALIPSO atmospheric lidar operating at 532 nm. Even though the main goal of ALADIN is wind detection (Reitebuch et al., 2020;Straume et al., 2020), the calibration of which does not rely on absolute calibration of the detected 65 radiation, its products include atmospheric optical properties and such a comparison serves the intercalibration purposes. In addition, the methods developed in the course of this study, and the interpretation of the results will set the stage for the future validation of the ATLID/EarthCare instrument and other spaceborne lidars.
The structure of the article is as follows. In Section 2, we describe the datasets used in this study, explain the collocation criteria, and provide an estimate of the best possible theoretically achievable agreement for two instruments in given 70 configuration. In Section 3, we strive to provide a multifaceted view of the collocated dataset and discuss the observed differences. Section 4 concludes the article.

Datasets and methods
We start this section with the description of ALADIN/Aeolus optical properties dataset followed by the description of CALIOP/CALIPSO product and its modification aimed at matching the sampling and averaging of Aeolus product. In the next 75 steps, we define the procedures and criteria for the comparison of these two products.

AEOLUS
A detailed description of the Aeolus mission and its instrument can be found in (Krawczyk et al., 1995;Stoffelen et al., 2005;ADM-Aeolus Science report, 2008;Flamant et al., 2017) and here we provide only a brief description of the lidar and the details necessary for understanding the key differences between the compared instruments. The Aeolus satellite carries a 80 Doppler wind lidar called ALADIN, which operates at 355 nm wavelength and is composed of a transmitter, a Cassegrain telescope, and a receiver capable of separating the molecular (Rayleigh) and particular (Mie) backscattered photons (HSRL, high spectral resolution lidar). The lidar is aimed 35° from nadir and 90° to the satellite track, its orbit is inclined at 96.97º and the instrument overpasses the equator at 6h and 18h of local solar time (LST), see also Table 1 to compare with CALIOP.
The laser emitter sends 15 ns long pulses of 355 nm radiation down into the atmosphere 50 times per second. The telescope 85 collects the light that is backscattered from air molecules, aerosols and hydrometeors. The received backscatter signal in Mie receiver passes through a Fizeau interferometer, which produces a linear fringe whose position on the ACCD (Accumulation Charge Coupled Device) detector of this channel is linked to the wind velocity. As for the Rayleigh receiver, it uses a dualfilter Fabry-Pérot interferometer, which throws two images on the ACCD detector of this channel, and the wind speed is defined from the ratio of intensity of these two images (Chanin et al., 1989). Besides the winds, the Aeolus processing 90 algorithms retrieve the optical properties of the observed atmospheric layers (Ansmann et al., 2007;Flamant et al., 2017). The vertical resolution of the instrument is adjustable, but the total number of points in a vertical profile is defined by a number of rows of the detector dedicated to this purpose (24). The observation priorities changed throughout the period of the mission  (Bley et al., 2021), and for the majority of the period considered in this work (see below), the vertical sampling of both Mie and Rayleigh channels between 2 km and 22 km was equal to 1 km whereas the sampling below 2 km varied from 0.25 to 95 1 km. The native horizontal resolution of 140 m of the instrument is sacrificed to achieve higher signal to noise ratio both onboard by accumulating the detected profiles and on the ground by averaging the downloaded profiles at different steps of the processing chain (Flamant et al., 2017).
The present study has been done using the pilot L2A dataset from Aeolus, Prototype_v3.10, which is available for a limited period of ALADIN's observations, from 28/06/2019 through the 31/12/2019. According to (Flamant et al., 2017), the L2A 100 data is produced from the L1B product of this instrument and it contains height profiles of Mie and Rayleigh co-polarized backscatter and extinction coefficients, scattering ratios, and lidar ratios (Flamant et al., 2008;Lolli et al., 2013) along the lidar line-of-sight. For the end user, the profiles are provided both on observation scale (87 km averages) and on smaller scales after applying scene classification, but for the purposes of the present work the scattering ratio on the scale of 87 km is an optimal choice. 105 In Fig. 1(a-c), we show the observation geometry and sampling of ALADIN's L2A product as well as three variables retrieved from its observations, namely, the APB (Attenuated Particular Backscatter), the AMB (Attenuated Molecular Backscatter), and the ATB (Attenuated Total Backscatter). The white dashed lines in Fig. 1 represent the lines of sight of the instrument.
One has to note, however, that in the real life the ALADIN's line of sight is pointed perpendicular to the flight direction; at the same time, the horizontal variability of the observed scene is nearly the same in latitudinal and longitudinal directions at 110 100 km distance, so the sketch gives an idea of the comparability of the physical parameters observed by ALADIN (Fig. 1ac) and CALIOP (Fig. 1d). The atmospheric scene used in Fig. 1 has been calculated for demonstration purposes for two wavelengths, 355 nm ( Fig.1a,b,c) and 532 nm (Fig.1d) from the output of the EAMv1 (Energy Exascale Earth System Model (E3SM) atmosphere model version 1) atmospheric model (Rasch et al., 2019) for the conditions of autumn equinox in Northern hemisphere. This data has been obtained with the help of the COSP2 (the Cloud Feedback Model Intercomparison Project 115 Observational Simulator Package, v2) package, which is capable of simulating the atmospheric observables for spaceborne instruments (Swales et al., 2018). The CALIOP is built into COSP2 (Chepfer et al., 2008) whereas the ALADIN is not yet a part of this package, so we used the 355 nm calculations by COSP2 (Reverdy et al., 2015) at fine grid corresponding to ALADIN's original laser pulse frequency rate and modified them in accordance with the ALADIN's vertical and horizontal averaging. The cloud variability along the satellite's track has been estimated from the gridded EAMv1 data using the 120 parameterization of (Boutle et al., 2014). Figure 1 also serves as an illustration to theoretically achievable cloud detection agreement discussed below.
For each profile corresponding to an inclined dashed line in Fig. 1, we extracted the corresponding scattering ratio (SR) column of sca_optical_properties group of variables where SCA stands for standard correct algorithm (Flamant et al., 2017). An important companion of such a column is a corresponding quality flag column, which we scanned looking for the points 125 characterized either by high Mie signal-to-noise ratio (SNR) or by high Rayleigh SNR, and by a flag that indicates an absence https://doi.org/10.5194/amt-2021-96 Preprint. Discussion started: 19 April 2021 c Author(s) 2021. CC BY 4.0 License. of signal attenuation. Presumably, these flags are necessary and sufficient for a valid SR profile, which can be then compared with that of CALIOP.

CALIPSO-GOCCP
CALIOP, a two-wavelength polarization-sensitive nadir viewing lidar, provides high-resolution vertical profiles of aerosols 130 and clouds. Its 705 km orbit is inclined at 98.05º and it overpasses the equator at 1h30 and 13h30 LST, see also Table 1. It uses three receiver channels: one measuring the 1064 nm backscatter intensity and two channels measuring orthogonally polarized components of the 532 nm backscattered signal. Cloud and aerosol layers are detected by comparing the measured 532 nm signal return with the return expected from a molecular atmosphere.
The CALIPSO-GOCCP (GCM Oriented Cloud Calipso Product) was initially designed to evaluate GCM cloudiness (Chepfer 135 et al., 2010). It is derived from CALIPSO L1/NASA products at LMD/IPSL with the support of NASA/CNES, ICARE, and ClimServ and it contains observational cloud diagnostics including the instantaneous scattering ratio (profiles) at the native horizontal resolution of CALIOP (333 m) and at ~0.5 km vertical resolution. This makes it a good reference dataset for ALADIN retrievals because it can be easily recalculated to the latter's horizontal and vertical grids considering the corresponding horizontal averaging. Since the CALIOP is not a HSRL, the detailed information on AMB and APB is not 140 available, and one has to compare the SR products. Correspondingly, we convert the ALADIN's SR retrieved at 355 nm to SR at 532 nm using the following equation: (1) which is derived from (Collis and Russell, 1976) in an assumption that their fitting parameter Λ (see their Section 4.3.1) is equal to 3. The choice of the fitting parameter is not crucial for the purposes of the present work because the conversion 145 described by Eq. 1 is linear and it does not change the altitude distribution of the SR. On the other hand, using the same physical parameter is highly advisable for the comparisons we are intending to perform. Theoretically, one could have validated the parameters of Eq. 1 using the collocated data under consideration, but, looking ahead, one can say that the spread of the values is too large to do it with reasonable uncertainty, so we will stay with Eq. 1 in the framework of this paper and in Appendix A we justify our choice of conversion coefficients using the collocated data. 150

Collocation criteria
As for any collocation, there is a trade-off between the quality of collocation and the number of collocated pairs of profiles.
As we show below, in the case of AEOLUS and CALIPSO, this tradeoff is supplemented with a requirement of a representative geographical coverage, because imposing a strict temporal overlap criterion dramatically changes the latitudinal distribution of the collocated points. Since the horizontal averaging and resolution of the Aeolus Prototype_v3.10 product is 87 km, there 155 is no much sense in collocating the data with the accuracy better than this value. On the other hand, a fractional standard deviation fc of cloud water content at 1º (~111 km) distance is about 0.5 for a cloud cover of 1 (Boutle et al., 2014), and there https://doi.org/10.5194/amt-2021-96 Preprint. Discussion started: 19 April 2021 c Author(s) 2021. CC BY 4.0 License.
6 is a risk of comparing incoherent quantities, so we took Δdist = 1º as a limit for the collocations and created several subsets based on the Δtime, the absolute value of the difference between two collocated measurements. In Fig. 2, we show three such subsets, and the Table 2 provides the information about the other cases we considered. On the one hand, one can see that a 160 strict collocation criterion of Δtime < 1h (Fig. 2a) provides the information only about two narrow zones in the Southern and Northern polar regions. On the other hand, an excellent geographical coverage shown in Fig. 2c comes at the cost of mixing up the cases, which differ by almost one day that is unacceptable from the point of view of temporal variation. In addition, this case is characterized by unequal distribution of Δtime throughout the globe. Finally, a subset corresponding to Δtime < 6h ( Fig. 2b) has been chosen for the analysis. Over the oceans, the diurnal effects in cloud distribution associated with this 165 difference are small (e.g. Noel et al., 2018;Chepfer et al., 2019;Feofilov and Stubenrauch, 2019) and the land represents one third of the analyzed cases. To avoid the risks associated with the solar contamination, we picked up only the night-time cases, which yield about 7.8E4 pairs of SR profiles. In supplementary materials, we provide the complete collocated database, which corresponds to the last line, 4 th column of Table 2 (3.7E5 collocations), for further analysis by the interested teams.

Estimating the theoretically achievable agreement between two collocated datasets 170
To justify the collocation criteria and to estimate the theoretically possible agreement for the clouds detected by two instruments in a given setup and for the selected Δtime and Δdist values, we have performed a numerical experiment using the same calculated data as we used in Fig. 1. This time, we picked up the "lidar curtain" at 532 nm calculated at the resolution of CALIOP (333m) and created artificial pairs of "collocated" data with the Δdist distribution modulated by that of a real collocated dataset. The "reference", CALIOP profile has been composed using 2000 individual SR profiles covering 67 km 175 region that is somewhat less than the 87 km covered by ALADIN. This averaging is supposed to catch the mean atmospheric properties and at the same time it is not supposed to go too far from the ALADIN footprint location. The "test" SR profile was created from the SR averages, considering both the ALADIN's off-nadir pointing and its 87 km averaging. To imitate the diurnal variation, we modulated the SRs using the 6-hour diurnal cycle amplitudes for land and ocean retrieved from active and passive observations Chepfer et al., 2019;Feofilov and Stubenrauch, 2019) and added them to the 180 comparison. Besides testing a noise-free simulation, we also checked the effects introduced by instrumental noise for CALIOP.
Since ALADIN is not yet part of COSP2, we used the estimates from (Ansmann et al., 2007). Overall, we considered about 1E5 pairs of pseudo-collocated data and we present the results of cloud detection in Fig. 3. We define the cloud detection agreement as follows: for each altitude bin, the cloud detection agreement is a ratio of a number of cases when both instruments have detected a cloud (SR>5) to a total number of joint observations. For a given altitude bin, the cloud amount is a ratio of 185 number of cases with SR>5 to a total number of profiles for a single instrument, and the normalized cloud detection agreement is a ratio of the former to the latter. As one can see, the normalized cloud detection quality is mostly defined by a horizontal variability of aerosols/hydrometeors and by differences in viewing geometries of two instruments. Observation noise and diurnal variation play the secondary role, and according to our estimates the saturation effects in 355 nm and 532 nm channels associated with opaque clouds (Guzman et al., 2017) do not add more than 2% to the cloud detection mismatch (not shown in Fig. 3 for the sake of clarity). Overall, the theoretically achievable agreement for the collocated data in a given setup can be estimated as 0.77±0.17 for cloud detection.

Zonal averages
To give a general overview of the agreement between two products, we have split the database to latitudinal zones: 90S−60S, 195 (Fig. 4). As it was stated above, we rescale the SR355 values retrieved from ALADIN observations to SR532 using Eq. 1. Even though the zonal mean statistics does not imply using collocated data, we do it to avoid any incoherence in sampling different geographic areas. By using exactly the same number of profiles collocated within 1°, we ensure the same coverage and sampling by both lidars. If the detection efficiency of different cloud types were the same for two instruments, the plots would have been close to each other because the horizontal variability of clouds would cancel 200 out due to averaging over a large number of profiles within the zone and the diurnal variation is small over oceans, which constitute two thirds of the cases used to build Fig. 4 Chepfer et al., 2019;Feofilov and Stubenrauch, 2019).
Analyzing the Fig. 4, one can note the following: (1) the SR/altitude histograms of CALIOP (Fig. 4a-e) are characterized by two distinct peaks corresponding to low-level and high-level clouds; this feature is coherent with other observations, e.g. with GEWEX (Global Energy and Water cycle Experiment) cloud assessment (Stubenrauch et al., 2013); (2) the SR/altitude 205 histograms built for SRs retrieved from ALADIN's observations (Fig. 4f-j) are characterized by a smoother occurrence frequency plot where the two-peak structure is less pronounced than for CALIOP; (3) even though ALADIN detects some clouds in polar stratosphere (PSCs), its overall sensitivity to high clouds (>7 km) is lower than that of CALIOP; (4) both rows show certain consistency of zone-to-zone change up to ~3km altitude while the behavior above requires a more detailed view.
We would like to stress here that no linear scaling applied uniformly to SRs at all heights could change the ratio of high cloud 210 detection frequency to low cloud detection frequency of ALADIN. The same is true for CALIOP. In the next step, we compare the "instantaneous" profiles provided by CALIOP and ALADIN having in mind the peculiarities of cloud detection sensitivity differences observed in Fig. 4.

Comparing pseudo-individual profiles at ALADIN's L2A product resolution
To address the high cloud detection sensitivity, we have inspected the 6h nighttime subset of collocated data, looking for the 215 cases, which would satisfy the following criteria: (1) both instruments should have at least one strong SR peak; (2) the vertical position of this peak detected by one instrument should match that of the peak detected by a second instrument within 1 km; (3) the CALIOP SR profile should have a secondary peak at or above 9 km (Fig. 5a-j). For the comparison purposes, the panels in Fig. 5 represent the individual profiles belonging to the same 5 zones as the panels of Fig. 4. For the sake of simplicity, we compare the SR355(z) profiles recalculated to SR532(z), but we also show the source SR355(z) profiles for reference purposes. 220 Regarding the conversion using Eq. 1, the strong peaks selected this way demonstrate a qualitative agreement between the https://doi.org/10.5194/amt-2021-96 Preprint. Discussion started: 19 April 2021 c Author(s) 2021. CC BY 4.0 License.
8 peak values calculated from SR355 and peak retrieved SR532 values. In Appendix A, we demonstrate the correlation between individual pairs of CALIOP and ALADIN SR profiles; the conclusion of this exercise is that it justifies using Eq. 1, but the uncertainties of the analysis do not allow to refine the conversion coefficients. As for the potential capability of ALADIN to detect high clouds, the subset Fig. 5a-e represents the cases, for which the instrument was capable of retrieving the peak of the 225 same magnitude and height as the peak detected by CALIOP. Even though these cases exist, they are far less frequent than those shown in Fig. 5f-j. We did not detect and correlation between the collocation criteria (Δdist; Δtime) and the frequency of occurrence of these cases, it's just a statistical observation that both types of cases exist and the former are less frequent than the latter. This observation gives a hint that the instrumental part provides the backscatter information sufficient for some cloud detection up to 20 km, but the detection algorithm suppresses noisy solutions. The PSC detection discussed below (see 230 also Fig. 4f) confirms this assumption because the vertical extent and the composition of these clouds yield a strong signal.
Further speculations on this subject are beyond the scope of the present article, but we believe that the high cloud detection agreement might be improved by studying the collocated cases provided in the supplementary materials and by applying different noise filtering techniques in the L0→L1→L2 elements of the ALADIN retrieval chain. Figures 5k-o will be discussed below in the context of low-level cloud observations. 235

Cloud detection agreement
To illustrate the peculiarities of zonal and altitudinal behavior of cloud detection agreement between two considered instruments, we have split the collocated data into four groups (Fig. 6). For each altitude/latitude grid point, we have estimated the number of cases when both instruments have detected a cloud (SR532(z)>5), when neither of instruments has detected a cloud, when only CALIOP has detected a cloud, and when only ALADIN has detected a cloud. For the sake of simplicity, we 240 will call them YES_YES, NO_NO, YES_NO, and NO_YES cases. It is clear that in the ideal experiment the number of mismatched cases (YES_NO and NO_YES) should tend to zero. From the study presented in Section 2.4, we expect that the ratio of (YES_YES+NO_NO)/(YES_YES+NO_NO+YES_NO+NO_YES) should be about 0.77±0.17 if both instruments detect the clouds with the same efficiency. In Fig. 6a we show the ratio of YES_YES cases to the total number of collocated profiles per altitude/latitude bin. This panels resembles a typical cloud amount plot, and this is expected because in the case of 245 an ideal agreement the aforementioned ratio is equivalent to cloud amount definition. Below, we will also discuss the YES_YES statistics normalized to cloud amount, but at this point we also want to study the other cases, which cannot be normalized this way. Even though the distribution in Fig. 6a looks physical, the absolute numbers are somewhat low and this is explained by YES_NO and NO_YES distributions ( Fig. 6c and d, respectively). As for NO_NO agreement (Fig. 6b), it is close to 100% in the high-altitude area where there are no clouds. This indicates that the noise-induced false detection rate of 250 both instruments is low, and this is a good sign.
If we consider the mismatch of YES_NO type (Fig. 6c), we will see that the altitudinal/zonal distribution of the mismatch occurrence frequency resembles that of the YES_YES type. A part of mismatch can be explained by theoretically allowed cloud detection disagreement discussed in Section 2.4. However, the occurrence frequency of YES_NO cases above 3 km is whereas it does not report anything at 9 km height where CALIOP sees a thick cloud. These cases do need our special attention.
On the one hand, many cases of this type are over the ocean, so one can rule out the surface echo mixed with atmospheric backscatter and treated like an atmospheric signal. On the other hand, the NO_YES cases are often accompanied by the structures similar to those presented in Fig. 5k,l,n which are probably provoked by a presence of a cloud at these heights. The perturbations to the extinction and backscatter profile caused by these structures might propagate downwards, thus causing the 265 appearance of the false peaks in the lower layers of ALADIN's data. This indicates a need for a quality flag refinement in the lower layers in the presence of a thick cloud above and the improvement of thick cloud detection itself. Apparently, the CALIOP cloud retrievals beneath thick clouds do not suffer from these effects.
To test whether the aforementioned disagreements are at least partially caused by the cloud definition and SR recalculation to another wavelength and whether the agreement could be improved, we varied the SR threshold for ALADIN, assuming the 270 ±50% uncertainty on the parameters forming the coefficients of Eq. 1. However, this exercise yielded no optimum value for SR threshold: its lowering for ALADIN increased the number of YES_YES and reduced the number of YES_NO cases, but at the same time it increased the frequency of NO_YES cases. Correspondingly, increasing the threshold reduced the number of NO_YES cases, but it adversely affected the YES_YES agreement. Summarizing this comparison, one can conclude that (a) a cloud detected by CALIOP is detected by ALADIN in ~50% of cases for clouds below ~3km and in ~30% of cases for 275 higher clouds; (b) in the cloud-free area, the agreement between the datasets is good that indicates a low frequency of false positive detections by both instruments; (c) one half of the cases when ALADIN detects a cloud missed by CALIOP should be attributed to false positive detection of the low cloud in the presence of a higher opaque cloud, which perturbs the retrieval in the lower layers.

Cloud altitude detection sensitivity 280
Besides marking the profile elements as "cloudy" and "not cloudy" and comparing the cloud detection statistics as we did in the previous section, it would be interesting to obtain cloud peak detection statistics for pairs of collocated profiles like those shown in Fig. 5. This exercise is not aimed at revealing any altitude offset in backscatter signal registration, because this part of experimental setup is robust in both instruments. But, as we saw in Fig. 4 and Fig. 6, the sensitivity of ALADIN to high clouds is lower than to lower clouds and a convolution of sensitivity curve with the backscatter profile can skew the cloud 285 peak position and the average cloud height. To illustrate this effect, we have carried out the following analysis. For each pair https://doi.org/10.5194/amt-2021-96 Preprint. Discussion started: 19 April 2021 c Author(s) 2021. CC BY 4.0 License. of collocated profiles selected for YES_YES plot (Fig. 6a), we scanned through ALADIN profile step by step, looking for a local maximum, which we define as a set of the following conditions: where SRthreshold is the cloud detection threshold at 532 nm, which is equal to 5. For each local peak found, we have searched 290 for a peak or for a maximal value of CALIOP's SR profile in the vicinity of ±3 km from the peak height determined from ALADIN. The choice of a "reference" dataset in this case depends on the detection probability, and if one choses CALIOP as a reference, the distance to the nearest ALADIN peak might be spoiled by lower probability of cloud detection by ALADIN and the distribution will be skewed. The search limits are arbitrary and they have been chosen from inspecting the collocated profiles taking into account the natural variability of cloud heights at distances of about 100 km, estimated from the analysis 295 of CALIOP data used in this study (~75% of clouds move vertically by less than 1 km, ~8% by 1−2 km, ~5% by 2−3km, ~4% by 3−4km, ~3% by 4−5 km and ~5% by more than 5 km). The differences between the ALADIN's and CALIOP's cloud peak heights have been stored and then averaged in the corresponding latitude/altitude bins (Fig. 7). As one can see, the cloud height detection agreement is better than 0.2 km below ~3 km and, surprisingly, for some of high-altitude zones. For the tropical zone, this is probably linked with thick Ci clouds which should be reliably detected by both instruments. For the Southern 300 polar zone, this figure reveals the PSCs, which are barely visible in Fig. 6a, but which can be seen in Fig. 4f for ALADIN.
These clouds form at very low temperatures and are composed of ice particles yielding a reflection, which is reliably detected at both wavelengths if the layer is thick (e.g. Adriani et al., 2004;Snels et al., 2021). As for the clouds between ~3 km and ~10 km height, the height sensitivity effects skew the effective cloud height detected by ALADIN downwards by 0.5−1.0 km. This is coherent with Fig. 4, which shows lower frequency of occurrence of high clouds detected by ALADIN. At least a part 305 of the cloud peak shifts in the 3−5 km layer should be attributed to the reasons discussed for NO_YES statistics and these differences should reduce when the aforementioned quality flags for cloud-perturbed retrievals are fixed.

Temporal evolution of cloud detection agreement
ALADIN is a relatively young instrument, and its calibration/validation activity is still on the way (Baars et al., 2020;Donovan et al., 2020;Kanitz et al., 2020;Reitebuch et al., 2020;Straume et al., 2020). This includes, but is not limited to internal 310 calibration and comparisons with other observations. The Aeolus mission faced a number of technical issues, which hindered obtaining the planned specifications. These issues are related to several factors: (a) laser power degradation (60 mJ/pulse instead of 80 mJ/pulse) and signal losses in the emission and reception paths (33%) that results in lower signal to noise ratio (SNR) than planned, (b) telescope mirror temperature effects biasing the wind detection and calibration of Mie and Rayleigh channels of ALADIN, (c) constantly increasing number of hot pixels of both ACCD detectors (Weiler et al., 2021) leading to 315 errors both in wind speed and in retrieved optical parameters of the atmosphere (the number of hot pixels increased by a factor of 1.4 during the period considered in this work). The Aeolus teams managed to mitigate some of these adverse effects (e.g. Baars et ql., 2020;Weiler et al., 2021), and it would be interesting to see whether the pilot L2A dataset, Prototype_v3.10 is https://doi.org/10.5194/amt-2021-96 Preprint. Discussion started: 19 April 2021 c Author(s) 2021. CC BY 4.0 License. free of cloud detection quality trends. If true, this would indicate a good calibration and consistent processing of Level 0 through Level 1 to Level 2A. 320 In Fig. 8 and 9 we show the temporal evolution of cloud detection agreement per height bins. The panels of Fig. 8 are consistent with those of Fig. 6 whereas Fig. 9 considers only the evolution of YES_YES statistics, which corresponds to Fig. 6a and for seasonal anomalies. The signatures one should be looking for are experimental artefacts linked with laser power degradation, hot pixels appearance, and bias corrections. If these issues are not properly compensated, the "agreement panels" (Fig. 8a,b) should demonstrate a decrease in occurrence frequency with time and the occurrence frequency in "disagreement panels" (Fig. 8c,d) should increase with time. As one can see, this is not the case: visually, all 4 panels of Fig. 8 do not show any anomaly, which would go beyond their noise levels (a special region corresponding to a forced bin size reduction in the 330 period of 28/10/2019−10/11/2019 is marked by white dashed lines in Fig. 8 and should not be considered at heights below 2250m). To quantify the tendencies and to compare them with noise levels, we have normalized Fig. 8a (YES_YES cases) by cloud amount per altitude/time bin. This procedure helps to get rid of seasonal variation of clouds. The results presented in Fig. 9 confirm the previous conclusions regarding the altitude distribution of cloud detection agreement: for the clouds below 3 km it is better than for higher ones (61±16% and 34±18% for 0.75 and 2.25 km, respectively versus 24±10%, 26±10%, and 335 22±12% for 6.75 km, 8.75 km, and 10.25 km, respectively. As for the tendencies, the low-level clouds demonstrate an improvement towards the end of the year whereas the agreement for 6.75 km and 10.25 km becomes slightly worse by the end of the considered period. If we compare the hot pixels distribution for Mie and Rayleigh channel ACCD detectors at the beginning and at the end of the time scale of Fig. 8 and 9 (Table 2 of Weiler et al., 2021), we will see 3 and 5 new hot pixels for Mie and Rayleigh matrices, 340 respectively. Even though the Rayleigh matrix pixels are not directly linked to cloud detection, their information is used for the ALADIN SR calculations. For Mie matrix, the lowermost hot pixel, which appeared during the considered period, corresponds to ~15 km height and this cannot affect the tendencies shown in Fig. 9. As for new Rayleigh hot pixels, the lowermost two corresponds to 1 km height, the next twoto 5 km height, and the last oneto 18 km. This information does not explain the observed behavior, either. Overall, considering relatively large error bars for all five altitudinal sections 345 presented in Fig. 9b and the variety of the observed slopes, one cannot make a sound conclusion neither regarding the deterioration (or the improvement) of cloud detection agreement nor regarding the link between hot pixels appearance and change of cloud detection quality. A proper conclusion is that one does not detect the tendencies beyond the variability limits of the analyzed parameter and that the hot pixels appearance cannot be tracked from the cloud agreement plot, indicating that compensation for hot pixels effects (Weiler et al., 2021)  in the same way that is not observed in Fig. 8 and 9.

Conclusions
The active sounders are advantageous for atmospheric and climate studies because they provide atmospheric parameters at 355 altitude resolved scale with high accuracy. For continuity of climate studies and monitoring the global changes it is essential to understand the differences between spaceborne lidars operating at different wavelengths, flying at different orbits, and utilizing different observation geometries, receiving paths, and detectors. In this article, we addressed an intercomparison of ALADIN and CALIOP lidars using their scattering ratio products (CALIPSO-GOCCP and Aeolus L2A, Prototype_v3.10) for the period of 28/06/2019−31/12/2019. 360 Using the COSP2 lidar simulator coupled with output from the EAMv1 model and a horizontal cloud variability parameterization, we estimated a theoretically achievable agreement in cloud detection of 0.77±0.17 for these two instruments with their orbits, averaging, and observation geometry.
On the one hand, the spatial collocation criterion of 1° chosen in this work is based on averaging distance of Aeolus L2A Prototype_v3.10 data. On the other hand, the temporal collocation criterion of Δtime < 6h is a tradeoff between the 365 geographical coverage of the collocated profiles, their number, and uniformity of Δtime distribution throughout the globe.
With the named criteria, we managed to find ~7.8E4 collocated nighttime profiles, which underwent a series of analysis summarized here. For the simplicity of the comparison with CALIOP, we converted SR355 of ALADIN to SR532 and we discuss the sensitivity of the results to the conversion parameters.
Overall, the SR product of ALADIN is characterized by lower sensitivity to high clouds above ~7 km than CALIOP, that we 370 explain by lower SNR for ALADIN at these heights that is due both to physical reasons (smaller backscatter at 355 nm) and technical reasons (hot pixels, lower emission and lower transmissivity of receive path than planned). Large sensitivity to lower clouds leads to prioritizing the lower cloud solutions to higher ones in the case of a continuous cloud or a double layer. This skews the ALADIN's cloud peak height in pairs of ALADIN/CALIOP profiles by ~0.5±0.4 km downwards. Interestingly, the agreement of PSC peak heights does not suffer from these effects. We explain this by large vertical extent and composition of 375 PSCs, which make them a better target for ALADIN than the tropospheric clouds. In the cloud-free area, the agreement between two instruments is good indicating low rate of noise-induced false detection for both instruments. Last, but not least, the temporal evolution of cloud agreement does not reveal any statistically significant change during the considered period.
This indicates that hot pixels and laser energy and receiving path degradation effects in ALADIN have been mitigated at least down to the uncertainties of the following cloud detection agreement values: 61±16%, 34±18%, 24±10%, 26±10%, and 380 22±12% estimated at 0.75 km, 2.25 km, 6.75 km, 8.75 km, and 10.25 km, respectively. We believe that the provided collocated dataset will facilitate the further analysis and improvement of ALADIN L2A data.