Validating TROPOMI aerosol layer height retrievals with CALIOP data

The Tropospheric Monitoring Instrument’s (TROPOMI) level-2 aerosol layer height (ALH) product has now been released to the general public. This product is retrieved using TROPOMI’s measurements of the oxygen A-band, radiative transfer model (RTM) calculations augmented by neural networks and an iterative optimal estimation technique. The TROPOMI ALH product will deliver aerosol layer height estimates over cloud-free scenes over the ocean and land that contain aerosols above a certain threshold of the measured UV absorbing index (UVAI) in the ultraviolet region. This paper provides background 5 for the ALH product and explores its quality by comparing ALH estimates to similar quantities derived from spaceborne lidars observing the same scene. The spaceborne lidar chosen for this study is the Cloud-Aerosol Lidar with Orthogonal Polarisation (CALIOP) on board the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) mission, which flies in formation with NASA’s A-train constellation since 2006 and is a proven source of data for studying aerosol layer heights. The influence of the surface and clouds are discussed and the aspects of the TROPOMI ALH algorithm that will require future 10 development efforts are highlighted.

CALIOP profiles over an extended period of time covering several months from May 2018 till March 2019, in order to draw conclusions on the accuracy of the TROPOMI aerosol layer height retrievals. Further on, the paper also discusses four selected cases in and around West Africa for a deeper analysis of the comparison with CALIOP data; the choice of using the Africa as a study area arises from the fact that a significant majority of colocations between TROPOMI and CALIOP are concentrated around the West African region.

5
In Section 2) of this paper, we discuss the data and methods used in this paper; section 2.1 describes the retrieval algorithm and highlights different diagnostic parameters available for assessing the product's quality. Following this, the comparison between CALIOP and TROPOMI estimates of aerosol heights are presented in 3 -Section 3.1 presents an overall analysis of a large number of TROPOMI-CALIOP colocations, followed by Section 3.2 which discusses selected cases for a deeper dive into the TROPOMI product. The paper concludes with section 4, highlighting important areas of potential improvement in the 10 current TROPOMI aerosol layer height product.

TROPOMI aerosol layer height
The TROPOMI aerosol layer height product is derived from measurements of the oxygen A-band in the near infrared region between 758 nm and 770 nm. Within this spectral range, TROPOMI measures top of atmosphere radiances and solar irradiances 15 with a spectral resolution between 0.34 nm and 0.35 nm and a spectral sampling of 0.126 nm. The retrieval algorithm exploits the absorption characteristics of molecular oxygen, which varies with the photon path length -the photon path length for an aerosol layer closer to the surface is longer, which appears as deeper oxygen absorption lines in the measured spectrum (see Figure 1 of Nanda et al. (2018a)).
The reported aerosol layer height is the height of a single aerosol layer for the entire atmospheric column within the scene 20 measured by TROPOMI; in reality however, there can be several cases where distinctly separated elevated and boundary layer aerosols are present in the same scene. In such cases, the retrieval algorithm is expected to retrieve an optical centroid pressure or height of the two (or more) aerosol layers, depending on the atmospheric level of the aerosol layer from which most of the photons are scattered back. For instance, if the elevated aerosol layer contributes significantly more than the boundary layer aerosols to the top of atmosphere measured spectra, the aerosol layer height retrieval algorithm is expected to retrieve values 25 closer to the elevated layer.
The technique for retrieving aerosol layer height is based on optimal estimation (Rodgers, 2000), where an RTM that calculates the top of atmosphere oxygen A-band spectra is fitted to TROPOMI measured oxygen A-band spectra. The cost function that is minimised in this estimation step, χ 2 , is defined as where, y is the reflectance spectra calculated from measured radiances and irradiances for the oxygen A-band, F (x, b) is the modeled reflectance for input parameters b, of which the state vector x containing aerosol layer height z aer and aerosol optical thickness τ is a part, x a is the a priori state vector and S −1 and S a −1 are the measurement error covariance and the a priori error covariance matrices. Optimal estimation is an iterative process, requiring several iterations to minimise the cost function described in Equation 1. The approach is Gauss-Newton, with a maximum number of iterations set at 10. If the 5 optimal estimation does not converge within these iterations, the aerosol layer height field in the final level-2 product is filled with a fill value. For a given measurement, optimal estimation is said to have converged to a final solution if the update to the state vector for the next iteration is less than the expected precision.
The χ 2 is a measure of how close the modelled sun-normalised radiances are to the observations, with smaller values representing a better fit. The consequence of the many assumptions in the model (described in Section 2.2 of Nanda et al.

10
(2019)) result in a large χ 2 (to the order of 1E4-1E7), with larger χ 2 representing a larger departure between the model and the observation. There are several reasons for these departures, the more important ones being the presence of undetected clouds in the scene, incorrect surface reflectance information, and multiple aerosol layers. These attributes are not parameterised into the RTM, and can be source of discrepancies between the measured and the modeled reflectances. The RTM in this case is a neural network model that has learned parts of a full physics RTM derived from de Haan et al. (1987), described in Nanda In contrast to TROPOMI's ALH product which is reported at 7.2 km × 3.6 km till August 6, 2019, and 5.6 km × 3.6 km thereafter, the LER database is much coarser spatially.
This can lead to several artefacts in the final product, discussed further on in this paper in Section 3.2. Another issue to note is in the influence of bright surfaces on the retrieval. The oxygen A-band lies beyond the red edge, a wavelength region in which vegetation has high reflectance values. This poses several challenges; a significant portion of the measured signal over land 25 might be contributions from the surface reflectance (see Figure 3 from Nanda et al. (2018a)). If the aerosol optical thickness of the measured scene is low, the contribution of the surface to the top of atmosphere radiance dominates over the contribution from scattering by aerosols -there are more photons that get scattered back from the surface than the aerosol layer. In such cases, the retrieval algorithm will tend to retrieve an aerosol layer closer to the surface. Generally we find that, if the contribution to the top of atmosphere reflectance from aerosols is significantly larger than the same from the surface (i.e. the aerosol layer 30 appears brighter than the surface), the retrieval algorithm will tend to retrieve a height closer to the aerosol layer (Section 5.2 and Figure 10 from Nanda et al. (2018b) discuss this observation explicitly).
The forward model parameterises aerosols with a Henyey-Greenstein scattering phase function (Henyey and Greenstein, 1941) with an asymmetry factor of 0.7, a single scattering albedo of 0.95, and a fixed aerosol optical thickness for an aerosol layer parameterised by a single atmospheric layer with a 50 hPa thickness. The algorithm assumes a single aerosol layer for 35 the entire atmosphere, which is an important simplification to note when comparing with CALIOP profiles which have the capability to detect multiple aerosol layers.
Finally, the ALH retrieval algorithm implements a pixel selection scheme before committing to retrieving ALH estimates.
This pixel selection scheme involves auxiliary data products from TROPOMI such as the UVAI (www.tropomi.eu/document/ atbd-uv-aerosol-index) and cloud fraction estimates from the TROPOMI Fast Retrieval Scheme for Clouds from Oxygen ab-5 sorption bands (FRESCO) algorithm (Wang et al., 2008), and the cirrus reflectances derived from the Visible Infrared Imaging Radiometer Suite (VIIRS) on the Suomi National Polar-Orbiting Partnership (Suomi NPP) satellite.
1. The maximum solar zenith angle allowed is 75 • . If the pixel does not meet this criterion, it is removed from the processing and a flag is raised.
2. If the pixel over water lies in the sun-glint region (a maximum sun-glint angle of 18 • ), it is processed but a sun glint 10 warning flag is recorded in the level-2 product.
3. If the standard deviation of the surface elevation within the pixel is beyond 1000 m, the pixel is not processed and a flag is raised. If it is beyond 300 m, a warning flag is raised and the pixel is processed.
4. If the surface covered by the pixel comprises of both land and water, a warning indicating mixed surface type is raised and the pixel is processed regardless.

15
5. If the pixel contains snow or ice, the pixel is not processed and a flag is raised. 6. If the TROPOMI level-2 UV Absorbing Index product reports a value below 0.0, the pixel is not processed and a flag is raised. If the value is less than 1.0, a low UVAI flag is raised.
7. If the reported cloud fraction values from the TROPOMI FRESCO product for the pixel is beyond 0.6, the pixel is not processed and a flag is raised. 20 8. If the VIIRS average cirrus reflectance for the pixel is beyond 0.4, the pixel is not processed and a flag is raised. If it is beyond 0.01, a warning for possible cirrus clouds is indicated.
9. If the difference between the scene albedo (calculated using a look up table) from the Level-2 UVAI product and the surface albedo from the Tilstra et al. (2017) database at 380 nm is beyond 0.4, the pixel is removed from the processing pool and a flag is raised for possible cloud contamination. If this is value is beyond 0.2, a warning flag is raised.

25
10. The nominal TROPOMI pixels also contain radiances at a sub-pixel level, which are called small pixel radiances. If the standard deviation of the small pixel radiances is larger than 1E-7, the scene is deemed to be non-homogeneous (possibly containing clouds) and it is removed from the processing pool.
These relevant flags are reported in Table 1 and are available in the level-2 data products; the values for each of these flags can be accessed with bitwise-and operations for each pixel with the value of each processing quality flag. For cloud filtering, 30 the cloud_warning flag is the preferred flag for removing possibly cloudy pixels. This flag is a combination of FRESCO cloud fraction retrievals, VIIRS cirrus reflectance retrievals and the difference between the surface albedo and the scene albedo at 380 nm. An example of applying the cloud_warning flags to filter out possibly cloudy pixels is provided in Figure 1.

CALIOP weighted extinction height
The Cloud-Aerosol Lidar with Orthogonal Polarisation (CALIOP) instrument is a part of the payload for the Cloud-Aerosol 5 Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) mission (Winker et al., 2009), which orbits the Earth in a sunsynchronous orbit. The CALIOP instrument has three backscatter receiver channel, two channels for the orthogonal measurement of received backscatter signal at 532 nm and one channel for backscatter at 1064 nm. Lidar profiles from the CALIPSO mission are a good source of data for validating retrieved aerosol layer heights from TROPOMI, because of their ability to map the vertical structure of the atmosphere. The data from the CALIOP instrument relevant for validating TROPOMI ALH are the 10 level-1 backscatter profiles and the level-2 aerosol extinction profiles.
In this paper, the level-1 total backscatter profiles from the 532 nm channel are used as curtain plots to visualise the vertical structure of the atmosphere. Level-2 aerosol extinction profiles from the 532 nm channel are then used to compute an aerosol weighted extinction height ALH ext , following the definition provided by Equation 1 in Koffi et al. (2012), where Z i is the height from sea level in the i th lidar vertical level i (in km), and β ext,i is the aerosol extinction coefficient (in 5 km −1 ) at the same level. The Level-2 aerosol extinction profile product from CALIOP only includes atmospheric levels where aerosols are detected. In the case when aerosols are present over clouds, ALH ext will be situated to the center of the aerosol layer, with any possibly undetected aerosol layers below the cloud layer not included in the calculations due to attenuation of the signal beyond the cloud layer. This is an important detail as the TROPOMI ALH algorithm cannot separate cloud and aerosol signals from the measured radiances, and cloud contamination will affect the retrieved product. In this paper, the  (Figure 2b). The contrast between retrievals over land and ocean is apparent in Figure 3 (cloudy scenes filtered out using the cloud_warning flag), with a majority of the negative differences with values lower than -2 km occurring over land.
From Figure 2a , what is immediately clear is that the CALIOP ALH ext are higher than the TROPOMI ALH. With an 20 average difference of -2.25 km, median difference of -1.62 km and a standard deviation of 3.83 km, the retrieved ALH from TROPOMI over land is reported to be systematically closer to the surface than CALIOP ALH ext than in comparison with retrievals over the ocean, which has a mean difference of -0.41 km, a median difference of -0.29 km and a very high standard deviation of 6.86 km. There are several cases over the ocean where TROPOMI ALH is significantly higher than the CALIOP ALH ext , which could be due to cloud contamination. The comparison of the cloud-screened retrievals (Figure 2b) reveals that 25 the retrieved ALH from TROPOMI over the ocean differs from CALIOP ALH ext by -1.03 km on average, a median difference of -0.76 km and a standard deviation of 1.97 km. More than 50% of the TROPOMI ALH retrievals over the ocean have an absolute difference with ALH ext less than 1.0 km. Retrievals over land are have a larger difference, with -2.41 km on average and a median of -1.75 km. The results are very skewed over land, with very large negative values dictating the averagethis is indicated by the very large standard deviation of 3.56 km. 50% of the selected colocations over land have an absolute difference with ALH ext less than approximately 1.8 km.
The distribution of the differences between TROPOMI ALH and CALIOP ALH ext as a function of the retrieved UVAI ( Figure 4a) shows that for most cases, the UVAI is below 2.0. The spread of the differences in this UVAI regime is large, 5 which reduces as the UVAI increases. The differences seem to be less often positive as the UVAI increases; if compared with the behaviour observed between Figure 2a and Figure 2b where a majority of the positive differences vanish once the data is cloud screened, such a behaviour could be related to clouds. The distribution of the differences as a function of retrieved AOT in Figure 4b show that the majority of the colocations have AOT values between 0 and 2. Finally, the distribution of these differences as a function of the GOME-2 LER values used for the retrievals for cases over land show that the retrievals tend to 10 have a lower difference as the LER value increases -this could be a consequence of the fact that so few retrievals converge in high LER regimes that, unless the aerosol layer has a significant contribution to the measured top of atmosphere radiance in comparison to the surface, the retrievals tend to fail.
Retrieved ALH over land (if successful) can be closer to the surface than where the aerosol layer actually is situated vertically.
The TROPOMI ALH product, unlike the CALIOP ALH ext which only considers aerosol signatures in the recorded backscatter 15 profile, is also influenced by the presence of undetected clouds. These are some of the several possible sources of departures between the observations of CALIOP and TROPOMI over the same scene further explored in the following sections.

Selected cases
The analysis presented in the previous section alone is insufficient to fully quantify the quality of the retrieved TROPOMI It is important to note that spaceborne lidars, while having the advantage of being able to map more than one vertical layer 5 in the atmosphere, suffer from attenuation of the signal in the presence of strongly backscattering species such as clouds or aerosols with a large optical depth. In the presence of a primary strongly backscattering aerosol layer, the attenuation of the signal may lead to undetected secondary aerosol layers beneath the primary layer. These layers, not apparent in the CALIOP curtain plots of the measured attenuated backscatter profiles, may be detected by the level 2 aerosol extinction profile product from the CALIOP mission, using the formula described in Equation 2. Some of these discussed situations are observed in the 10 CALIOP curtain plots of the selected cases in Figure 6, especially for cases a and b, where the attenuated signal does not detect possibly lower aerosol or cloud layers, and in case d where the attenuation of the signal due to a thick aerosol plume can hide the surface from the received backscatter signal. TROPOMI, on the other hand, will tend to report an aerosol layer height between these two layers as it will be influenced by photons scattered back from both layers.

15
The retrieved TROPOMI ALH in Figure 5 (4th rd column) represent successful retrievals for each of the selected cases. Beyond the sun glint warning, the cloud_warning flag in Table 1 is applied to remove possibly cloud contaminated data. The retrieved aerosol optical thickness (AOT), which is a part of the state vector, for each of the scenes are plotted over the VIIRS image of the scene in Figure 5 (3 rd column). The retrieved AOT (τ aer ) can act as a diagnostic tool to indicate the influence of the surface (over bright surfaces) or the presence of undetected clouds (both over bright and dark surfaces) -in these cases, the 20 retrieved AOT of the scene can be uncharacteristically high with values much greater than 3.0. All retrieved TROPOMI AOT values beyond 5.0 are discarded as the neural network forward models are trained with AOT values less than or equal to 5.0.
A visual inspection of the figures in Figure 5 shows that the retrieved UVAI, AOT and ALH need not be spatially correlated, as they are separate properties of the observed aerosol plumes -for instance, if the retrieved UVAI and AOT are low (case c), the retrieved ALH need not necessarily be low. An inspection of the plots of the retrieved AOT for cases c (between latitudes 25 10 • and 15 • and longitude -20 • ) and d reveal square structures, both over the ocean and land. These square shaped spatial artefacts are the surface albedo grids derived from the database provided by Tilstra et al. (2017), which is the current source for surface reflectance in the ALH retrieval algorithm. In cases such as case c, the retrieved AOT contains surface information influenced by the assumed albedo in the database. These spatial features are not as apparent in cases a and b ( Figure 5, 1 st and 2 nd rows) as a majority of the signal in the measured top of atmosphere radiances come from aerosols and the minority from 30 the surface. Another major observation is the lack of retrievals over the desert. This is within expectation, as measurements of the top of atmosphere radiances over a cloud-free desert scene tend to contain more photons scattered back from the surface than the aerosol layer. As a result, retrievals over bright scenes are sensitive to the assumed errors in the surface albedo, thereby reducing sensitivity to the assumed aerosol layer height (Sanders et al. (2015), Section 2, Figure 2).
While scenes not contaminated with clouds show a smooth spatial distribution of the retrieved ALH, the presence of clouds may or may not add spatial variability in the ALH product. For instance, the the presence of low clouds are clear in case b ( Figure 5b) beyond latitude 21.0 • , but the retrieved ALH is spatially homogeneous with values less than 1.0 km. For each of the selected cases, colocated CALIOP profiles in Figure 6 give additional information about the scene. These TROPOMI-CALIOP colocations are done via the method discussed in Appendix A. The CALIOP curtain plot for case b reveals the influence of 5 low clouds as well as high clouds on the cloud-screened ALH. An example of cloud-contaminated heterogeneous vertical distribution of TROPOMI ALH in Figure 6a can be observed between latitudes 9.5 • and 11.0 • . The cloud filtering following the cloud_warning flag in Table 1 does not detect these low clouds (for instance above latitude 21.50 • , see Figure 6 a, b). These are manually for comparison further on.
From Figure 2b, TROPOMI retrievals of ALH over bright surfaces are expected to differ from CALIOP ALH ext , wherein 10 the TROPOMI ALH product may report ALH estimates closer to the surface than CALIOP will. This is observed in case d ( Figure 5, bottom row), wherein the CALIOP curtain plot for (Figure 6d) indicates that the plume is close to the surface, with a maximum height less than 3 km; TROPOMI ALH for biomass burning aerosol plume that extends from land to the ocean is slightly closer to the surface in the case of land when compared to CALIOP ALH ext , whereas over the ocean both height estimates more or less are in agreement.

15
For cases a and b, retrieved TROPOMI ALH does not seem to coincide with large values of the received backscatter signal in the level-1 data, whereas it does for case c, and to a certain extent for case d (over land it tends to be closer to the surface).
Parts of the CALIOP curtain plots for cases a, b and c suggest that a possible second layer beneath the layer that is visually obvious, or that the desert dust layer extends deeper to the surface and the CALIOP signal is simply too attenuated to detect it. These features are to be kept in mind before proceeding on to a direct comparison of the CALIOP ALH ext and TROPOMI 20 ALH of these selected cases (Figure 7). For this comparison, every cloud-filtered and sun-glint-filtered TROPOMI pixel with ALH information colocated to a specific CALIOP level-2 aerosol extinction profile is averaged and a standard deviation is also computed. These averaged TROPOMI ALH are then compared to the CALIOP ALH ext . What is immediately apparent is that, while there seems to be an agreement between the two heights (indicated by the pearson correlation coefficient of 0.64, the slope of fit of 1.0 and an intercept of 0.53 km), CALIOP ALH ext are systematically higher than TROPOMI ALH (indicated 25 by a y-intercept of the fit at 0.53 km). The CALIOP ALH ext is also higher than TROPOMI ALH almost consistently in most cases.

Discussion and conclusion
This paper discusses the quality of the soon to be released TROPOMI ALH product by comparing it with CALIOP data of colocated measurements of scenes containing aerosols between the two instruments. In order to do so, CALIOP weighted 30 extinction heights from the 532 nm channel were calculated following Equation 2, and then directly compared to TROPOMI ALH. Further on, four individual cases of Saharan desert dust and biomass burning aerosol events in 2018 were selected for a deeper analysis of the product's quality.
From the analysis presented in this paper, TROPOMI's neural network ALH retrieval algorithm retrieves ALH values that compare well with CALIOP weighted extinction heights in cloud-screened cases following the cloud screening strategy using the TROPOMI ALH level-2 processing quality flags discussed in Table 1. For more than 1 million colocations between CALIOP and TROPOMI over the ocean, the TROPOMI ALH differs from CALIOP ALH ext on average by approximately -1 km and -0.76 km median, with the TROPOMI ALH values being lower than the CALIOP ALH ext . Over land, the same values 5 are -2.41 km on average and -1.75 km as the median. For the selected cases, largely over the ocean within a portion of the data over land, the averaged retrieved ALH from TROPOMI differed from CALIOP ALH ext by 0.53 km, with CALIOP ALH ext being higher than TROPOMI ALH. These numbers are indicative that TROPOMI ALH performs well, especially considering the many simplifications made by the retrieval algorithm in order to optimise on the computational speed; future improvements to the forward model may only improve the product further on.

10
There is a clear distinction between TROPOMI ALH retrievals over land and the ocean as photons scattering back from bright surfaces tend to influence ALH estimates closer towards the surface than an elevated aerosol layer. Retrieved ALH over land, if successful, can to be closer to the surface if measured signal in the top of atmosphere contains more photons scattered back from the deepest atmospheric layer which is the surface, in comparison to elevated aerosol layers which are higher up in the atmosphere. This, however can change depending on the amount of aerosol information available in the spectrum compared 15 to same from the surface. Any attempt in retrieving ALH over the desert generally fail, with very few exceptions. There are several challenges, that will need further development.
The TROPOMI level-2 UVAI product is currently an ingredient in selecting pixels containing aerosols for retrieving ALH.
While this choice works quite well for cloud free scenarios, it does not do a great job when a scene that contains both aerosols and clouds. These cloudy scenes seem to not be detected by the current cloud filtering schematic in the level-2 algorithm, 20 and will require a significant update in deciding whether a pixel is cloudy or not. For cases scenes with a low aerosol load, square shaped artefacts resulting from a surface albedo database with a resolution significantly lower than TROPOMI exist.
Currently, the GOME-2 surface LER product derived from Tilstra et al. (2017) is used operationally, and will eventually need to be updated with a higher resolution version possibly derived from TROPOMI itself.
Finally, space based lidars such as the CALIOP instrument on board the CALIPSO mission are a very good source of 25 aerosol vertical information to validate the TROPOMI ALH product. While the CALIOP level-1 backscatter profiles may be attenuated in cases of very strong signals from the top of the aerosol layer, the weighted extinction heights in conjunction with the backscatter profiles are sufficient for validation activities. These CALIOP profiles will be very important in assessing the impact of future development activities of the TROPOMI ALH product.
Appendix A: Colocation

30
The colocation between TROPOMI and CALIOP ground pixels is done in the following manner. First, the geographic coordinates of CALIOP level 1 backscatter profiles and level 2 aerosol extinction profiles are converted into the Cartesian coordinate system. These CALIOP coordinates are fed into a k-dimensional tree, which is a fast algorithm developed by Maneewong-