A Bayesian parametric approach to the retrieval of the atmospheric number size distribution from lidar data
- 1Dipartimento di Matematica, Università di Genova, Genova, Italy
- 2Dipartimento di Fisica, Università di Napoli Federico II, Napoli, Italy
- 3CNR-IMAA, Potenza, Italy
- 4CNR-SPIN, Napoli, Italy
- 5ALA Srl Advanced Lidar Applications, Napoli, Italy
Correspondence: Alberto Sorrentino (firstname.lastname@example.org)
We consider the problem of reconstructing the number size distribution (or particle size distribution) in the atmosphere from lidar measurements of the extinction and backscattering coefficients. We assume that the number size distribution can be modeled as a superposition of log-normal distributions, each one defined by three parameters: mode, width and height. We use a Bayesian model and a Monte Carlo algorithm to estimate these parameters. We test the developed method on synthetic data generated by distributions containing one or two modes and perturbed by Gaussian noise as well as on three datasets obtained from AERONET. We show that the proposed algorithm provides good results when the right number of modes is selected. In general, an overestimate of the number of modes provides better results than an underestimate. In all cases, the PM1, PM2.5 and PM10 concentrations are reconstructed with tolerable deviations.
Lidar (light detection and ranging) is a remote sensing technique similar to radar (radio detecting and ranging) which uses light in the form of short laser pulses to invest a target and obtain, through elastic and inelastic scattering processes, information on the target properties as a function of the distance from the lidar system.
In the atmospheric application, lidar systems can be used to obtain spatially resolved information about the optical properties of the atmospheric aerosols (desert dust, volcanic ash, smog and many other types of substances) (Giannakaki et al., 2010; Ritter et al., 2018; Lee and Wong, 2018; Stelitano et al., 2019; Chazette, 2020) over a distance of several kilometers and with high spatial and temporal resolutions. Advanced, multiwavelength lidar systems are capable of giving information on the spectral dependence of the aerosol optical properties and can therefore be used to obtain information on the aerosol microphysical properties (Weitkamp, 2006; Pérez-Ramírez et al., 2013; Granados-Muñoz et al., 2014; Giannakaki et al., 2016; Müller et al., 2016; Chemyakin et al., 2016; Ortiz-Amezcua et al., 2017; Molero et al., 2020).
However, information on the microphysical properties of the atmospheric aerosols is seldom obtained using the lidar signal alone. This information, which is essential for a complete aerosol characterization useful to understand their effect on climate, is instead frequently obtained through the synergistic use of in situ instruments; incidentally these measurements also allow a validation of the lidar retrievals, but only for those values that are closest to the ground and for a particular aerosol typology (Saharan dust, biomass burning aerosol, etc.); alternatively, validation can be done using synthetic data (Alados-Arboledas et al., 2011; Di Girolamo et al., 2012; Veselovskii et al., 2013; Osterloh et al., 2013; Samaras et al., 2015; Whiteman et al., 2018).
In order to retrieve the microphysical properties of the aerosol from lidar measurements, two inverse problems must be solved in sequence: in the first inverse problem, one uses the measured backscattered power to obtain an estimate of the aerosol optical parameters; in the second problem, one uses the estimated optical parameters (at different wavelengths), derived from lidar observations, to obtain an estimate of the number size distribution, i.e., the density of particles as a function of the particle size. This latter problem is particularly challenging because of the limited number of data due to the practical problem of measuring at many different wavelengths. Moreover, from a mathematical point of view, the microphysical parameters are generally derived from the optical ones through integral equations that cannot be solved analytically and whose numerical solution leads to a so-called ill-posed problem. This last is characterized by a strong sensitivity of the solution from the input data uncertainties and by the non-uniqueness of the solution. Remarkably, from a mathematical viewpoint there can be several ways to overcome ill-posedness; however, not all of them actually reflect realistic physical conditions. In addition, numerical studies showed that a poor selection of the constraints can affect the quality of the solution and compromise the microphysical retrieval in spite of the strength of the regularization algorithm. Therefore, in order to obtain stable and physically acceptable solutions, mathematical and physical constraints variously combined with regularization methods are applied (Müller et al., 1999; Böckmann, 2001; Veselovskii et al., 2002; Böckmann et al., 2005; Kolgotin and Müller, 2008).
In the past decade, a number of studies have focused on the retrieval of the microphysical aerosol parameters from multiwavelength lidar measurements using the standard “3β+2α” configuration (i.e., the measurement of the backscattering coefficient at three wavelengths and the extinction coefficient at two wavelengths) (Müller et al., 1998, 2001; Wandinger et al., 2002; Müller et al., 2003; Murayama et al., 2004; Müller et al., 2006; Tesche et al., 2008; Noh et al., 2009; Balis et al., 2010; Alados-Arboledas et al., 2011; Navas-Guzman et al., 2013; Nicolae et al., 2013; Sawamura et al., 2014; Chemyakin et al., 2014; Burton et al., 2016; Tesche et al., 2019; Pérez-Ramírez et al., 2019, 2020; McLean et al., 2021). Most of these studies refer mainly to data from ground-based elastic and Raman lidars working at 355, 532 and 1064 nm. However, the validation of such retrievals is a challenging task due to the unavailability of direct collocated measurements.
Other studies have compared the “3β+2α” lidar retrievals to the AERONET (Aerosol Robotic Network) retrievals (Holben et al., 1998; Veselovskii et al., 2009; Sawamura et al., 2014). Nevertheless, AERONET retrieval of columnar volume concentrations, assuming the aerosol is uniformly mixed throughout the boundary layer, is not directly comparable to vertically resolved lidar results. Intercomparison studies between “3β+2α” lidar retrievals and AERONET retrievals or ground-based in situ measurements are not suitable to properly evaluate the performance of the lidar microphysical retrievals obtained for different altitudes.
Despite these difficulties, the possibility of characterizing the atmospheric particulate using only the lidar instrument would be very advantageous, and for these reasons it is currently a much studied topic (Di et al., 2018a; Kolgotin et al., 2016; De Rosa et al., 2020).
Following the state of the art, we retrieve the particle size distribution from “3β+2α” lidar system parameters. In order to mitigate the ill-posedness, we adopt a parametric model for the number size distribution. Several authors referred to this shape as triangular, such as in (Veselovskii et al., 2004), defined on equidistant or a logarithmic-equidistant grid for the effective radius of the particles. In agreement with Di et al. (2018b) and Sun et al. (2013) and with the standard AERONET inversion procedure (Dubovik et al., 2006), we find it more viable to work with log-normal distributions. In particular we reconstruct the particle size distribution as a superposition of a small number of log-normal distributions on a logarithmic interval of particle radius. In practice, this entails that the number of unknowns to be estimated is either three, in the unimodal case, or six, in the bimodal case; the problem is therefore overdetermined in the unimodal case and underdetermined in the bimodal case. In both cases, it is very useful to have the possibility of exploiting any information one might have a priori, for instance on the plausible interval range for the values of these unknowns; to this aim, we set up a Bayesian model where such a priori information can be coded in proper prior distributions, and we then use uniform priors in selected intervals. Then, because of the non-linearity of the problem, we adopt a Markov Chain Monte Carlo algorithm (Metropolis–Hastings) to approximate the posterior distribution of the parameters of interest. Monte Carlo methods have long been considered to be of little practical use due to their high computational cost; however, the steady growth of available computational power that characterized the last decades has made them largely usable in the applications. We reckon the proposed method features three main advantages with respect to the state of the art: (i) because it is based on a Bayesian model, it naturally provides uncertainty quantification on the estimated parameters, which is not always the case for competitors; (ii) because it makes use of a Monte Carlo algorithm, it does not get stuck in local minima like deterministic optimization algorithms often do; and (iii) for the same reason it can be easily generalized to include, for instance, a non-Gaussian distribution of the noise term.
This article is organized as follows: in the “Methods” section, we provide the mathematical formulation of the problem and a description of the Monte Carlo algorithm; in the following section, we analyze the results obtained for synthetic data using five exemplar cases; in the final section we briefly summarize our conclusions.
Lidar instruments measure the backscattered light power at wavelength λ, coming from distance z, given by the following equation:
where α(λ,z) and β(λ,z) are the extinction and backscattering coefficients, respectively, and C is a constant that depends on the instrument characteristics. Equation (1) can be looked at as an inverse problem, where one aims at recovering the extinction and backscattering coefficients from lidar data. By solving the inverse problem, (see Ansmann et al., 1990, 1992; Shcherbakov, 2007; Pornsawad et al., 2008; Garbarino et al., 2016; Denevi et al., 2017), one obtains extinction and backscattering coefficients at every altitude z for a usually small set of λ. The present article is concerned with the subsequent problem of estimating the number size distribution from these indirect measures of α(λ,z) and β(λ,z). In the following, we omit the dependence on z because the problem can be solved independently at different altitudes.
2.1 Definition of the problem
The extinction and backscattering coefficients carry information on the number size distribution through the Mie scattering theory. Specifically, let n(r) be the number size distribution; then, under the approximation of spherical homogeneous particles, the extinction and backscattering coefficients are given by
where ra and rb are the lower and upper bounds for the particles' size, m is the complex refractive index (CRI) of the target atmosphere, and are integral kernels defined as follows.
where an and bn are defined as follows:
ψn and ξn being the Riccati–Bessel functions.
The problem we want to solve consists of retrieving the number size distribution n(r) from a set of measurements and . By defining a data array and discretizing the possible values of r, we get an inverse problem of the form
where n is a vector such that ni=n(ri), K is the discretization of the integral kernels, and ε is the noise affecting the data. Typical experimental values are such that α is measured at two wavelengths ( nm), and β is measured at three wavelengths ( nm).
2.2 Parametric model
Solving the linear inverse problem defined by Eq. (8) with a reasonable discretization of the r variable (say, at least 200 points) entails recovering a large number of unknowns from a very small dataset. One viable option is to reduce the number of unknowns by using a variable support function, such as in Osterloh et al. (2011). Here we take a different approach and assume that the number size distribution has a pre-defined shape that depends on a small number of parameters (Osterloh et al., 2013). Specifically, we assume that the number size distribution is the superposition of a small number N of log-normal distributions.
Each log-normal distribution is completely defined by three parameters: its mean μ, its standard deviation σ and its height h. The total number of parameters to be estimated is therefore 3N, which is substantially smaller than the number of parameters for solving the unconstrained, linear case. However, the problem is now non-linear and cannot be solved with standard Tikhonov regularization. Indeed, we can rewrite Eq. (8) as
where represents non-linear functions of the mode and width of each component and can be derived by inserting Eq. (9) into Eqs. (2)–(3). We observe that the dependence on the CRI m is indicated here by the subscript because we consider m to be fixed and known throughout this study.
In this work, we set up a Bayesian model and apply a Monte Carlo technique to find the best parameters of unimodal (N=1) and bimodal (N=2) distributions, given a priori information and the data.
2.3 Bayesian approach
In the Bayesian framework, probability distributions are used to code our degree of knowledge on the values of unknown or unobservable quantities: perfect knowledge is represented by a probability distribution which is non-zero only in the correct value, and partial knowledge is represented by a probability distribution which assigns high probability to likely values and low probability to unlikely values. The Bayesian framework is useful to combine a priori information, i.e., information available before the data are collected, with the information content of the data: a priori information is coded in the so-called prior distribution, while the information content of the data is conveyed by the likelihood function.
Let us define the vector containing all the unknown parameters. We denote its prior density as p(x); in this work, we assume that individual parameters are independent of each other so that their joint density is the product of single densities. We also assume that no additional prior information is available and use uniform prior distribution for each of them in a suitable interval. This leads us to define the prior as
where 𝒰() denotes the uniform distribution.
As far as the likelihood is concerned, we assume that the data are affected by Gaussian noise, which leads us to define the likelihood as
where denotes the Gaussian distribution for the y variable, with mean a and standard deviation b. We can finally derive the posterior distribution as given by the Bayes theorem:
where p(y) is a normalization constant whose value is unknown but can be neglected as it does not depend on x. The posterior distribution represents all the available information of the values of the unknown set of parameters x, given the data y. As such, it represents the solution to the inverse problem. However, it is a probability distribution on a relatively high-dimensional space ℛ3N and is therefore difficult to visualize. In the next section we illustrate a computational algorithm that allows samples from this distribution to be obtained and, in particular, the set of parameters that maximizes the posterior itself to be found.
Finally, let us observe that in this work we assume that N is known; while this is, in general, not true, it makes the problem more tractable. In practice, N can be selected by the user, and solutions with different values can be calculated and compared. Future work will be devoted to considering the general case of unknown N.
2.4 Monte Carlo algorithm
The posterior distribution defined in Eq. (11) is a complicated function on a relatively high-dimensional space: characterization of such distribution requires a computational tool that is able to deal with narrow peaks and local modes. In this work we produce a numerical approximation of the posterior distribution using a Markov chain Monte Carlo (MCMC) approach.
The general idea of Monte Carlo methods is to sample the target distribution p(x|y), i.e., to obtain a set of realizations of random variables distributed according to the target distribution p(x|y); indeed, the law of large numbers then guarantees that, for any suitable function f(X), the following approximation holds:
where Ω is a given domain. The possibility of approximating the integral of any function f() implies that one has a full characterization of the target distribution.
MCMC methods work by constructing a Markov Chain whose invariant distribution is the target posterior distribution, i.e.the transition kernel of the Markov Chain satisfies
The implications of this choice are obvious: if X(k) is distributed according to π(x|y), then each X(n) has the same distribution for n≥k. Importantly, if a set of technical conditions are met, after an initial “wandering” time, referred to as burn-in, the Markov Chain will hit the target distribution independently of the initialization. Therefore, once a transition kernel satisfying Eq. (12) has been constructed, it is enough to simulate from it, starting with a random initialization.
In this work we use the Metropolis–Hastings construction of the kernel, which has the form
and has to be interpreted as follows: given the current state x, the Markov Chain will evolve with probability according to a proposal distribution , and with probability it will remain in x; the delta function here is the Dirac delta that gives all the probability mass to the point . A sufficient condition that guarantees that the transition kernel defined in Eq. (13) is invariant with respect to π(x|y) is that the acceptance probability is set as follows:
In the numerical simulations below we use Gaussian proposal distributions; Gaussian distributions have the pleasant property so that the acceptance ratio simplifies, as detailed below.
Eventually, the MCMC algorithm works as follows. We start from an initializing value x(1), drawn from the prior distribution. Then for k>1,
for every and each parameter within x(k), a new value is proposed by drawing a Gaussian perturbation around the current value; this way, a new state is proposed x* which is identical to x(k) for all but one value (e.g., assuming N=1, , where μ* is drawn from a Gaussian distribution of mean μ(k)).
compute the acceptance ratio α; given our previous choices of uniform distribution and symmetric proposal distribution,
accept the proposed value with probability : if α≥1, then set ; otherwise, draw a uniform number , and accept the proposed value if r<α; otherwise, set (i.e., do not move).
The algorithm proceeds for a fairly large number of samples; in the simulations below, we use 5000 samples.
We show several examples of applications of the Monte Carlo method to completely synthetic data as well as to data derived from experimental recordings, in the following denoted as quasi-real data. We first consider unimodal distributions with variable modal radius from “fine” to “coarse” and then with bimodal distributions with components of variable amplitude. Synthetic distributions are generated so as to represent typical atmospheric aerosol distributions; for this reason, an analysis of the distributions deriving from the observations of the AERONET network has been made. We then analyze three quasi-real datasets derived from the AERONET network. As already mentioned, throughout this study the complex refractive index m is considered to be known; the possibility of also estimating the CRI will be the subject of future developments.
3.1 Performance evaluation
In order to perform a quantitative analysis of reconstruction accuracy, we use two different methods.
The first method is based on the deviation between the size distribution (SD) reconstructed by the inversion algorithm and the simulated exact SD. Indeed, the algorithm determines the SD that reproduces the set of measured parameters with some tolerance to account for the presence of noise; it is first necessary to define a method for quantitatively measuring the distance between the synthetic SD and the reconstructed SD. This can be done using the deviation defined as
where SDT is the true size distribution (simulated); SDR is the size distribution reconstructed by the algorithm; and rmin and rmax are minimum and maximum values of the radius, determined on the basis of the condition that within the interval the SDT values are higher than maximum of the SDT. This last condition allows only the significant parts of the distribution to be taken into account.
The second method of evaluating the accuracy of the solutions is based on the calculation of integral properties of the size distributions. Since our algorithm allows us to determine the dimensional distribution expressed as
it is possible to derive the concentration in volume
and therefore the particulate matter parameters PM1, PM2.5 and PM10, defined as the amount of particulate matter with sizes smaller than 1, 2.5 and 10 µm, respectively, can be computed as
where ρ represents the particulate matter density, which is here assumed constant and equal to 1, and R assumes the values 1, 2.5 or 10 µm for PM1, PM2.5 and PM10, respectively.
In addition, we also take into consideration the effective radius reff and the mean volume radius rv, defined as
where rmin and rmax assume in our case the values 0.01 and 20 µm, respectively.
3.2 Numerical validation
We first proceed to show that the deviation measure is a good indicator of the performances in the sense that its value gives a quantitative evaluation of the distance between true and estimated size distributions. To this aim, we simulated in Simulation 1 a unimodal distribution with the parameters given in Table 1 (middle column). We then run the reconstruction algorithm multiple times, from different random initializations and with different numbers of iterations ranging between 100 and 5000, so as to reach a diverse population of final estimates. In the inversion we assumed that the optical input parameters (2α+3β) are determined with the 5 % error and imposed the constraints on the parameters given in Table 1. We observe that a 5 % error in the retrieved optical parameters might seem unrealistically small; however, as the inverse method is applied to a single set of optical parameters, these can be obtained by averaging across different altitudes and/or times, thus effectively reducing the impact of noise and making our assumption plausible. Future studies will be devoted to investigate the impact of the noise level more in detail.
Discretization of the r variable used 500 values logarithmically spaced between 0.01 and 20 µm. Notably, these values correspond to ra and rb in Eqs. (2) and (3); i.e., they define the discretization range in the integration and shall not be interpreted as extrema of a range for which the method can obtain reliable reconstructions. Indeed, it is typically hard to retrieve modes corresponding to particles larger than 7 µm.
Figure 1 shows the comparison between the simulated size distribution and the reconstructed size distribution in cases where the deviation is <0.0001 (left), 0.001 (center) and 0.01 (right), respectively. This figure suggests that the deviation is a good indicator of the quality of the reconstruction.
Having ascertained that the deviation is a good measure of the “closeness” of the reconstruction to the real distribution, it is obvious to assume that a reconstruction with a small value of deviation must correspond to a low value of the discrepancy between measured and predicted optical coefficients, but the inverse statement is not true. In fact, since the discrepancy only measures the distance of the reconstructed optical parameters with respect to the input ones, small differences in the optical parameters (low values of the discrepancy) can correspond to very large differences in the size distribution parameters.
The statistical nature of the Monte Carlo method includes an intrinsic instability in the sense that a repetition of the calculation with the same set of input optical parameters, even if without error, leads to different reconstructions. The dispersion of the reconstructions with the same initial conditions is a measure of the stability of the method.
Another issue is the effect of noise on the optical input parameters. This random perturbation causes a further increase in the instability of the reconstruction, which may also prevail over the intrinsic instability of the method.
In order to have a quantitative evaluation of the influence of the instability of the method with respect to the noise in the input parameters, we have made a statistical analysis of the discrepancy. A bimodal distribution was simulated with the parameter values given in Table 1
The values of the extinction coefficients at the wavelengths of 355 and 532 nm and of the backscattering coefficients at the wavelengths of 355, 532 and 1064 nm were then determined with the Mie theory, considering homogeneous spherical particles. The values of the optical coefficients were then used as input data for the reconstruction. The reconstruction was repeated 30 times, each time perturbing the set of input parameters in order to simulate a 5 % error in each of them. We then calculated the discrepancy of each of the 30 sets of optical parameters perturbed with respect to the unperturbed set (input discrepancy). The distribution of the input discrepancy is shown in Fig. 2, left plot, and the distribution of the values of the discrepancy of the corresponding 30 reconstructions (output discrepancy) is shown in the right plot in the same figure.
It should be kept in mind that the discrepancy input represents the “distance” between each set of perturbed parameters and the theoretical set, while the discrepancy output represents the “distance” between the set of parameters corresponding to each reconstruction with respect to the set of input parameters of the same reconstruction. From Fig. 2 we see that (1) the uncertainty of 5 % in the optical input parameters corresponds to a much greater fluctuation in the solution than the one produced by the instability of the algorithm, and (2) the algorithm always finds the solutions corresponding to a set of optical parameters very close to the input one. To take this into account, our method is based on repetition of the calculation; at each repetition, the set of optical parameters is disrupted, and each of the parameters is subjected to a variation that takes into account the experimental error. In practice, each parameter is assigned a value extracted from a Gaussian distribution around the experimental value and width equal to the error itself. With each set of optical input parameters thus determined, the Monte Carlo is executed with a fixed number of iterations. The standard deviation of the coefficients of the size distributions thus obtained is considered a good evaluation of the uncertainty in the final solution.
The uncertainty obviously takes into account both the instability of the method and the errors in the input parameters. In the tests conducted so far the distribution of the optical parameters has been considered Gaussian, but in view of the simple logical structure of the algorithm, it is in principle possible to introduce arbitrary distributions to take into account, for example, the contribution of systematic errors in the input optical parameters (consider for example that the error in the backscattering coefficient at 1064 nm can be dominated by the uncertainty in the lidar ratio, whose value should be fixed a priori in a more or less arbitrary manner).
3.3 Results with synthetic data
In the following we show the results of different tests for simulated SDs representing realistic cases. The reconstruction has been obtained by setting the number of iterations to 5000 and by running the algorithm 30 times, with noise equal to 5 %. For each run, we collect the best solution, and we provide uncertainty quantification, shown as a shaded area in the pictures below, using the standard deviation of the best solution across these runs.
The (a) and (b) cases simulate a unimodal SD centered on the coarse mode of a realistic bimodal distribution; the (c) and (d) cases simulate a unimodal SD which approximates a fine mode of a realistic bimodal distribution.
Figure 4 shows the reconstructions with minimum discrepancy of a simulated bimodal distribution with two modes with similar width. The goal of this test is to evaluate the accuracy of the reconstruction of a realistic bimodal distribution obtained by a unimodal distribution whose parameters have the same constraint as those in Fig. 3.
Tables 3, 4, 5 and 6 report the deviations of each reconstruction of the Figs. 3, 4, 5 and 6 and the percentage deviations of the parameters PM1, PM2.5, PM10, PM, reff and rv with respect to the simulated SD.
3.4 Results with quasi-real data
We finally validate our proposed method on three datasets that have been obtained from experimental data recorded by AERONET by applying the direct calculation of Mie functions to the size distribution reported in the AERONET database. For all three datasets, we attempt reconstruction with a unimodal and a bimodal distribution: constraints for the parameter values are reported in Table 7. In order to exemplify how the quality of the retrieval may depend on the actual value of the CRI, in addition to using the CRI value estimated by AERONET, we also use a very different value, i.e., one with a larger imaginary part. Specifically, the used CRI (1.57+i0.43) corresponds to the maximum value of both the real and the imaginary part as measured for the Saharan dust in several measurement campaigns (Wagner et al., 2012).
All reconstructions were obtained with 5000 iterations, 20 repetitions and 5 % noise in the optical parameters. In Table 8 we report details of the datasets, including registration site and date as well as performance evaluation of our reconstructions.
As shown in Fig. 3 and Table 3, the reconstructions of simulated unimodal distributions with unimodal distributions give excellent results. The distributions with modal radius between 0.2 and 2.5 µm are reconstructed with high accuracy both for the shape of the distribution and for the integral properties (PM1, PM2.5, PM10, PM). More critical is the reconstruction of the effective radius and the mean volume radius; however the worst case happens when the size distribution is simulated with modal radius equal to 0.2 mm, in which case the difference with the values of the simulated distribution is equal to 11 % and 25 % for reff and rv, respectively. Note that the above results are obtained with very large, not realistic, constraints to the parameters of the reconstructed distribution.
The reconstruction of bimodal distributions, with realistic values of the parameters, by using unimodal distributions (see Fig. 4 and Table 4) gives solutions which show very important deviations with respect to the simulated distributions; nevertheless, the values of the integral parameters have deviations which are less than 35 %. In these cases, reff and rv are always underestimated and show percentage deviations with respect to the real value of 50 % and 80 %, respectively. These values are comparable to those routinely obtained in the inversion of lidar data, which typically provide errors around 50 % (Di et al., 2018a).
Figures 5 and 6 and the respective Tables 5 and 6 refer to reconstructions with bimodal SDs of a simulated unimodal distribution and a simulated bimodal distribution, respectively. In this case, the reconstructions are done by imposing more strict constraints than in the cases of unimodal distributions. The constraints are the product of a statistical analysis of the parameters of the size distributions from the AERONET network. In these cases the reconstruction of unimodal distributions gives solutions which have deviation less than 5 % except for the case of distributions with modal radius equal to 0.2 µm, in which case PM10 and PM have deviations around 8 % and 21 %, respectively; such deviations (see Fig. 5d) are due to the fact that often the algorithm introduces some contribution of particles with a radius around 5 µm. In all the cases, reff is reconstructed with an accuracy of around 5 %, while the accuracy of rv is less than 9 %, but in the case of distribution simulated with modal radius equal to 0.2 mm, rv is underestimated by 22 %.
In the bimodal SD reconstruction (Fig. 6 and Table 6), the “fine” mode is always reconstructed better than the “coarse” mode. This effect is connected to the wavelengths used for determining the optical parameters of the lidar measures. Overall the integral properties are reconstructed with an accuracy of less than 6 % for PM1 and PM2.5 and an accuracy between 6 % and 44 % for PM10 and PM; reff and rv are recovered with an accuracy between 7 % and 37 % and between 2 % and 57 %, respectively.
We observe that reconstructions obtained with bimodal distributions provide consistently better results than those obtained with unimodal distributions and that the error in assessing the PM concentrations remains at more-than-acceptable levels, particularly with bimodal distributions. This result is particularly reasonable because all the datasets were indeed generated by bimodal distributions, including the one recorded on Etna (Fig. 8), which may be erroneously thought to correspond to a unimodal distribution.
Our analysis also highlights a few limitations of the proposed technique. First, the technique presents an inherent subjectivity as regards the choice of unimodal versus bimodal distributions; while, on average, the bimodal settings performs better, it can also produce some spurious modes such as those in Fig. 5d. Our recommendation is to use a unimodal distribution when there is strong a priori indication in favor of it and a bimodal distribution elsewhere. Future studies will investigate the possibility of including the number of modes among the unknowns and provide a posterior probability for different numbers of modes in the same fashion as was done in Sorrentino et al. (2014) for a neuroimaging application.
A second limitation concerns the subjectivity in the choice of the CRI, which was assumed to be known in the present study. Our analysis of quasi-real data showed how the quality of the retrieval may depend on the value of the CRI and partly deteriorates when the imaginary part grows, particularly for larger modes. This is a known issue with lidar data that can possibly be solved in a Bayesian framework by devising better priors. In addition, a full Bayesian model including the CRI among the unknowns can be devised; however, with an increased number of unknowns it will be necessary to exploit more prior information to reduce the degree of ill-posedness.
To conclude, we observe that the uncertainty quantification currently implemented seems to provide, at times, optimistic results, to the extent that the true distribution is not always included in the confidence bands. We reckon that these limitations can be overcome by using more complex but more powerful Monte Carlo sampling techniques, such as those described in Sorrentino et al. (2014), Luria et al. (2019), Sciacchitano et al. (2019) and Viani et al. (2021); this will also be the topic of future studies.
The preliminary results presented in this paper indicate that the proposed method can retrieve uni- and bimodal distributions from extinction coefficients measured at two wavelengths and backscattering coefficients measured at three wavelengths when the correct number of modes is selected. The reconstruction of three-modal distributions is more challenging, and more constraints might be necessary to obtain reliable and stable solutions. The extension of the method to three-modal distributions and variable refractive index, together with better uncertainty quantification and automatic model selection, will be the subject of future studies.
The code used in this article is not publicly accessible as the research has been partially funded by a private company. However, the article contains information that shall enable the reproduction of the code.
ASo conceived and implemented the algorithm. VT and PC contributed to an efficient implementation of the algorithm. NS, ASa and AB performed the statistical analysis used to determine inversion constraints. NS ran the simulations and performed the analysis of experimental data. All authors contributed to writing the manuscript.
The contact author has declared that neither they nor their co-authors have any competing interests.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Alberto Sorrentino was partially supported by Gruppo Nazionale per il Calcolo Scientifico. This project was partially supported by the Beijing Research Institute on Telemetry (BRIT), Beijing, China.
This paper was edited by Daniel Perez-Ramirez and reviewed by three anonymous referees.
Alados-Arboledas, L., Müller, D., Guerrero-Rascado, J., Navas-Guzmán, F., Pérez-Ramírez, D., and Olmo, F.: Optical and microphysical properties of fresh biomass burning aerosol retrieved by Raman lidar, and star-and sun-photometry, Geophys. Res. Lett., 38, L01807, https://doi.org/10.1029/2010GL045999, 2011. a, b
Ansmann, A., Riebesell, M., and Weitkamp, C.: Measurement of atmospheric aerosol extinction profiles with a Raman lidar, Opt. Lett., 15, 746–748, 1990. a
Ansmann, A., Riebesell, M., Wandinger, U., Weitkamp, C., Voss, E., Lahmann, W., and Michaelis, W.: Combined Raman elastic-backscatter lidar for vertical profiling of moisture, aerosol extinction, backscatter, and lidar ratio, Appl. Phys. B-Lasers O., 55, 18–28, 1992. a
Balis, D., Giannakaki, E., Müller, D., Amiridis, V., Kelektsoglou, K., Rapsomanikis, S., and Bais, A.: Estimation of the microphysical aerosol properties over Thessaloniki, Greece, during the SCOUT-O3 campaign with the synergy of Raman lidar and Sun photometer data, J. Geophys. Res.-Atmos., 115, D08202, https://doi.org/10.1029/2009JD013088, 2010. a
Böckmann, C.: Hybrid regularization method for the ill-posed inversion of multiwavelength lidar data in the retrieval of aerosol size distributions, Appl. Optics, 40, 1329–1342, 2001. a
Böckmann, C., Mironova, I., Müller, D., Schneidenbach, L., and Nessler, R.: Microphysical aerosol parameters from multiwavelength lidar, J. Opt. Soc. Am. A, 22, 518–528, 2005. a
Burton, S. P., Chemyakin, E., Liu, X., Knobelspiesse, K., Stamnes, S., Sawamura, P., Moore, R. H., Hostetler, C. A., and Ferrare, R. A.: Information content and sensitivity of the 3β + 2α lidar measurement system for aerosol microphysical retrievals, Atmos. Meas. Tech., 9, 5555–5574, https://doi.org/10.5194/amt-9-5555-2016, 2016. a
Chemyakin, E., Müller, D., Burton, S., Kolgotin, A., Hostetler, C., and Ferrare, R.: Arrange and average algorithm for the retrieval of aerosol parameters from multiwavelength high-spectral-resolution lidar/Raman lidar data, Appl. Optics, 53, 7252–7266, 2014. a
Chemyakin, E., Burton, S., Kolgotin, A., Müller, D., Hostetler, C., and Ferrare, R.: Retrieval of aerosol parameters from multiwavelength lidar: investigation of the underlying inverse mathematical problem, Appl. Optics, 55, 2188–2202, 2016. a
De Rosa, B., Di Girolamo, P., and Summa, D.: Determination of aerosol size and microphiysical proprierties based on multi-wavelength raman lidar measurements in the framework of HyMeX-SOP1, in: EGU General Assembly Conference Abstracts, May 2020, p. 17218, 2020. a
Denevi, G., Garbarino, S., and Sorrentino, A.: Iterative algorithms for a non-linear inverse problem in atmospheric lidar, Inverse Probl., 33, 085010, https://doi.org/10.1088/1361-6420/aa7904, 2017. a
Di, H., Wang, Q., Hua, H., Li, S., Yan, Q., Liu, J., Song, Y., and Hua, D.: Aerosol microphysical particle parameter inversion and error analysis based on remote sensing data, Remote Sensing, 10, 1753, https://doi.org/10.3390/rs10111753, 2018a. a, b
Di, H., Zhao, J., Zhao, X., Zhang, Y., Wang, Z., Wang, X., Wang, Y., Zhao, H., and Hua, D.: Parameterization of aerosol number concentration distributions from aircraft measurements in the lower troposphere over Northern China, J. Quant. Spectrosc. Ra., 218, 46–53, 2018b. a
Di Girolamo, P., Summa, D., Bhawar, R., Di Iorio, T., Cacciani, M., Veselovskii, I., Dubovik, O., and Kolgotin, A.: Raman lidar observations of a Saharan dust outbreak event: characterization of the dust optical properties and determination of particle size and microphysical parameters, Atmos. Environ., 50, 66–78, 2012. a
Dubovik, O., Sinyuk, A., Lapyonok, T., Holben, B. N., Mishchenko, M., Yang, P., Eck, T. F., Volten, H., Munoz, O., Veihelmann, B., van der Zande, W. J., Leon, J.-F., Sorokin, M., and Slutsker, I.: Application of spheroid models to account for aerosol particle nonsphericity in remote sensing of desert dust, J. Geophys. Res.-Atmos., 111, D11208, https://doi.org/10.1029/2005JD006619, 2006. a
Garbarino, S., Sorrentino, A., Massone, A. M., Sannino, A., Boselli, A., Wang, X., Spinelli, N., and Piana, M.: Expectation maximization and the retrieval of the atmospheric extinction coefficients by inversion of Raman lidar data, Opt. Express, 24, 21497–21511, 2016. a
Giannakaki, E., Balis, D. S., Amiridis, V., and Zerefos, C.: Optical properties of different aerosol types: seven years of combined Raman-elastic backscatter lidar measurements in Thessaloniki, Greece, Atmos. Meas. Tech., 3, 569–578, https://doi.org/10.5194/amt-3-569-2010, 2010. a
Giannakaki, E., van Zyl, P. G., Müller, D., Balis, D., and Komppula, M.: Optical and microphysical characterization of aerosol layers over South Africa by means of multi-wavelength depolarization and Raman lidar measurements, Atmos. Chem. Phys., 16, 8109–8123, https://doi.org/10.5194/acp-16-8109-2016, 2016. a
Granados-Muñoz, M., Guerrero-Rascado, J., Bravo-Aranda, J., Navas-Guzmán, F., Valenzuela, A., Lyamani, H., Chaikovsky, A., Wandinger, U., Ansmann, A., Dubovik, O., Grudo, J. O., and Alados-Arboledas, L.: Retrieving aerosol microphysical properties by Lidar-Radiometer Inversion Code (LIRIC) for different aerosol types, J. Geophys. Res.-Atmos., 119, 4836–4858, 2014. a
Holben, B. N., Eck, T. F., Slutsker, I. a., Tanre, D., Buis, J., Setzer, A., Vermote, E., Reagan, J. A., Kaufman, Y., Nakajima, T., Lavenu, F., Jankowiak, I., and Smirnov, A.: AERONET – A federated instrument network and data archive for aerosol characterization, Remote Sens. Environ., 66, 1–16, https://doi.org/10.1016/S0034-4257(98)00031-5, 1998. a
Kolgotin, A. and Müller, D.: Theory of inversion with two-dimensional regularization: profiles of microphysical particle properties derived from multiwavelength lidar measurements, Appl. Optics, 47, 4472–4490, 2008. a
Kolgotin, A., Müller, D., Chemyakin, E., and Romanov, A.: Improved identification of the solution space of aerosol microphysical properties derived from the inversion of profiles of lidar optical data, part 1: theory, Appl. Optics, 55, 9839–9849, 2016. a
Lee, K. H. and Wong, M. S.: Vertical Profiling of Aerosol Optical Properties From LIDAR Remote Sensing, Surface Visibility, and Columnar Extinction Measurements, in: Remote Sensing of Aerosols, Clouds, and Precipitation, pp. 23–43, Elsevier, https://doi.org/10.1016/B978-0-12-810437-8.00002-5, 2018. a
Luria, G., Duran, D., Visani, E., Sommariva, S., Rotondi, F., Sebastiano, D. R., Panzica, F., Piana, M., and Sorrentino, A.: Bayesian multi-dipole modelling in the frequency domain, J. Neurosci. Meth., 312, 27–36, 2019. a
McLean, W. G. K., Fu, G., Burton, S. P., and Hasekamp, O. P.: Retrieval of aerosol microphysical properties from atmospheric lidar sounding: an investigation using synthetic measurements and data from the ACEPOL campaign, Atmos. Meas. Tech., 14, 4755–4771, https://doi.org/10.5194/amt-14-4755-2021, 2021. a
Molero, F., Pujadas, M., and Artíñano, B.: Study of the Effect of Aerosol Vertical Profile on Microphysical Properties Using GRASP Code with Sun/Sky Photometer and Multiwavelength Lidar Measurements, Remote Sensing, 12, 4072, https://doi.org/10.3390/rs12244072, 2020. a
Müller, D., Wandinger, U., Althausen, D., Mattis, I., and Ansmann, A.: Retrieval of physical particle properties from lidar observations of extinction and backscatter at multiple wavelengths, Appl. Optics, 37, 2260–2263, 1998. a
Müller, D., Wandinger, U., and Ansmann, A.: Microphysical particle parameters from extinction and backscatter lidar data by inversion with regularization: theory, Appl. Optics, 38, 2346–2357, 1999. a
Müller, D., Wandinger, U., Althausen, D., and Fiebig, M.: Comprehensive particle characterization from three-wavelength Raman-lidar observations: case study, Appl. Optics, 40, 4863–4869, 2001. a
Müller, D., Franke, K., Ansmann, A., Althausen, D., and Wagner, F.: Indo-Asian pollution during INDOEX: Microphysical particle properties and single-scattering albedo inferred from multiwavelength lidar observations, J. Geophys. Res.-Atmos., 108, 4600, https://doi.org/10.1029/2003JD003538, 2003. a
Müller, D., Tesche, M., Eichler, H., Engelmann, R., Althausen, D., Ansmann, A., Cheng, Y., Zhang, Y., and Hu, M.: Strong particle light absorption over the Pearl River Delta (south China) and Beijing (north China) determined from combined Raman lidar and Sun photometer observations, Geophys. Res. Lett., 33, L20811, https://doi.org/10.1029/2006GL027196, 2006. a
Müller, D., Böckmann, C., Kolgotin, A., Schneidenbach, L., Chemyakin, E., Rosemann, J., Znak, P., and Romanov, A.: Microphysical particle properties derived from inversion algorithms developed in the framework of EARLINET, Atmos. Meas. Tech., 9, 5007–5035, https://doi.org/10.5194/amt-9-5007-2016, 2016. a
Murayama, T., Müller, D., Wada, K., Shimizu, A., Sekiguchi, M., and Tsukamoto, T.: Characterization of Asian dust and Siberian smoke with multi-wavelength Raman lidar over Tokyo, Japan in spring 2003, Geophys. Res. Lett., 31, L23103, https://doi.org/10.1029/2004GL021105, 2004. a
Navas-Guzman, F., Bravo-Aranda, J. A., Guerrero-Rascado, J. L., Granados-Munoz, M. J., and Alados-Arboledas, L.: Statistical analysis of aerosol optical properties retrieved by Raman lidar over Southeastern Spain, Tellus B, 65, 21234, https://doi.org/10.3402/tellusb.v65i0.21234, 2013. a
Nicolae, D., Nemuc, A., Müller, D., Talianu, C., Vasilescu, J., Belegante, L., and Kolgotin, A.: Characterization of fresh and aged biomass burning events using multiwavelength Raman lidar and mass spectrometry, J. Geophys. Res.-Atmos., 118, 2956–2965, 2013. a
Noh, Y. M., Müller, D., Shin, D. H., Lee, H., Jung, J. S., Lee, K. H., Cribb, M., Li, Z., and Kim, Y. J.: Optical and microphysical properties of severe haze and smoke aerosol measured by integrated remote sensing techniques in Gwangju, Korea, Atmos. Environ., 43, 879–888, 2009. a
Ortiz-Amezcua, P., Guerrero-Rascado, J. L., Granados-Muñoz, M. J., Benavent-Oltra, J. A., Böckmann, C., Samaras, S., Stachlewska, I. S., Janicka, Ł., Baars, H., Bohlmann, S., and Alados-Arboledas, L.: Microphysical characterization of long-range transported biomass burning particles from North America at three EARLINET stations, Atmos. Chem. Phys., 17, 5931–5946, https://doi.org/10.5194/acp-17-5931-2017, 2017. a
Osterloh, L., Böckmann, C., Mamouri, R., and Papayannis, A.: An adaptive base point algorithm for the retrieval of aerosol microphysical properties, Open Atmospheric Science Journal, 5, 61–73, 2011. a
Pérez-Ramírez, D., Whiteman, D. N., Veselovskii, I., Kolgotin, A., Korenskiy, M., and Alados-Arboledas, L.: Effects of systematic and random errors on the retrieval of particle microphysical properties from multiwavelength lidar measurements using inversion with regularization, Atmos. Meas. Tech., 6, 3039–3054, https://doi.org/10.5194/amt-6-3039-2013, 2013. a
Pérez-Ramírez, D., Whiteman, D. N., Veselovskii, I., Colarco, P., Korenski, M., and da Silva, A.: Retrievals of aerosol single scattering albedo by multiwavelength lidar measurements: Evaluations with NASA Langley HSRL-2 during discover-AQ field campaigns, Remote Sens. Environ., 222, 144–164, 2019. a
Pérez-Ramírez, D., Whiteman, D. N., Veselovskii, I., Korenski, M., Colarco, P. R., and da Silva, A. M.: Optimized profile retrievals of aerosol microphysical properties from simulated spaceborne multiwavelength lidar, J. Quant. Spectrosc. Ra., 246, 106932, https://doi.org/10.1016/j.jqsrt.2020.106932, 2020. a
Pornsawad, P., Böckmann, C., Ritter, C., and Rafler, M.: Ill-posed retrieval of aerosol extinction coefficient profiles from Raman lidar data by regularization, Appl. Optics, 47, 1649–1661, 2008. a
Ritter, C., Angeles Burgos, M., Böckmann, C., Mateos, D., Lisok, J., Markowicz, K., Moroni, B., Cappelletti, D., Udisti, R., Maturilli, M., et al.: Microphysical properties and radiative impact of an intense biomass burning aerosol event measured over Ny-Ålesund, Spitsbergen in July 2015, Tellus B, 70, 1–23, https://doi.org/10.1080/16000889.2018.1539618, 2018. a
Samaras, S., Nicolae, D., Böckmann, C., Vasilescu, J., Binietoglou, I., Labzovskii, L., Toanca, F., and Papayannis, A.: Using Raman-lidar-based regularized microphysical retrievals and Aerosol Mass Spectrometer measurements for the characterization of biomass burning aerosols, J. Comput. Phys., 299, 156–174, 2015. a
Sawamura, P., Müller, D., Hoff, R. M., Hostetler, C. A., Ferrare, R. A., Hair, J. W., Rogers, R. R., Anderson, B. E., Ziemba, L. D., Beyersdorf, A. J., Thornhill, K. L., Winstead, E. L., and Holben, B. N.: Aerosol optical and microphysical retrievals from a hybrid multiwavelength lidar data set – DISCOVER-AQ 2011, Atmos. Meas. Tech., 7, 3095–3112, https://doi.org/10.5194/amt-7-3095-2014, 2014. a, b
Sciacchitano, F., Lugaro, S., and Sorrentino, A.: Sparse bayesian imaging of solar flares, SIAM J. Imaging Sci., 12, 319–343, 2019. a
Shcherbakov, V.: Regularized algorithm for Raman lidar data processing, Appl. Optics, 46, 4879–4889, 2007. a
Sorrentino, A., Luria, G., and Aramini, R.: Bayesian multi-dipole modelling of a single topography in MEG by adaptive sequential Monte Carlo samplers, Inverse Probl., 30, 045010, https://doi.org/10.1088/0266-5611/30/4/045010, 2014. a, b
Stelitano, D., Di Girolamo, P., Scoccione, A., Summa, D., and Cacciani, M.: Characterization of atmospheric aerosol optical properties based on the combined use of a ground-based Raman lidar and an airborne optical particle counter in the framework of the Hydrological Cycle in the Mediterranean Experiment – Special Observation Period 1, Atmos. Meas. Tech., 12, 2183–2199, https://doi.org/10.5194/amt-12-2183-2019, 2019. a
Sun, X., Yin, Y., Sun, Y., Sun, Y., Liu, W., and Han, Y.: Seasonal and vertical variations in aerosol distribution over Shijiazhuang, China, Atmos. Environ., 81, 245–252, 2013. a
Tesche, M., Müller, D., Ansmann, A., Hu, M., and Zhang, Y.: Retrieval of microphysical properties of aerosol particles from one-wavelength Raman lidar and multiwavelength Sun photometer observations, Atmos. Environ., 42, 6398–6404, 2008. a
Tesche, M., Kolgotin, A., Haarig, M., Burton, S. P., Ferrare, R. A., Hostetler, C. A., and Müller, D.: : what is the most useful depolarization input for retrieving microphysical properties of non-spherical particles from lidar measurements using the spheroid model of Dubovik et al. (2006)?, Atmos. Meas. Tech., 12, 4421–4437, https://doi.org/10.5194/amt-12-4421-2019, 2019. a
Veselovskii, I., Kolgotin, A., Griaznov, V., Müller, D., Wandinger, U., and Whiteman, D. N.: Inversion with regularization for the retrieval of tropospheric aerosol parameters from multiwavelength lidar sounding, Appl. Optics, 41, 3685–3699, 2002. a
Veselovskii, I., Kolgotin, A., Griaznov, V., Müller, D., Franke, K., and Whiteman, D. N.: Inversion of multiwavelength Raman lidar data for retrieval of bimodal aerosol size distribution, Appl. Optics, 43, 1180–1195, 2004. a
Veselovskii, I., Whiteman, D., Kolgotin, A., Andrews, E., and Korenskii, M.: Demonstration of aerosol property profiling by multiwavelength lidar under varying relative humidity conditions, J. Atmos. Ocean. Tech., 26, 1543–1557, 2009. a
Veselovskii, I., Whiteman, D. N., Korenskiy, M., Kolgotin, A., Dubovik, O., Perez-Ramirez, D., and Suvorina, A.: Retrieval of spatio-temporal distributions of particle parameters from multiwavelength lidar measurements using the linear estimation technique and comparison with AERONET, Atmos. Meas. Tech., 6, 2671–2682, https://doi.org/10.5194/amt-6-2671-2013, 2013. a
Viani, A., Luria, G., Bornfleth, H., and Sorrentino, A.: Where Bayes tweaks Gauss: Conditionally Gaussian priors for stable multi-dipole estimation, Inverse Probl. Imag., 15, 1099–1119, https://doi.org/10.3934/ipi.2021030, 2021. a
Wagner, R., Ajtai, T., Kandler, K., Lieke, K., Linke, C., Müller, T., Schnaiter, M., and Vragel, M.: Complex refractive indices of Saharan dust samples at visible and near UV wavelengths: a laboratory study, Atmos. Chem. Phys., 12, 2491–2512, https://doi.org/10.5194/acp-12-2491-2012, 2012. a
Wandinger, U., Müller, D., Böckmann, C., Althausen, D., Matthias, V., Bösenberg, J., Weiß, V., Fiebig, M., Wendisch, M., Stohl, A., and Ansmann, A.: Optical and microphysical characterization of biomass-burning and industrial-pollution aerosols from-multiwavelength lidar and aircraft measurements, J. Geophys. Res.-Atmos., 107, 8125, https://doi.org/10.1029/2000JD000202, 2002. a
Weitkamp, C.: Lidar: Range-resolved optical remote sensing of the atmosphere, vol. 102, Springer Science & Business, ISBN 0387400753, 2006. a
Whiteman, D. N., Pérez-Ramírez, D., Veselovskii, I., Colarco, P., and Buchard, V.: Retrievals of aerosol microphysics from simulations of spaceborne multiwavelength lidar measurements, J. Quant. Spectrosc. Ra., 205, 27–39, 2018. a