Regularized inversion of aerosol hygroscopic growth factor probability density function: application to humidity-controlled fast integrated mobility spectrometer measurements

. The new Aerosol hygroscopic growth plays an im-portant role in atmospheric particle chemistry and the ef-fects of aerosol on radiation and hence climate. The hygroscopic growth is often characterized by a growth factor probability density function (GF-PDF), where the growth factor is deﬁned as the ratio of the particle size at a speciﬁed relative humidity to its dry size. Parametric, least-squares methods are the most widely used algorithms for inverting the GF-PDF from measurements of the humidiﬁed tandem differential mobility analyzer (HTDMA) and have been re-cently applied to the GF-PDF inversion from measurements of the humidity-controlled fast integrated mobility spectrometer (HFIMS). However, these least-squares methods suffer from noise ampliﬁcation due to the lack of regularization in solving the ill-posed problem, resulting in signiﬁcant ﬂuctuations in the retrieved GF-PDF and even occasional failures of convergence. In this study, we introduce nonparametric, regularized methods to invert the aerosol GF-PDF and apply them to HFIMS measurements. Based on the HFIMS kernel function, the forward convolution is transformed into a matrix-based form, which facilitates the application of the nonparametric inversion methods with regularizations, including Tikhonov regularization and Twomey’s iterative regularization. Inversions of the GF-PDF using the non-parameteric methods with regularization are demonstrated using HFIMS measurements simulated from representative GF-PDFs of ambient aerosols. The characteristics of reconstructed GF-PDFs resulting from different inversion methods, including previously developed least-squares methods, are quantitatively compared. The result shows that Twomey’s method generally outperforms other inversion methods. The capabilities of Twomey’s method in reconstructing the predeﬁned GF-PDFs and recovering the mode parameters are validated.

Abstract. The new Aerosol hygroscopic growth plays an important role in atmospheric particle chemistry and the effects of aerosol on radiation and hence climate. The hygroscopic growth is often characterized by a growth factor probability density function (GF-PDF), where the growth factor is defined as the ratio of the particle size at a specified relative humidity to its dry size. Parametric, least-squares methods are the most widely used algorithms for inverting the GF-PDF from measurements of the humidified tandem differential mobility analyzer (HTDMA) and have been recently applied to the GF-PDF inversion from measurements of the humidity-controlled fast integrated mobility spectrometer (HFIMS). However, these least-squares methods suffer from noise amplification due to the lack of regularization in solving the ill-posed problem, resulting in significant fluctuations in the retrieved GF-PDF and even occasional failures of convergence. In this study, we introduce nonparametric, regularized methods to invert the aerosol GF-PDF and apply them to HFIMS measurements. Based on the HFIMS kernel function, the forward convolution is transformed into a matrix-based form, which facilitates the application of the nonparametric inversion methods with regularizations, including Tikhonov regularization and Twomey's iterative regularization. Inversions of the GF-PDF using the nonparameteric methods with regularization are demonstrated using HFIMS measurements simulated from representative GF-PDFs of ambient aerosols. The characteristics of reconstructed GF-PDFs resulting from different inversion meth-ods, including previously developed least-squares methods, are quantitatively compared. The result shows that Twomey's method generally outperforms other inversion methods. The capabilities of Twomey's method in reconstructing the predefined GF-PDFs and recovering the mode parameters are validated.

Introduction
The hygroscopic growth of aerosol particles influences heterogeneous reactions, light extinction, and visibility, whereby aerosol water is most relevant for the direct radiative forcing of Earth's climate (Tang and Munkelwitz, 1994;Pilinis et al., 1995;Swietlicki et al., 2008). The ability of aerosols to absorb water depends mainly on their compositions; hence the hygroscopic properties reflect the variability in the key chemical components (Gysel et al., 2007;Zheng et al., 2020). Therefore, the variation in aerosol hygroscopic growth can be used to infer the potential chemical composition, especially for small aerosols that are beyond the size range of the aerosol mass spectrometer. Aerosol hygroscopic growth under atmospheric relative humidity (RH) is commonly measured by a humidified tandem differential mobility analyzer (HTDMA) system (Liu et al., 1978;Rader and McMurry, 1986;Swietlicki et al., 2008). In an HTDMA system, monodisperse particles classified by the first differential mobility analyzer (DMA) are exposed to an elevated RH in a humidity conditioner, and the size distribution of humidified particles is then measured by a second DMA and a particle detector using the scanning mobility technique. The particle hygroscopic growth is then derived from the size distribution of the humidified particles. Recently, a humidity-controlled fast integrated mobility spectrometer (HFIMS) was developed. The HFIMS replaces the second DMA and particle detector within the HTDMA system with a water-based fast integrated mobility spectrometer (WFIMS), which captures the size distribution of humidified particles instantly (Pinterich et al., 2017a). As a result, the HFIMS drastically accelerates aerosol hygroscopic growth measurements (Pinterich et al., 2017b;Wang et al., 2019;Zhang et al., 2021), making it feasible to characterize ambient aerosol hygroscopic growth at a wide range of sizes and RH levels under ∼ 25 min.
The HTDMA measurement, i.e., the mobilityconcentration distribution of humidified particles, is a convolution of the aerosol hygroscopic growth factor probability density function (GF-PDF) and the transfer functions of both DMAs. Similarly, the HFIMS measurement represents a convolution of the aerosol GF-PDF together with the transfer functions of the DMA and the WFIMS (Wang et al., 2019). Two inversion algorithms, TDMAfit (Stolzenburg and McMurry, 1988) and TDMAinv (Gysel et al., 2009), were developed and widely used to retrieve the GF-PDF from HTDMA measurements. In both algorithms, the GF-PDF is represented with a specific functional form, and the function parameters were derived by least-squares fitting. For example, the TDMAfit algorithm assumes the GF-PDF as a superposition of multiple Gaussian distribution functions (Stolzenburg and McMurry, 1988) or a summation of multiple lognormal (ML) distribution functions (Stolzenburg and McMurry, 2008). Likewise, TDMAinv describes the GF-PDF as a piecewise linear (PL) function at predefined growth factor values (Gysel et al., 2009). The function parameters are derived using least-squares fitting that minimizes the residual between the measured and reconstructed size distributions of humidified particles. Similar methods have been applied to invert GF-PDFs from HFIMS measurements by Wang et al. (2019).
Inversion of the GF-PDF from the HTDMA or HFIMS measurements is an ill-posed problem (Gysel et al., 2009). Least-squares methods such as TDMAfit and TDMAinv provide simple and effective ways to solve this ill-posed problem by representing the GF-PDF in a specific functional form (Kandlikar and Ramachandran, 1999). However, the GF-PDF inverted by the TDMAfit algorithm often relies on the initial guess of the parameters, resulting in occasional failures of convergence (Gysel et al., 2009). For example, it was reported that the TDMAfit algorithm may not be robust in cases of closely multiply overlapped modes, and the successful convergence depends on the initial guess (Swietlicki et al., 2008). Moreover, it is well known that the unregularized least-squares method amplifies the measurement noise (Kandlikar and Ramachandran, 1999;Sipkens et al., 2020), resulting in significant fluctuations in the retrieved GF-PDF. It has been shown that the derived GF-PDF using the TD-MAinv algorithm may oscillate strongly when a higher bin resolution is chosen, while too low of a resolution may not be adequate to reproduce complex shapes of the true GF-PDF (Gysel et al., 2009). This may lead to incorrect interpretation of the aerosol mixing state (Wang et al., 2019). The approach to overcoming noise amplification is to regularize the problem by including additional information, such as smoothness (Kandlikar and Ramachandran, 1999). Tikhonov regularization is among the most common regularization methods and has been applied to inversions of the aerosol size distribution (Talukdar and Swihart, 2003) and mass-mobility distribution (Sipkens et al., 2020). Recently, a software package was developed to invert HTDMA data using Tikhonov regularization (Petters, 2021). Twomey's method (Twomey, 1975), one of the most common iterative regularization methods, has been widely used to invert aerosol size distributions (Collins et al., 2002;Olfert et al., 2008;Wang et al., 2018) and twodimensional mass-mobility distributions (Rawat et al., 2016;Sipkens et al., 2020). However, to the best of our knowledge, Twomey's method has not been applied to invert the GF-PDF from HTDMA or HFIMS measurements.
In this study, we present nonparametric, regularized inversions of the GF-PDF from HFIMS measurements. These inversion methods can be adapted to HTDMA measurements straightforwardly. The forward model (i.e., the convolution of the GF-PDF, the transfer function of DMA, and the transfer function of WFIMS) is derived analytically and cast into a matrix form such that nonparametric inversion methods can be conveniently applied. The nonparametric inversions are demonstrated by retrieving the GF-PDF from HFIMS measurements of ambient aerosols. The dependence of the retrieved GF-PDF on GF bin resolutions is investigated, and an optimal GF bin resolution is identified. Synthetic data are generated using representative GF-PDFs of ambient aerosols and are applied to evaluate different inversion methods, including (1) parametric, least-squares fittings; (2) nonparametric, unregularized least squares; (3) Twomey's method; and (4) Tikhonov regularization. The performances of the different inversion methods including reconstruction accuracy, GF-PDF fidelity, smoothness, and computation time are presented and discussed.

Methods
This section presents the GF-PDF inversion routine from the HFIMS measurement, which includes the mathematical derivation of the matrix-based inverse problem, the description of different inversion algorithms, and the generation of synthetic data for evaluating the inversion algorithms.

A matrix form for the forward model
The integrated response of HFIMS is determined by the aerosol size distribution, the DMA transfer function, the GF-PDF, and the WFIMS transfer function (Wang et al., 2019). The number concentration of particles with diameters between D p1 and D p1 + dD p1 downstream of the DMA inside the HFIMS is given by where Q a, DMA and Q s, DMA are the DMA aerosol and sample (i.e., monodispersed) flow rates, respectively; η chg D p1 is the aerosol charging efficiency; η p, DMA D p1 is the particle penetration efficiency through the DMA; V DMA ,Z p1 is the transfer function of the DMA operated with the classifying voltage of V DMA ;Z p1 is the particle mobility (Z p1 ) normalized by the DMA centroid mobility corresponding to V DMA ; and dN = n D p1 dD p1 represents the number concentration of particles with diameters between D p1 and D p1 + dD p1 . The number concentration of particles with diameters between D p2 and D p2 + dD p2 at the outlet of the conditioner is where the integration considers all possible values of D p1 ; η p, cond D p2 is the penetration efficiency of the conditioned particles, assuming the particle growth from D p1 to D p2 is instantaneous; and c cond D p2 , D p1 is the growth factor probability density function (GF-PDF) for particles with a dry diameter of D p1 growing to a diameter of D p2 during the humidity conditioning process. The GF-PDF satisfies The WFIMS response to particles with diameters between D p2 and D p2 + dD p2 in the ith D * p bin during any time interval (t) is calculated by Q a, WFIMS is the inlet flow rate through the WFIMS, N F is the number of frames being used to count dR i , andṄ F is the frame rate. N F /Ṅ F represents the time interval (t) of counting, η p, WFIMS is the penetration efficiency of particles going through the WFIMS separator, and WFIMS, i Z p2 is the transfer function of the ith bin of the instrument response diameter (D * p ) of the WFIMS. Note that the detection efficiency for particles above 8 nm has been shown to be 1 (i.e., η det D p2 = 1; Pinterich et al., 2017a).
The theoretical response of the ith D * p bin of the HFIMS, R i , can be derived by combining the above equations as detailed in Wang et al. (2019): . R tot is the total counts of particles detected within the WFIMS viewing window, i.e., R tot = i R i , where R i is the response of the ith D * p bin of the WFIMS; b view and b are the length of the viewing area of the charge-coupled device (CCD)-captured image and the length of the WFIMS mobility separator. i is the error in the measured response. In Eq. (4), the GF-PDF is written as a function of growth factor g (i.e., D p2 /D p1 ), and it satisfies c cond, n g, D p1 dg = c cond D p2 , D p1 dD p2 . Given the narrow particle size range classified by the DMA, we assume the GF-PDF is the same for all particles classified by the DMA at a given voltage; i.e., c cond g, D p1 is independent of D p1 for the integration in Eq. (4). Rewriting the GF-PDF as c cond (g) and replacing D p2 with gD p1 in Eq. (4) gives The integration can be approximated by a sum over J GF bins, with the assumption that c cond (g) is a constant value within each GF bin: where g j −1/2 and g j +1/2 (j = 1, 2, 3, . . ., J ) are the lower and upper bounds of the j th GF bin. Equation (6) can be further arranged into a matrix form (neglecting the error term) as where the HFIMS response R is an I × 1 array composed of R i (i = 1, 2, 3, . . ., I ). I is the selected size bins of the WFIMS that covers the size range of (0.8D * p1 , 2.0D * p1 ) according to the settings of the DMA centroid diameter D * p1 . The unknown GF-PDF c, a J × 1 array composed of c j (j = 1, 2, 3, . . ., J ), can be found by solving the Fredholm integral Eq. (7).

J. Zhang et al.: Regularized inversion of aerosol hygroscopic GF-PDF
The element of the HFIMS kernel matrix, M, is calculated by The HFIMS kernel describes the probability of particles with GF between g j −1/2 and g j +1/2 that is measured between the channel limits between Z * p,di−1/2 and Z * p, i−1/2 . As described above, the inversion of the GF-PDF (c) becomes an ill-posed problem due to overlapping of the HFIMS kernel function, like that of the aerosol size spectrometers (Kandlikar and Ramachandran, 1999;Collins et al., 2002;Talukdar and Swihart, 2003). It is worth noting that the derivation of the HFIMS kernel function can be easily applied to HT-DMA measurement by replacing the WFIMS transfer function with the transfer function of the second DMA in Eq. (8), as detailed in the Supplement.

Inversion methods
A number of techniques have been developed to solve the Fredholm integration (Kandlikar and Ramachandran, 1999). With Eqs. (7) and (8), nonparametric algorithms can be straightforwardly applied to invert the GF-PDF; hence no prior knowledge of the functional form of the GF-PDF is needed.

Unregularized least squares
The simplest route is the ordinary least squares (LSQ), which seeks to minimize the square of the residual: where · 2 denotes the Euclidean norm. Here, the leastsquares solution is solved by using the lsqnonneg function from MATLAB. As the uncertainty in measurements can vary substantially for different D * p bins, the residual is often weighted by measurement uncertainty. A weighted LSQ (WLSQ) seeks to minimize the weighted sum of squares (Sipkens et al., 2020): where W denotes a diagonal weight matrix whose ith diagonal element is the reciprocal of the standard deviation for data point i.

Tikhonov regularization
Tikhonov regularization is a common regularization method that overcomes noise amplification, and it has been used to invert aerosol size distribution and 2-D aerosol massmobility distributions (Talukdar and Swihart, 2003;Petters, 2021;Stolzenburg et al., 2022). In Tikhonov regularization, an additional regularization term is included in the leastsquares approach: where λ 2 Lc 2 2 represents the regularization term designed to minimize the derivative of a specific order, and λ is the regularization parameter that controls the degree of regularization. The penalization matrix L is often set as the identity matrix I; the bidiagonal (−1, 1) matrix; and the upper tridiagonal (1, −2, 1) matrix for the zeroth-, first-, and second-order regularization, respectively (Hansen and O'Leary, 1993;Hansen, 1994). The parametric L-curve of Mc λ − R 2 vs. Lc λ 2 is plotted, and the corner of the Lcurve with the maximum curvature is identified using the "Lcurve" routine from the regularization tools package developed by Hansen (1994). This optimal regularization parameter λ corresponds to a good balance between minimization of the residual and reduction in the noise in the inverted c (Hansen, 1992;Hansen and O'Leary, 1993).
Similarly, a weighted Tikhonov regularization (WTik) can be applied by (Sipkens et al., 2020) The effect of introducing the weight in the LSQ inversion and Tikhonov regularization is examined in Sect. 3.2.

Twomey's method
Twomey's method is commonly used to find solutions for ill-posed problems and has been proven to be effective in inversions of the aerosol size distribution (Collins et al., 2002;Olfert et al., 2008) and aerosol mass-mobility distribution (Rawat et al., 2016;Sipkens et al., 2020). It is a nonlinear optimization method and provides iterative regularizations. An initial guess solution is iteratively multiplied by small multiples of the HFIMS kernel function which are proportional to the ratio of the measured to calculated measurements as follows: where m i is the ith row of the HFIMS kernel function M, and R i m i c k denotes the relative divergence between actual and reconstructed HFIMS measurements. The positively constrained, least-squares solution is set as the initial guess (Olfert et al., 2008). Then, the initial guess is smoothed using a three-term moving average (Markowski, 1987) and input into the iterative Twomey's routine, which is then repeated until a Chi-squared criterion is satisfied. It is worth noting that Twomey's method may require sufficient counting statistics to ensure converged solutions.

Parametric LSQ fittings
The parametric fitting methods assume a prior known distribution of the GF-PDF and calculate the forward model problem (Eq. 4) to reconstruct the HFIMS measurements. A nonlinear least-squares fitting with boundary constraints is performed to search for the least-squares solution within the bounds. The ML and PL fitting routines for the GF-PDF inversion from HFIMS measurements have been developed by Wang et al. (2019). The influence of counting statistics and GF-PDF parameters (i.e., the number of modes of the ML GF-PDF and the number of sections of the PL GF-PDF) has been statistically studied. In this work, the GF-PDFs inverted using ML and PL fitting routines with the optimized parameters are compared with those retrieved using nonparametric inversion methods described above.

Generation of synthetic data to evaluate inversion algorithms
HFIMS measurements are synthesized to evaluate the performance of different inversion methods. The synthetic data are based on three representative GF-PDFs that consist of one, two, and three lognormal modes, respectively. The mode parameters of the pre-defined GF-PDFs are listed in Table 1, similar to those listed in Wang et al. (2019). The parameters of f , G, and σ are the fractional weight, mean diameter growth factor, and geometric standard deviation of each mode, respectively. The theoretical HFIMS response (i.e., R i ) is derived using Eq. (4) based on each of the three GF-PDFs, and Gaussian and Poisson noise is then added to the response using the following approach. First, a zero-mean Gaussian noise component is added to the theoretical HFIMS response to simulate the system noise such as fluctuation in the sample flow rate: where R i is the derived theoretical response of the ith D * p bin, and n G i is the ith element of a standard normally distributed random vector, n G , with zero mean and variance of 1. The magnitude of the Gaussian noise is controlled by using a factor, α. The HFIMS measurement is then simulated using the following Poisson distribution to reflect the discrete nature of the particle counting process: where P (x) is the probability that x number of particles are detected by HFIMS in the ith D * p bin (i.e., actual measurements). The impact of the Gaussian noise on the performance of the inversion methods is examined for different noise levels in Sect. 3.2. Five hundred sets of HFIMS measurements are generated using Monte Carlo methods with constant counting statistics (i.e., R tot of 100). These synthetic HFIMS measurements are then used to evaluate the inversion methods described above. Note that in the forward model for deriving the theoretical HFIMS response (i.e., Eq. 4), a higher resolution of g (i.e., 120 bins over 0.8-2.0) is used than that of the HFIMS kernel matrix (i.e., 20 bins of g; Eq. 8). The difference between the forward and inverse models, together with the inclusion of Gaussian and Poisson noises, minimizes the effect of inverse crime (Colton and Kress, 1998).
3 Results and discussion 3.1 Optimal numbers of growth factor bins and HFIMS size bins (D * p ) The numbers of GF bins (J ) and D * p bins (I ) determine the dimensions of the HFIMS kernel function, which affects the inversion of the GF-PDF. The optimal number of the D * p bin is a trade-off between sizing resolution and counting statistics. Wang et al. (2019) examined the influence of the WFIMS D * p bin number (I ) on the inverted GF-PDF and found an optimal range of 23-32 for total particle counts of 100. For representative remote continental and urban aerosols, the number of particles measured by the HFIMS often exceed 100 in 20 s (Pinterich et al., 2017b;Zhang et al., 2021), ensuring sufficient counting statistics for ambient measurements. The dynamic range of WFIMS is roughly a factor of 10 in mobility, corresponding to a factor of ∼ 3 in the size range (Zhang et al., 2021). In this study, 30 size bins (i.e., I = 30) that are evenly spaced on a logarithmic scale over the WFIMS size range are used in the inversions.
The influence of growth factor bin number (J ) on the inverted GF-PDF is examined using the synthetic HFIMS measurements described above. The GF-PDF was inverted from each set of the simulated HFIMS measurements using different GF bin numbers ranging from 10 to 50 (i.e., corresponding to a GF resolution range of 0.024-0.12). To facilitate the comparison of GF-PDFs inverted with different GF bin numbers, we interpolate the inverted GF-PDFs to 120 fixed growth factors that are evenly distributed from 0.8 to 2.0. The average error in the inverted GF-PDF γ is defined as where c i, inv and c i, sim are the interpolated GF-PDF and predefined GF-PDF (i.e., true values) at the 120 fixed growth factors, respectively. N is the number of points of fixed growth factors (i.e., 120). The smoothness of the inverted GF-PDF is evaluated using the absolute second-order derivative: To evaluate how well the inverted GF-PDF reproduces the HFIMS measurement, we define the residual of the recon- structed HFIMS measurement (i.e., reconstruction error) as whereR i, inv is the normalized HFIMS measurement that is reconstructed using Eq. (7) (i.e., forward calculation).R i is the normalized synthetic HFIMS measurement (i.e., true values). Figure 1 shows the smoothness of the inverted GF-PDF (ξ ) vs. the residual of reconstructed HFIMS measurement (χ 2 ) for different GF bin numbers (J ). The variation in ξ with χ 2 exhibits an L-shaped curve for all three representative GF-PDFs. The initial increase in J from 10 to 20 substantially improves the agreement between the reconstructed and simulated HFIMS measurements, as indicated by a much reduced χ 2 value. At the same time, ξ remains relatively small, indicating a high smoothness of the inverted GF-PDF. In contrast, an increase in J above 20 leads to a minor reduction in χ 2 value but a drastic increase in ξ , suggesting strong noise in the inverted GF-PDF. The optimal solution lies near the corner of the L-curve (Hansen and O'Leary, 1993) that strikes a balance between the smoothness and the fidelity to the HFIMS measurements. For all three pre-defined GF-PDFs, the corner of the L-curve corresponds to a J value of 20. The GF-PDF inverted with 20 growth factor bins generally shows the smallest error (γ 2 ), indicating best agreements between the inverted and the true GF-PDFs. Note that the above results are based on inversions using Twomey's method. The same type of L-curves for GF-PDFs inverted using unregularized LSQ and Tikhonov regularizations are shown in the Supplement (Sect. S2), and they also reveal a corner that corresponds to a J value of 20. These results suggest an optimal J value of 20 for a range of representative GF-PDFs and different inversion methods.

Effect of measurement uncertainties
The uncertainty in HFIMS measurements consists of mainly normally distributed random instrumental noise (e.g., sample flow fluctuation) and Poisson noise due to counting statistics. As the uncertainty varies among different HFIMS D * p bins, we first compare the performance of weighted and unweighted inversion methods, including LSQ and Tikhonov regularizations. For this comparison, inversion methods are applied to HFIMS data synthesized with α = 0.05, a typical value used in previous studies (Gysel et al., 2009). A total of 500 sets of synthetic data are generated for each of the three pre-defined GF-PDFs. The values of synthesized HFIMS response (R i, s ) are integers, which reflect the discrete nature of particle counting. For weighted LSQ and Tikhonov regularizations, the weight for D * p bins (i.e., diagonal elements in W) is derived as 1/ R i, s . However, this approach leads to a weight of infinity when R i, s has a value of zero (i.e., no particle detected within the D * p bin). To overcome this issue, we replace zero R i, s values with a fixed number R i, min when deriving the weight. Figure 2 compares the reconstruction residual, the GF-PDF error, and the smoothness of the GF-PDF inverted using unweighted LSQ and weighted LSQ with R i, min values of 1, 0.1, and 0.01, respectively. Whereas statistically no substantial difference is found among the smoothness of GF-PDFs inverted using unweighted and weighted LSQ, unweighted LSQ leads to a lower reconstruction residual and a lower error in the inverted GF-PDF compared to the weighted LSQ. For the weighted LSQ inversions, both the reconstruction residual and the error in the inverted GF-PDF increase with increasing weight for R i, s of zeros values (i.e., 1/ R i, min ). The measurement uncertainty is larger, and therefore the weight is lower for channels with higher R i, s , which corresponds to higher probability densities (i.e., higher c(g) values). As a result, the GF-PDF inverted using weighted LSQ may have relatively larger errors for high c(g) values and consequently the average GF-PDF error (γ 2 ). The same comparisons are also carried out for weighted and unweighted Tikhonov algorithms, and again the weighted algorithms do not provide better performances (i.e., lower error in inverted GF-PDFs) than the unweighted ones. Therefore, subsequent analyses of this study are focused on unweighted algorithms for LSQ and Tikhonov regularizations. It is worth noting that derivation of the weight as 1/ R i, s implicitly assumes that the noise in HFIMS measurements is due to counting statistics only, whereas the synthetic HFIMS data are generated with 5 % Gaussian noise. As shown next, the noise in the synthetic HFIMS data is dominated by the counting statistics. In addition, for real measurements, the level of Gaussian noise (i.e., α) is often not accurately known. We also repeated the above comparisons by deriving the weight as 1 R i, s + α 2 R 2 i, s , which accounts for both Poisson and Gaussian noises. The results are essentially the same.
The effect of the level of Gaussian noise on the inverted GF-PDF is examined. Synthetic HFIMS measurements are generated following the approach described above (Eqs. 13 and 14) at four Gaussian noise levels (i.e., α = 0 %, 1 %, 5 %, and 10 %). At each α level, 500 sets of synthetic data are generated and inverted using Twomey's method for each of the three pre-defined GF-PDFs. All retrieved inversion parameters, including the reconstruction residual, the GF-PDF error, and the smoothness, are statistically the same for all four Gaussian noise levels (Fig. 3), indicating that HFIMS measurement noise is dominated by counting statistics, and the inclusion of the Gaussian noise has a negligible impact on the GF-PDF inverted by Twomey's method. Similarly, the impact of Gaussian noise is also negligible for the GF-PDF inverted using unweighted LSQ and zeroth-, first-, and second-order Tikhonov regularizations (not shown).
We also challenged the inversion algorithms with different forward and inverse models to simulate the scenarios when DMA or WFIMS is not perfectly calibrated. A different DMA or WFIMS transfer function width is used to generate the synthetic HFIMS measurements than that used to calculate the inversion matrix. We found that up to ±20 % variation in the DMA or WFIMS transfer function width has negligible impacts on the inverted GF-PDF. The results and discussion are detailed in Sect. S3.

Comparisons of different inversion methods
The performances of different inversion methods described in Sect. 2.2 are systematically compared. A total of 500 sets of synthetic HFIMS data are generated and inverted for each of the three pre-defined GF-PDFs. For all nonparametric methods, the inversions were carried out using the optimal numbers of GF bins (J ) and D * p bins (I ): 20 and 30, respectively. Figure 4 shows the residual of reconstructed HFIMS measurements (χ 2 ), the smoothness (ξ ), the error in the inverted GF-PDF (γ 2 ), and the computing time for different inversion methods. Compared with parametric counterparts (i.e., ML and PL least-squares fitting), the nonparametric methods generally retrieve more accurate GF-PDFs. Note that the ML least-squares fitting fails to converge to a valid solution occasionally, resulting in the abnormally large error in the inverted GF-PDFs, particularly for the pre-defined GF-PDFs with two and three modes. It may be due to the assumed spectral shape of GF-PDFs or the finite range of the boundary constraints that lead to a failure of searching for a least-squares solution in the presence of random noise. Among all nonparametric inversion methods, the unregularized LSQ provides the solution with the lowest reconstruction residual but largest noise and error in the inverted GF-PDFs, consistent with the noise amplification in unregularized methods. In comparison, regularized inversion methods generally produce smoother solutions at the expense of increased reconstruction residuals. Among different Tikhonov regularization methods, higher-order regularizations (i.e., first-and second-order) tend to produce smoother solutions, although the errors in the inverted GF-PDF are very similar statistically. The ξ value of the GF-PDF inverted using first-and second-order Tikhonov regularizations increases with the mode number of the GF-PDF, consistent with the increasingly more complex spectral shape of the GF-PDF. Overall, Twomey's method outperforms the Tikhonov regularization methods regardless of the shapes of the pre-defined GF-PDFs. On average, the GF-PDF inverted using Twomey's method has the smallest error (γ 2 ) and lowest ξ value, indicative of the best performance. Note that the results are based on synthetic data generated with relatively low counting statistics (i.e., R tot of 100). We also synthesized HFIMS data with R tot of 500 and compared the performance of different inversion methods for measurements with the improved counting statistics, and the results are consistent with those shown in Fig. 4 (Fig. S7). We therefore expect the results to reflect the general performances of different inversion methods for a typical range of counting statistics of HFIMS measurements. Figure 2. Comparison of reconstruction residual (χ 2 ) (a), the GF-PDF error (γ 2 ) (b), and the smoothness (ξ ) (c) of GF-PDFs inverted using the unweighted and weighted LSQ methods with different weighting schemes for zero-value D * p bins (i.e., replacing zero values by 1, 0.1, and 0.01, respectively). Colors correspond to the pre-defined GF-PDFs with one mode (blue), two modes (orange), and three modes (yellow). The results are averages based on inversions of 500 sets of synthetic HFIMS data for each of the three pre-defined GF-PDFs. Figure 3. Comparison of reconstruction residual, χ 2 (a), the GF-PDF error, γ 2 (b), and the degree of smoothing, ξ (c) of GF-PDFs inverted using Twomey's methods from synthetic HFIMS data with additional Gaussian noises of different levels (i.e., none, 1 %, 5 %, and 10 %). Colors correspond to the pre-defined GF-PDFs with one mode (blue), two modes (orange), and three modes (yellow). The results are averages based on inversions of 500 sets of synthetic HFIMS data for each of the three pre-defined GF-PDFs. Figure 4d shows that once the matrix is generated, the implementation of the nonparametric methods requires a much shorter computing time than the parametric fitting methods. Here, the computing time is recorded on a desktop with Intel's eighth-generation processor Core i7-8700. On average, a single-time implementation of the unregularized LSQ (i.e., the lsqnonneg function in MATLAB) requires ∼ 1 s for all three pre-defined GF-PDFs, and the computing times for all other nonparametric methods are similar (with the largest difference of only ∼ 4 %), indicative of equally good computing efficiencies. In contrast, the computing time required by either the ML or PL least-squares fitting routine is more than one order of magnitude longer.

Comparison of Tikhonov regularization and Twomey's method
In this section, we investigate why Twomey's method performs better than Tikhonov regularizations. The Tikhonovregularized solutions depend on the regularization parameter, λ. The value of λ is often determined by heuristic meth-ods, including the L-curve approach (Hansen and O'Leary, 1993) and the Hanke-Raus rule (Hanke and Raus, 1996). The L-curve approach determines λ by seeking a trade-off between minimizing the residual term and minimizing the regularization term (i.e., roughness of the solution), and the Hanke-Raus rule selects a computable λ that minimizes the λ-dependent residual term 1 λ Mc Tik (λ) − R 2 (Hanke and Raus, 1996;Sipkens et al., 2020). As the pre-defined GF-PDFs are known for the synthetic HFIMS data, the value of λ can be optimized by comparing the inverted GF-PDF with the true solution, i.e., minimizing the error in the inverted GF-PDF (γ 2 ). Figure 5 shows the comparison of the statistics of inversions using LSQ, Twomey's method, and firstorder Tikhonov. The results are averages based on inversions of 500 sets of synthetic HFIMS data for each of the three pre-defined GF-PDFs. Here, the first-order Tikhonov regularization is chosen as it shows better performance (i.e., lower GF-PDF error) than zeroth-and second-order Tikhonov regularizations (Fig. 4). The Tikhonov regularization parameter is identified by all three methods: (1) the L-curve, (2) the Hanke-Raus rule, and (3) optimization through minimizing the error in inverted GF-PDFs. It is worth noting that the third method is not feasible for real measurements as the true GF-PDF is unknown. Figure 5b shows that the Tikhonov regularization with the optimized λ (i.e., the third method) provides the most accurate solution (i.e., lowest GF-PDF error) and outperforms Twomey's method. However, when λ derived using the L-curve approach or Hanke-Raus rule is used, the GF-PDF inverted using first-order Tikhonov regularization generally has a larger error (i.e., γ 2 ) than that inverted using Twomey's method. The above comparisons indicate that while the Tikhonov regularization can outperform Twomey's method in theory, the optimal regularization parameter λ cannot be obtained reliably using existing methods in practice, leading to inferior performance compared to Twomey's method. For example, the L-curve approach does not work well if the curvature of the L-curve is negative everywhere, and in such a scenario, the leftmost point (i.e., with smaller λ) on the L-curve is taken as the corner (Hansen, 1994), leading to insufficient regularizations of the solution (Naseri et al., 2021). On the other hand, the Hanke-Raus rule often chooses a much larger λ compared with the optimal value, which results in over-smoothed solutions with even larger errors. We also carried out similar comparisons of Twomey's method with zeroth-and second-order Tikhonov regularizations using λ values derived from the three differ-ent methods, and the results are consistent with those shown in Fig. 5.
The nonparametric inversion methods are also applied to HFIMS measurements of ambient particles with a dry diameter of 35 nm (Zhang et al., 2021), as detailed in the Supplement (Sect. S5). As the true GF-PDF of ambient aerosols is unavailable, the performance of the inversion methods cannot be directly compared. Nevertheless, the comparison of the reconstruction residual and the smoothness of the inverted GF-PDF paints a similar picture that Twomey's method strikes a good balance between the smoothness of the inverted GF-PDF and the fidelity in reproducing the HFIMS measurements, and it likely outperforms Tikhonov regularizations in practice.

Inversion by Twomey's method
As Twomey's method is shown to be the best among all inversion methods examined, we characterize the accuracy of the GF-PDFs inverted using Twomey's method and the recovered mode parameters. Figure 6 compares the GF-PDFs inverted with the optimized GF and D * p bin numbers and with the pre-defined GF-PDFs. The reconstructed and the simulated HFIMS measurements are also presented in the top panel. Both the inverted GF-PDF and reconstructed HFIMS measurements are averaged over the inversions of 500 sets of Figure 5. The reconstruction residual (χ 2 ) (a), the GF-PDF error (γ 2 ) (b), and the smoothness (ξ ) (c) of the GF-PDF inverted using LSQ, first-order Tikhonov regularization with the regularization parameter derived from three different approaches (L-curve, Hanke-Raus rule, and optimized λ), and Twomey's method. The colors correspond to the pre-defined GF-PDFs with one mode (blue), two modes (orange), and three modes (yellow). Figure 6. (a-c) Comparisons between the averaged reconstructed HFIMS measurements and the simulated HFIMS measurements corrupted with Poisson noises for pre-defined GF-PDFs of one mode (a), two modes (b), and three modes (c), respectively. (d-f) Comparisons between the pre-defined GF-PDFs and the GF-PDFs inverted using Twomey's method with the optimized value of GF bins. The shaded area represents GF-PDF solution spaces within 1 standard deviation. synthetic data. The results demonstrate excellent agreement of the reconstructed HFIMS measurements with the synthetic data (i.e., simulated HFIMS measurements) for all three predefined GF-PDFs. Both the spectral shapes and peak locations of the inverted GF-PDFs agreed well with that of the pre-defined GF-PDFs. In addition, the inverted GF-PDFs are also in better agreement as compared with those inverted from parametric least-squares approaches (i.e., ML and PL GF-PDFs; Wang et al., 2019).
To quantify the accuracy of the inverted GF-PDFs, we fitted the inverted GF-PDFs to recover the mode parameters as shown in Table 2. The pre-set mode parameters of the predefined GF-PDFs are shown in Table 1. The results show that both the mode geometric means and the multimodal number fractions can be recovered accurately with minor uncertainties.

Conclusion
In this study, we develop and evaluate nonparametric regularized methods for inverting the GF-PDF from HFIMS measurements. The integrated response of HFIMS, which is a convolution of the aerosol hygroscopic GF-PDF, the transfer function of the DMA, and the transfer function of the WFIMS, is first cast into a matrix form. With the matrix form, nonparametric regularized methods can be applied straightforwardly to invert the GF-PDF. Synthetic HFIMS measurements are generated using Monte Carlo simulations for representative aerosol GF-PDFs, and the synthetic data are used to investigate the dependence of the inverted GF-PDF on the number of GF bins (i.e., GF resolutions) and the performances of different inversion methods. We show an optimal GF bin number of 20 for all nonparametric methods and representative GF-PDFs. The performances of unregularized least squares, Twomey's algorithm, Tikhonov regu- 1.00 ± 0 1.39 ± 0.03 1.09 ± 0.01 n/a n/a 2 0.46 ± 0.10 1.10 ± 0.02 1.02 ± 0.01 0.54 ± 0.10 1.30 ± 0.02 1.03 ± 0.01 n/a 3 0.37 ± 0.09 1.05 ± 0.03 1.05 ± 0.02 0.34 ± 0.13 1.40 ± 0.03 1.03 ± 0.02 0.28 ± 0.12 1.69 ± 0.08 1.06 ± 0.03 n/a: not applicable.
larizations, and commonly used parametric inversion methods (i.e., ML and PL least-squares fitting) are compared. Nonparametric methods based on the matrix form have substantial advantages in the inversion of the GF-PDF over the parametric fitting methods as (1) no prior assumption of GF-PDF distributions is required; (2) the matrix-based form facilitates the application of different regularizations (e.g., Tikhonov regularization and Twomey's iterative regularization), which reduce the error in the inverted GF-PDF by eliminating noise amplification; and (3) they are much more computationally efficient once the matrix is generated. The Tikhonov-regularized solutions depend on the regularization parameter, λ. While the Tikhonov regularization can outperform Twomey's method in theory, the optimal λ value cannot be obtained reliably using existing methods in practice, leading to inferior performances compared to Twomey's method. On average, the GF-PDF inverted using Twomey's method has the smallest error compared to solutions using the other inversion methods regardless of the shapes of the pre-defined GF-PDFs, and it accurately reproduces the true GF-PDF, including the mode parameters and other key statistics.
Author contributions. JZ and JW designed the study. JZ developed the code. JZ and JW prepared the manuscript with contributions from all co-authors.
Competing interests. The contact author has declared that neither they nor their co-authors have any competing interests.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.