Tikhonov regularization is a tool for reducing noise amplification
during data inversion. This work introduces RegularizationTools.jl,
a general-purpose software package for applying Tikhonov regularization
to data. The package implements well-established numerical algorithms
and is suitable for systems of up to

Atmospheric aerosol plays an important role in shaping the microphysics
of clouds and the Earth's climate

Differential mobility analyzers (DMAs) select particles as a function
of their size, charge, and an applied voltage. DMAs and tandem DMAs
are widely used to measure the distributions of size and distributions
of aerosol physicochemical properties

Humidified tandem DMAs select a mobility diameter,
pass this quasi-monodisperse aerosol through a humidification system,
and then measure the humidified mobility response function using a
second DMA operated in stepping or scanning mode

The inverse solution of ill-posed problems is characterized by strong
sensitivity to noise superimposed on the data. Regularization methods
are needed to relate an observed instrument response to the underlying
physical property of the system under investigation. A common inverse
method is

To date,

This work revisits the challenge of performing an SMPS-style inversion
of the humidified mobility distribution to retrieve the growth factor
frequency distribution while also accounting for multiply charged
particles.

Section 2.1 and 2.2 use the following linear algebra notation. Capital
bold roman letters denote matrices (

The formalisms closely follow the description in

The analytical solution for Eq. (2) is the regularized normal equation

The L-curve method involves a plot of

The generalized cross-validation estimator presents a mathematical
shortcut to compute the leave-one-out cross-validation estimate, which
removes one point from the data, creates a model, computes the error
between the model and data point not included in the data, and then
averages the result over all permutations. It is given by

Equation (3) can be solved straightforwardly using any software
that supports linear algebra operations. This brute-force approach,
however, is slow. Efficient algorithms to solve Eqs. (3) and (4) have
been developed. The algorithms used here are briefly described. If

The inverse problem can be solved using specific methods. Here, method
refers to the content of the filter matrix

https://github.com/matthieugomez/LeastSquaresOptim.jl (last access: 10 December 2021).

library. The net result is an optimized solution that is within the specified upper and lower bounds. The upper and lower bounds are vectors of the same size asThe design matrix can be obtained from a forward model

Differential mobility analyzers consist of two electrodes held at
a constant or time-varying electric potential. Cylindrical

The DMA selects particles by electrical mobility. The relationship
between mobility and mobility diameter is well known and well defined.
The relationship is given, for example, in Eq. (2) in

The traditional mathematical formulation of transfer through the DMA
is summarized in

The integrated response downstream of a tandem DMA that is operated
at voltages

A disadvantage of the computational approach compared to the traditional
mathematical approach is that computation lacks standardization of
notation. This can blur the line between general pseudo-code and language-specific syntax. Some of the applied computing concepts may be less
widely known when compared to standard mathematical approaches. Nevertheless,
the author believes that the advantages of the computational approach
outweigh the drawbacks. Therefore, this work builds upon the expressions
reported in

The computational language includes a standardized representation
of aerosol size distributions, operators to construct expressions,
and functions to evaluate the expressions. Size distributions are
represented as a histogram and internally stored in the form of the

Generic functions are used to evaluate expressions. The function

DMA geometry, dimensions, and configuration are abstracted into composite
types

To help with parsing the expression,

The mobility distribution exiting the humidity conditioner and before
entering DMA 2 in the humidified tandem DMA is evaluated using the
expression

The total humidified apparent

If the aerosol is externally mixed, the humidified apparent growth
factor distribution function exiting DMA 2 is given by

For purposes of the forward model, the mobility grid for DMA 1 is
discretized at a resolution of

Figure 1 shows an example application of Eq. (18) for an input growth
factor frequency distribution where all particles are assumed to have
the same growth factor of

Figure 2 shows the relationship between four illustrative growth factor
frequency distributions and the modeled apparent mobility distribution
functions. The apparent mobility distribution function represents
the raw particle concentration that would be measured by a detector
as a function of the apparent

Simulated examples are used to test if Eq. (18) is invertible. Figure 3 shows an example simulation for the Bimodal growth factor
distribution test case. The humidified apparent growth factor distributions
are calculated using Eq. (18). The noise-free example corresponds
to

Figure 4 is similar to Fig. 3, showing an example simulation for aerosol with uniform composition; i.e., all particles have the
same growth factor. Although the

The total number of composable regularization methods according to
Eq. (5) is 24. Half of these methods do not include lower and upper
bounds, and these are not suitable for tandem DMA inversion due to
the negative and oscillatory solutions for narrow inputs. The remaining
12 methods have been systematically tested using Monte Carlo analysis
described in detail in the supporting information. Briefly, 60 000
inversions were performed on synthetic data similarly to the examples
shown in Figs. 3 and 4. The total number concentration, dry diameter,
number of bins, and random seeds were varied, and the root mean square
error was evaluated for each simulation. Results compiled in Fig. S1 show that all of the methods perform equally well for the Bimodal,
Uniform, and Truncated examples shown in Fig. 3. Method

An alternative approach to fit single-component data is to perform
a nonlinear least-squares fit to match the apparent growth factor
distribution using the forward model while restricting the number
of compositions to either one or two. This corresponds to a two- or
four-parameter fit. Results from this procedure are either one or
two growth factors and one or two fractions. The corresponding methods
are denoted as LSQ

Which method, however, should be selected when inverting real-world
data and when the number of components is unknown? Since the true solution
is also unknown, the root mean square error between the truth and
reconstruction is unavailable. It is, however, possible to compute
the residual between the measured apparent growth factor distribution
and the predicted apparent growth factor distribution from different
reconstructions. A large residual can be used to flag truncated oscillatory
solutions such as

Note, however, that the low residuals between the apparent growth factor distribution and the model do not automatically ensure that the algorithm has a good or adequate solution. Additional tests should be performed to validate the physical plausibility of the solution. For example, the retrieved growth factors should be physically plausible at the applied relative humidity. The mode of the apparent growth factor distribution and the mode of the inverted growth factor distribution should be similar. A histogram of the root mean square error between can be plotted for a large dataset. Visual inspection of fits for large root mean square error can be used to derive a threshold above which reconstructions are automatically rejected. The integrated probability density function of the reconstructions should be near unity. Deviations from unity may occur due to concentration errors between the size distribution measurement and the growth factor distribution measurement, unaccounted transmission losses, and errors from the inversion. Reconstructions deviating significantly from unity should be flagged and rejected.

A limitation of the above approach is that the forward model (and
thus matrix

Aerosol size distribution data to contrast inversion schemes were
obtained from measurements taken at Bodega Marine Laboratory (39

Aerosol size distribution and humidified tandem DMA data to illustrate
the tandem DMA inversion schemes were taken from measurements made
by the US Department of Energy (DOE) Atmospheric Radiation Measurement
(ARM) program. The Southern Great Plains (SGP) site is located in
Lamont, OK, USA (36

The instruments and measurements are part of the Aerosol Observing System

The matrix

Figure 5 shows a real-world example size distribution response function
gridded into 120 size bins. The total particle concentration is

Time evolution of the normalized particle size distributions collected
between 16 January and 7 March at Bodega Marine Laboratory. The
normalization is for each size distribution such that the maximum
of the spectral density equals unity. The red color visualizes
the time evolution of the mode diameter of the dominant mode. Top
panel: inverted using

Figure 6 shows the time evolution of the normalized particle size
distributions over a 7-week period. The normalization is to highlight
changes in the mode diameter(s). In general, the aerosol at the site
is dominated by continental rural background conditions and the land–sea
breeze circulation

Figure 7 shows real-world examples of growth factor frequency distributions
for five dry sizes. Also shown for context is the evolution of the
normalized aerosol number size distribution. Figure 7 shows dynamic
evolution of the size distribution with sudden changes in mode diameter,
several apparent new particle formation events, and several prolonged
modal growth events. The distribution of the methods selected for
best inversion was LSQ

RegularizationTools.jl is a general-purpose software package
to invert data using

The software package can be used to simplify the prototyping of a
wide variety of inverse problems that arise in science and engineering
applications. Although the package does not add any novel regularization
methods, it provides a systematic method to categorize inversion methods
via the expression in Eq. (5). A total of 24 basic permutations can
be combined with a set of hyperparameters to attempt the inversion
of ill-posed problems. Hyperparameters include boundary constraints,
values for a priori estimates, and the lower-bound

To the author's knowledge this is the first time

Application of the inversion to a 16 d dataset demonstrates that
the thus-obtained growth factor frequency distribution data can reveal
significant details about the mixing state of the aerosol. The inverted
dataset is suitable as input to carry out common analyses made with
growth factor frequency distributions. Examples include the characterization
of the evolution of the aerosol mixing state as a function of time, characterization
of changes in the growth factor with the dry diameter and its relationship
to chemical composition, or characterization of the growth factor at
the mode diameter of particles during modal growth events

Current and future versions of the DifferentialMobilityAnalyzers.jl and RegularizationTools.jl are also hosted on GitHub. Details about the SGP HTDMA data and the SMPS data are provided in the references.
Source code to reproduce the figures, derived datasets, and archived versions of the software packages is available via Zenodo:

The supplement related to this article is available online at:

The contact author has declared that there are no competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Data from the SGP site were obtained from the Atmospheric Radiation Measurement (ARM) program sponsored by the US Department of Energy, Office of Science, Biological and Environmental Research, Climate and Environmental Sciences Division. I thank Janek Uin for providing additional information about the data. Size distribution data at Bodega Marine Laboratory were collected with support from the National Science Foundation grant AGS-1450690. I thank Nicholas Rothfuss, Sam Atwood, and Hans Taylor for help operating the SMPS at Bodega Marine Laboratory. I thank Kimberly Prather, Sonia Kreidenweis, and Paul DeMott for logistical support during the field campaign. I thank Sarah Petters for helpful discussions. I thank Mark Stolzenburg for exceptionally helpful referee comments.

This research has been supported by the US Department of Energy, Office of Science, Biological and Environment Research (grant no. DE-SC 0021074) and NASA (grant no. 80NSSC19K0694).

This paper was edited by Mingjin Tang and reviewed by Mark Stolzenburg, Christopher Oxford, and one anonymous referee.