Least-squares fitting of overlapping peaks is often needed to separately
quantify ions in high-resolution mass spectrometer data. A statistical
simulation approach is used to assess the statistical precision of the
retrieved peak intensities. The sensitivity of the fitted peak intensities
to statistical noise due to ion counting is probed for synthetic data
systems consisting of two overlapping ion peaks whose positions are
pre-defined and fixed in the fitting procedure. The fitted intensities are
sensitive to imperfections in the

Spectra acquired using techniques such as mass spectrometry (MS) can contain large amounts of information but are inherently complex in nature and can represent a significant challenge for data analysis. The identification and separate quantification of overlapping peaks in measured spectra are often required in order to extract the maximum possible information content. Computational approaches to this deconvolution problem have been extensively reported in the literature, in both mass spectroscopy fields such as liquid-chromatography MS (LC-MS; see e.g. Jaitly et al., 2009; Yu and Peng, 2010), matrix-assisted laser desorption/ionisation MS (MALDI-MS; see e.g. Sun et al., 2010; House et al., 2011), proton transfer reaction MS (PTR-MS; see e.g. Titzmann et al., 2010), electrospray ionisation MS (ESI-MS; see e.g. Horn et al., 2000; Strittmatter et al., 2003) and other techniques with similar analysis procedures such as chromatography (see e.g. Fraga and Corley, 2005; Krupcik et al., 2005) and gamma-ray spectroscopy (Hammed et al., 1993; Uher et al., 2010; Gardner et al., 2011). Assessing the precision in the fitting parameters resulting from such deconvolution procedures is important to demonstrate the reliability of the technique and understand the information content of the retrieved data (Hammed et al., 1993). However the quantification of this precision is not always discussed in the literature.

The primary goal of many mass spectrometry applications is the correct identification and quantification of ions present in the mass spectrum. Several studies probe the sensitivity of deconvolution algorithms to perturbations in the measurement parameters by applying them to synthetic data (Laeven and Smit, 1985; Blom, 1998; Lee and Marshall, 2000; Sun et al., 2010; Hilmer and Bothner, 2011; Müller et al., 2011). Some studies are also concerned with quantifying the overlapping ion signals, in fields such as proteomics (Link et al., 1999; Mirgorodskaya et al., 2000; Bantscheff et al., 2007, 2012) and atmospheric science (DeCarlo et al., 2006; Titzmann et al., 2010; Müller et al., 2011; Jokinen et al., 2012; Yatavelli et al., 2012). Quantification of such ion signals is difficult and may be confounded by unconstrained peak position parameters, or through the use of falsely constrained peak centroids arising from an automated peak-finding algorithm. The peak intensity and position parameters and their precisions are clearly not independent. The quantification process is thus complex, and assessing the precision of the retrieved intensities is difficult.

Correct identification of unknowns below the limit where two overlapping but
non-coincident peaks no longer maintain an inflection point in the
derivatives of the measurement profile is difficult. However, effects such
as peak width broadening may point to the presence of unknown ions. For
example, Meija and Caruso (2004) use peak width measurements from a
calibration standard to compare with that of a spectrum containing two
overlapping peaks, showing that Gaussian deconvolution as well as shifts in
the peak centroid position can be used to predict the ratio of the
intensities of the ions. Blom (1998) considers the impact of a weak
overlapping interference on two quantities describing peak shape, variance
and skew. That study shows that deviations in the peak shape can point to
the presence of an unknown interfering peak at separations well below those
which would be required to separate it visually. However, Blom also
concluded that an interfering peak with relative abundances of only a few
percent could cause significant shifts (of a few ppm) in the centroid

Given the challenges encountered by such studies to correctly identify unknown peaks in the MS, it is unsurprising that the uncertainties arising during peak identification are often expressed simply by confidence metrics, such as mass accuracy/error and relative ion abundance as compared to theoretical isotope patterns (e.g. Kilgour et al., 2012) rather than, as would generally be preferred, reporting the estimated precision of the fitted intensities.

Similar confidence metrics are also reported for studies attempting to quantify the intensity of known overlapping peaks. Haimi et al. (2006) qualitatively split fits into reliable and unreliable categories by comparing peak ratios for successive measurements at different concentrations. Fits were considered reliable for a standard deviation in the peak ratio < 25% for eight successive measurements, an arbitrary but consistent metric. This is a useful guide when interpreting experimental results but does not address the intensity precision in a quantitative manner, limiting the scope of applicability. Müller et al. (2011) reported on a more systematic approach to quantify the expected attainable precision of the peak intensity for an example synthesised system subject to counting and estimated calibration errors. Their approach was however not extended from a single example to the general case. A generalised metric to describe the performance of such deconvolution procedures is desired.

This study aims to present a quantitative, systematic analysis of the
statistical precisions arising during the deconvolution of overlapping peaks
for the special case where the peak positions are known a priori and held fixed in
the fitting procedure. This technique is widely employed by the
atmospheric-science community during analysis of data from field and also laboratory
instrumentation (e.g. Farmer and Jimenez, 2010), for example the
high-resolution time-of-flight aerosol mass spectrometer (HR-ToF-AMS;
DeCarlo et al., 2006), the proton transfer reaction time-of-flight mass
spectrometer (Cappellin et al., 2009, 2011; Müller et
al., 2011), the atmospheric pressure interface time-of-flight mass
spectrometer (APi-TOF; Junninen et al., 2010) and the high-resolution
time-of-flight chemical-ionisation mass spectrometer (HRToF-CIMS; Jokinen et
al., 2012; Yatavelli et al., 2012). The ionisation techniques used in the
instruments ionise and fragment the molecules in a very consistent manner.
Thus, one degree of freedom can often be removed from the ion fitting
procedure, which is then based upon a comprehensive list of ions and their
exact

Imprecision in such a constrained fitting procedure may arise from (i) noise
in the measurement distribution, particularly from counting statistics of
the ions of interest; (ii) the

Müller et al. (2011) conducted an error analysis on such a constrained hypothetical system using a peak model and specifications for a typical lower-resolution TOF spectrometer used (amongst other fields) in atmospheric science, and they demonstrated that the precision in the fitted peak intensities is sensitive to the ratio of the peak intensities. The precision with which the less-intense peak intensity can be retrieved becomes extremely poor for peak separations less than the full-width at half-maximum (FWHM). Müller et al. (2011) also concluded that a precise analysis could only be performed for well-separated peaks. We extend this analysis from a single example to the general case for a wide range of measured intensities, separations and resolving powers (peak widths). We investigate the relationship between peak separation and achievable peak intensity precision, and develop a parameterisation to quantify the latter.

A synthetic measurement distribution was constructed consisting of one or
two Gaussian peaks of known width and centroid position. Unless stated
otherwise, the synthetic peaks were generated for a fixed ion time-of-flight
(iToF) resolving power

To address counting error (item i above), the synthetic measurement
distribution was degraded, point by point, with Poisson-distributed error of
magnitude sqrt(

The peak shape model (item iv) was removed as a degree of freedom by utilising Gaussian shapes to represent the instrumental peak shape; the influence of the peak shape on fitted parameters is difficult to assess (Yu and Peng, 2010, and references therein) and is thus not considered here, although its relative impact should be the focus of future studies. The separation of the discrete data points (item iii) is held fixed unless otherwise noted. Further sources of uncertainty in the measurement distribution such as electronic baseline noise are not considered. In modern data-acquisition systems they are typically small compared to ion counting noise which leads to signal degradation and a non-zero mass-spectrum baseline.

After application of the Poisson-distributed noise and of the

For the case of a system with two overlapping peaks, we define a normalised
separation parameter

Precision theory offers a calculable method to describe the best precision
with which the peak intensity of an isolated ion can be retrieved from a
discrete spectrum with Poisson-distributed noise (Lee and Marshall, 2000).
Lee and Marshall ran simulations of least-squares fits to Gaussian peak
shapes and were able to demonstrate the application of precision theory to
mass spectra, giving the relationships for the standard deviations in fitted
peak amplitude,

Repeating the process for a system with two overlapping but non-coincident
Gaussian peaks leads to histograms that must be at least as broad as those
shown in Fig. 1. Although the centroid positions of the two Gaussians are
still fixed during both generation of the synthetic distribution and the
fitting procedure, the combined Poisson-distributed uncertainties from the
two peaks and the mix of information in the area of peak overlap leads to
increased imprecision in the retrieved peak intensities. Figure 2
demonstrates this tendency for a pair of equally intense peaks with various
different peak separations (

A further example where one peak is 1 / 10 as intense as the other is
shown in Fig. S2. In this case the results are starkly different for the
parent (more intense) vs. child (less intense) ions.

These results are generally observed for other examples: the precision due
to ion counting on an isolated ion

For the constrained fitting procedures investigated in this study, which
constrain a priori the positions of the fitted ions, correct determination of the

Histograms of the normalised deviation in fitted peak intensity,

Relative precision in fitted peak intensity,

The following procedure was used to obtain a quantitative estimate of the
imprecision introduced on the

As a result of the imprecision in the calibrant ion fits, for each iteration
of the calibration procedure, each of the calibrated peak positions is
subject to an error, the magnitude of which depends on the goodness of the
fits to the calibrant peaks and the relative positions in

The limiting calibration precision, where all peaks exhibit identical
signal-to-noise ratios, can be less than 0.1 ppm for high signal-to-noise
situations. This unrealistic scenario exceeds the performance attainable
using current mass spectrometry systems of similar resolving power,
indicating that other sources of error than purely counting statistics may
play a significant role in determining calibration precision. Lee and
Marshall (2000) note that the relative error in

In Sect. 3.4 we incorporate the impact of the limited precision of the

Mean precision in peak position arising from

We now estimate the achievable precision for the peak intensities resulting
from the constrained fitting procedure, combining the precisions from
counting and calibration errors in an overlapping two-ion system. As the
peak positions are held fixed, uncertainties in the calibration propagate to
uncertainties in the fitted intensities. Both these effects contribute to
the precisions summarised in Fig. 4. These simulations use synthetic data
consisting of two overlapping peaks as was demonstrated in Fig. 2 for peaks
of equal intensity. However, in Fig. 4 the peak intensities are different,
with a dominant “parent” peak of intensity

Precision on fitted peak intensity,

Three regimes are apparent in this plot:

For large

For smaller values of

For the special case of a low signal-to-noise ratio and small

Actual precision in the normalised fitted peak intensities

The different regimes introduced in Fig. 4 can be visualised for a range of

Since our simulations are complex and time consuming, a parameterisation of
the value of

Table 1 demonstrates that

Considered together, these observations imply that the precision owing to
calibration imprecision,

Changes in precision in fitted intensity of the child peak when altering properties of the fitting procedure and its input distributions, whilst holding the peak intensity ratio constant at 2.

Given the weak dependence of

Given that

This estimate of precision, whilst predominantly generally applicable for
the resolving power range considered in this study, does break down in the
overlapping counting-error regime (low signal-to-noise ratio and small peak
separation). This is apparent when studying Fig. 4, which demonstrates that a
maximum underestimate (at very low

An example of the application of these results is given in Fig. 6. Lines
of constant

The optimal experimental setup for a given ion pair is where

Ratio of fitted peak intensities to true peak separation for the smaller ion from all pairs of ions used in analysis of the CalNex field campaign data from a high-resolution aerosol mass spectrometer (Hayes et al., 2013). Superimposed are lines of constant estimated precision on fitted peak intensity. The data points are coloured according to the precision in fitted peak intensity expected from counting statistics. The data-point spacing is 0.2 ns, and peak width is 1 ns.

To demonstrate the application of the parameterisation to a problem relevant
to the atmospheric-science community that provided motivation for this
study, we take three commonly observed ions in HR-ToF-AMS spectra as an
example: C

A simple statistical simulation-based approach has been used to demonstrate
the precision to which the intensities for a pair of overlapping mass
spectral peaks can be ascertained using least-squares multi-peak fitting.
Synthetic measurement distributions containing imprecisions from counting
statistics and

The results are demonstrated as applied to a typical instrumental setup employed in atmospheric science but make no assumptions about ionisation or spectrometer type and can thus be generally applied. Further investigations to include the effect of imprecisions in peak width, non-Gaussian peak shape and systems with greater than two ion peaks should form the basis of future work.

This work was partially supported by NSF AGS-1243354 and AGS-1360834, NOAA NA13OAR4310063, NASA NNX12AC03G, and DOE (BER, ASR) DE-SC0011105. We thank Marc Gonin for useful discussions on this topic. We thank D. Sueper for providing the CalNex data. Edited by: F. Stroh