General Comments:
The revision has produced a much stronger and more integrated manuscript which describes a particular machine-learning approach used to separate single particle mass spectra by identity. The authors have provided more detailed information about the conceptual framework for their classification scheme, have done a rudimentary comparison to an alternative method, and have done a thorough job of exploring, presenting, and explaining the results from training and “blind” tests using their proposed method. Although it is mentioned briefly in the manuscript, the authors haven’t seriously engaged with assessing the utility of this method for analysis of ambient particle spectra, where presumably it would need to be functional to be useful. Are there situations wherein this method could be used to essentially “pick out” the particles that match one of the training sets, while not trying to differentiate “other” particles not included? If so, how different would the particle spectra need to be to achieve this?
Specific Comments:
The authors state, p. 3, lines 11 – 12, that “interpretability is more limited with methods such as cluster analysis and neural networks” without justification. Such statements should include explanations and/or citations, or be removed if they represent opinions.
On p. 5, line 12, the authors describe that “algorithms are known to struggle with chemically-similar aerosols…” but again provide no definition of “struggle” nor a discussion of how similar is too similar. Furthermore, the discussion on this page, lines 19 – 23, should mention that (as with all of the algorithms discussed in this paper), there are user defined settings that are included in each method, and the choice of those settings influences the outcome significantly. Generalizations about performance are therefore challenging, when little information about settings is provided. An alternative approach that the authors could explore is referencing specific articles in which specific methods/algorithms are used, and commenting on the successes and challenges that are illustrated by the specific results that the authors obtained.
On p. 6, lines 4 - 5, the authors mention “measurement uncertainty” without defining the variable in which that uncertainty is found. Is it the identification, the peak areas, or something else?
In section 2.3, the authors discuss binary decision trees without mentioning random forests, although the term has been introduced. It would be helpful to contextualize the binary trees within the discussion of the random forests at the beginning of this discussion, which could be accompanied by a short comment that the random forest approach will be described more thoroughly below.
In the methods section, parameters such as the number of nodes per tree (p. 11, line 10), number of trees (p. 10, line 11) and number of variables per split (p. 11, line 11) are stated, but the methodology for choosing these numbers is not explained in sufficient detail (or at all, in the case of the number of nodes). The parameter used to select the best settings is described as the “values that produce the lowest test error” – is this error just rate of incorrect identification?
On p. 11, line 18, the noun asymptote is used as a verb. The sentence should be rewritten.
On pp. 20 - 21, the authors illustrate the advantages of their method by mentioning that an unexpected contaminant was detected based on the results. The implication is that this is possible using their method but not others, however a distance metric-based algorithm would likely also be able to identify this contaminant, as it contained additional peaks. The authors should clarify how this example specifically illustrates the strength of their method (if it does).
Figure 4 illustrates a comparison of results using the random forest and a distance classifier. However, no information is provided about the (user-defined) parameters used to define different clusters in the distance metric example, making this comparison tricky. If the parameters were changed slightly, these results would likely vary. Also, the labels of a) and b) should be removed from the figure caption; top and bottom row are sufficient. The figure would be more useful if the algorithm type were included in the labels for the specific matrices, so that one needn’t rely on the text in the figure caption to identify what the matrices represent. Maybe replace “Aerosol Confusion Matrix (Positive)” with “Random Forest (Positive)” or “Euclidian Distance (Positive)” for clarity.
Figure 5 is still confusing, in that it shows the ~1/3 of particles (soot) that are introduced into the AIDA chamber but which the PALMS instrument cannot detect. The figure caption suggests that the instrument transmission efficiency is discussed in the text, but that discussion (p. 18, lines 18 – 21) is very brief and is mostly directed towards explaining the significant under-counting of the larger particles. This discussion should be expanded, and ideally, the data presented in the figure should be shown corrected for the inlet transmission. As it stands now, the use of the pie charts only illustrates that the match between the concentration (it is not specified whether the input aerosol in the chamber is given in number concentration or mass concentration, although presumably the PALMS results are provided in number concentration) is poor. The specificity with which the different particle types can be identified is sufficiently different in positive and negative ion spectra to warrant more discussion than is given. Overall, the data presented in this figure cannot serve to make the readers of this paper confident that the picture of the aerosol composition obtained by these experiments would do an excellent job of representing the reality of what is present. |