<i>Rolling</i> vs. <i>seasonal</i> PMF: real-world multi-site and synthetic dataset comparison

Via, Marta; Chen, Gang; Canonaco, Francesco; Daellenbach, Kaspar R.; Chazeau, Benjamin; Chebaicheb, Hasna; Jiang, Jianhui; Keernik, Hannes; Lin, Chunshui; Marchand, Nicolas; Marin, Cristina; O'Dowd, Colin; Ovadnevaite, Jurgita; Petit, Jean-Eudes; Pikridas, Michael; Riffault, Véronique; Sciare, Jean; Slowik, Jay G.; Simon, Leïla; Vasilescu, Jeni; Zhang, Yunjiang; Favez, Olivier; Prévôt, André S. H.; Alastuey, Andrés; Cruz Minguillón, María

doi:https://doi.org/10.5194/amt-15-5479-2022

Articles | Volume 15, issue 18

https://doi.org/10.5194/amt-15-5479-2022

© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/amt-15-5479-2022

© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 15, issue 18

Research article

|

27 Sep 2022

Research article |

| 27 Sep 2022

Rolling vs. seasonal PMF: real-world multi-site and synthetic dataset comparison

Marta Via, Gang Chen, Francesco Canonaco, Kaspar R. Daellenbach, Benjamin Chazeau, Hasna Chebaicheb, Jianhui Jiang, Hannes Keernik, Chunshui Lin, Nicolas Marchand, Cristina Marin, Colin O'Dowd, Jurgita Ovadnevaite, Jean-Eudes Petit, Michael Pikridas, Véronique Riffault, Jean Sciare, Jay G. Slowik, Leïla Simon, Jeni Vasilescu, Yunjiang Zhang, Olivier Favez, André S. H. Prévôt, Andrés Alastuey, and María Cruz Minguillón

Download

Final revised paper (published on 27 Sep 2022)
Supplement to the final revised paper
Preprint (discussion started on 30 May 2022)
Supplement to the preprint

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-269', Anonymous Referee #1, 22 Jun 2022

The manuscript by Marta Via et al. performed a comprehensive comparison between the two methodologies of fine organic aerosol (OA) source apportionment through the Positive Matrix Factorization (PMF) model: rolling and seasonal PMF. They found that the rolling PMF can be considered more accurate and precise, globally, than the seasonal one, although both meet the standards of quality required by the source apportionment protocol. In addition, the results showed that the selection of anchor profiles is highly influencing the OA factors, so local reference profiles are encouraged to minimise this impact. The topic fits well within the scope of Atmospheric Measurement Techniques.

Overall, the data analysis is solid and the manuscript is clearly written. Before its publication, the following comments need to be addressed.

Specific comments:

1 Line 308: What are the contributions of SOA species to m/z 55? Looking into these datasets would be helpful to evaluate the uncertainty of using m/z 55 as a marker for HOA.

2 Are comparisons of Rolling vs. Seasonal PMF depended on the type of site (e.g., Urban Background, Suburban) and/or the instruments(i.e., Q-ACSM and ToF-ACSM). Please be specific.

3 Figure 3: I noticed that there are three distinguished lines in the triangle plot of f₄₄vs. f₄₃ using seasonal PMF data, while this phenomenon does not appear using the rolling PMF and the truth PMF. Please describe and explain these differences in detail.

4 The sampling period that appears in Figure 2 is not in Table 1 (Participant sites). Please have a check.

5 It would be better to change the name of “truth PMF”, because no one knows what the truth is like, and our goal is to pursue infinite access to the truth

6 The figure captions for each panel should be clearly stated. Take Figure S4 for example, what is SHINDOA representing? In addition, "58-OOA" must be defined.

Citation: https://doi.org/10.5194/egusphere-2022-269-RC1
- AC1:
  'Reply on RC1', Marta Via, 30 Aug 2022
  Reply to Anonymous Referee #1 (Egusphere-2022-269)
  Received and published: 23rd June 2022
  
  The authors would like to thank the reviewer for all the comments and suggestions, which helped improving the quality of this work. A new version of the manuscript has been prepared following the suggestions from the reviewers. We provide below detailed replies to each of the comments in a point‐by‐point manner. Figures and tables cited in the following document can be found in the supplementary annex.
  Comments from the reviewer
  
  The manuscript by Marta Via et al. performed a comprehensive comparison between the two methodologies of fine organic aerosol (OA) source apportionment through the Positive Matrix Factorization (PMF) model: rolling and seasonal PMF. They found that the rolling PMF can be considered more accurate and precise, globally, than the seasonal one, although both meet the standards of quality required by the source apportionment protocol. In addition, the results showed that the selection of anchor profiles is highly influencing the OA factors, so local reference profiles are encouraged to minimise this impact. The topic fits well within the scope of Atmospheric Measurement Techniques. Overall, the data analysis is solid and the manuscript is clearly written. Before its publication, the following comments need to be addressed.
  
  Line 308: What are the contributions of SOA species to m/z 55? Looking into these datasets would be helpful to evaluate the uncertainty of using m/z 55 as a marker for HOA.
  
  According to rolling PMF results, the m/z 55 is attributed to each source in a percentage depicted in the table R.1.1 (in the supplementary document).
  As can be perceived from the table, HOA is always the main contributor to m/z 55 except for those sites in which COA is present, in which cases, HOA is in second place, and in Dublin, where the main contributor to COA is PCOA. These results demonstrate that the posterior assumption that the m/z 55 can be used as a marker for HOA is sufficiently accurate. Besides that, SOA (LO-OOA + MO-OOA) apportionment of m/z 55 does not exceed the 30% of apportioned m/z 55 and the ratio m/z55_HOA-to-m/z55_SOA is always superior to 1.5 (up to 3.4) except for the Dublin site, in which the combustion sources are significant sources of this ion. This brief analysis proves that m/z55 can indeed be used as a marker for HOA in the multi-site comparison.
  Regarding the synthetic dataset (Table R.1.2.), the apportionment of this ion is more equally distributed, but one should take into consideration that SOA_tr might have grabbed some HOA markers and therefore its prevalence in HOA must be higher. Hence, not by far, but HOA is yet in the synthetic dataset the highest contributor of the m/z55.
  For these reasons, the following sentence (starting at line 356) can be left unchanged:
  The adaptability of the models can be assessed from Figure 3 (b), where the 60/55 vs. 44/43 (which are proxies for the BBOA-HOA differentiation and the SOA oxidation, respectively) is plotted for the truth and both methods.
  
  Are comparisons of Rolling vs. Seasonal PMF dependent on the type of site (e.g., Urban Background, Suburban) and/or the instruments (i.e., Q-ACSM and ToF-ACSM). Please be specific.
  
  In both site-dependent and instrument-dependent, the groups are too unbalanced in terms of the number of sites to assure a proper assessment of the group differences.
  Regarding the type of site:
  Urban background / Suburban / Peri-urban (7): BCN-PR, DUB, ATOLL, INO, MRS-LCP, SIR, TAR.
  
  Remote/Rural (2): CAO-AMX, MAG.
  
  Regarding the type of instrument:
  Q-ACSM (8): BCN-PR, CAO-AMX, DUB, ATOLL, INO, MAG, SIR, TAR.
  
  ToF-ACSM (1): MRS-LCP.
  
  Therefore, the authors believe that this matter should not be addressed as the results would not be sufficiently conclusive.
  
  Figure 3: I noticed that there are three distinguished lines in the triangle plot of f44 vs. f43 using seasonal PMF data, while this phenomenon does not appear using the rolling PMF and the truth PMF. Please, describe and explain these differences in detail.
  
  Figure 3(c) (or in the supplementary document R.1.3.) is a triangle plot (Ng et al., 2010), in which the SOA profile f44 vs. the SOA profile f43 concentrations are shown. This plot aims to compare the time series behaviour of the truth vs. the profile concentrations of the rolling PMF (in red) and seasonal PMF (in blue) results. The big grey dots represent the SOA profile concentrations of the truth, both presenting the three SOA sources (round dots) and the weighted mean of all of them (squared dot).
  The rolling PMF dots move around the triangle plot, nevertheless, the seasonal dots are shown as three different blue lines. The seasonal PMF has been applied in this analysis in three different seasons, therefore we get three different profiles, one per season. Each of these profiles has a concrete f44-to-f43 ratio, which could be represent in three dots in this plot. Nevertheless, here we show the time series adaptation (or lack of adaptation in the seasonal case), this is why the plot presents the profile multiplied by the time series concentrations. This way of plotting the data, in the seasonal PMF case, gets represented into three different lines whose slopes correspond to the three different f44-to-f43 ratios of the three seasons. The points in each of the lines represent the i-th time point concentration f44-to-f43, which all concur in the slope established for each season. In contrast, the rolling PMF dots show unconstrained mobility inside the triangle.
  
  The sampling period that appears in Figure 2 is not in Table 1 (Participant sites). Please have a check.
  
  The synthetic dataset sampling period has not been added in the participant sites (Table1) because it did not use real data in its time series (coming from the CAMx model) but it did use the Zurich ToF-ACSM data to generate the error time series mimicking the procedure of error calculation of a real-world dataset. The authors consider that for the sake of clarity, this Zurich dataset should not be included in the participant site table, in order to differentiate the synthetic dataset as an standalone approach. However, we acknowledge some more information of the sampling period should be provided by the authors.
  The following sentence has been added at section 2.2.
  For this purpose, the dataset used is that from the Zurich site which ranges from February 2011 until December 2011. Hence, the same CAMx outcoming time series period was used to generate the concentration matrix.
  
  It would be better to change the name of “truth PMF”, because no one knows what the truth is like, and our goal is to pursue infinite access to the truth.
  
  The label of “truth” has been used only in the synthetic analysis to refer the synthetically created dataset, consisting on the time series and the profiles of five OA sources. This dataset has been arranged as a matrix, in order to be the starting point to launch the OA source apportionment in both rolling and seasonal PMF modelling. The outcomes of each methodologies are compared to the initial data in this study, which is actually a proxy for the truth, as the matrix is directly the multiplication of the synthetically designed.
  Hence, the use of “truth” in this article does not imply that the synthetic dataset is a representation of the atmosphere. The labelling of the synthetic dataset as “truth” intends to remark what the results would be if the model could represent with perfect effectivity the input matrix. This is why we the authors would like to maintain the “truth” labelling.
  
  The figure captions for each panel should be clearly stated. Take Figure S4 for example, what is SHINDOA representing? In addition, ‘”58-OOA” must be defined.
  
  Some figure and table captions have been rephrased in order to promote a quicker understanding of the plot content.
  Figure 1. OA apportionment results for rolling and seasonal methods and truth output.
  Figure 2. Rolling, seasonal and truth (synthetic dataset original values) (a) time series (in hourly averages for the sake of clarity), (b) diel profiles and (c) scatter plots.
  Figure 3. Synthetic dataset solution (a) Profiles; (b) Time-dependent profile variability of ratios 60/55 vs. 44/43; (c) Triangle plot of f44 vs. f43; for rolling PMF (red), seasonal PMF (blue) and truth (black).
  Figure 4. Pie charts of the mean concentrations of the main factors for the ensemble of all sites.
  Figure 8. Kernel density estimation of the histograms of the subtraction of the m/z 44-to-43 ratio from the raw (from input matrices) time series data minus the apportioned quantity profiles. These plots only contain those time lapses among the change of season (transition periods).
  Table S1. Multi-site assessment dataset characteristics.
  Figure S54. Pie plots for rolling and seasonal source apportionment solution for each site. The factor acronyms correspond to: Hydrocarbon-like OA (HOA), Biomass Burning OA (BBOA), Less Oxidized Oxygenated OA (LO-OOA), More Oxidised Oxygenated OA (MO-OOA), Cooking-like OA (COA), Peat Combustion OA (PCOA), Coal Combustion OA (CCOA), Wood Combustion OA (WCOA), 58-related OA (58-OA) and Shipping + Industry OA (SHINDOA).
  
  Figure 6 was changed slightly for the sake of clarity and its caption was modified as follows:
  Figure 6. Rolling and seasonal boxplots of the Pearson-squared correlation coefficient of OA sources with their respective markers for all sites.
  
  References
  Ng, N. L., Canagaratna, M. R., Zhang, Q., Jimenez, J. L., Tian, J., Ulbrich, I. M., Kroll, J. H., Docherty, K. S., Chhabra, P. S., Bahreini, R., Murphy, S. M., Seinfeld, J. H., Hildebrandt, L., Donahue, N. M., Decarlo, P. F., Lanz, V. A., Prévôt, A. S. H., Dinar, E., Rudich, Y. and Worsnop, D. R.: Organic aerosol components observed in Northern Hemispheric datasets from Aerosol Mass Spectrometry, Atmos. Chem. Phys., 10(10), 4625–4641, doi:10.5194/acp-10-4625-2010, 2010.
  
  Citation: https://doi.org/10.5194/egusphere-2022-269-AC1
RC2:
'Comment on egusphere-2022-269', Anonymous Referee #2, 30 Jul 2022
Comments on” Rolling vs. Seasonal PMF: Real-world multi-site and synthetic dataset comparison” by Via et al.

This paper illustrates the comparison results between rolling and seasonal PMF methods using a synthetic dataset and real-world ambient datasets. They deployed multiple tools in many dimensions to evaluate the comparison results. In general, they found the rolling window PMF method perform slightly better than seasonal PMF method when the source apportionment for long-term dataset is required. The whole paper is well written and organized. I only have a few questions, as shown below.

Major comments:

For rolling window method, why were 14 days or 28 days chosen for a timing window. Can other arbitrary days e.g., 7 days or 20 days be applied? I also do not understand why 1 day shift was used. How about the half day or other days.

For the seasonal PMF, can the MO-OOA and LO-OOA be compared among different seasons since free PMF was used. The MO-OOA and/or LO-OOA among different seasons might have different spectra and oxidation level. Are the spectra of MO-OOA and LO-OOA the same compared to the rolling method.

Is there any difference among the spectra of SOA_bioï¼SOA_bb and SOA_tr? How were these spectra obtained and which oxidation level was chosen? The similarity of BBOA and SOA_bb might obscure the source results.

Figure 3 b-c, This figure needs to be revised, which shows very small legend and label.

Minor comments:

Line 307 delete extra “m/z”

Line 384, line 394 and line 426 What is “58-OA”? There shall be explanation for the abbreviation name of each PMF factor since some of the readers might not be familiar with these names.
Citation: https://doi.org/10.5194/egusphere-2022-269-RC2
- AC2:
  'Reply on RC2', Marta Via, 30 Aug 2022
  Reply to Anonymous Referee #2 (Egusphere-2022-269)
  Received and published: 30th June 2022
  
  The authors would like to thank this reviewer for the comments and suggestions, which helped improving the quality of this work. A new version of the manuscript has been prepared following the suggestions from the reviewers. We provide below detailed replies to each of the comments in a point‐by‐point manner. Figures and tables cited in this document can be found in the supplementary annex.
  
  Comments from the reviewer
  
  This paper illustrates the comparison results between rolling and seasonal PMF methods using a synthetic dataset and real-world ambient datasets. They deployed multiple tools in many dimensions to evaluate the comparison results. In general, they found the rolling window PMF method performs slightly better than seasonal PMF method when the source apportionment for long-term dataset is required. The whole paper is well written and organized. I only have a few questions, as shown below.
  MAJOR COMMENTS
  For rolling window method, why were 14 days or 28 days chosen for a timing window. Can other arbitrary days e.g., 7 days or 20 days be applied? I also do not understand why 1 day shift was used. How about the half day or other days.
  
  The 14-days window-length was proposed in Parworth et al. (2015) as it results in a good compromise between catching the lifetimes of the studied pollutants (~ 5-6 days) and capturing short enough periods to observe inter-window variability. Nevertheless, the window length was further tested in Canonaco et al. (2021) considering the Q/Q_exp and the number of non-modelled points as shown in Figure 1(a) in that study and Figure R2.1. (a) in this document.
  The 14-days window is justified as it presents a low Q/Q_exp value and minimises the number of non-modelled points. The immediately inferior window length presents a quadrupled number of modelled points therefore it is discarded, even if the Q/Q_exp is a bit lower. The 28-days period presents a higher Q/Q_exp and non-modelled points percentage, but the difference is not high, therefore this window length could be used if any environmentally-feasible criteria pointed to a better performance with this window length. The reason why the window length was set to the multiples of 7 is that certain sources (e.g., HOA, COA, and BBOA) have stable weekly cycles (Chen et al., 2022) that could help in resolving better results. Therefore, time windows smaller than 7 days might not be a representative subset to conduct PMF. In the case of the present study, the window length modification criterium consisted on checking if the correlation of time series to their external markers overperformed the mathematically-optimised 14-days window length as well as following the guidelines in Canonaco et al., (2021).
  The use of an advancing 1-day shift is stablished due to the assumption that the factor profiles remain consistent within a day. Of course, different steps could be tested with alternative assumptions, but changing less than 1/7 data in the time window will have limited effects on the final PMF solutions. Besides, shortening the step will imply increasing computational expense with extra repetitions that one can argue it is not worthwhile.
  
  For the seasonal PMF, can the MO-OOA and LO-OOA be compared among different seasons since free PMF was used. The MO-OOA and/or LO-OOA among different seasons might have different spectra and oxidation level. Are the spectra of MO-OOA and LO-OOA the same compared to the rolling method.
  
  The Figure R2.2 depicts the oxidation ratio f44/f44 for the LO-OOA and MO-OOA factors on the left and for SOA on the right.
  The left plot shows close rolling and seasonal dots in all seasons and in the mean of all of them except for the MO-OOA in FMAM. Compared to truth, it can be seen how this issue is related to an f44/f43 overestimation of the seasonal method. In the right graph, it can be seen how in all cases except for JJA, the rolling captures a more oxidised aerosol than the seasonal, which is closer to the truth.
  
  Is there any difference among the spectra of SOA_bio, SOA_bb and SOA_tr? How were these spectra obtained and which oxidation level was chosen? The similarity of BBOA and SOA_bbmight obscure the source results.
  
  There are significant differences between these three SOA factors in terms of oxidation level as can be seen in Figure R1.1 in the RC1 document. These three SOAs were selected blindly from the PMF runner from the spectral database described in Ulbrich et al. (2009) to configure a ‘truth’ synthetic dataset.
  Figure R2.3(a) aims to depict the significant differences between truth BBOA and SOA_bb profiles. In the profiles can already be seen the lower proportion of m/z 43, m/z 44 in BBOA in comparison with the SOA and a much higher concentrations of combustion markers (m/z 60, 73) for the BBOA. Figure R4(b) also depicts much higher proportions of 44, 28 in the SOA_bb profile. To summarize, these two factors presented are distinct enough to be separated by PMF, as shown in the study.
  Figure 3 b-c, This figure needs to be revised, which shows very small legend and label.
  
  The figure has been revised and included in the new version of the manuscript as can be seen in Figure R.2.4.
  MINOR COMMENTS:
  Line 307 delete extra “m/z”
  
  This modification was applied to the manuscript.
  Line 384, line 394 and line 426 What is “58-OA”? There shall be explanation for the abbreviation name of each PMF factor since some of the readers might not be familiar with these names.
  
  The following sentence has been added to the manuscript to clarify this factor source:
  The 58-related OA, as explained in Chen et al. (2021), is a factor dominated by nitrogen fragments (m/z 58, m/z 84, m/z 94) which appeared as an artefact after the filament replacement in that instrument.
  
  References
  Canonaco, F., Tobler, A., Chen, G., Sosedova, Y., Slowik, J. G., Bozzetti, C., Daellenbach, K. R., El Haddad, I., Crippa, M., Huang, R.-J., Furger, M., Baltensperger, U. and Prévôt, A. S. H.: A new method for long-term source apportionment with time-dependent factor profiles and uncertainty assessment using SoFi Pro: application to 1 year of organic aerosol data, Atmos. Meas. Tech., 14(2), 923–943, doi:10.5194/amt-14-923-2021, 2021.
  Chen, G., Sosedova, Y., Canonaco, F., Fröhlich, R., Tobler, A., Vlachou, A., Daellenbach, K., Bozzetti, C., Hueglin, C., Graf, P., Baltensperger, U., Slowik, J., El Haddad, I. and Prévôt, A.: Time dependent source apportionment of submicron organic aerosol for a rural site in an alpine valley using a rolling PMF window, Atmos. Chem. Phys. Discuss., 43, 1–52, doi:10.5194/acp-2020-1263, 2021.
  Chen, G., Canonaco, F., Tobler, A., Aas, W., Alastuey, A., Allan, J., Atabakhsh, S., Aurela, M., Baltensperger, U., Bougiatioti, A., De Brito, J. F., Ceburnis, D., Chazeau, B., Chebaicheb, H., Daellenbach, K. R., Ehn, M., El Haddad, I., Eleftheriadis, K., Favez, O., Flentje, H., Font, A., Fossum, K., Freney, E., Gini, M., Green, D. C., Heikkinen, L., Herrmann, H., Kalogridis, A.-C., Keernik, H., Lhotka, R., Lin, C., Lunder, C., Maasikmets, M., Manousakas, M. I., Marchand, N., Marin, C., Marmureanu, L., Mihalopoulos, N., Močnik, G., Nęcki, J., O’Dowd, C., Ovadnevaite, J., Peter, T., Petit, J.-E., Pikridas, M., Matthew Platt, S., Pokorná, P., Poulain, L., Priestman, M., Riffault, V., Rinaldi, M., Różański, K., Schwarz, J., Sciare, J., Simon, L., Skiba, A., Slowik, J. G., Sosedova, Y., Stavroulas, I., Styszko, K., Teinemaa, E., Timonen, H., Tremper, A., Vasilescu, J., Via, M., Vodička, P., Wiedensohler, A., Zografou, O., Cruz Minguillón, M. and Prévôt, A. S. H.: European aerosol phenomenology − 8: Harmonised source apportionment of organic aerosol using 22 Year-long ACSM/AMS datasets, Environ. Int., 166(May), 107325, doi:10.1016/j.envint.2022.107325, 2022.
  Parworth, C., Fast, J., Mei, F., Shippert, T., Sivaraman, C., Tilp, A., Watson, T. and Zhang, Q.: Long-term measurements of submicrometer aerosol chemistry at the Southern Great Plains (SGP) using an Aerosol Chemical Speciation Monitor (ACSM), Atmos. Environ., 106, 43–55, doi:10.1016/j.atmosenv.2015.01.060, 2015.
  Ulbrich, I. M., Canagaratna, M. R., Zhang, Q., Worsnop, D. R. and Jimenez, J. L.: Interpretation of organic components from Positive Matrix Factorization of aerosol mass spectrometric data, Atmos. Chem. Phys., 9(9), 2891–2918, doi:10.5194/acp-9-2891-2009, 2009.
  
  Citation: https://doi.org/10.5194/egusphere-2022-269-AC2
RC3:
'Comment on egusphere-2022-269', Anonymous Referee #3, 02 Aug 2022

The manuscript presents a comparison between two different approaches of PM1 organic aerosol (OA) source apportionment through the Positive Matrix Factorization (PMF) source-receptor model applied on mass spectra by Aerosol Chemical Speciation Monitor (ACSM): the widely used “seasonal” PMF against the emerging “rolling” PMF. The two approaches are systematically applied on both real-world ACSM datasets (from 9 European sites) and on a synthetic dataset. The comparison shows that the two approaches lead to similar apportionment results, both addressing the quality standards required by the source apportionment protocol. Overall “Rolling” PMF can be considered more accurate, especially in “transition” periods between seasons because it is able to better adapt to the changes in OA sources along the time. Interestingly the application of PMF on the synthetic dataset performed not so well for both the methodologies. This result shows the strong influence of using and selecting anchor profiles to constrain the PMF solutions and encourages the development of local reference profiles to minimize this impact on OA source apportionment.

This is a well-written paper that clearly describes methodologies, analyses and results. Even if the subject looks quite methodological and possibly suitable for more technical journals, actually the manuscript clarifies important and debated advantages/disadvantages of the approaches and can have a large impact on a wide audience of the atmospheric organic aerosol community.

For this reason, I recommend its publication after consideration of some major/minor comments/changes detailed below.

Major General comments:

-The discrepancies between original synthetic values and the PMF outputs on the same synthetic dataset are important and not negligible. For what I understand they originate by the choice of the anchor profiles used to constrain PMF solutions. Considering that it is now very common (and indeed strongly recommended by the AMS-data Source Apportionment protocols) to use constraints to apportion OA primary components, I also believe that this result is somewhat worrying. In particular the fact that POA are underestimated and SOA overestimated with respect to the original synthetic values might indicate that using constrains cannot be always the best option. Although I recognize this alone can be the topic of a specific publication, in my opinion the Authors in Section 3.1 should try to assess the general implications also with respect to previous/future studies applying OA source apportionment protocols. Specifically, the Authors should explain better the reasons for the choice of specific anchor profiles used in the different PMF solutions and possibly show (or at least comment) the results of the un-constrained PMF solutions on the synthetic dataset. More specific comments are below.

-I find the term “truth” to describe the synthetic dataset quite pretentious and misleading. I would suggest to change the name, especially in Figures, using for instance “original synthetic” or something else.

Specific comments

Abstract

P1-P2, L40-46: too general statements, not easy to really understand the importance of the findings and “quantify” them. Sentences like “although the rolling PMF profile adaptability feature has been proven advantageous” or “these results highlighted the impact of profile anchor on the solution” are quite hasty and vague: what this impact is? And the advantages? Although I acknowledge that it is not easy to find a quantitative and synthetic way (suitable for an abstract) to evaluate these advantages/impacts, I would recommend trying to do it by elaborating more (using also some numbers if possible) and / or removing too general sentences.

P2, L46-47: “The results of this comparison….were scarce.”, it is redundant, please remove the sentence and/or integrate with the rest.

P3, L104: what is “o their” meaning? is it a spelling mistake? In general, this sentence is hard to follow, consider to re-phrase it.

P3, L114: I am not a native English speaker but the use of "concerning" as a comparative clause sounds strange to me. Please check here and also in other parts of the text.

P4, L118: again, the term “granularity” sounds quite strange to me associated with timestamps, consider to replace with “size” or other.

P4, L146-148: the last two sentences of the paragraph appear redundant or not clear. Please, consider to re-phrase.

P5, L168-171: the verb of the main clause seems to be missing. Please add it or re-phrase.

P8, L265-266: what about the unconstrained application of PMF? Did you try? How unconstrained solutions perform in comparison with original synthetic values?

P8, L275-280: As already mentioned, the discrepancies reported in this section of the paper are a major issue that deserves more emphasis and possibly more elaborations. The risk is that, considering that the ability to reconstruct even a synthetic dataset is low, someone could question the OA source apportionment protocols and argue that PMF results are in general not robust in apportioning real-world sources, at least the ones using POA chemical features. I’m not saying this is true, but in my opinion the Authors should not underestimate the importance of these findings and spend more words to explain what their implications are in applying the OA source apportionment protocols in other studies (past and future). For instance, the analysis of the unconstrained-PMF runs and a comparison with the “best” solutions identified following the protocol could be worth of an assessment or at least comments.

P9, L303-305: here (or somewhere else in this section) a comparison with unconstrained PMF would be very welcome.

P9, L308: “m/z” is repeated.

P11, L357-360: could the higher errors on OOA factors be also due to the fact that SOA sources are changing between seasons? I mean for sake of simplicity in the comparison, SOA components are represented here by only two factors (MO- and LO-OOA) but it is possible that the model (especially rolling application) is able to split SOA in more factors, leaving it the freedom to go to higher number of factors. Could the Authors comment on this?

P11, L360-362: it is well known that BBOA profiles have a higher variability among sites and seasons. Could the BBOA positive whiskers be also influenced by the fact that this variability is better reproduced by rolling PMF?

P11, L364-369: the discussion here is quite misleading: if I understand well, Figure 4 shows the pie-charts without site-specific sources (which are very similar in term of relative contributions), while later on there is a list of differences of ratios probably based on the absolute factor-contributions to the total OA concentrations. In general please clarify why the pie charts are so similar while the ratios reported in the text are so different.

P11, L372-373: what about the discrepancy in Cyprus?

P12, L414-415: “The histogram …is plotted as a histogram”, please consider to re-phrase the sentence.

P13, L430: “are always greater for rather than for .” This is not true for BBOA vs BCwb and for MO-OOA vs SO4.

P14, L474: Figure S8 is reported as Figure S7 in the Supplementary. Check for consistency. Moreover, in this figure I suggest to change the color and/or the thickness of the lines because of difficult reading.

P15, L514-521: I would suggest the Authors to use some of these considerations also to improve the vague sentences of the abstract

Supplementary

P1, L29, right-bottom side: there is a “Sorry” to be deleted.

Citation: https://doi.org/10.5194/egusphere-2022-269-RC3
- AC3:
  'Reply on RC3', Marta Via, 30 Aug 2022
  Reply to Anonymous Referee #3 (Egusphere-2022-269)
  Received and published: 2nd August 2022
  The authors would like to thank this reviewer for the comments and suggestions, which helped improving the quality of this work. A new version of the manuscript has been prepared following the suggestions from the reviewers. We provide below detailed replies to each of the comments in a point‐by‐point manner. Figures and tables cited in this document can be found in the supplementary annex.
  COMMENTS FROM THE REVIEWER
  
  The manuscript presents a comparison between two different approaches of PM₁ organic aerosol (OA) source apportionment through the Positive Matrix Factorization (PMF) source- receptor model applied on mass spectra by Aerosol Chemical Speciation Monitor (ACSM): the widely used “seasonal” PMF against the emerging “rolling” PMF. The two approaches are systematically applied on both real-world ACSM datasets (from 9 European sites) and on a synthetic dataset. The comparison shows that the two approaches lead to similar apportionment results, both addressing the quality standards required by the source apportionment protocol. Overall “Rolling” PMF can be considered more accurate, especially in “transition” periods between seasons because it is able to better adapt to the changes in OA sources along the time. Interestingly the application of PMF on the synthetic dataset performed not so well for both the methodologies. This result shows the strong influence of using and selecting anchor profiles to constrain the PMF solutions and encourages the development of local reference profiles to minimize this impact on OA source apportionment.
  This is a well-written paper that clearly describes methodologies, analyses and results. Even if the subject looks quite methodological and possibly suitable for more technical journals, actually the manuscript clarifies important and debated advantages/disadvantages of the approaches and can have a large impact on a wide audience of the atmospheric organic aerosol community.
  For this reason, I recommend its publication after consideration of some major/minor comments/changes detailed below.
  
  MAJOR COMMENTS:
  The discrepancies between original synthetic values and the PMF outputs on the same synthetic dataset are important and not negligible. For what I understand they originate by the choice of the anchor profiles used to constrain PMF solutions. Considering that it is now very common (and indeed strongly recommended by the AMS-data Source Apportionment protocols) to use constraints to apportion OA primary components, I also believe that this result is somewhat worrying. In particular the fact that POA are underestimated and SOA overestimated with respect to the original synthetic values might indicate that using constrains cannot be always the best option. Although I recognize this alone can be the topic of a specific publication, in my opinion the Authors in Section 3.1 should try to assess the general implications also with respect to previous/future studies applying OA source apportionment protocols. Specifically, the Authors should explain better the reasons for the choice of specific anchor profiles used in the different PMF solutions and possibly show (or at least comment) the results of the un-constrained PMF solutions on the synthetic dataset. More specific comments are below.
  
  One of the unexpected outcomes of this paper is the acknowledgement of the constraint influence on the solution. The PMF outcoming factors are biased by the a priori information introduced to initialise the F matrix. This finding then could represent a major reason to overhaul the source apportionment protocols to prevent introducing too much subjectivity.
  The multi-site comparison profiles were chosen by the participating PMF runners according to previous knowledge of site-specific OA sources’ profiles. In some cases, the cited studies of each individual source apportionment mention a-value and reference profile testing. The most used reference profiles were HOA from Crippa et al., (2013) and BBOA from Ng et al., (2011), and for this reason these were those selected for the synthetic dataset approach.
  In the following graph, the seasonal unconstrained and constrained experiments are compared to the truth. The rolling results are not shown here because these must point to a similar direction as seasonal ones as has been demonstrated multiple times in this paper.
  The Figure R.3.1, now included in SI as Figure S2, presents how these key ions are in all cases better described by the constrained experiment. HOA truth profile is the same as the anchor one, so the proximity of the points to truth was expected. Contrarily, in the BBOA and SOA profiles cases, the proximity of the constrained runs is greater even though: i. the constrained profile was different to the truth one in the case of BBOA; ii. No constraints were applied in the case of SOA. This figure proves that the application of constraints is still positive even though they are shown in this study that they significantly influence the solution. Furthermore, it can be seen how even if a factor is not constrained (like SOA in the current situation), it is better defined if other factors are constrained.
  The current study does not intend to comprehensively determine the role of the anchors on the continuously-evolving source apportionment guidelines, but to raise the suspicions on the effect they might have once these are come across. However, these anchoring still provides more benefit than unconstrained runs, therefore the inclusion of these procedures in source apportionment protocols should be maintained. Future research should be promoted in this direction in order to generate consensus towards profile anchoring relying on solid evidence.
  
  I find the term “truth” to describe the synthetic dataset quite pretentious and misleading. I would suggest to change the name, especially in Figures, using for instance “original synthetic” or something else.
  
  The choice of “truth” as the word to refer to the original factors was already justified in Comment #5 of the Anonymous Referee #1.
  
  SPECIFIC COMMENTS
  
  P1-P2, L40-46: too general statements, not easy to really understand the importance of the findings and “quantify” them. Sentences like “although the rolling PMF profile adaptability feature has been proven advantageous” or “these results highlighted the impact of profile anchor on the solution” are quite hasty and vague: what this impact is? And the advantages? Although I acknowledge that it is not easy to find a quantitative and synthetic way (suitable for an abstract) to evaluate these advantages/impacts, I would recommend trying to do it by elaborating more (using also some numbers if possible) and/ or removing too general sentences.
  
  These two sentences were rephrased as follows:
  This approach revealed similar apportionment results amongst methods, although the rolling PMF profile adaptability feature has been proven advantageous as it generated output profiles moving nearer to the truth points. Nevertheless, these results highlighted the impact of the profile anchor on the solution, as the use of a different anchor with respect to the truth led to significantly different results in both methods.
  As the reviewer mentioned, the summary of this technical paper into an abstract is a difficult matter, and more quantifications or numbers could not be added here.
  
  P2, L46-47: “The results of this comparison.... were scarce.”, it is redundant, please remove the sentence and/or integrate with the rest.
  
  This sentence referred to the multi-site comparison, whose results point to the same direction than those from the synthetic result. In order to promote this intention, the sentence was rephrased as follows:
  The results of this multi-site comparison coincide with the synthetic dataset in terms of rolling-seasonal similarity and rolling PMF reporting moderate improvements.
  
  P3, L104: what is “o their” meaning? is it a spelling mistake? In general, this sentence is hard to follow, consider to re-phrase it.
  
  This sentence was removed as it was confusing and did not add information. Here it is reproduced the paragraph without it:
  (…) Some of them contain site-specific sources related to instrument artefacts or proximity to pollution hotspots. The factors identified at all sites are Hydrocarbon-like OA (HOA), (…)
  
  P3, L114: I am not a native English speaker but the use of "concerning" as a comparative clause sounds strange to me. Please check here and also in other parts of the text.
  
  This word was replaced here as follows:
  It works at a lower mass-to-charge resolution but it is more robust compared to the Aerosol Mass Spectrometer (AMS, Aerodyne Research Inc, Billerica, MA, USA) allowing for long-term deployment.
  It was also replaced along the document to ensure a better readability.
  
  P4, L118: again, the term “granularity” sounds quite strange to me associated with timestamps, consider to replace with “size” or other.
  
  This word was removed from the text and that sentence is modified as follows:
  The resolution of ToF-ACSM datasets (10 minutes) was averaged to 30 minutes (resolution of the Q-ACSM) to have a harmonised of timestamps.
  
  P4, L146-148: the last two sentences of the paragraph appear redundant or not clear. Please, consider to re-phrase.
  
  The whole paragraph was rephrased for the sake of clarity:
  The first step for the synthetic dataset creation was to select p (number of factors), POA and SOA spectral profiles from the High-Resolution AMS Spectral database (Crippa et al., 2013; Ng et al., 2010; Ulbrich et al., 2009) and multiply them by the time series of the same sources from the model output. The error matrix was generated following the same steps as for real-world data and real-world parameters were used as detailed in SI. For this purpose, the dataset used is that from the Zurich site which ranges from February 2011 until December 2011. Hence, the same CAMx outcoming time series period was used to generate the concentration matrix. Gaussian noise was subsequently added to the outcoming matrix.
  
  P5, L168-171: the verb of the main clause seems to be missing. Please add it or rephrase.
  
  The missing verb was introduced as follows:
  In order to reach an environmentally-reasonable local Q minimum, the implementation of constraints on Primary Organic Aerosol factors (POA), has been performed according to the COLOSSAL guidelines for source apportionment (COLOSSAL, COST Action CA16109, 2019) and the protocol from Chen et al. (2022).
  
  P8, L265-266: what about the unconstrained application of PMF? Did you try? How unconstrained solutions perform in comparison with original synthetic values?
  
  This issue is extensively tackled in the first major comment. As a summary, the unconstrained solutions are less accurate than the constrained ones even though the latter also contain significant differences with respect to truth. For this reason, one should be aware that anchors do bias results but still reproduce better the truth than setting PMF runs free, so that anchoring should still be yet advised in source apportionment protocols.
  
  P8, L275-280: As already mentioned, the discrepancies reported in this section of the paper are a major issue that deserves more emphasis and possibly more elaborations. The risk is that, considering that the ability to reconstruct even a synthetic dataset is low, someone could question the OA source apportionment protocols and argue that PMF results are in general not robust in apportioning real-world sources, at least the ones using a priori POA chemical features. I’m not saying this is true, but in my opinion the Authors should not underestimate the importance of these findings and spend more words to explain what their implications are in applying the OA source apportionment protocols in other studies (past and future). For instance, the analysis of the unconstrained-PMF runs and a comparison with the “best” solutions identified following the protocol could be worth of an assessment or at least comments.
  
  The comments required in this comment are shown in the following question.
  
  P9, L303-305: here (or somewhere else in this section) a comparison with unconstrained PMF would be very welcome.
  
  The text was enriched by adding some more discussion as follows:
  The influence of reference profile constraints might have enhanced the misattribution of the profiles, for example, imposing m/z44-to-m/z43 ratios has led to a significant difference in the degree of oxidation solution with respect to truth. Nevertheless, constraining profiles has provided more accurate solutions than unconstrained set-ups, as shown in Figure S2. These plots show how seasonal constrained PMF launches always present higher similarity to truth in terms of key ions ratios. Moreover, OA sources of unanchored runs were less robust due to lower reproducibility along accumulation of runs. By extension, rolling results are expected to reproduce the same results as it has been proven that both techniques’ outcomes converge sufficiently.
  And the conclusions:
  (…) , as it differed significantly from the truth results when the anchor was significantly different to the truth profile. However, the use of profile constraints still provided solution closer to the truth than unconstrained PMF. Besides, the rolling method has been proven to give a more sensitive representation (…)
  
  P9, L308: “m/z” is repeated.
  
  This change was already addressed in Anonymous Referee #1 minor comment number one.
  
  References
  Crippa, M., Decarlo, P. F., Slowik, J. G., Mohr, C., Heringa, M. F., Chirico, R., Poulain, L., Freutel, F., Sciare, J., Cozic, J., Di Marco, C. F., Elsasser, M., Nicolas, J. B., Marchand, N., Abidi, E., Wiedensohler, A., Drewnick, F., Schneider, J., Borrmann, S., Nemitz, E., Zimmermann, R., Jaffrezo, J. L., Prévôt, A. S. H. and Baltensperger, U.: Wintertime aerosol chemical composition and source apportionment of the organic fraction in the metropolitan area of Paris, Atmos. Chem. Phys., 13(2), 961–981, doi:10.5194/acp-13-961-2013, 2013.
  Ng, N. L., Canagaratna, M. R., Jimenez, J. L., Chhabra, P. S., Seinfeld, J. H. and Worsnop, D. R.: Changes in organic aerosol composition with aging inferred from aerosol mass spectra, Atmos. Chem. Phys., 11(13), 6465–6474, doi:10.5194/acp-11-6465-2011, 2011.
  
  Citation: https://doi.org/10.5194/egusphere-2022-269-AC3

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Marta Via on behalf of the Authors (30 Aug 2022) Author's response Author's tracked changes Manuscript

ED: Publish as is (31 Aug 2022) by Mingjin Tang

AR by Marta Via on behalf of the Authors (31 Aug 2022)

Post-review adjustments

AA: Author's adjustment | EA: Editor approval

AA by Marta Via on behalf of the Authors (23 Sep 2022) Author's adjustment Manuscript

EA: Adjustments approved (23 Sep 2022) by Mingjin Tang

Download

The requested paper has a corresponding corrigendum published. Please read the corrigendum first before downloading the article.

Article (1407 KB)
Full-text XML

Short summary

This work presents the differences resulting from two techniques (rolling and seasonal) of the positive matrix factorisation model that can be run for organic aerosol source apportionment. The current state of the art suggests that the rolling technique is more accurate, but no proof of its effectiveness has been provided yet. This paper tackles this issue in the context of a synthetic dataset and a multi-site real-world comparison.