the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Identification of spikes in continuous ground-based in-situ time series of CO2, CH4 and CO: an extended experiment within the European ICOS-Atmosphere Network
Paolo Cristofanelli
Cosimo Fratticioli
Lynn Hazan
Mali Chariot
Cedric Couret
Orestis Gazetas
Dagmar Kubistin
Antti Laitinen
Ari Leskinen
Tuomas Laurila
Matthias Lindauer
Giovanni Manca
Michel Ramonet
Pamela Trisolino
Martin Steinbacher
Abstract. The identification of spikes (i.e. short and high variability in the measured signals due to very local emissions occurring in the proximity of a measurement site) is of interest when using continuous measurements of atmospheric greenhouse gases (GHGs) in different applications like the determination of long-term trends and/or spatial gradients, the inversion experiments devoted to the top-down quantification of GHG surface-atmosphere fluxes, the characterization of local emissions or the quality control of GHG measurements. In this work, we analysed the results provided by two automatic spike identification methods (i.e. the standard deviation of the background - SD and the robust extraction of baseline signal - REBS) for a 2-year dataset of 1-minute in-situ observations of CO2, CH4 and CO at ten different atmospheric sites spanning different environmental conditions (remote, continental, urban).
The sensitivity of the spike detection frequency and its impact on the averaged mole fractions on method parameters was investigated. Results for both methods were compared and evaluated against manual identification of the site Principal Investigators (PIs).
The study showed that, for CO2 and CH4, REBS identified a larger number of spikes than SD and it was less “site-sensitive” than SD. This led to a larger impact of REBS on the time-averaged values of the observed mole fractions for CO2 and CH4. Further, it could be shown that it is challenging to identify one common algorithm/configuration for all the considered sites: method-dependent and setting-dependent differences in the spike detection were observed as a function of the sites, case studies and considered atmospheric species. Neither SD nor REBS appeared to provide a perfect identification of the spike events. The REBS tendency to over-detect the spike occurrence shows limitations when adopting REBS as an operational method to perform automatic spike detection. REBS should be used only for specific sites, mostly affected by frequent very nearby local emissions. SD appeared to be more selective in identifying spike events and the temporal variabilities of CO2, CH4 and CO were more consistent with that of the original datasets. Further activities are needed for better consolidating the fitness for purposes of the two proposed methods and to compare them with other spike detection techniques.
- Preprint
(3000 KB) - Metadata XML
-
Supplement
(1910 KB) - BibTeX
- EndNote
Paolo Cristofanelli et al.
Status: final response (author comments only)
-
RC1: 'Comment on amt-2023-130', Anonymous Referee #1, 14 Aug 2023
Referee comment on manuscript: amt-2023-130
Identification of spikes in continuous ground-based in-situ time series of CO2, CH4 and CO: an extended experiment within the European ICOS-Atmosphere Network
Paolo Cristofanelli, et al;General comments:
In this paper the authors evaluated the effectiveness of two types of algorithms that can be used to automatically identify spikes in continuous long-term data sets. Their application was specifically focussed on GHG species and the observations within the ICOS network (which consists of various sites, each having unique site specific properties). The explanation of the two algorithms (one based on the standard deviation of the background - SD and the other a robust extraction of baseline signal - REBS) and their respective modifiable (tuneable) factors to the specific measuring stations, is of interest to the wider scientific community and certainly applicable to other non-ICOS measuring stations, who also produces continuous, long-term data records. The graphics in the paper are relevant and clearly shows the reader the observed differences between the applied filter(s) and original data set. This paper is well written and very logically structured. The conclusions that were arrived at are justified and well-articulated.
Specific comments:
P5 Line 150: It would be beneficial if the authors can state the approximate numbers of tourists per days / season? This will provide the reader additional information on their possible impact at the sites (also for other sites mentioned where visitor platforms are in close proximity to instrument inlets ...).
P11 Line 332: The authors mentioned “…again, significant impacts of the de-spiking were observed…” Was this “significance” statistically tested? It would add value to the discussion if the authors could mention a confidence level / statistical evaluation of this significance.
A general comment - Have the authors considered making use of CO as a spike filter? It might be worthwhile for the authors to consider adding a second, independent spike filter parameter, such as CO to aid in the refining of the primary spike detection technique. This will certainly assist in instances where anthropogenic emissions are the source of these spikes. It might also only be quite site specific, and not applicable to a general solution…
Technical corrections:
P6 Line 164: Sentence construction requires a re-write/rephrase… “…agricultural activities taking place during the livestock farming…” replace “taking place during the” with “from”
P10 Line 308: Replace “Not” with “No”
P13 Line 403: Rephrase “…by looking into CO also REBS…” with ”…by looking at CO, REBS also …”
P13 Line 406: Rephrase “…the most part of spike events…” with “ …most of the spike events…”
P15 Line 474: Replace “…not…” with “…no…”
Citation: https://doi.org/10.5194/amt-2023-130-RC1 -
RC2: 'Comment on amt-2023-130', Anonymous Referee #2, 18 Aug 2023
Cristofanelli et al. applied two automatic spike identification methods, SD and REBS, to continuous observations of CO2, CH4 and CO from ICOS-RI network. They conducted a comprehensive comparison of both methods for measurements across different time scales (hourly, monthly) at sites within various environments (remote, continental, urban). The manuscript is well-written and falls within the scope of AMT. Thus, I recommend its publication with a few minor revisions.
- While I agree with the authors that the ability to detect spikes for each method is
a function of the sites, events, and considered species, I strongly suggest that the authors compile a summary table at the end, outlining their recommendations for spike identification methods and the parameters utilized for each method at each site.
- The figures are dense, especially Figures 8-10. Please simplify it and consider moving some subplots into supplements.
- Line 70-71, it is recommended to add the definition of “regional” signal and to ensure a distinct differentiation between "local" and "regional" signals.
- Line 104: Change “to of access” to “to access”
Citation: https://doi.org/10.5194/amt-2023-130-RC2 -
RC3: 'Comment on amt-2023-130', Anonymous Referee #3, 04 Sep 2023
The paper presents a systematic analysis of spike detection algorithms applied to ICOS atmospheric data for CH4, CO2, and CO, for a variety of sites. In general the paper is well written, and I recommend publication after the following concerns have been addressed.
General Comments:
As the authors correctly point out in the introduction, it is “very local emissions” that are of concern when using the data in inverse atmospheric transport models. I miss a discussion on this main use of ICOS atmosphere data in the discussion section. The basic question that needs discussion is what we expect models to represent. For example, the “very local sources of CH4 due to the systematic venting of cattle farms located in the proximity of the site” (Lines 374-375) for IPR could be included in a atmospheric transport model, of the resolution is sufficiently high.
Furthermore, there was a discussion of using buffer volumes to time-integrate samples at ICOS sites with multiple vertical levels, such that meaningfull instantaneous gradient information can be obtained. This is also mentioned in the cited ICOS RI 2020 (Atmosphere Station Specifications V2.0). It should be at least mentioned, how many ICOS sites are actually using buffers. Unfortunately it is unclear if any sites are using this as the meta data available through ICOS-CP don’t seem to include any information on the use of a Buffer volume (although recommended in ICOS RI 2020). If there are ICOS sites where buffer volumes are deployed, it should be discussed that for those a different strategy needs to be deployed for filtering (e.g. de-convolution as demonstrated by Winderlich et al. (2010), followed by spike detection).
Reference: Winderlich, J., Chen, H., Gerbig, C., Seifert, T., Kolle, O., Lavrič, J. V., Kaiser, C., Höfer, A., and Heimann, M.: Continuous low-maintenance CO2/CH4/CO measurements at the Zotino Tall Tower Observatory (ZOTTO) in Central Siberia, Atmos. Meas. Tech., 3, 1113–1128, https://doi.org/10.5194/amt-3-1113-2010, 2010.
Specific comments
Ln 47: “networks with surface footprints representative-enough of the tagged spatial regions” may be replace with “networks whose surface footprints are representative enough of the tagged spatial regions “
Ln 49: “the measurement sites must carry out accurate measurements“ -> “accurate measurements are required at the measurement sites”
Ln 59-62: Is the only objective of the near-real time delivery the application of QA/QC checks? I would hope that the driving idea behind is utilization of the data in NRT.
Ln 79: “basing“ -> „based“
Ln 104: „were used to of assess“ drop the “of”
Ln 107: before using site abbreviations (here “PUI”) I suggest using a reference to Table 1, to which I recommend adding the site names and countries as additional columns.
Ln 218: “but not at PUI“ -> „except for PUI, where inly CO2 and CH4 are measured”
Ln 231: What is meant by “combination”, one with “and” or with “or”? Or, in other words, are all spikes considered as spikes, or only those that are detected in both directions?
Ln 233: What happens if there is an interruption of data (e.g. the sampling switches to a different level), is “n” then still only counting data points, or is it counting time in minutes?
Table 1: In addition to site names and countries, also the sampling levels should be included as additional columns.
Ln 274: “Continental sites” – does this mean sites having a certain distance to the coast? Or non-mountain and non-island sites? Table 1 lists as site classification only “Remote” and “Non-remote”, may be one can add also “continental” vs. others (coastal, island, mountain). Now going through the paper again, I see “continental” is described as environmental conditions, with others being “remote” and “urban”. In that context may be this means “continental background”? There is a need for a clear characterization and nomenclature.
Fig. 4 caption: “differences in the percentiles of hourly mean values between de-spiked and original dataset“ – I would call this “the percentiles of hourly mean value differences between de-spiked and original dataset”. “differences in the percentiles” would not have units of ppb or ppm.
Fig. 4: I suggest for easier comparisons to use the same y-axis range for the REBS and SD results.
Ln 388: “which method was in better agreement” please add with what this agreement is better. Is it the expectation by the expert as in indicated in the following sentence?
Ln 394: what is meant by “ “standard” settings at JFJ”?
Ln 454: “this exercise would allow” may be replace with “this exercise allows”?
Ln 468-469: note that in Ln 451-452 it is stated that “BIAS” should be as close as possible to 1. With the exception of site UTO REBS shows thus better (not just “much higher”) BIAS than SD.
Ln 475: may be replace “In respect to the all-spike analysis, both SD and REBS were more effective in catching events” with “Compared with the all-spike analysis, both SD and REBS were more effective in catching high-spike events” (if I understood this correctly). Also please refer to Table 3.
Ln 528: “running REBS on standard deviation records” this is not clear to me
Citation: https://doi.org/10.5194/amt-2023-130-RC3
Paolo Cristofanelli et al.
Paolo Cristofanelli et al.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
314 | 79 | 16 | 409 | 33 | 10 | 14 |
- HTML: 314
- PDF: 79
- XML: 16
- Total: 409
- Supplement: 33
- BibTeX: 10
- EndNote: 14
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1