25 Feb 2022
25 Feb 2022
Status: this preprint is currently under review for the journal AMT.

Automated identification of local contamination in remote atmospheric composition time series

Ivo Beck1, Hélène Angot1, Andrea Baccarini1, Lubna Dada1, Lauriane Quéléver2, Tuija Jokinen2,3, Tiia Laurila2, Markus Lampimäki2, Nicolas Bukowiecki4, Matthew Boyer2, Xianda Gong5, Martin Gysel-Beer6, Tuukka Petäjä2, Jian Wang5, and Julia Schmale1 Ivo Beck et al.
  • 1Extreme Environments Research Laboratory, École Polytechnique fédérale de Lausanne, Switzerland
  • 2Institute for Atmospheric and Earth System Research, INAR/Physics, FI-00014 University of Helsinki, Finland
  • 3Climate & Atmosphere Research Centre (CARE-C), The Cyprus Institute, P.O. Box 27456, Nicosia, 1645, Cyprus
  • 4Atmospheric Sciences, Department of Environmental Sciences, University of Basel, Switzerland
  • 5Center for Aerosol Science and Engineering, Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, USA
  • 6Laboratory of Atmospheric Chemistry, Paul Scherrer Institute, Villigen PSI, Switzerland

Abstract. Atmospheric observations in remote locations offer a possibility to explore trace gas and particle concentrations in pristine environments. However, data from remote areas are often contaminated by pollution from local sources. Detecting this pollution is thus a central and frequently encountered issue. Consequently, many different methods exist today to identify local pollution in atmospheric composition measurement time series, but no single method has been widely accepted. In this study, we present a new method to identify primary pollution in remote atmospheric datasets, e.g., from ship campaigns or stations with low background signal compared to the pollution signal. The Pollution Detection Algorithm (PDA) identifies and flags periods of polluted data in five steps. The first and most important step identifies polluted periods based on the gradient (time-derivative) of a concentration over time. If this gradient exceeds a given threshold, data are flagged as polluted. Further pollution identification steps are a simple concentration threshold filter, a neighboring points filter (optional), a median and a sparse data filter (optional). The PDA only relies on the target dataset itself and is independent of ancillary datasets such as meteorological variables. All parameters of each step are adjustable so that the PDA can be “tuned” to be more or less stringent (e.g., flag more or less data points as polluted).

The PDA was developed and tested with a particle number concentration dataset collected during the Multidisciplinary drifting Observatory for the Study of Arctic Climate (MOSAiC) expedition in the Central Arctic. Using strict settings, we identified 62 % of the data as influenced by local pollution. Using a second independent particle number concentration dataset also collected during MOSAiC, we evaluated the performance of the PDA against the same dataset cleaned by visual inspection. The two methods agreed in 94 % of the cases. Additionally, the PDA was successfully applied on a trace gas dataset (CO2), also collected during MOSAiC, and on another particle number concentration dataset, collected at the high altitude background station Jungfraujoch, Switzerland. Thus, the PDA proves to be a useful and flexible tool to identify periods affected by local pollution in atmospheric composition datasets without the need for ancillary measurements. It is best applied to data representing primary pollution. The user-friendly and open access code enables reproducible application to a wide suite of different datasets. It is available at:

Ivo Beck et al.

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on amt-2021-429', Anonymous Referee #1, 06 Mar 2022
    • AC1: 'Reply on RC1', Julia Schmale, 19 May 2022
  • RC2: 'Comment on amt-2021-429', Anonymous Referee #2, 20 Mar 2022
    • AC2: 'Reply on RC2', Julia Schmale, 19 May 2022

Ivo Beck et al.

Model code and software

Pollution Detection Algorithm (PDA) Ivo Beck, Hélène Angot, Andrea Baccarini, Markus Lampimäki, Matthew Boyer, Julia Schmale

Ivo Beck et al.


Total article views: 549 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
424 108 17 549 9 10
  • HTML: 424
  • PDF: 108
  • XML: 17
  • Total: 549
  • BibTeX: 9
  • EndNote: 10
Views and downloads (calculated since 25 Feb 2022)
Cumulative views and downloads (calculated since 25 Feb 2022)

Viewed (geographical distribution)

Total article views: 495 (including HTML, PDF, and XML) Thereof 495 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 24 May 2022
Short summary
We present the Pollution Detection Algorithm (PDA), a new method to identify pollution in remote atmospheric aerosol and trace gas data. The PDA identifies periods of polluted data, and relies only on the target dataset itself, i.e. it is independent of ancillary data such as meteorological variables. The parameters of all steps are adjustable so that the PDA can be tuned to be more or less stringent. The PDA is available as open access code.