Articles | Volume 13, issue 7
Atmos. Meas. Tech., 13, 3815–3834, 2020
Atmos. Meas. Tech., 13, 3815–3834, 2020

Research article 15 Jul 2020

Research article | 15 Jul 2020

Integration and calibration of non-dispersive infrared (NDIR) CO2 low-cost sensors and their operation in a sensor network covering Switzerland

Integration and calibration of non-dispersive infrared (NDIR) CO2 low-cost sensors and their operation in a sensor network covering Switzerland
Michael Müller1, Peter Graf1, Jonas Meyer2, Anastasia Pentina3, Dominik Brunner1, Fernando Perez-Cruz3, Christoph Hüglin1, and Lukas Emmenegger1 Michael Müller et al.
  • 1Empa, Swiss Federal Laboratories for Materials Science and Technology, Dübendorf, Switzerland
  • 2Decentlab GmbH, Dübendorf, Switzerland
  • 3Swiss Data Science Center, Zurich, Switzerland

Correspondence: Christoph Hüglin (


More than 300 non-dispersive infrared (NDIR) CO2 low-cost sensors labelled as LP8 were integrated into sensor units and evaluated for the purpose of long-term operation in the Carbosense CO2 sensor network in Switzerland. Prior to deployment, all sensors were calibrated in a pressure and climate chamber and in ambient conditions co-located with a reference instrument. To investigate their long-term performance and to test different data processing strategies, 18 sensors were deployed at five locations equipped with a reference instrument after calibration. Their accuracy during 19 to 25 months deployment was between 8 and 12 ppm. This level of accuracy requires careful sensor calibration prior to deployment, continuous monitoring of the sensors, efficient data filtering, and a procedure to correct drifts and jumps in the sensor signal during operation. High relative humidity (> ∼85 %) impairs the LP8 measurements, and corresponding data filtering results in a significant loss during humid conditions. The LP8 sensors are not suitable for the detection of small regional gradients and long-term trends. However, with careful data processing, the sensors are able to resolve CO2 changes and differences with a magnitude larger than about 30 ppm. Thereby, the sensor can resolve the site-specific CO2 signal at most locations in Switzerland. A low-power network (LPN) using LoRaWAN allowed for reliable data transmission with low energy consumption and proved to be a key element of the Carbosense low-cost sensor network.

1 Introduction

The number of available low-cost sensor types for ambient trace gas observations has increased in recent years. Frequently, these sensors are combined with wireless data transfer capabilities to form a versatile measurement unit. Low-cost sensors for trace gas measurements are based on different working principles such as metal-oxide semiconductors, electrochemical cells or non-dispersive infrared detection (NDIR). For CO2, NDIR is the most common technique (Lewis et al., 2018). Similarly to other instruments, knowledge of the sensors' characteristics such as sensitivity, cross sensitivity or ageing is important for meaningful applications. Moreover, the raw sensor output must be converted into the molar fraction of the target gas using a mathematical function. The mathematical models provided by the manufacturers are often not sufficient to meet the accuracy demands of trace gas measurements in outdoor conditions. Different approaches such as multilinear regression (Mueller et al., 2017; Martin et al., 2017; Spinelle et al., 2017), random forest models (Bigi et al., 2018; Zimmerman et al., 2018) or artificial neural networks (Spinelle et al., 2017) are investigated to derive better-performing sensor models. However, thorough model validation that is adequate with respect to the foreseen application is necessary for this task, especially as many data-driven models include parameters that were not shown to have a reproducible impact on the sensor signal. Some approaches also employ information in the model that is only valid in a statistical manner, such as similar pollutant concentrations at the sensor location and at the closest reference site during selected time periods (Mueller et al., 2017; Kim et al., 2018). The use of a standardized terminology for processing levels, as was recently proposed by Schneider et al. (2019), is recommended to clearly define the type of information a sensor model is based on.

The design of low-cost sensors usually relies on a less stable and less controlled measurement environment than high-end instruments. Therefore, the mathematical description of sensor behaviour must be flexible and robust enough to accommodate a wide range of operating conditions. Nevertheless, the accuracy level achieved by low-cost trace gas sensors is still significantly below that of high-precision instruments. This may be acceptable in view of their lower costs if the achievable data quality remains suitable for a specific application. Usually, low-cost sensors have to be individually calibrated for achieving their best performance, and data processing is an essential element to obtain accurate measurements. This data processing includes filtering to eliminate and report outliers or data of reduced quality and the detection of changes in sensor characteristics which require the adaptation of the model that converts the raw sensor output to the molar fraction.

Smart and dependable sensor integration is crucial for both high data quality and reliable and cost-efficient operation. A long-lasting autonomous sensor deployment requires that the sensor unit has low energy consumption, which depends on the energy consumption of the sensing device, the measurement frequency, the on-site data processing and the method that is used for data transmission. The latter can be achieved using the LoRaWAN protocol (LoRa-Alliance, 2019), which offers data transmission with highly reduced energy consumption compared to mobile communication networks such as GSM, UMTS and LTE.

Increasing the spatial coverage of a measurement network or reducing its costs by the operation of low-cost sensors is appealing. However, the number of long-term applications of low-cost sensors is still sparse (Mueller et al., 2017; Shusterman et al., 2016; Castell et al., 2017; Popoola et al., 2018). The total costs of the sensors, their calibration, deployment, data transmission, and data processing have to be in equilibrium with the information the sensors provide. Further technical and operational progress is required to enhance both the efficiency and the data quality of low-cost sensor networks and to eventually integrate more low-cost sensors into meaningful services. Examples of research activities in the field of lower-cost CO2 measurements and sensor networks are provided by Arzoumanian et al. (2019) and Shusterman et al. (2016).

In this study we present the deployment of more than 250 low-cost CO2 sensors in Switzerland in the framework of the Carbosense project, which aims to assess anthropogenic and natural CO2 fluxes in Switzerland through a combination of dense observations and high-resolution atmospheric transport modelling. The entire CO2 sensor network is formed by high-precision instruments, intermediate precision instruments and low-cost sensors. The accuracy of the low-cost sensors is clearly outside the extended compatibility goal of 0.2 ppm for CO2 proposed within the activities of the World Meteorological Organization (WMO) Global Atmosphere Watch (Tans and Zellweger, 2014). However, these sensors are not intended to resolve small regional gradients and trends in atmospheric CO2. Rather they should complement the high-precision measurements by providing information on short-term and local variations in CO2 on the order of several tens of parts per million as expected near emission sources, e.g. in the city of Zurich, or due to CO2 accumulation when the boundary layer is shallow.

This paper focuses on the calibration of the LP8 CO2 sensors, their operation within the Carbosense network, the sensor data processing and the achieved data quality. Most of the findings and developments carried out by means of the Carbosense sensor network such as aspects of data transmission and data processing are generic and transferable to other low-cost trace gas sensor networks.

2 Hardware and infrastructure

2.1 Carbosense network

The Carbosense CO2 sensor network covers the whole of Switzerland with a regional focus on the city of Zurich (Fig. 1). It is formed by three classes of sensors: (i) seven high-precision cavity ringdown spectrometers (Picarro G1301, G2302 and G2401); (ii) 20 temperature-stabilized, mains-powered NDIR medium-cost sensors with active sampling and reference gas supply (Senseair HPP; Hummelgard et al., 2015); and (iii) 300 nodes of battery-powered CO2 low-cost diffusive NDIR sensors (Senseair LP8). The deployment of the first low-cost sensors started in July 2017, and the network has been continuously extended, reaching 230 sensors by September 2019. The CO2 low-cost sensors are deployed at antenna locations of the telecommunication company Swisscom (4–150 m above ground), meteorological measurement sites of the Federal Office of Meteorology and Climatology (MeteoSwiss; 10 m above ground), and sites of the National Air Pollution Monitoring Network NABEL (5 m above ground). Within the city of Zurich, the sensors are also mounted on lamp posts or electricity poles (3–5 m above ground). The low-cost sensor network covers a wide altitude range from 200 to 2390 m a.s.l. and various orographic conditions and landscape types (urban areas, agricultural lands, forests, mountain areas). This implies a wide range of environmental conditions during the operation of the sensors. The deployment of the HPPs started in August 2018, and instruments were operating at 15 locations as of 1 September 2019.

Figure 1Carbosense sensor network as of 13 September 2019. Red dots depict LP8 sensor locations; yellow dots depict HPP sensor locations; blue dots depict locations of Picarro instruments. The cantons or administrative divisions of Zurich and Ticino are plotted and marked by ZH and TI. Geographic data used for creating the base map originate from (last access: 29 October 2019) and (last access: 29 October 2019).

2.2CO2 low-cost sensor unit

2.2.1 Integrated sensors

The CO2 low-cost sensor units (dimensions 110mm×80mm×65mm) were engineered by Decentlab GmbH (Fig. 2). A sensor unit comprises a Senseair LP8 sensor (Senseair, 2019), a Sensirion SHT21 sensor (Sensirion, 2019), a LoRaWAN communication module, a microprocessor and two batteries for power supply. There is no active ventilation. The LP8 and SHT21 sensors are located close to the opening of the box to ensure fast response times. Dead volumes are kept as small as possible for the same reason. The LP8 sensor reports the infrared (IR) measurement, a CO2 molar fraction based on factory calibration, temperature and its status. The SHT21 sensor measures temperature and relative humidity (±0.3C, ±2 % RH). The measurement frequency was set to 1 min for all the sensors, and the measurements are transmitted as 10 min averages together with the last single measurement of the infrared and temperature values over Swisscom's Low Power Network (LPN; based on LoRaWAN). However, during the first weeks of using the sensor units in spring 2017, only the last single values were transmitted for all the measurement types. Since the unit is not equipped with a pressure sensor, pressure has to be measured independently or has to be estimated from other information sources, which is possible with a small uncertainty of ±1 hPa as described in Sect. 3.2.

Figure 2(a) CO2 low-cost sensor unit and LP8 sensor (front). (b) Schematic view of the sensor unit. The LP8 and SHT21 sensors reside close to the opening of the sensor unit. The sensing area of the LP8 is indicated by a thick line. The volume around the sensors is minimized on three sides (left, back, right) by filling material. The board separates by its form the lower from the upper part of the unit.


2.2.2 LP8 sensor

Operating conditions of the LP8 sensor are specified by the manufacturer as 0–50 C, 0 % RH–85 % RH and 0–2000 ppm CO2. The specifications in terms of accuracy are ±50 ppm and ±3 % of reading (Senseair, 2019), which are insufficient for applications in ambient air. The LP8 sensor provides a CO2 measurement based on the factory calibration, the sensor temperature and sensor status information. In addition, the LP8 infrared measurement (preprocessed by the sensor firmware) is accessible. It enables a calibration based on an extended mathematical sensor model that relates the infrared measurement to the CO2 mole fraction χCO2 in moist air. The parameters of the sensor model have to be determined during a calibration process.

The LP8 is a non-dispersive infrared sensor, and, thus, its working principle is based on the Beer–Lambert law.

(1) log I 0 I 1 = ϵ λ c d

I0 and I1 denote the emitted and detected light; c is the number density of the gas (mol m−3); ϵλ is the molar attenuation coefficient (m2 mol−1); d is the path length (m) of the beam of light through the cell.

The number of moles of CO2 (nCO2) equals

(2) n CO 2 = χ CO 2 p V R T = χ CO 2 n P 0 , T 0 p T 0 p 0 T ,

with p, T and V denoting the pressure, temperature and volume of the gas, p0=1013.25 hPa and T0=273.15 K the standard pressure and temperature, R the universal gas constant (8.3145 J K−1 mol−1), and χCO2 the CO2 mole fraction in moist air. With the CO2 number density cCO2=nCO2/V (mol m−3), combining Eqs. (1) and (2) yields

(3) χ CO 2 p T 0 p 0 T = V n P 0 , T 0 ϵ λ d log I 0 - log I 1 .

The volume V, the path length d and the molar attenuation coefficient ϵλ are unknown constants. Also the emitted light I0 cannot directly be observed. It is expected to slightly change over time. In order to compensate for temperature effects (e.g. through effects on the optical filter or the detector), pressure effects (e.g. through pressure-dependent spectral line broadening), and changes in the intensity of the emitted light (I0) or in the geometry of the light beam, Eq. (3) is expanded by additional terms as follows:

(4) χ CO 2 p T 0 p 0 T = k 0 + k 1 log I 1 + i = 1 3 u i T i + i = 1 3 v i T i I 1 + w 1 p - p 0 p 0 + f t .

These terms are empirically chosen with the objective of keeping the model simple. The coefficients ki, ui, vi and wi are unknown and have to be determined by calibration. Temperature effects are described by a polynomial of up to the third order and pressure effects by a linear model. The terms associated with the parameters vi are based on the transformation log(I1+e)=log(I1(1+e/I1))=log(I1)+log(1+e/I1)log(I1)+e/I1, where e is a small impacting effect.

The function f(t) accounts for possible temporal changes in light intensity I0 or changes in optical path length. For practical reasons, it was modelled as a step function with a temporal resolution of approximately 14 d during calibration. The variable T is the temperature provided by the LP8 sensor. Usually, atmospheric transport models use χCO2,dry as input. The CO2 dry-air mole fraction χCO2,dry can be computed as χCO2/(1-χH2O), with χH2O being the air mole fraction of water. This quantity is computed from T, RH (SHT21 sensor) and p. The formula used is given in the Supplement.

For each sensor, the coefficients of Eq. (4) are determined during initial calibration. The final calibrated model describes the CO2 mole fraction based on I1, T and p accounting for the ideal gas law and additional optical and thermal effects of the sensor. Some of the terms compensating for optical and thermal effects include I1. If I1 changes strongly, the respective compensations are not adequate anymore. Therefore, an additional simplified model is defined with only one term depending on I1. This model has a reduced capability to account for different environmental conditions, but it is more robust against large changes in I1.

(5) χ CO 2 p T 0 p 0 T = k 0 + k 1 log I 1 + i = 1 3 u i T i + w 1 p - p 0 p 0 + f t

For each sensor and calibration, the coefficients of Eqs. (4) and (5) are determined.

The presented LP8 sensor model corresponds to level-2B in the terminology presented by Schneider et al. (2019). This means that, related to the sensor unit, internal and external information is employed but is limited to parameters that are appropriate for artefact correction and directly related to the measurement principle.

Equations (4) and (5) do not include a term that is dependent on humidity although, from the theory of spectroscopy, a certain impact of humidity on the detected light is likely. We did not find a parametrization with respect to RH that leads to clear improvements in CO2 accuracy compared to reference measurements. The water molar fraction might be more relevant than RH, but as temporal variation in this quantity is much smaller than in RH, we expect that a part of it is absorbed by f(t). Therefore, we did not further investigate this option.

2.2.3 Data transmission over the LPN

The measurements of the sensor units are transmitted every 10 min over Swisscom's Low Power Network (LPN) to a central database hosted by Decentlab GmbH. Swisscom's LPN is based on the LoRaWAN protocol (LoRa-Alliance, 2019), using chirp spread spectrum modulation in the frequency band between 863 and 870 MHz and operating as a commercial service. LoRaWAN is a wireless network protocol focusing on asymmetrically organized, energy-efficient data transmission. Data can be transmitted as far as several tens of kilometres in rural areas.

In our case, the sensor units have a transmission rate of 10 min, while the LP8 and SHT sensors operate at a sampling rate of 1 min. Every transmitted message contains 33 bytes (14 numbers). The energy consumption of data transmission over the LPN depends on the spreading factor (SF). Most sensor units in the Carbosense network operate on SF7. In this case, a sensor unit can independently operate for 5.1 years before the two batteries (alkaline, 1.5V, IEC LR14) need to be replaced. Here, radio transmission requires 22 % of the total energy used by the sensor unit.

2.3 Sensor calibration infrastructure

2.3.1 Climate and pressure chambers

Calibration data for the determination of the temperature and pressure dependencies of the LP8 sensors were obtained by placing the sensors in climate and pressure chambers. One climate chamber and one pressure chamber at Empa and one pressure chamber at METAS (Federal Institute for Metrology) were used for this task.

In the climate chamber at Empa, the sensors were exposed to at least four 24 h long temperature profiles uniformly decreasing from 50 to −5C at CO2 levels of 350, 450, 700 and 1000 ppm. In the pressure chamber at METAS, pressure levels were varied between 780 and 1050 hPa at CO2 levels of 420 and 900 ppm and at a temperature of 24 C. In the pressure chamber at Empa, pressure levels were varied between 800 hPa and ambient pressure (approximately 960 hPa) at CO2 levels between 350 and 1000 ppm. The three chambers were not completely airtight and thus required a continuous supply of air with a specific CO2 molar fraction, ventilation to ensure a uniform mixture of air within the chambers and a pump for the pressure chamber. Picarro G1301 and G2401 instruments were connected to the chambers to provide CO2 reference values. Pressure was recorded by calibrated instruments (outside the climate chamber, inside the pressure chambers).

2.3.2 High-precision CO2 measurement sites

High-precision CO2 field measurements are performed at several locations in Switzerland. Those used in this project for sensor calibration and assessment of the sensors' long-term performance as well as for correcting the sensor drifts (see Sect. 3.5) are listed in Table 1 and shown in Fig. 1. The CO2 measurement facilities at sites BRM, GIMM and LAEG were initiated within the CarboCount project (Oney et al., 2015; Berhanu et al., 2016). Sites HAE, PAY and RIG belong to the Swiss National Air Pollution Monitoring Network, NABEL (Empa, 2018). The CO2 measurement infrastructure at DUE was specifically set up within the Carbosense project to provide an accurate reference for LP8 sensors during ambient calibration. The CO2 measurement instruments were calibrated using working standards with traceability to the WMO X2007 calibration scale (Zhao and Tans, 2006; Tans et al., 2017).

Table 1High-precision CO2 measurements available for this study. The locations of the sites are shown in Fig. 1. H denotes the altitude of the instrument; HI denotes the height above ground level of the inlet of the tube that connects to the high-precision instrument; HS denotes the height above ground level of the LP8 sensors deployed at this site.

Download Print Version | Download XLSX

2.4 Data storage infrastructure

The raw data from the sensor units, after being transmitted via the LPN to a Swisscom server, are forwarded via the Internet to Decentlab where they are stored in an Influx database (InfluxDB, 2019), providing near-real-time access to the data. Decentlab provides web-based dashboards for data visualization as well as APIs for data access in various scripting languages. Information about the sensor network such as deployment history, calibration runs, calibration parameters, observations from reference instruments and processed sensor measurements is stored in a MySQL database hosted by Empa.

3 Data processing

3.1 Important issues for LP8 long-term measurements

The deployment of a large number of LP8 sensors in this study revealed two issues that are important for ambient long-term measurements with this sensor type. First, the response characteristics of the LP8 infrared measurement can change over time, both steadily and abruptly. Sudden changes in the sensor response might be due to mechanical stress of the plastic housing under continuously varying environmental conditions. Second, the infrared measurements are susceptible to humidity exceeding a value of about 85 %. This behaviour is common to all LP8 sensors, but actual thresholds differ among individual sensors. Therefore, additional processing steps subsequent to the application of the calibration function are required to achieve a data set of sufficiently high accuracy and completeness (Sect. 3.3, 3.4, and 3.5).

Several analyses that are presented in the following sections refer to the term deployment. We define deployment as the time period within which a specific sensor unit is placed at a particular outdoor location. A sensor unit can be used in several consecutive deployments.

3.2 LP8 sensor calibration and application of the sensor model

Each LP8 sensor was individually calibrated. For this purpose, each sensor unit was placed in the climate and pressure chambers for at least one complete calibration. Furthermore, each unit was operated under ambient conditions at site DUE until it was shipped for deployment in the Carbosense network. The sensors were run at DUE under ambient conditions in parallel with a Picarro instrument for a time period of between several weeks and several months.

Thus, an extensive data set of both chamber and ambient measurements was collected for each sensor unit to determine the calibration parameters of Eqs. (4) and (5). Filters that exclude conditions near condensation and large changes in IR measurements or in ambient CO2 were applied to this data set for optimal parameter estimation. The data filtering during calibration is more rigorous than the outlier detection applied to the sensors deployed in the Carbosense network (see Sect. 3.4). A robust estimator (Huber loss function) was used for the parameter estimation to minimize the impact of large residuals (e.g. persons breathing near the sensors). The parameters of the LP8 sensor models are stored in the MySQL database. The sensor unit has to pass a new calibration cycle whenever the LP8 sensor is exchanged.

Measurements from LP8 sensors deployed within the Carbosense network are processed by using Eqs. (4) and (5) with the corresponding coefficients determined during the calibration phase, yielding a first-guess CO2 molar fraction CO2,CAL. Thereby, there are two parallel processing chains, but besides the computation of CO2,CAL, the further processing is performed equally (outlier detection, drift correction). The function f(t) in Eqs. (4) and (5) is replaced by a constant that equals the last value of this step function during calibration.

The first-guess CO2,CAL is subsequently corrected for sensor drifts as described in Sect. 3.5. Equations (4) and (5) require the pressure at the sensor location. This value is derived from 10 min pressure measurements from the meteorological measurement network SwissMetNet operated by MeteoSwiss (Supplement Fig. S2). A procedure was set up that estimates the vertical pressure gradient in Switzerland and horizontally interpolates the pressure reduced to sea level every 10 min. These values allow for the computation of the pressure for any location and height above ground level with an uncertainty of about 1 hPa. The accuracy of the pressure interpolation has been determined from a comparison to measurements at SwissMetNet sites performing a leave-one-out cross validation.

Results and flags of subsequent processing steps are stored in the MySQL database to guarantee full traceability and to support the comparison of different processing options.

3.3 Flagging for high relative humidity

A relative humidity threshold RHtrsh was determined for every LP8 sensor based on the measurements from the ambient calibration performed at DUE. The purpose is to review the operation limits specified by the manufacturer and to develop a method for flagging the sensor measurements that may be impacted by humidity.

First, the standard deviation of the CO2 residuals (difference between computed CO2 values of the sensors and CO2,moist measured by the Picarro) is computed in 2 % RH intervals in a range of relatively dry conditions between 40 % RH and 70 % RH (resulting in 15 values in total), and the median of these values denoted as σres is determined. Second, the 95 % quantile of the residuals is computed in 2 % intervals from 0 % RH to 100 % RH. RHtrsh is then selected as the maximum interval for which the 95 % quantile is smaller than 3⋅σres. The CO2 residuals and the computed RHtrsh values are exemplarily depicted for two sensors in Fig. 3a and b. The operation limits for humidity indicated by the manufacturer (0 % RH–85 % RH) concur with our results (Fig. 3c). All the RHtrsh values are stored in the database.

Flagging the measurements of the deployed sensors by applying the criterion RH > RHtrsh results in a data set with very few outliers but, concurrently, a significant number of measurements are falsely rejected. In Sect. 3.4 a more adaptive outlier detection algorithm is presented that does not rely on any reference measurements. The choice of the filtering approach depends on the intended use of the measurements and whether the number of undetected outliers or the number of falsely flagged outliers is more important.

Figure 3(a) and (b) CO2 residuals (sensor minus reference; based on Eq. 4) versus relative humidity during calibration at ambient conditions at the DUE site for sensor units 1062 and 1071. The vertical dashed line indicates RHtrsh; the other three lines depict the 5 %, 50 % and 95 % quantiles of the residuals in 2 % RH intervals. (c) Overlaid histograms of RHtrsh for all the sensors (ALL), from the Carbosense-network-deployed sensors (DEPL) and from sensors deployed at reference sites (REF). The indicated quantiles refer to the set of deployed sensors.


3.4 Outlier detection

We call an LP8 measurement an outlier when it cannot be related to the ambient CO2 molar fraction by means of the sensor models described by Eqs. (4) and (5). Outliers are primarily caused by relative humidity exceeding about 85 % (see Fig. 3). Under these conditions the light absorption within the measurement cell can be increased due to the presence of water droplets or condensation of water on the mirrors. Such conditions may last for periods of between a few minutes and more than a day. The difficulty in detecting such events is that the signals in the LP8 IR and SHT21 RH time series do not follow a characteristic profile but exhibit significant variation depending on the actual progression of the meteorological conditions. The distinction between small humidity effects and a true increase in CO2 is not a simple task as the sensor measurements do not fully describe the conditions in the measurement cell. In addition, temporary enhancements from closely located emission sources can unusually impact the CO2 measurements and should not be treated as outliers.

The outlier detection algorithm was designed to rely entirely on the measurements from the sensor unit itself and to require no auxiliary information such as measurements from a reference instrument. It analyses and processes quantities derived from sensor observations that, under normal conditions, vary only slowly. The algorithm learns the sensor's usual behaviour at its current location from data obtained during the particular deployment and flags unusual measurements. Prerequisites for the algorithm are that environmental conditions and their changes remain within certain limits and that stable relations exist between specific sensor quantities and environmental conditions. Learning sensor behaviour in the field is an important element for minimizing the required calibration time.

Thus, the LP8 outlier detection algorithm is primarily based on the differences in consecutive log(IR) and temperature values plus statistical measures that are derived from a large number of IR measurements. The algorithm also reviews the relative humidity to enhance the robustness of the algorithm. The absolute values of IR and the corresponding values of CO2,CAL were not directly used as both are not stable over time due to drift or jumps and as they depend on CO2, temperature and pressure, which are variable over time.

First, the outlier detection algorithm requires the computation of several auxiliary quantities. In the following, IR, T and RH denote the infrared measurement, the LP8 temperature and the SHT21 relative humidity. The subscripts M and L refer to the mean and the last single measurement in a 10 min interval. Δt indicates the time between subsequent measurements transmitted to the database (subsequent measurements are only considered if the difference does not exceed 20 min).

  1. Difference in log (IR). ΔIR,M(t)=log(IRM(t))-log(IRM(t-Δt))

  2. Difference in T. ΔT,M(t)=T(t)-T(t-Δt)

  3. Mean RH of two measurements. MRH(t)=1/2(RH(t)+RH(t-Δt))

  4. Difference between single measurement and mean. γ(t)=log(IRL(t))-log(IRM(t)))

  5. Variance of log (IRM(t)). σM2t=1101nγτ2τt-2ht-10min

  6. Noise in log (IRL). IRnoise=MADlogIRL(τ)-logIRM(τ)τTdeployment

  7. Median absolute deviation of the difference in consecutive log (IRM). ΔIRlarge=MADlogIRM(τ)-logIRM(τ-Δt)τTdeployment

In item 5, n is the number of used γ(t) values and 10 is the number of single sensor measurements within a 10 min interval.

Second, based on the samples in relatively dry conditions (RH < 80 %), a quadratic function ΔIR,M=f(ΔT,M) is robustly determined which describes the normal change in log (IR) with a change in temperature. The corresponding residuals r for all samples are computed, and, again based only on the dry samples, the median absolute deviation (MAD) is calculated (Fig. 4).

A measurement IR(ti) is flagged when |r(ti)| > 3MAD|ΔIR,M(ti)|>3σM(ti)RH(ti)>70% (value set according to Fig. 3c). The positive flagged residuals are denoted as rflag,pos and the negative flagged residuals as rflag,neg. Starting from ti, consecutive (for all the rflag,pos(ti)) or preceding (for all the rflag,neg(ti)) measurements are also flagged until RH drops below MRH(ti). In general, high relative humidity leads to decreased IR and, concurrently, increased CO2 values for the LP8 sensor. Concurrently, the sign of r(ti) determines the direction of backward or forward flagging of temporally adjacent measurements.

In addition, two more quantities are determined based on rflag,pos: RHQ75 is the 75 % quantile of the RH values, and TDPQ25 is the 25 % quantile of the difference between T and the dew point (Td).

Measurements are also flagged if (i) |γ(ti)|>5IRnoiseRH(ti)>85%, (ii) |ΔIR,M(t)|>5ΔIRlarge, (iii) RH > RHQ75 or (iv) TTd < TDPQ25. Under (i), the first criterion is already fulfilled if two of seven |γ(ti)| adjacent to ti indicate increased noise.

Figure 4(a) Differences in consecutive log (IR) values versus differences in LP8 temperatures. Positive outliers are coloured in red; negative outliers are coloured in green. The orange line depicts the quadratic fit of Δlog(IR) ∼ΔLP8_T. (b) Histogram of the residuals of Δlog(IR) with relation to the fitted curve. The vertical red lines depict ±3 MAD. Positive outliers are in red; negative outliers are in green. Results from sensor unit 1010 deployed in Leibstadt are depicted.


3.5 Drift correction

IR measurements from LP8 sensors and the corresponding calibrated molar fractions CO2,CAL, corresponding to χCO2 in Eqs. (4) and (5), are not stable in time. For sensors deployed in the field, this drift has to be corrected in order to compute unbiased CO2 molar fractions. Since usually no reference measurement is available at the location of the LP8 sensor to determine the drift, a method was developed that makes use of specific weather conditions during which horizontal gradients in CO2 are small and links the measurements of the LP8 sensor to those of the closest accurate instrument. The criterion of small horizontal gradients and a well-mixed planetary boundary layer is best met during situations of high wind speeds.

The drift correction algorithm involves two consecutive steps: first, the identification of time periods Pslow when the sensor behaviour is slowly evolving and the drift can be corrected and of periods Pfast when the behaviour changes abruptly and, second, the determination of the drift and its correction.

For the first task, the identification of Pslow, the calibrated measurements CO2,CAL from the afternoon are analysed because CO2 molar fractions are most comparable from day to day in the afternoon when the planetary boundary layer is usually well mixed (Fig. S1).

The algorithm computes for each sensor and day td the following quantities from the calibrated measurements CO2,CAL (time in Switzerland refers to CET or CEST):

  • Qprev7d (td) – 20 % quantile of CO2,CAL(τ) where τϵ [t−7 d… t−1 d] τϵ (13:00–17:00 UTC)

  • Qnext7d (td) – 20 % quantile of CO2,CAL(τ) where τϵ [t+1 d… t+7 d] τϵ (13:00–17:00 UTC)

  • Qprev15d (td) – 20 % quantile of CO2,CAL(τ) where τϵ [t−15 d… t−1 d] τϵ (13:00–17:00 UTC)

  • Qnext15d (td) – 20 % quantile of CO2,CAL(τ) where τϵ [t+1 d… t+15 d] τϵ (13:00–17:00 UTC)

  • Q15 d (td) – 20 % quantile of CO2,CAL(τ) where τ ϵ [t−7 d… t+7 d] τϵ (13:00–17:00 UTC)

  • b15d (td) – slope of CO2,CAL(τ) where τ ϵ [t−7 d… t+7 d] τϵ (13:00–17:00 UTC).

Further, an empiric threshold ΔQTRSH is computed as median (Qn7d-Qp7d)+5MAD(Qn7d-Qp7d). Sensor behaviour is considered unsteady (Pfast) if |Qnext7d(td) – Qprev7d(td)| > ΔQTRSH or if |b15 d (td)| > 3 ppm d−1 |Qnext15d(td) – Qprev15d(td)| > 40 ppm. Drift correction is applied separately to the measurements of each sensor deployment and continuous time period Pslow. An example of the working principle of the algorithm is shown in Fig. 5.

Figure 5(a) Time series of Qd, Qprev7d, Qnext7d and Q15 d for sensor unit 1012 deployed at Hallau (HLL). The vertical red lines depict days when |Qnext7d-Qprev7d| is larger than the threshold. Shaded periods indicate time periods with increased |b15d|. For comparison, the Q15 d values for the reference sites DUE, PAY, RIG and HAE are shown. (b) Time series of Qnext7dQprev7d and Qnext1dQprev1d for the same sensor. The horizontal lines depict the threshold ΔQTRSH.


Figure 6Time series of ΔCO2 for all the sensors deployed in the Canton of Zurich (Fig. 1). The red dots depict the dates of the sensor adjustments.


The actual drift correction is based on wind measurements from MeteoSwiss sites (Fig. S2) and CO2 measurements from the high-precision instruments deployed in the network (both 10 min averages). The drift correction algorithm is applied to the calibrated measurements CO2,CAL from the sensors deployed in the Carbosense network.

First, all the MeteoSwiss sites within a distance of 40 km from a sensor are selected. Time periods are identified when all the selected sites report for at least 90 min

  • i.

    wind speed > 2 m s−1 or

  • ii.

    wind speed > 0.75 m s−1 median (wind speed at selected sites) > 3 m s−1.

Time periods lasting longer than 4 h are split into shorter intervals with a duration of about 2 h each. Second, the most closely located CO2 reference is chosen (Fig. 1). Its data are checked for completeness (number of measurements n≥6) and variability (SD ≤4 ppm) within each windy period. Similarly, the sensor data are checked for completeness (n6 SHT21 RH < RHtrsh) and variability (SD ≤15 ppm). Third, the CO2 offset ΔCO2(t) between the sensor's and the reference's median is computed for each windy period, and a continuous CO2 offset time series is derived by linear interpolation between these periods. Drift-corrected sensor measurements are derived by adding the linearly interpolated ΔCO2(t) to the measurements CO2,CAL (Fig. 6).

For the data set presented in this study, we use only the high-precision measurements from the sites DUE, PAY and GIMM for the adjustment of the LP8 sensors. This procedure allows for quantifying the accuracy of the concept by means of the remaining reference sites. In fact, measurements from GIMM are only used to adjust LP8 sensors deployed at PAY. Thereby, co-located sensor and reference measurements are independent in this data set (see Sect. 4.2). Obviously, three reference sites are not sufficient to accurately adjust all the LP8 sensors deployed in Switzerland as weather conditions often differ from region to region. Drift correction for a final and optimized LP8 data set will rely on measurements from all the reference sites and also from the HPP instruments (Fig. 1).

The assumption of spatially homogeneous CO2 mole fractions during strong wind events was tested by treating measurements from reference instruments in the same way as those from the LP8 sensors. Whenever an LP8 sensor was corrected at sites BRM, LAEG, HAE and RIG relative to DUE or at site PAY relative to GIMM, the 10 min CO2 molar fractions measured by the Picarro instruments at the two sites were compared. Not considered are RH and the measurement completeness of the LP8 sensor. Figure 7 shows CO2 differences in measurements from sites LAEG (distance d=19 km; height difference Δh=423 m), RIG (d=39 km; Δh=598 m), BRM (d=41 km; Δh=366 m) and HAE (d=61 km; Δh=-2 m) with respect to DUE as well as CO2 differences at PAY (d=35 km; Δh=57 m) with respect to GIMM. All these sites are located in or adjacent to the Swiss Plateau (Figs. 1 and 7f) and therefore have mostly similar weather conditions. The CO2 differences are depicted in two histograms placed on top of each other. The histograms in light grey shows all 10 min CO2 differences, while the histograms in dark grey only present those differences during windy conditions. The concept works well for background sites (LAEG, RIG, BRM) but has limitations for sites that are locally impacted by emissions (HAE is located next to a motorway). For all the site pairs, the differences in the CO2 measurements show a small bias (−2.1 to 0.8 ppm) and a scatter component (2.2–2.8 ppm at background sites, 6.0 ppm at the traffic site HAE). The root mean square error (RMSE) of the differences amounts to 2.3 to 3.6 ppm (background site) and 6.2 ppm (traffic site). The situation for HAE can be improved if the effect of local emissions is reduced and only measurements between 22:00 and 04:00 UTC (LT = UTC+01:00 or 02:00) and/or wind directions upward of the motorway are selected (Fig. S5). Obviously, that coincides with a reduction in the number of adjustment periods. For three sites in the city of Zurich that are located next to a busy road, sensor corrections are performed only during windy conditions at night-time in order to reduce the effect of local traffic emissions.

Figure 7CO2 differences in measurements at (a) LAEG, (b) RIG, (c) BRM and (d) HAE with respect to DUE, and CO2 differences in measurements at (e) PAY with respect to GIMM. DIST denotes the distance between the two sites (km); H1 and H2 denote the altitudes of the two sites (m). Q005 WP, Q050 WP, Q095 WP denote the 5 %, 50 % and 95 % quantile of the CO2 differences in windy conditions, respectively. MAD WP and SD WP denote the median absolute deviation and standard deviation of the CO2 differences in windy conditions. RMSE WP denotes the RMSE of the CO2 molar fraction of the two sites in windy conditions. (f) Map of the locations of the reference sites, their 40 km perimeters and the names of geographic regions. SP: Swiss Plateau; ZH: Canton of Zurich. Geographic data used for creating the base map originate from (last access: 29 October 2019) and (last access: 29 October 2019).

3.6 Consistency check

There are instances when the LP8 sensor drift cannot be corrected as frequently as required. This can be caused by an extended meteorological situation with low wind speeds or by sensor-related issues (e.g. unstable behaviour, simultaneous wind and high relative humidity). Consequently, the difference between the computed and the true CO2 molar fraction may increase over time. In addition, the outlier detection algorithm can be less effective during prolonged time periods with no dry conditions.

In order to identify such periods of suspicious or less accurate data, the measurements of individual sensors were checked for consistency with the more accurate measurements from HPP and Picarro instruments in a similar geographic setting. Although the true CO2 mole fractions at a given site are unknown, CO2 time series of sites within a particular region are expected to exhibit similarities, e.g. similar daily CO2 minima in the afternoon when the boundary layer is usually well mixed.

For this purpose, all the locations of the Carbosense network were divided into three groups based on their region and the surrounding topography.

All the sites in the Canton of Ticino (Fig. 1) are part of group one as only two HPPs operate in this region. The sites in the other regions of Switzerland are divided into two groups depending on whether they are located on a hilltop (group 3) or not (group 2). A hilltop location is defined by the following criteria: (i) the difference in altitude, i.e. the topography in a 2.5 km perimeter including the actual altitude of the mounted sensor, is larger than 300 m and (ii) more than 90 % of the topography in a 2.5 km perimeter is at a lower altitude than it is at the sensor location. The second criterion is omitted if the difference in altitude exceeds 400 m.

The CO2 molar fraction of the reference instruments and the HPPs are analysed group by group. The 10 % quantile of the preceding 24 h is computed for each instrument and HPP every sixth hour (CO2,Q10%). Afterwards, a band is derived (CO2,limits= median (CO2,Q10%) ±2.0range (CO2,Q10%)) that indicates plausible daily minimum CO2 molar fractions. The preceding 24 h of measurements from a sensor are flagged in case the sensor's daily CO2 minimum is outside the computed band.

Figure 8(a) Calibrated sensor measurements (Eq. 4) versus measurements from Picarro instruments exemplarily shown for sensor 1060. The data set contains measurements from the climate and pressure chambers and ambient measurements. The band between the dashed red lines denotes a range of ±20 ppm. (b) Same as in (a) but for Eq. (5). (c) Same as in (a) but for factory-calibrated sensor measurements. Measurements outside the sensor specifications are depicted in grey and included for the RMSE and correlation. (d) CO2–LP8 T plot and (e) CO2P plot depicting the environmental conditions covered during calibration. Same data set as in (a). (f) LP8 IR measurements versus CO2 from the Picarro instruments. Same data set as in (a).


4 Results

4.1 Sensor calibration

The employed sensor model that is based on the Beer–Lambert law and is extended by an empirical parametrization that can relate the sensor IR measurements and the ambient CO2 molar fraction in all relevant CO2, temperature and pressure conditions (Fig. 8a, b, d, e). The sensor's factory calibration is intended for using the sensor in a narrower temperature range, like that encountered indoors, and does not include pressure information. Measurements based on the factory calibration are not as accurate as they can be under outdoor conditions when using an extended model such as those described by Eqs. (4) and (5) (Fig. 8c). The 1 / IR terms in Eq. (4) require that the sensor's IR measurement does not heavily drift or jump because in this case the error cannot be compensated for by a simple offset. The data quality of sensors whose IR values significantly jumped during deployment (∼300 ppm in CO2) is usually better when the simplified sensor model (Eq. 5) without 1/IR-terms is applied (see Fig. 12). Equation (5) provides a less optimal fit under particular operating conditions (e.g. for high CO2 molar fractions; compare with Fig. 8a and b). However, this is of minor importance for most locations. The CO2 molar fraction at locations which are not impacted by nearby emissions is usually within 380–550 ppm. Temperature effects cause by far the largest deviation of the sensor response from the ideal gas law (Fig. 8c, d, f). Pressure effects are of a much smaller magnitude (∼0.1 ppm hPa−1).

Figure 9RMSE values of sensor calibration (a) using Eq. (4) and (b) using Eq. (5). Three histograms are overlaid: all calibrated sensors, sensors deployed in the Carbosense network (DEPL) and sensors at locations with a reference instrument (REF). The indicated quantiles refer to the set of deployed sensors.


As shown in Fig. 9, the RMSE of the LP8 CO2 measurements with respect to the Picarro during chamber and ambient calibration is between 6.8 and 12.5 ppm when applying Eq. (4) and between 8.0 and 13.9 ppm when applying Eq. (5) for the deployed sensors. Data filtering during calibration is chosen to be very selective in order to optimally determine the sensor model parameters.

4.2 Drift correction and outlier detection

The performances of the outlier detection and drift correction algorithms are presented together as both processing steps have to be applied to obtain accurate CO2 measurements. The results shown in this section refer to sensor measurements in the period 1 July 2017 to 1 September 2019.

Several sensor units are operated at sites equipped with a CO2 reference instrument (HAE - five sensor units; PAY – five sensor units; RIG – five sensor units; LAEG – two sensor units; BRM – one sensor unit) in order to test different calibration and processing options. Drift correction for the sensors at PAY relies on the CO2 measurements from GIMM and for the sensors at RIG, HAE, LAEG and BRM on the CO2 measurements from DUE (Fig. 7). Thus, the sensor and reference instrument measurements are independent at these sites.

A slightly modified data processing scheme was applied to the data from the 141 sensor units that were operated at DUE beyond 1 December 2017. This additional data treatment provides the opportunity to assess the data quality for a larger set of sensors. The calibration data set for these sensors contains all data before 1 December 2017 and is applied to the measurements thereafter. The sensor data are processed as described in Sect. 3, but drift is corrected by referring to measurements from sites LAEG and BRM instead of DUE (site LAEG, being located closer to DUE, is used when both instruments provide data). The accuracy of the CO2 molar fraction from these sensors located at DUE can therefore be compared to that from sensors deployed in the Carbosense network. Among the sensors at DUE there are also those with a performance that is not sufficient for deployment, and therefore they are held back at DUE.

The comparison of the median difference between CO2 measurements from the sensors and from the reference instruments reveals that sensor drift can be adjusted over the long term when the sensor measurements can regularly be referred to CO2 predictions (Fig. 10). The frequency of the required adjustments depends on the individual sensor as the change in sensor behaviour and the corresponding drift do not evolve constantly.

Figure 10Weekly median deviation of sensors operated at HAE, PAY, RIG, LAEG, BRM and DUE before (particular colours correspond to individual sensors as indicated in the legend) and after (grey) drift correction. Note the different scales on the y axis.


Figure 11(a) Weekly RMSE values for all the sensors deployed at HAE, PAY, RIG, LAEG, BRM and DUE. For each site four versions are presented for the drift-adjusted measurements: (i) no filtering applied, (ii) outlier detection based on sensor-specific RHtrsh value, (iii) outlier detection algorithm and (iv) outlier detection algorithm plus consistency check. When two bars have the same colour, the left bar refers to Eq. (4) and the right bar to Eq. (5). (b) Same as in (a) for the weekly correlation. (c) Same as in (a) for the weekly fraction of used data. Here, the fraction refers to the number of measurements transmitted to the database.


By means of the sensors which operate co-located with reference instruments the effect of different processing options can be assessed. This includes the employed sensor model (Eq. 4 or 5), the applied outlier detection (no outlier detection, outlier detection based on RHtrsh or the algorithm presented in Sect. 3.4) and the use of additional consistency checks. The sensor and reference measurements are compared for weekly periods by means of the root mean square error (RMSE) and the correlation (Fig. 11a and b). In addition, the fraction of valid measurements with respect to the total number of measurements in the database is indicated (Fig. 11c). It shows the effect of data filtering on the number of usable measurements. Scatter plots of the comparisons between the LP8 measurements and the measurements from the reference instruments at HAE, PAY, RIG, LAEG and BRM are shown in Figs. S6 to S9.

The median of the weekly RMSE of the sensor measurements with respect to the reference measurements at BRM, HAE, LAEG, PAY and RIG amounts to 10 ppm (25 % and 75 % quantiles – 6.8 and 14.3 ppm, respectively). The accuracy of the measurements is not constant over time but has a dependency on the effectiveness of the outlier detection and drift correction algorithms and thereby also on the prevailing weather conditions. The two described outlier detection algorithms differ in terms of the resulting RMSE values. Rigorous data filtering using RHtrsh leads to the best RMSE values. The outlier detection algorithm performs slightly worse in terms of the RMSE. Overall, it classifies a slightly larger number of measurements as valid than the filtering using RHtrsh. Differences in performance between the sensor models described by Eqs. (4) and (5) are small for this set of sensors. The accuracy of the measurements can be further improved when they are validated against measurements from high-precision instruments operated in the Carbosense network. This is shown for the combination of the outlier detection algorithm and the consistency check. Correlation between sensor and reference is about 0.9 on average. At sites RIG, LAEG and BRM the correlation coefficients are smaller due to smaller CO2 variations encountered at these locations (Fig. S1).

The extended sensor model described in Eq. (4) is applicable for a wider range of environmental conditions (CO2, T, P) than the reduced version (Eq. 5). However, when the IR signal shows large changes (> |300| ppm expressed in molar fraction) as in the case of sensor unit (SU) 1314 deployed at HAE, the application of the simplified sensor model provides more accurate results (Fig. 12).

Figure 12Comparison of sensor and reference CO2 measurements for SU 1314 deployed at HAE and SU 1100 and SU 1139 deployed at PAY. The sensor measurements depicted in panels (a–c) are based on the sensor model given by Eq. (4), and in panels (d–f) they are based on the sensor model given by Eq. (5). The sensor measurements are drift corrected, and the outlier detection algorithm was applied. Points in grey are outliers.


4.3 Differences between co-located sensors

Co-located sensor units are an additional option to assess the sensor performance. They reveal how similarly two sensors behave when they encounter comparable environmental conditions. There are 12 locations where two sensor units operate in parallel but where no reference instrument is available (Figs. 13 and S6). The horizontal distance between the sensor unit pairs does not exceed 45 m. There are no close emission sources for cases with distance ≠0. The indicated RMSE refers to the difference in simultaneous measurements. The sensor pairs exhibit fairly good correlations at most locations. For the sensor pairs operated at Hallau (HLL) and Birmensdorf (BSCR) there is better agreement when processing the measurements using the sensor model given by Eq. (5) instead of the model given by Eq. (4). The IR measurement of sensor unit 1012 changed significantly in January 2018; those of sensor unit 1120 changed significantly in March 2018. The difference between the processing models is small for the other sensor pairs.

Figure 13Comparison of LP8 measurements (drift corrected, outlier detection algorithm) from co-located sensors (distance between sensors < 45 m). Points in grey are flagged as outliers. The header of the individual figures indicates the sensor pairs by the location name and the sensor unit ID as well as the sensor model.


Eight sensor units were deployed in the Carbosense network, and they were brought back to site DUE to review their performance due to sensor malfunctioning (e.g. LP8 sensor dropped out of the board) or suspicious CO2 measurements. For completeness, the comparison between the measurements from these sensors and from the Picarro instrument is shown in the Supplement.

Figure 14(a) Analysis of measurement yield in the Carbosense network. Grey: Difference between expected and actual number of measurements in the database. Red: Measurements transmitted to the database with nonzero LP8 status (e.g. temperature below −8.5C, sensor malfunctioning). Cyan: Measurements with no drift adjustment (e.g. periods with unstable sensor behaviour). Orange: Measurements flagged by outlier detection. Light green: measurements that did not pass the consistency check. Dark green: Usable measurements. (b) Distribution of the measured relative humidity. The 10 %, 50 % and 90 % quantiles of RH. Ordering as in (a).


Figure 15Analysis of the results of measurement filtering referring to time of day (a) and relative humidity (b). Filtering is based on (i) a sensor-specific RHtrsh value, (ii) the outlier detection algorithm (OutDet) and (iii) the outlier detection algorithm plus a consistency check (OutDet/CC). For the calculation of the fraction of flagged measurements, the number of measurements and flags of all deployments are summed. The numbers of measurements are depicted as red dots.


4.4 Overall data coverage

The Carbosense network consists of 230 LP8 sensors as of 1 September 2019. In total, there were 262 deployments in the period 1 July 2017 to 1 September 2019. Over 75 % of the deployments lasted longer than 1 year, and five lasted less than 30 d.

The data transmission over Swisscom's Low Power Network (LPN) works reliably. The 25 %, 50 % and 75 % quantiles of the fraction of transmitted data for individual deployments at MeteoSwiss and NABEL locations and at locations within the city of Zurich amount to 88 %, 95 % and 98 %, respectively (Fig. 14a). Performance is even better at Swisscom's transmitter locations (25 % quantile – 98 %). However, these are usually equipped with an LPN gateway and built at elevated locations. We cannot assess to which part of the data transmission process the data loss is attributed (transmission module used in the sensor unit, LPN infrastructure, LPN network coverage). The transmission module (Microchip RN2483) of several sensor units was found to have a reduced reliability at high temperatures (above about 30 C).

Figure 16Comparison between LP8 and reference measurements. The LP8 measurements are outlier screened, drift corrected and checked for consistency. (a) Site HAE is located next to a motorway; (b) site PAY is located in a rural landscape; (c) site RIG is an elevated background site. Points in grey are outliers.


A small number (∼1 %) of the transmitted LP8 measurements had a nonzero status flag, for instance, when temperature was below −8.5C (LP8-specific threshold) or the sensor was malfunctioning. For a minor fraction of measurements a drift adjustment could not be performed as the sensor was assessed to be in an unstable phase. The outlier detection algorithm flags 23 % of the measurements that were drift corrected. In combination with the consistency check, 29 % of the measurements are flagged. There is considerable variability in these fractions related to the individual sensor performance and the location. A clear relationship is evident between the fraction of outliers and the humidity conditions encountered at the deployment location (Fig. 14b). Overall, the median of usable measurements from all individual deployments amounts to 67 %. There is a diurnal variation in the fraction of flagged measurements closely related to the diurnal variation in relative humidity (Fig. 15a). The outlier detection algorithm has the advantage of retaining a larger number of measurements in conditions of high relative humidity compared to the method using RHtrsh (Fig. 15b).

4.5 Computation of the water volume fraction

The conversion of wet CO2 to dry CO2 requires the water molar fraction χH2O. This value is computed for the sensor units based on the SHT21 T and RH measurements and the pressure that is interpolated for the specific location. The uncertainty in the estimation of χH2O and the corresponding uncertainty in the dry-air mole fraction of CO2 can be assessed for a total of 55 sensor units operated at MeteoSwiss SwissMetNet sites that are equipped with more accurate meteorological instruments. At those sites, χH2O has been computed from the sensor units and from reference T, RH and p measurements (Figs. S3 and S4). The agreement is best (±0.07 %) when global radiance is low (< 50 W m−2). In this case T and RH measured inside the box are representative for the outside conditions. Deviation is slightly worse (±0.15 %) for higher global radiance. For the majority of the measurements, the conversion of wet-CO2 to dry-CO2 molar fraction is associated with an error below 1.2 ppm (assessment of deviation for an error ϵ=0.2 % and χCO2,wet=600 is as follows: χCO2,dry=χCO2,wet/(1-ϵ/100)=600ppm/(1-0.2/100))=601.2 ppm).

5 Discussion and conclusions

Calibration, drift correction and outlier detection are crucial elements for the operation of the LP8 sensors in a sensor network. Due to the number of employed sensors and the slight differences in their individual response characteristics the processing scheme has to be optimized in terms of accuracy, yield of usable measurements and processing efficiency. As the processing scheme consists of several independent elements, each of them can be further improved in the future.

The sensor calibration reveals the dependencies of the sensor signal on CO2, temperature and pressure. The mathematical sensor model has to account for a varying sensor response over time. Our approach is to use an extended model as long as the sensor behaviour does not drift significantly. After large jumps in the IR signal, sensor measurements can be processed based on a simpler sensor model to optimize the measurement accuracy until the sensor is replaced. Moreover, the analysis of the data during calibration also shows the impact of environmental conditions, such as increased relative humidity (> 85 %), that are not captured by the calibration model. It demonstrates the need for dedicated measurement filtering.

We present two methods for the detection of outliers. The application of the two methods for individual sensors leads to a different number of flagged measurements and concurrently to different RMSE values. Flagging the measurements based on a conservative RH threshold results in the most accurate results. The presented outlier detection algorithm that relies on no reference measurements is similarly powerful. The possibility of learning individual sensor characteristics in the field is an important feature to reduce calibration time.

The response of the LP8 sensors is not stable over time, and frequent adjustments are required. The performed correction during windy periods works well for the regions in and adjacent to the Swiss Plateau (Fig. 7). The method relies on a dense network of meteorological observations and high-precision CO2 measurements. Moreover, it strongly depends on the prevailing meteorology, and, therefore, it is prone to a shortage of suitable adjustment periods. This situation could possibly be enhanced by using the results of an operational atmospheric transport model. Two aspects are expected to be improved by using such a model: (i) the identification of time periods when the CO2 molar fraction in the atmosphere is homogeneous and sensors and reference instruments can be related and (ii) the determination of the vertical CO2 gradient. Such an atmospheric transport model is currently under development at Empa, and its applicability for the sensor network will be investigated.

The data processing for sensors in the Swiss Plateau and especially in the region of Zurich (Fig. 1) where the Carbosense network is most dense is operational. For these regions, the analysis of measurements from reference sites shows that drift correction within selected time periods works well. Results from atmospheric transport models will be required to achieve a similar level of data quality for the sensors located in the Swiss Alps.

The LP8 sensor measures the CO2 molar fraction with an accuracy of about 10 ppm on average if the sensors are calibrated, continuously monitored and drift corrected during operation and if the measurements are filtered. The resulting accuracy is not constant because it depends not only on the sensor characteristics but also on the performance of the drift correction and outlier detection algorithms and thereby on the prevailing weather conditions. The LP8 sensors are well capable of resolving differences in the CO2 molar fraction exceeding 30 ppm (3⋅σ if the RMSE value computed in Sect. 4.3 is taken for σ). CO2 variations encountered at locations in Switzerland usually exceed this threshold (see Fig. S1). Exceptions are high-altitude locations such as Jungfraujoch (3580 m a.s.l.; Sturm et al., 2013). Near-surface CO2 signals depend on anthropogenic emissions, the activity of the biosphere (uptake, respiration) and meteorology (boundary layer height, transport of CO2). LP8 sensors can resolve the site-specific CO2 signals for a wide range of locations, from elevated background sites to sites next to motorways (Fig. 16). The sensors are not capable of detecting small-scale signals and long-term trends under outdoor conditions.

Data availability

Periodic data releases on the ICOS Carbon Portal are in preparation. Temperature and RH measurements from the sensor units of the period 1 July 2017 to 1 October 2019 are already available at (EMPA et al., 2019)


The supplement related to this article is available online at:

Author contributions

MM, CH, DB, LE and PG designed, established and operated the Carbosense network. MM, AP, CH, DB and FP were involved in the development of the data analysis strategies. MM and PG performed the sensor calibrations. JM developed and manufactured the sensor units. MM prepared the paper with contributions from all co-authors.

Competing interests

Author Jonas Meyer is the CTO of DecentLab, the manufacturer of the sensor units. Jonas Meyer was solely involved in the development and manufacturing of the sensor units and in the transmission of the raw sensor data into the database hosted by DecentLab and was not involved in sensor deployment or in data analysis. The authors declare that they have no other competing interests.


We acknowledge MeteoSwiss, Swisscom, the Swiss National Air Pollution Monitoring Network (NABEL), the Environment and Health Department of the City of Zurich (UGZ), Agroscope, and the Amrein Futtermühle AG for their generous support of the Carbosense sensor network. In addition, we acknowledge Swisscom's contribution to the sensor units and data transmission. We are grateful for Senseair's support of the project. We thank Markus Leuenberger (University of Bern) for providing CO2 measurements from the sites Beromünster and Gimmiz. Günter Grossmann (Empa) is acknowledged for his support in the operation of the climate chamber. We thank Antoine Berchet (now at CEA) for his contributions at an early stage of the Carbosense project.

Financial support

Funding is provided by the Swiss Data Science Center (SDSC) through the project CarboSense4D (grant number 17-06) and the Swiss State Secretariat for Education, Research and Innovation (SERI) through the Eurostars project CO2.GLOBAL (Eurostars project ID 11401).

Review statement

This paper was edited by Russell Dickerson and reviewed by three anonymous referees.


Arzoumanian, E., Vogel, F. R., Bastos, A., Gaynullin, B., Laurent, O., Ramonet, M., and Ciais, P.: Characterization of a commercial lower-cost medium-precision non-dispersive infrared sensor for atmospheric CO2 monitoring in urban areas, Atmos. Meas. Tech., 12, 2665–2677,, 2019. 

Berhanu, T. A., Satar, E., Schanda, R., Nyfeler, P., Moret, H., Brunner, D., Oney, B., and Leuenberger, M.: Measurements of greenhouse gases at Beromünster tall-tower station in Switzerland, Atmos. Meas. Tech., 9, 2603–2614,, 2016. 

Bigi, A., Mueller, M., Grange, S. K., Ghermandi, G., and Hueglin, C.: Performance of NO, NO2 low cost sensors and three calibration approaches within a real world application, Atmos. Meas. Tech., 11, 3717–3735,, 2018. 

Castell, N., Dauge, F., Schneider, P., Vogt, M., Lerner, U., Fishbain, B., Broday, D., and Bartonova, A.: Can commercial low-cost sensor platforms contribute to air quality monitoring and exposure estimates?, Environ. Int., 99, 293–302,, 2017. 

Empa: Technischer Bericht zum Nationalen Beobachtungsnetz für Luftfremdstoffe (NABEL), Federal Office for the Environment, 2018. 

EMPA, Swisscom AG, Decentlab GmbH and Swiss Data Science Center: Carbosense T and RH Data – Release October 2019, ICOS Carbon Portal,, 2019. 

Hummelgard, C., Bryntse, I., Bryzgalov, M., Henning, J., Martin, H., Norén, M., and Rödjegard, H.: Low-cost NDIR based sensor platform for sub-ppm gas detection, Urban Climate, 14, 342–350,, 2015. 

InfluxDB: available at:, last access: 5 July 2019. 

Kim, J., Shusterman, A. A., Lieschke, K. J., Newman, C., and Cohen, R. C.: The BErkeley Atmospheric CO2 Observation Network: field calibration and evaluation of low-cost air quality sensors, Atmos. Meas. Tech., 11, 1937–1946,, 2018. 

Lewis, A., Peltier, W. R., and Schneidemesser, E.: Low-cost sensors for the measurement of atmospheric composition: overview of topic and future applications, World Meteorological Organization (WMO), 2018. 

LoRa-Alliance: available at:, last access: 5 July 2019. 

Martin, C. R., Zeng, N., Karion, A., Dickerson, R. R., Ren, X., Turpie, B. N., and Weber, K. J.: Evaluation and environmental correction of ambient CO2 measurements from a low-cost NDIR sensor, Atmos. Meas. Tech., 10, 2383–2395,, 2017. 

Mueller, M., Meyer, J., and Hueglin, C.: Design of an ozone and nitrogen dioxide sensor unit and its long-term operation within a sensor network in the city of Zurich, Atmos. Meas. Tech., 10, 3783–3799,, 2017.  

Oney, B., Henne, S., Gruber, N., Leuenberger, M., Bamberger, I., Eugster, W., and Brunner, D.: The CarboCount CH sites: characterization of a dense greenhouse gas observation network, Atmos. Chem. Phys., 15, 11147–11164,, 2015. 

Popoola, O. A. M., Carruthers, D., Lad, C., Bright, V. B., Mead, M. I., Stettler, M. E. J., Saffell, J. R., and Jones, R. L.: Use of networks of low cost air quality sensors to quantify air quality in urban settings, Atmos. Environ., 194, 58–70,, 2018. 

Schneider, P., Bartonova, A., Castell, N., Dauge, F. R., Gerboles, M., Hagler, G. S., Hueglin, C., Jones, R. L., Khan, S., Lewis, A. C., Mijling, B., Mueller, M., Penza, M., Spinelle, L., Stacey, B., Vogt, M., Wesseling, J., and Williams, R. W.: Toward a Unified Terminology of Processing Levels for Low-Cost Air-Quality Sensors, Environ. Sci. Technol., 53, 15, 8485–8487,, 2019. 

Senseair: Senseair LP8, available at:, last access: 5 July 2019. 

Sensirion: Digital Humidity Sensor SHT2x (RH/T), available at:, last access: 5 July 2019. 

Shusterman, A. A., Teige, V. E., Turner, A. J., Newman, C., Kim, J., and Cohen, R. C.: The BErkeley Atmospheric CO2 Observation Network: initial evaluation, Atmos. Chem. Phys., 16, 13449–13463,, 2016. 

Spinelle, L., Gerboles, M., Villani, M. G., Aleixandre, M., and Bonavitacola, F.: Field calibration of a cluster of low-cost commercially available sensors for air quality monitoring. Part B: NO, CO and CO2, Sensor. Actuat. B-Chem., 238, 706–715,, 2017. 

Sturm, P., Tuzson, B., Henne, S., and Emmenegger, L.: Tracking isotopic signatures of CO2 at the high altitude site Jungfraujoch with laser spectroscopy: analytical improvements and representative results, Atmos. Meas. Tech., 6, 1659–1671,, 2013. 

Tans, P. and Zellweger, C.: GAW Report No. 213, World Meteorological Organization Global Atmospheric Watch, 2014. 

Tans, P. P., Crotwell, A. M., and Thoning, K. W.: Abundances of isotopologues and calibration of CO2 greenhouse gas measurements, Atmos. Meas. Tech., 10, 2669–2685,, 2017. 

Zhao, C. L. and Tans, P. P.: Estimating uncertainty of the WMO mole fraction scale for carbon dioxide in air, J. Geophys. Res., 111, D08S09,, 2006. 

Zimmerman, N., Presto, A. A., Kumar, S. P. N., Gu, J., Hauryliuk, A., Robinson, E. S., Robinson, A. L., and R. Subramanian: A machine learning calibration model using random forests to improve sensor performance for lower-cost air quality monitoring, Atmos. Meas. Tech., 11, 291–313,, 2018.