Sensor networks are being more widely used to
characterize and understand compounds in the atmosphere like ozone
(O
Tropospheric ozone formation and destruction are a complex chemical process
involving a series of interdependent chemical reactions of volatile organic
compounds (VOCs) and nitrogen oxides (NO
The equipment employed at air quality monitoring stations (AQMSs) is
relatively expensive (> USD 100 000 station
Regulatory monitoring for compliance with the ozone NAAQS is undertaken as dictated by the Code of Federal Regulations (CFR), which states, “The goal in locating monitors is to correctly match the spatial scale represented by the sample of monitored air with the spatial scale most appropriate for the monitoring site type, air pollutant to be measured, and the monitoring objective” (EPA, 2006). Ozone monitoring site types include highest concentration, population orientation, source impact, general/background and regional transport, and welfare-related impacts. Siting involves choosing a monitoring objective, selecting a location that best achieves those goals, and determining a spatial scale that fits the monitoring objective.
The minimum number of ozone monitoring sites required by the US Environmental Protection Agency (EPA) via the CFR in the Riverside and San Bernardino counties is three, given the population is between 4 and 10 million. As of 2013, there were 20 active regulatory sites measuring ozone in Riverside and San Bernardino counties (California Air Resources Board, 2013). While this monitor density is more than sufficient for regulatory requirements, recent studies suggest that the current spacing is not sufficient to capture concentration variations in high spatial resolution (Bart et al., 2014; Moltchanov et al., 2015). This variability could potentially be used to inform exposure assessment for health studies as well as improve our understanding of pollutant sources and fate (Simon et al., 2016; Lin et al., 2015; Blanchard et al., 2014).
Networks of air quality sensors have been deployed in various settings.
Moltchanov et al. (2015) measured O
Here we specifically seek to answer the question, are these metal oxide
sensors able to detect significant differences on scales that are smaller
than current EPA reference stations, given their quantification uncertainty?
This study is unique in that the Inland Empire region of greater Los
Angeles frequently experiences high levels of ozone resulting in
nonattainment of the NAQQS ozone standard. The combination of abundant
sunlight and high VOC concentrations in the presence of NO
Demonstration of the U-Pod layout
This field study was conducted within a 200 km
Measurements were taken using the University of Colorado U-Pod air quality
monitoring platform (
MO
Field calibration results of the model (see Eq. 1) for ozone
sensors, showing
Sensors were calibrated using a field calibration technique commonly employed
with low-cost sensor networks which involves collocating sensors with a
reference-grade monitor for an extended period of time prior to and/or
directly following a field deployment (Piedrahita et al.,
2014). The concept of field calibration is
straightforward: develop regressions between the reference measurement and
gas sensor signal using combinations of concurrently collected environmental
data. All U-Pods were calibrated at the SCAQMD Rubidoux AQMS (elevation 248 m
above sea level) for 3 weeks, 22 July–10 August, prior to the field
deployment. The Rubidoux station spatial scale is classified as “urban”
for ozone and is located 119 m from Hwy 60 (SCAQMD, 2017). Reference ozone
is measured using a designated Federal Equivalent Method (FEM) Thermo 49i
dual-cell UV photometric monitor. This monitor is equipped with temperature
and pressure compensation, which adjusts for changes in sensor signal due to
changes in the sample gas. Numerous field calibration relationships were
developed using a suite of custom MATLAB codes. This process involves
performing linear and nonlinear regressions using sensor signal, measured
U-Pod enclosure temperature, absolute humidity, and time (to account for
sensor drift) against the reference gas concentrations. MO
Following the field calibration, the U-Pods were relocated throughout the study area to the sites shown in Fig. 1. Sites were chosen based on availability and zoning. A mix of industrial, residential, and commercial areas were selected including a university campus and public parks. U-Pod D7 remained at the Rubidoux station, while D0 and D5 were relocated to Mira Loma Reference station for the purpose of validation.
To quantify the performance of the calibration model coefficients, a nearly
3-month-long validation dataset was collected comparing reference-grade
gas concentration measurements to sensor data after applying the model
coefficients to the raw sensor data. Previous air quality sensor campaigns
either have had mixed results when performing validation in the field or no
validation was included. Moreover, no study, to our knowledge, has validated
ozone sensor measurements to reference-grade monitors at 1 min
resolution. Two validation approaches were investigated. First, we compared
sensor measurements to reference-grade observations in the
Example calibration results for one ozone sensor in U-Pod D0.
Panel
Calibration results for various models showing correlation and RMSE of the
calibrated ozone data against the reference monitor data are provided in
Table S1 in the Supplement. For the sake of simplicity, results from the overall
best-performing model (see Eq. 1) are shown in Table 1.
Figure 3 illustrates the calibration results for U-Pod D0. Residuals were
calculated as modeled minus reference instrument concentrations. The
normally distributed residuals shown in panel c were indicative of an
unbiased model. Residuals were plotted versus various model parameters to
assess bias in the model performance as a function of the predictors. The
slightly negative slope of the trend line in panel e indicated underpredicting
at increasing absolute humidity, whereas positive slopes in panels d and f
show the opposite trend, slight overprediction at higher values of
concentration and temperature. The
The quickly expanding sensor community has been convening to discuss
practical and theoretical considerations of low-cost sensor applications in
the modern landscape, identifying a need for increased understanding of
inter-sensor variability (Clements el al., 2017). Few groups have thoroughly
investigated the physiochemical relationships governing MO
Additional insight into this effort can be gleaned by exploring the results
of sensor-specific model parameters from the nearly 3-week calibration
period of this study. To directly compare model parameters (i.e.,
coefficients), standardized regression coefficients were generated by
rescaling model input variables from 0 to 1. Rescaling was achieved by
dividing the difference between each variable data point from its respective
distribution minimum by the maximum difference measured (i.e.,
[
Average relative effect size of model parameters predicting sensor
signal (
It is important to note that the reference resistance,
Some temperature and humidity values were experienced by the U-Pods during the deployment that were not experienced during the calibration time period. This means that the environmental parameter space sampled during the calibration time did not cover the parameter space experienced during the deployment. Deployment data were filtered for conditions that would require extrapolation, an example of which is shown in Fig. 5. Because ozone measurements are dependent on temperature and humidity, one way to reduce error in the deployment data is to only use ozone data points whose temperature and humidity were in the range of those of the calibration data. All U-pod data from the deployment period were filtered to eliminate points that had temperature and relative humidity values out of the ranges recorded during calibration. The global absolute humidity in Fig. 5a is the same for all U-Pods. Normally, the absolute humidity would be calculated for each U-Pod using its individual recorded temperature, relative humidity, and pressure. However, during the deployment, the relative humidity sensors failed in several U-Pods. The relatively high chance of sensor failure in the field is one of the limitations of low-cost sensor networks. Four of the U-Pods experienced RH values below zero. However, the RH sensor sets these values to zero. Therefore, there was no way to recover any data below zero. All of the U-Pods experienced, at some point, at least 1 week of missing data. Because of this, temperature and relative humidity data from Rubidoux AQMS, along with a constant pressure value, were used to calculate the global absolute humidity for the Riverside area for each minute. During calibration, the same values of absolute humidity were used for each U-Pod, but temperatures were U-Pod specific.
Example filtering for a U-Pod (D3) showing lower absolute
humidity
In addition, deployment data were filtered for maximum values of O
Lastly, data were filtered using consecutive differences. Data were omitted when they fell more than 8 standard deviations away from the mean consecutive difference in values. This is a standardized way to cut out spikes in data caused by power control issues. The results of the deployment data filtering, including percent of data lost, are shown in Table S2. Most U-Pods (except D8 and DB) have two ozone sensors. For U-Pods with two ozone sensors, only one was used for the analysis. The data from the calibration time period for each sensor were compared to the reference data at Rubidoux. Whichever sensor had the highest correlation and lowest RMSE with the reference was chosen for subsequent analysis.
U-Pod DD was omitted from this analysis due to a lack of data. This pod lost almost 46 % of its data after the filtering process and collected significantly fewer data than the others due to site security issues. U-Pods D4, D5, D6, D8, and DF required a modification be made to their electronics boards. This modification to the U-Pod system appeared to have shifted ozone baseline signal values, resulting in biased values for D5 (see Sect. 3.3 below). In a conservative effort, all U-Pods that were modified as described above were removed from the subsequent ozone analysis. Since some U-Pods were at the same location, the removal of these U-Pods resulted in the loss of three sites from the study. All the remaining sites were left with one U-Pod each.
Overall validation sensitivity results showing mean residuals,
median residuals,
Validation of the field calibration models was achieved by deploying U-Pods
next to reference instruments during times when the others were spread out
over the study area. The validation time period (11 August–25 October)
overlapped with the deployment time period (17 August–20 October). Coefficients
generated from the regression models (Table S1) were applied to the filtered
data from D7, D0, and D5. The best-performing model was selected based on
Validation results from the
The first validation method (U-Pod in the same location as the reference
station, D7) would be expected to have better validation statistics than
U-Pods validated using the second method (U-Pod relocated to a different
location, D0 and D5) because the environmental conditions (e.g., temperature,
humidity, distance to roadway and other site-specific conditions)
encountered by the pods were the same as the reference for the first
validation method. However, this is not the case as both O
Organizations using or planning to use sensors to monitor ambient air quality are interested in how frequently sensors require calibration so as to keep them within a specified “tolerance” of reference-grade measurements. As a precautionary note, durations between suggested calibrations are highly dependent on the environment, quality and robustness of the calibration, and gas species of interest. The validation statistics presented so far have been aggregated over the entire deployment period (or have been selected at random) in the case of the iterative validation described above. However, to further inform the sensor community on how robust calibration models can be through time and environmental space (e.g., humidity and temperature), validation was performed independently for the first week and last full week of the deployment, and the results for each week are shown below in Fig. 6.
Within the first week of the validation (panel a), the range of reference
ozone concentrations (
Figure 6 has two identifiable deviations from the 1 : 1 line. These two
events, identifiable as the “claws” in week 1 (shown in panel a
(i–ii)), demonstrate higher reference measurements than both D7 sensors,
leading to large residuals. These claws are separated in time, but each claw
is a single event (consecutive measurements) lasting 1 and 8 h in
duration. To explore these claws further, a scatterplot for each sensor
colored by temperature and humidity at each time point was created (Fig. S5). They show that the two events visible for D7 occur at drastically
different temperatures and humidity. The first (lower) claw has low
temperature and high humidity, and the second has the reverse conditions.
This finding provides evidence for a separate confounding variable, as it is
not the same condition in temperature or humidity that causes these
underpredictions in ozone measurements. In future studies, the U-Pod could be
outfitted with sensors to detect other possibly confounding gasses, such as
NO
SCAQMD performed nightly precision checks (PCs) consisting of measuring the
ozone concentration of a known gas standard that typically ranges between
90 and 100 ppbv for 1 h. When PC measurements deviated more than 5 % from
expected values (corresponding to approximately 5 ppbv), subsequent data
would be flagged and a work order would be generated for service or
calibration. Values that are within 5 % of the standard would not be
flagged. This serves as a reference point for the quality of the reference
ozone measurements. During validation, O
As mentioned above, U-Pods were deployed, spread out across 200 km
The U-Pods sampled for approximately 2900 h total, 58 % of which consisted of the deployment period data. The medians of ozone value distributions during the calibration range from 29 to 30 ppbv. During calibration, the 5th and 95th percentiles ranged from 2 to 5 and 70 to 83 ppbv, respectively. During deployment, the median ozone values were between 14 and 31 ppbv, while the 5th- and 95th-percentile ranges were 0–6 and 67–99 ppbv, respectively.
Ozone concentrations experience a diurnal cycle. This cycle usually
incorporates low ozone at night and during the early morning, and a peak in
concentration sometime during the day. Gao (2007) used hourly ozone
measurements recorded over southern California from 16 June to
15 October 1997 and found that ozone began to increase in the region
around 08:00, peak between noon and 15:00, and then undergo reduction until
about 21:00. The precursors to forming ozone – sunlight, VOCs, and NO
Figure 7 offers context of what the temporal variability in ozone concentrations in this study looks like. There are trends in ozone concentrations across southern California that would be expected. Ozone is lowest from midnight to 06:00. Then the accumulation period takes place between 06:00 and 14:00. Peak concentrations occur between 14:00 and 16:00, and for the remaining hours concentrations decrease again.
In order to assess spatial variability, we examined the
The diurnal cycle of ozone during the deployment. Distributions are
concentrations from all U-Pods during each hour. Whiskers indicate the 5th
and 95th percentile, with
U-Pod ozone measurements are more correlated to each other during
calibration than deployment. The
Each box plot is a collection of the
Absolute O
We expected that at times of day where the spatial variability was the lowest
(
Distributions of medians of absolute differences between all pairs of pods for each hour of the day. Whiskers show 95 % intervals. The black line connects the medians of the deployment. The “all” category includes all hours of the day.
Towards the end of daylight hours, between 16:00 and 20:00, the medians of
absolute concentration differences have a decreasing trend in time of day,
which should be indicating that the U-Pods are becoming more similar because
their differences are smaller. However, in the same hours and later, the
U-Pod D7 ozone concentrations are plotted on the
Data from D3, at industrial zone 1, plotted against D7 (at Rubidoux). In each scatterplot, colored data in the legend represent 4 h of the day, and the black data represent the complete deployment dataset (all hours). The black line is a 1 : 1 line, not a line of best fit.
To further understand the factors impacting the observed spatial variability, we examined U-Pods individually in more detail. We undertook this investigation by comparing each U-Pod to a common reference U-Pod, to illuminate differences between locations in a normalized way. If no spatial variability was observed, then comparing two U-Pods' ozone measurements would show a 1 : 1 relationship with spread near the RMSE values determined in the validation (4.4–5.9 ppbv). To explore this analysis, D7 was used for normalization. U-Pod D7 was never moved from the Rubidoux station throughout the project and as such was employed in the validation effort mentioned previously. This U-Pod was used as the normalization instead of an AQMS reference monitor in order to compare two similar types of measurement. The U-Pod to U-Pod comparisons are shown with the differences between calibration period trends and deployment trends in Fig. 10 as well as hourly patterns in Fig. 11.
In Fig. 10, the calibration data points, representing collocated O
Data from DA, located at Commercial Zone 1, plotted against D7 (Rubidoux). Each scatterplot represents 4 h of the day, with the black data representing the complete deployment dataset (all hours), and data points recorded within each hour bin are marked by the colors and times in the legend. The black line is a 1 : 1 line, not a line of best fit.
Examining the data in this way allows for detailed comparison of U-Pods at different sites. For example, sites D0, D3, and DE were not more than 1.8 km away from each other, near Van Buren Blvd in the north west of the project area, and all were less than 1.2 km from the road. Therefore, one might expect data from these U-Pods to be very similar. Indeed, D0 and DE have similar data cloud shapes in Fig. 10. However, data from D3 look to be rather different. This could indicate that a localized source is affecting the ozone concentrations at that site. Perhaps a local emission of NO was scavenging ozone at industrial zone 1 as a result of industrial operations. Alternatively, this difference could be caused by unique meteorological conditions at this site. However, when investigated further, the lower ozone values of D3 than of D7 also appear more pronounced on weekdays (Fig. S7), reinforcing the hypothesis of industrial activities causing such differences.
U-Pod DA was the farthest away from the other monitors (
Temporal variation in ozone values can be visually examined in more detail by singling out certain hours of data, compared to the full set. Figures 11 and 12 demonstrate this concept.
Figures 10 and 11 show that the deployment data for D3 are consistently lower
when compared to D7 than the other U-Pods. D3 is 7 km from D7, in the north
of the project area. U-Pod D3 was sited at a company in an industrial area
where there are potentially more VOCs in the air. This site was half a
kilometer from the Van Buren roadway, and as such there is also the potential
for elevated levels of NO
Figure 12 shows the relationship between DA and D7 at varying hours during
the day, highlighting some interesting observations. First, there was far
less spread around the 1 : 1 line for DA (than for D3), indicating that ozone
measurements from D7 and DA were more similar than D7 and D3. DA is
similarly distanced from D7 as D3, about 7.5 km away, but still in the
northern area of the study. These plots show concentrations from DA are more
similar to D7 than those of D3, because there is much less deviation from
the 1 : 1 line in data points. Also of interest is the strange claw shape on
the underside of the black data cloud. The analysis in Fig. 12 was conducted
for all pods, but not all are shown here. It appears that many of these
points occur mostly in hours 09:00 through 11:00 for all affected U-Pods. The
data points from the claws in DA occur in a few consecutive hours on three
different days, similar to D7. The claw in D7 is not causing this effect in
DA, because they occur at different times. One possible explanation for this
may be the presence of one or more gas species that is not captured by the
model but that affects either the sensor directly or the concentration of
ozone in the vicinity for a short time. These gases could be localized ozone
precursor emissions such as NO
In the region of Riverside, CA, we were able to observe spatial and temporal
variability of ozone across an area of roughly 200 km
Technological difficulties of obtaining sensor data through environmental
extremes, increased sensor variability with high ozone values, electrical
issues, and data retrieval are all issues encountered when using a U-Pod
sensor network. Although the sensors themselves are low-cost, the data
retrieval, validation, and analysis are not. Data were retrieved every two weeks, which required a field visit to each site. Sensor platforms
that wirelessly transmit data (or stream data) require additional hardware
and may limit sensor placement yet are promising for many applications. The
U-Pod has since evolved to incorporate wireless data transmission in some
units. Processing (e.g., QAQC, filtering) and analysis of these data
(
The highest amount of variability between U-Pods based on the
For future sensor research, an analysis of the amount of time spent collocating (calibrating) to the amount of time deployed (applying calibration) would be very beneficial for the sensor community. This information can inform how long sensors can be deployed in a given region under given environmental conditions before recalibration is warranted. In this study, for nearly 3 weeks of collocation time, sensors were deployed for more than 9 weeks with only slightly variation of performance from week 1 to week 9. It is important to collocate the sensors as frequently as possible while balancing other resources. Sensor quantification using different mathematical approaches to linear regression could improve the performance. Since higher values of ozone are of the greatest interest to regulators and the public from a human health standpoint, and the sensor variability increases at those higher values, perhaps the regression could be fit differently to suit those needs. An example could be to fit a piecewise function, to better capture the low-ozone and high-ozone regimes separately, or other nonlinear models.
Additionally, including contemporaneous measurements of other gaseous compounds could help explain spatial and temporal ozone variability. For example, including information on nitrogen oxides and volatile organic compound concentrations could help inform the effects of traffic on ozone measurements, while land use data could reveal the effect of vegetation or industrial operations on measurements. Furthermore, this study was conducted in an area with relatively high levels of ozone, which can be simpler to detect. Many people live in areas that have ozone levels closer to EPA-required levels, though they still experience some periods of non attainment. To make this research more relevant to all people, the next step could be to try to detect the same spatial and temporal variability at these places as well.
The final, filtered dataset and the codes used to make the plots in this
paper are available on Mendeley at
KS helped conduct the field experiment and analyze deployment data, and prepared the manuscript with contribution from all authors. EC was the lead field scientist, performed the calibrations and validation analysis, and conducted the literature review. AP and BF facilitated collaboration between the Hannigan group and the South Coast Air Quality Management District, and provided useful information on air quality conditions in Riverside County. QL, DKH, and MH provided guidance and academic support for the project.
The authors declare that they have no conflict of interest.
This material is based upon work supported by the National Science Foundation under grant no. 1442971 for the CyberSEES project. Thank you to members of the Hannigan group for your help: Ashley Collier, Ricardo Piedrahita, and Joanna Casey. Thank you to Drew Meyers for your support. Also, thanks to our field technician, Brandon Wong. We also acknowledge the individual participants who accommodated U-Pods during the deployment. Edited by: William R. Simpson Reviewed by: three anonymous referees