Application of Low-Cost Fine Particulate Mass Monitors to Convert Satellite Aerosol Optical Depth Measurements to Surface Concentrations in North America and Africa

Low-cost particulate mass sensors provide opportunities to assess air quality at unprecedented spatial and temporal resolutions. Established traditional monitoring networks have limited spatial resolution and are frequently absent in 15 less-developed countries (e.g. in sub-Saharan Africa). Satellites provide snapshots of regional air pollution, but require ground-truthing. Low-cost monitors can supplement and extend data coverage from these sources worldwide, providing a better overall air quality picture. We demonstrate such a multi-source data integration using two case studies. First, in Pittsburgh, Pennsylvania, both traditional monitoring and dense low-cost sensor networks are present, and are compared with satellite aerosol optical depth (AOD) data from NASA’s MODIS system. We assess the performance of linear conversion 20 factors for AOD to surface PM2.5 using both networks, and identify relative benefits provided by the denser low-cost sensor network. In particular, with 10 or more ground monitors in the city, there is a two-fold reduction in worst-case surface PM2.5 estimation mean absolute error compared to using only a single ground monitor. Second, in Rwanda, Malawi, and the Democratic Republic of the Congo, traditional ground-based monitoring is lacking and must be substituted with low-cost sensor data. Here, we assess the ability of regional-scale satellite retrievals and local-scale low-cost sensor measurements to 25 complement each other. In Rwanda, we find that combining local ground monitoring information with satellite data provides a 40% improvement (in terms of surface PM2.5 estimation accuracy) with respect to using ground monitoring data alone. Overall, we find that combining ground-based low-cost sensor and satellite data can improve and expand spatio-temporal air quality data coverage in both well-monitored and data-sparse regions. https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c © Author(s) 2020. CC BY 4.0 License.


Introduction 30
Air quality is the single largest environmental risk factor for human health. Outdoor air pollution exposure is estimated to have caused about four million premature deaths annually in recent years (WHO, 2016(WHO, , 2018a. Particulate matter (PM), which represents a mixture of solid and liquid substances suspended in the air, is one of the most commonly tracked and regulated atmospheric pollutants globally (WHO, 2006). Not only does it have a major adverse health impact by itself (e.g. Schwartz et al., 1996;Pope et al., 2002;Brook et al., 2010), but its concentration is also often used as a proxy for overall air 35 quality (WHO, 2018a). PM mass concentration is typically tracked as PM10 (total PM mass with diameter below 10 micrometers) and/or PM2.5 (total PM mass with diameter below 2.5 micrometers). Even at low concentrations, PM can have significant health impacts (Bell et al., 2007;Apte et al., 2015). These health impacts are especially notable in low-income communities and countries, where they can interact with other socio-economic risk factors (Di et al., 2017;Ren et al., 2018).
Sub-Saharan Africa (SSA) in particular is affected by poor air quality, with less than 10% of communities assessed by the 40 WHO meeting recommended air quality guidelines, compared with 18% globally, and 40 to 80% in Europe and North America (WHO, 2018b). This poor air quality manifests in terms of high infant mortality (Heft-Neal et al., 2018), increased risk of chronic respiratory and cardiovascular diseases (Matshidiso Moeti, 2018), and reduced gross domestic product (World Bank, 2016). Industrial development and climate trends will likely only exacerbate this problem in the future (Liousse et al., 2014;UNEP, 2016;Silva et al., 2017;Abel et al., 2018). 45 Many African countries have among the highest estimated annual average PM10 and PM2.5 concentrations, yet are also among those with the lowest number of in situ reference-grade PM monitoring sites per capita. Figure 1 shows estimated average annual PM2.5 concentrations for various regions of the world versus the density of reference-grade monitoring sites in these regions (note that low-cost monitors are not considered), based on information from the Global Health Observatory, which combines data from multiple sources, including data collected in different years and by sporadic monitoring from field 50 campaigns, and so does not necessarily reflect continuous routine monitoring for all regions (WHO, 2017). This lack of continuous surface monitoring data makes it difficult to answer basic scientific and policy questions related to air quality assessment and mitigation (Petkova et al., 2013;Martin et al., 2019). A major reason for this gap is the high capital and operational costs of traditional ground-based air quality monitoring equipment. Two emerging technologies have the capacity to close this gap: satellite-based air quality monitoring and ground-based low-cost sensor systems. 55 Satellites are much more expensive than traditional ground-based monitors, but their mobility and unique vantage point allow them to provide near-global coverage. Data from earth-observing satellites can be used to assess air quality in a variety of ways. In particular, aerosol optical depth (AOD) represents a measurement of the absorption and scattering (extinction) of light by the atmosphere, and can be related to the concentration of light-absorbing or light-scattering pollutants in the atmosphere. Several factors complicate the relationship between AOD and surface-level particulate matter mass 60 concentrations. As a vertically-integrated quantity, AOD is related to total light extinction by a column of atmosphere. The spatial distribution of particulate matter, especially vertical stratification, the presence or absence of plumes aloft, humidity, https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License. and the size and optical properties of particles drive the relationship between AOD and surface concentrations (Kaufman and Fraser, 1983;Liu et al., 2005;Paciorek et al., 2008;Superczynski et al., 2017;Zeng et al., 2018). Cloud cover also makes AOD retrievals impossible; the frequency of cloudy days in an area can therefore make it difficult to establish reliable 65 relationships between AOD and surface PM (Belle et al., 2017). Changes in surface brightness can also confound this relationship, although this may be less of an issue in developing countries with higher aerosol levels (Paciorek et al., 2012).
Nevertheless, early examinations of AOD data from the moderate resolution imaging spectroradiometer (MODIS) instrument, launched aboard the Terra and Aqua satellites in 1999 and 2002, showed good correlation with long-term average surface PM2.5 concentrations in the United States, although these relationships varied from region to region (Wang, 70 2003;Engel-Cox et al., 2004). For shorter timescales, correlations between AOD and hourly surface PM2.5 were found to vary from an r 2 of 0.36 in the southeastern United States to an r 2 of 0.04 in the southwestern United States (Zhang et al., 2009). Using additional covariates, such as land cover, land usage, and meteorological information, can further improve these relationships. In particular, surface PM2.5 estimation models combining daily-averaged, 1-kilometer-resolution AOD data with meteorological and land use regression variables can achieve an agreement (quantified as r 2 ) with EPA ground-75 based monitors of about 0.9 in the northeastern and 0.8 in the southeastern United States (Chang et al., 2014;Kloog et al., 2014). Methods incorporating the outputs of chemical transport models can further improve these results (e.g. Murray et al., 2019).
Models combining satellite AOD data with vertical profiles derived from chemical transport models tend to underestimate surface-level PM2.5 outside of Europe and North America, mainly in India and China where ground-based comparison data 80 are available (van Donkelaar et al., 2010(van Donkelaar et al., , 2015. In China, the r 2 between surface PM2.5 estimates derived from satellite AOD, meteorological, and land use information and measured surface PM2.5 was found to be about 0.7, corresponding to a root mean square error (RMSE) of about 30 µg/m 3 in resulting satellite-derived surface concentration estimates (Ma et al., 2014). A method that updates the relationships between AOD and surface PM2.5 on a daily basis (Lee et al., 2011) was able to improve these results, increasing r 2 above 0.8 while reducing RMSE to about 20 µg/m 3 (Han et al., 2018). This method, 85 however, relies on local ground-based measurements to provide the data necessary to perform this daily updating. In Africa, although satellite-based and ground-based AOD measurements agreed well during a recent assessment in West Africa (Ogunjobi and Awoleye, 2019), an assessment in South Africa found a poor relationship between satellite AOD and surface PM2.5, with maxima in the surface concentrations coinciding with minima in the AOD (Hersey et al., 2015). Similar results were found in India, with anticorrelation observed between satellite AOD and surface PM2.5 for some locations (see 90 supplemental information, Fig. S1). Overall, while satellites have the potential to provide broad data coverage to previously unmonitored areas such as in SSA, relationships between AOD and surface PM2.5 developed using ground monitoring data elsewhere in the world may not transfer well to SSA, leading to inaccurate air quality quantification.
Low-cost air quality monitors, defined in contrast to traditional or regulatory-grade monitors, have much lower purchase and operational costs, e.g. on the order of five thousand US dollars per multi-pollutant monitor (measuring gases and PM), while 95 a comparable suite of traditional air quality monitoring instruments would cost a hundred of thousand US dollars or more. https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License. This cost reduction is made possible by a combination of lower-cost measurement technologies such as electrochemical sensors for gases and optical particle detectors for PM and recent decreasing costs of battery, data storage, and communications technologies. Much recent research interest has been focused on assessing the performance of these technologies (e.g. AQ-SPEC, 2015, developing methods for accounting for cross-interference effects in gas sensors 100 (e.g. Cross et al., 2017;Zikova et al., 2017;Kelly et al., 2017;Zimmerman et al., 2018;Crilley et al., 2018;Malings et al., 2019a) and humidity dependence in optical PM measurement methods (e.g. Malings et al., 2019b) to improve data quality, and demonstrating the utility of these low-cost monitors in various use cases (e.g. Subramanian et al., 2018;Tanzer et al., 2019;Bi et al., 2020). Because of their relatively low cost, these instruments can be deployed more widely than traditional monitoring technologies, enabling measurements in previously unmonitored areas. The tradeoff for this increased 105 affordability is a decrease in accuracy compared to traditional air quality monitoring instruments. While there are currently no agreed-upon criteria for assessing low-cost monitor performance (Williams et al., 2019), several schemes suggest tiered rankings ranging from, for example, 20% relative uncertainty for reasonable quantitative measurements to 100% uncertainty for indicative measurements (Allen, 2018); this gives a general sense of the expected performance characteristics of such instruments. In particular, recent testing of two types of such low-cost monitors (which are the types used in this paper) 110 found relative uncertainties on the order of 40% and correlation coefficient of 0.7 (r 2 of 0.5) with regulatory-grade instruments for hourly PM2.5 measurements (Malings et al., 2019b). These results are generally consistent with similar studies conducted in a variety of environments and concentration regimes, although relative performance tends to improve at higher concentrations (Kelly et al., 2017;Zheng et al., 2018).
The potential exists to use both satellite and low-cost sensor data together in order to address the shortcomings of each data 115 source individually and thereby to fill existing data gaps globally. Satellite data provides near-global coverage, but relationships between AOD and surface PM2.5 do not generalize well across regions, and so local ground-based data are needed for establishing conversion factors. Low-cost sensors can provide these local data in areas where existing monitoring networks are sparse or if reference-grade data are only sporadically available. Although individual low-cost sensors are subject to noise and drift, if a large number of such sensors is covered with a single satellite pass these errors may be 120 averaged out. This paper examines the use of low-cost PM sensors as ground data sources for converting satellite AOD measurements into surface information for two case studies. First, using a dense network of low-cost monitors in Pittsburgh, Pennsylvania, USA, where a regulatory-grade monitoring network already exists, we assess the utility of low-cost sensors as compared to these traditional instruments. Second, using low-cost monitors deployed in SSA in various locations in Rwanda, Malawi, and the Democratic Republic of the Congo, we explore the utility of these low-cost sensors in previously 125 unmonitored areas. Although we have no overlapped networks of regulatory-grade and low-cost monitors in SSA to refer to, we use data (freely available from various sources, including the US State Department and openaq.org) from regulatory monitors at the US Embassies in Kampala, Uganda and Addis Ababa, Ethiopia to supplement our analysis of the relationship between converted satellite AOD data and surface-level PM2.5 across SSA. In this work, we focus on high spatial and temporal resolution satellite data, which best aligns with the capacity of low-cost sensors to provide local air quality 130 https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License. information in near-real-time. The techniques presented hare are likely to translate to other data sources (e.g. new referencegrade monitors, new geostationary satellites) as they become available in the future.

Low-cost PM2.5 sensor data
Surface PM2.5 data were collected with three types of low-cost sensor systems, as described below. 135

MetOne Neighborhood Particulate Monitor (NPM)
The Met-One Neighborhood Particulate Monitor (NPM) sensor uses a forward light scattering laser to provide estimates of PM mass. It is equipped with an inlet heater and PM2.5 cyclone. The performance of these instruments has been assessed in previous studies (AQ-SPEC, 2015;Malings et al., 2019b) and they have been shown to have moderate correlation to regulatory-grade instruments. The cost of an NPM unit is about $2000, or about one-tenth that of a regulatory-grade 140 instrument. It is recommended that these units be cleaned and re-calibrated regularly between field deployments; such maintenance activities are not always possible in certain remote deployment locations, however, and so long-term calibration drift and accumulation of debris in the cyclone is a potential source of error for these devices.

PurpleAir II (PA-II)
The PurpleAir PA-II monitor uses a pair of Plantower PMS 5003 laser sensors to detect particles. Estimates of PM1, PM2.5, 145 and PM10 mass concentrations are provided by these sensors. The units also have internal temperature and humidity sensors and wireless communications capability, allowing them to transmit data over local networks. Several units were also modified to interface with an external device for data collection (see Sect. 2.1.4). Previous tests have shown high correlation between these units and regulatory instruments, although this can vary, especially at high humidity (AQ-SPEC, 2017; Malings et al., 2019b). Individual Plantower sensors are also subject to malfunctions and performance degradation; a 150 comparison between the Plantower sensors within the PA-II can be useful in detecting when these errors occur. These sensors are sold for about $250, or roughly one hundredth of the price of a regulatory-grade monitor.

Alphasense Optical Particle Counter (OPC)
The Alphasense OPC-N2 optical particle counter measures particles in the 0.38 to 17 µm range, and converts particle counts to PM1, PM2.5, and PM10 mass concentrations using proprietary internal calibrations. Previous tests of these sensors showed 155 moderate correlation with regulatory-grade instruments in field conditions (AQ-SPEC, 2016;Crilley et al., 2018). The Alphasense OPC sensors used in this paper were integrated into ARISense low-cost monitor nodes (see Sect. 2.1.4), which provided temperature and humidity information along with data collection and transmission services. The sensors themselves https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License. cost about $350, but this does not include the cost of the necessary electronics for logging and transmitting data nor of a weatherproof housing. 160

Data collection and processing
For data collection, all NPM and most PA-II units were paired with RAMP lower-cost monitoring packages. The RAMP (Real-time Affordable Multi-Pollutant) monitor is produced by SENSIT Technologies (Valparaiso, IN; formerly Sensevere), and has internal gas, temperature, and humidity sensors, along with the capability to interface with external PM monitors. This allows data collected by these PM monitors to be stored and transmitted over cellular networks by the RAMP. The 165 characteristics and operation of the RAMP are described elsewhere (Zimmerman et al., 2018;Malings et al., 2019a). The ARISense node, manufactured by Quant-AQ (Somerville, MA; formerly manufactured by Aerodyne Research), is a lowercost sensor package which combines internal gas, humidity, temperature, wind, and noise sensors, together with the Alphasense OPC-N2 PM sensor, and provides internet connectivity for data transmission (Cross et al., 2017). Most low-cost PM2.5 data are collected via one of these two systems; the exception is a single independently-deployed PA-II unit in 170 Collected data are down-averaged from their device-specific collection frequencies to a common hourly timescale.
Erroneous data identified either automatically (e.g. negative concentration values) or manually (e.g. devices exhibiting abnormal performance characteristics identified during periodic inspections) are removed. To correct for particle hygroscopic growth effects (i.e. the impact of ambient humidity on the PM mass as measured by the low-cost sensors), 175 previously developed calibration methods were implemented for the NPM and PA-II sensors (these are described in detail by Malings et al., 2019b). Utilizing these methods, based on previous assessments (Malings et al., 2019b), hourly average PM2.5 concentration measures from both sensors (after calibration) differed from those of co-located regulatory-grade instruments by about 4 µg/m 3 , on average, with low long-term biases (on the order of 1 µg/m 3 for annual averages). For the Alphasense OPC sensors, raw bin count numbers were integrated to produce a new concentration estimate for PM2.5, and a similar 180 relative humidity correction was applied (Di Antonio et al., 2018). Finally, an additional correction factor of 1.69 (for workdays) or 1.39 (for non-work-days) was applied to data collected by NPM sensors in Rwanda, based on previous results showing that current calibration methods tended to underestimate PM2.5 there (R Subramanian et al., in preparation).

Ground-based sampling locations
Surface PM2.5 data analyzed in this paper are collected in six different areas, as described below. Additional information 185 about these areas are also provided in the supplemental information.

Pittsburgh, United States of America
This area represents the city of Pittsburgh, Pennsylvania, USA, as well as the surrounding Allegheny County. Data from this area were collected during the calendar year of 2018 (i.e. January 1, 2018 to December 31, 2018). All ground measurement https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License.
locations for this area were contained within a rectangular region ranging from 40.1ºN, 80.5ºW to 40.8 ºN, 79.7ºW.  monitoring data for this area were collected by a mixture of NPM and PA-II sensors, all of which were connected to RAMP monitors. During the data collection period, the number of active instruments in this area at any given time varied from 10 to 46. Calibration of these measures are performed according to the methods described by Malings et al. (2019b) as summarized in Sect. 2.1.4.
In the Pittsburgh area, ground-level PM2.5 data were also available from a local regulatory-grade monitoring network 195 operated by the Allegheny County Health Department (ACHD). These data are collected at five sites in Allegheny county, with Beta Attenuation Monitors (BAMs), a federal equivalent monitoring method, providing hourly concentration measurements for air quality index calculation purposes (Hacker, 2017;McDonnell, 2017). Nominally, such federal equivalent methods are required to be accurate within 10% of federal reference methods (Watson et al., 1998;US EPA, 2016). Since BAM data have been used to establish the calibration methods for low-cost PM sensor data, the data from the 200 BAM instruments are used as provided for uniformity, without any additional corrections being applied.

Rwanda
Data collection in Rwanda occurred mainly in the capital city of Kigali, along with a single rural monitoring site co-located with the Mount Mugogo Climate Observatory in Musanze. Data in this area were collected between April 1, 2017 and May 27, 2018. The sites were located in a rectangle ranging from 2.2ºS, 29.4ºE to 1.4ºS, 30.5ºE. In this area, NPM sensors paired 205 with RAMP monitors were used exclusively. A total of four ground sites were active in this area, with a maximum of three sites being active simultaneously.

Malawi
Data in Malawi were collected by three ARISense monitors using Alphasense OPC sensors, deployed to three locations in the vicinities of Lilongwe and Mulanje between June 25, 2017 and July 30, 2018. These sites were contained within a 210 rectangular region spanning from 16.2ºS, 33.6ºE to 14.0ºS, 35.7ºE. The two locations in the vicinity of Mulanje are village center sites, and so may be influenced by nearby combustion activities.

Kinshasa, Democratic Republic of the Congo
Data in Kinshasa, Democratic Republic of the Congo were collected by a single PurpleAir PA-II sensor deployed at the US Embassy, at approximately 4.3ºS, 15.3ºE. This sensor was deployed independently, i.e. without an associated RAMP unit as 215 in Pittsburgh. Temperature and humidity data were therefore obtained from the internal sensors within the device itself, and data connectivity was achieved using the local wireless internet network. Data from this device collected between March 20, 2018 and October 31, 2019 are used in this paper.

Kampala, Uganda
In Kampala, Uganda, regulatory-grade monitoring data collected at the US Embassy are used to provide ground comparison 220 data for concentration estimates derived from satellite AOD data. The embassy is located at approximately 0.3ºN, 32.6ºE, and hourly data collected from January 1, 2019 to December 31, 2019 are used in this paper. These data are collected by BAM monitors, and no additional corrections have been applied.

Addis Ababa, Ethiopia
In Addis Ababa, Ethiopia, a regulatory-grade monitor deployed at the US Embassy is also used as a ground comparison data 225 source, with data collected from January 1, 2019 to December 31, 2019 being used in this paper. The embassy is located at approximately 9.0ºN, 38.8ºE. These data are also collected by BAM monitors, and no additional corrections have been applied.

Satellite data
The satellite data product used in this paper is the MODIS MCD19A2v006 dataset  available 230 through NASA's Earth Data Portal (earthdata.nasa.gov). This dataset consists of AOD information for the 470nm and 550nm wavelengths from the MODIS system, processed using the Multi-angle Implementation of Atmospheric Correction (MAIAC) algorithm, and presented at 1-kilometer pixel resolution for every overpass of either the Aqua or Terra satellites.
This represents a Level 2 data product, meaning that it includes geophysical variables derived from raw satellite data, but has not yet been transformed to a new temporal or spatial resolution, as is the case for data derived from multiple satellite passes, 235 e.g. monthly average AOD data. Data from identified cloudy pixels is masked as part of the data product; possible misidentification of cloudy pixels is one source of error in relating surface PM2.5 and AOD. This dataset was chosen as it represents the highest possible spatial and temporal resolution for AOD, thus providing the most points for comparison with the high spatio-temporal resolution low-cost monitor data.

Conversion Methods for satellite AOD data 240
A linear regression approach is used to establish relationships between satellite AOD and surface-level PM2.5. Let , denote the ground-level PM2.5 measurement at location and time , and let , represent the vector of satellite AOD measurements (i.e., the AOD measurements at 470nm and 550nm wavelengths, together with a "placeholder" constant of one to allow fitting of affine functions) corresponding to location and time . The total set of ground measurement sites in an area, , is partitioned into two disjoint sub-sets. Subset in represents the sites used to establish the linear relationship between AOD 245 and surface PM2.5 concentrations. The remainder of sites, in the subset ap , are used for the application, i.e., to serve as an independent set to evaluate the performance of the linear relationship established from the in sites. Likewise, the time https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License.
domain is partitioned into initialization phase in , during which linear relationships are established, and application phase ap , during which these relationships are applied and evaluated.
Linear relationships are determined as follows. First, satellite AOD data and surface PM2.5 monitor data from the in sites 250 during the in phase were collected together: A linear relationship is established between these, defined by parameters in , using classical least-squares linear regression (e.g., Goldberger, 1980): The covariance matrix of the parameters, Σ in , is also obtained: where length(•) is a function returning the number of elements in the input. During the application phase, the linear relationship can be used to estimate the surface PM2.5 concentration at location and time , ̂, ,prior , from the satellite AOD data corresponding to that location and time: 260 The above procedure constitutes an offline or (in Bayesian terminology) prior conversion, i.e., it uses data collected during the initialization phase to define a single conversion factor which is applied throughout the application phase. An online or (in Bayesian terminology) posterior approach can also be adopted, in which this relationship is modified as additional data are available. This approach has been proposed by Lee et al. (2011) and evaluated by Han et al. (2018), and allows for the 265 potentially time-varying relationship between satellite AOD and surface PM2.5 concentration to be accounted for. In the online approach, for a time during the application phase, a new data set consisting of in, and in, is created by combining all data available from the in ground sites together with satellite AOD data for that time: Based on these new data, a linear relationship is established for that time, as above: 270 This relationship is combined with the prior relationship established during the initialization phase (using a Bayesian approach and assuming normally-distributed parameter values) to establish a new posterior relationship specific to that time, where diag(•) denotes a matrix diagonalization and is a relative error scale parameter, used to define how much "weight" is given to the time-specific relationship parameters versus the prior relationship parameters in in the updating process Both the offline and online approaches are used in this paper, and their performance is compared (see Sect. 3.1).
This simple linear correction factor method does not explicitly account for vertical distribution profiles, cloud cover, or any other variables which affect the relationship of AOD to surface PM2.5. Instead, the aggregate affect of these variables is accounted for implicitly in an empirical relationship. The offline approach uses fixed relationships, which cannot account for 285 time-varying effects such as changes in vertical distribution profiles. The online approach can account for these time-varying effects to some degree, assuming their observed impact on the AOD to surface PM2.5 relationship at the in sites is representative of their short-term impact throughout the region where the corresponding correction factors are applied.
Finally, note that all parameters described above can be solved for analytically using the equations presented in this section (i.e. no iterative or approximate solution methods are necessary). 290

Results
In this section, we apply the proposed method for satellite AOD to surface PM2.5 concentration conversion in several use cases. In Sect. 3.1, 3.2, and 3.3, we assess the performance in Pittsburgh, comparing the use of regulatory-grade monitors and low-cost monitors as ground sites for establishing conversion factors. In Sect. 3.4 and 3.5, we extend the comparison to Rwanda, examining the impact of using the relatively sparser low-cost sensor network there, and examining seasonal 295 variations in the conversions. Finally, in Sect. 3.6, we examine the generalization of a Rwanda-based conversion factors to other locations across SSA. Assessment metrics used in this section, including correlation (r 2 ), coefficient of variation of the mean absolute error (CvMAE), and mean-normalized bias (MNB) are described in the supplemental information.

Comparing the use of regulatory and low-cost monitors as ground stations to develop conversion factors for AOD
We first evaluate the utility of low-cost sensors as substitutes for regulatory-grade monitors when developing factors to 300 convert satellite AOD data to surface PM2.5 estimates, using the Pittsburgh area as our case study. The five ACHD regulatory monitoring locations are used to assess the performance of the satellite AOD conversion in all cases. First, we use these same locations to develop the conversion factors; in this case, we use four of five locations to develop a conversion factor, and https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License.
apply it to the fifth. All sites are rotated through in this manner, providing a performance metric assessed for each site.
Second, we use low-cost sensors for developing the conversion factor; in this case, we select a subset of four locations where 305 RAMP low-cost monitors are deployed, so that the number of ground sites used matches the number of ACHD sites used in the other case. These low-cost monitor locations are chosen to provide a similar spatial coverage over Allegheny county as the ACHD sites, although monitors co-located with ACHD sites were specifically not chosen, to allow for a fairer comparison when performance is assessed against the ACHD network (as a measurement will never be available at the exact location where the concentration is to be estimated, as was the case when the ACHD sites alone were used). In this case, a 310 conversion factor developed using the four low-cost sensor sites is applied across all five ACHD sites, with performance assessed at each site. A diagram of this procedure is provided in the supplemental information Fig. S6.
Different application cases of the satellite AOD conversion method were also tested. For a "yearly" conversion, data from the entire calendar year were used to develop the conversion factors, while in the "monthly" case, data from the previous month are used to develop conversion factors used in the current month; the median performance across months is presented. 315 Although the "yearly" case would technically require having access to data which have not yet been collected (assuming this method is being applied for data collected in the current year), we use this to represent a case where data from a previous year are used to develop conversions applied on the current year, as we assume that the annual average AOD to surface PM2.5 concentration relationship for a given area will not significantly change from one year to the next. In addition, we also assess the relative performance of the offline (prior) conversion factors, where the same relationship parameters determined 320 during the initialization period are applied to the entire application period, and the online (posterior) conversion, where these initial parameters are modified based on the AOD to surface PM2.5 relationships specific to each individual satellite pass.
Overall, these results indicate relatively weak relationships between satellite AOD and surface PM2.5 for Pittsburgh, 325 regardless of the method used. Correlations are weak (r 2 < 0.3), and mean absolute errors are on the order of half to threequarters the concentration values (annual average concentrations range from about 10 to 12 µg/m 3 across most of Pittsburgh). However, these results are consistent with similar comparisons conducted between hourly AOD and surface PM2.5 in the eastern United States, which found r 2 between 0.04 and 0.36 depending on season and location (Zhang et al., 2009). Biases are low on average, but can vary across locations. In comparing the different application modes, it seems that 330 the "posterior" method provides slightly worse performance, especially on ACHD data, than the "prior" method. This suggests that variability in AOD to surface PM2.5 relationships between satellite passes (due for example to differences in the vertical profile of PM2.5 over the area, and/or to differences between "point" measurements of the ground monitors and "area" AOD measurements) is not being well captured through the "posterior" method, i.e., that the additional uncertainty incurred by calibrating relationships using satellite data from a single pass (versus relying only on the more robust 335 calibration from multiple passes as in the "prior" method) tends to degrade performance. This may be due to the specific https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License.
conditions of Pittsburgh, however; the comparatively low PM2.5 concentrations in this area (averaging less than 10 µg/m 3 during the study period) may reduce the signal-to-noise ratio to the point where the noise is dominant.
In all cases, performances using low-cost sensor data are comparable or superior to that of the same conversion approaches utilizing the regulatory-grade instruments. Thus, there is no evidence from this analysis of any inherent disadvantage to the 340 use of low-cost sensors to provide ground data as compared to more traditional instruments. Data quality differences between low-cost sensors and regulatory-grade instruments seem negligible compared to the difficulties associated with relating satellite AOD to surface-level PM2.5, and therefore have little to no impact on the performance of the assessed conversion methods, at least for this study area.

How many ground stations are needed to improve surface PM2.5 estimates from AOD data? 345
A significant advantage of low-cost monitors compared to traditional instruments is the ability to deploy dense networks of the former for the same cost as a sparse network of the latter. To assess the potential benefits of this in terms of conversion of satellite AOD data to surface PM2.5, we analyze the effect of the number of surface sites used on the performance of the surface PM2.5 estimates from AOD conversion. We again examine the Pittsburgh region, and take the ACHD regulatory monitoring network as the "ground truth" against which performance is assessed. Here, the number of ground sites is varied, 350 with sites being chosen from the set of possible sites. For the ACHD network, the possible sites are the ACHD sites minus the one site against which performance is assessed (all ACHD sites are rotated through); this is schematically shown in Fig.   S7. For the low-cost sensors, the possible sites are all RAMP deployment locations in the area. Subsets of varying size are randomly selected (10 different random set selections are used in this example); the mean of the performance metric across these 10 randomly selected sets is used as the assessed performance (as depicted in Fig. S8). In this case, a monthly offline 355 conversion factor is used (with the factor developed in one month being applied in the following month without modification). Figure 3 shows results of this assessment in terms of the CvMAE metric.
For small numbers of ground sites, results for the ACHD network and the low-cost sensor network are similar in terms of mean performance across different randomly selected subsets of the network. The spread in performance across selected sites is lower for the ACHD network; this is related to the smaller number of possible combinations of ACHD sites to be 360 randomly selected compared to the RAMP sites, which would lead to lower variability in the results. The limited number of ACHD sites prevents this analysis to be carried forward to larger numbers of locations; at four chosen locations, there is only one possible combination to be selected, and so the spread in performance collapses to match the mean. With the low-cost sensor network, as more ground sites are included, mean performance stays roughly constant, but performance variability decreases, indicating that by adding additional ground sites, even sites positioned at random throughout the domain, the 365 conversion relationship becomes increasingly robust. In particular, while for a single ground monitor, worst-case CvMAE is on the order of 1.5 to 2, with 10 or more monitors, worst-case performance is improved below 0.8, a more than two-fold improvement in worst-case performance. This performance increase slows beyond about 15 ground stations, indicating that this may be an optimal density (at least in the Pittsburgh area) for ground sites for establishing conversion relationships to https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License. satellite AOD data. Overall, this demonstrates the potential benefits of dense low-cost sensor networks for conversion of 370 satellite AOD data, even over a limited spatial domain (covering about 600 square kilometers). Furthermore, it shows that even with quasi-random placement of the ground sites, such as might be achieved by citizens making personal decisions to deploy low-cost monitors on their own properties, increasingly robust conversion results can be achieved as more sensors are included, although these benefits diminish beyond (in this case) 15 monitors across 600 square kilometers.

Comparison of AOD-based surface PM2.5 to measurements from a dense ground network 375
In this section, we assess the benefits of combining satellite AOD and ground-based sensor data, as compared to using ground-based sensor data alone. For this assessment, we compare estimates of surface PM2.5 derived from satellite AOD data, using the methods presented previously in this paper, with estimates based on the surface PM2.5 measurements alone, which we denote as "nearest monitor" estimates. For this estimation, we make use of a locally constant or naïve interpolation, in which the surface PM2.5 estimate for a given time and location is the same as the measurement of the nearest 380 available ground monitor (i.e., one of the ground monitors used for establishing conversion factors for the satellite AOD data) at that time: where dist( , ) indicates the distance between locations and k, and argmin denotes the input which minimizes this objective. 385 Performance of both this nearest monitor method and the satellite AOD conversion method are assessed for Pittsburgh in Fig. 4. In this case, the low-cost sensor data are used to represent the "ground truth" against which performance is assessed.
Again, conversion factors are developed and applied on a monthly basis. All but one low-cost sensor sites are used for development of these factors, with application and assessment on the final site; it should be noted that this represents a greater number of ground sites than was evaluated in Sect. 3.1, leading to improved performance following the trend noted in 390 Sect. 3.2. These sites are then cycled through, to provide performance metrics across all sites. To allow for comparability between the nearest monitor approach and surface PM2.5 estimation from satellite AOD, we make use of the same set of ground sites for both, i.e., for each site, data from the closest available sites are used as inputs to the nearest monitor method, and all sites are cycled through in this manner, providing performance metrics for each site as above. A diagram of this procedure is provided in the supplemental information, Fig. S9. 395 In Pittsburgh, we see reduced performance (lower correlation, larger CvMAE, larger spread in the bias) when using converted satellite data as compared to nearest monitor data. This is likely a result of the quite dense network of low-cost sensors present in Pittsburgh, where the median distance between sensors in the network is about 1km. With this dense network, there is a good chance that the nearest ground monitor will be quite close to the location at which concentrations are to be estimated, and the resulting estimate is therefore likely to be quite good, as PM concentrations tend to be 400 homogenous at this spatial scale have in Pittsburgh (Li et al., 2019). When PM2.5 is instead estimated from satellite data, https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License. spatial and temporal variability in surface PM2.5 to AOD relationships is introduced, which can confound the assessment. This is especially important considering the relatively low levels of surface PM2.5 concentration and AOD in and above Pittsburgh, meaning that any introduced noise will be relatively large in proportion to the signal being assessed.

The utility of AOD-based surface PM2.5 in regions with a sparse ground monitoring network 405
Performance of the nearest monitor method and the satellite AOD conversion method are assessed for Rwanda in Fig. 5, in a similar manner as was done for Pittsburgh in Fig. 4. In Rwanda, we see an improvement across all metrics (slightly higher correlation, much smaller CvMAE, less spread in bias) as satellite data are combined with surface PM2.5 monitor data. In particular, median CvMAE is reduced from about 0.5 to 0.3, a 40% improvement. Because of the relative sparsity of the low-cost monitor network in Rwanda (4 measurement sites, not all of which were simultaneously operational) compared to 410 that in Pittsburgh, the assumption of spatial homogeneity of concentrations between monitoring sites is less valid, and so the inclusion of satellite data is beneficial in resolving these spatial differences. Furthermore, the relatively high levels of PM2.5 concentration in Rwanda (average of about 40 µg/m 3 over the study period) allows for a higher signal-to-noise ratio relative to Pittsburgh. Together, these results indicate the high utility of low-cost sensors, used in conjunction with satellite data, when these are deployed even in relatively sparse networks to previously unmonitored areas with high surface PM2.5 415 concentrations.

Seasonal effects on satellite AOD conversion to surface PM2.5
Changing seasons can affect the relationship between satellite AOD and surface PM2.5 due to changes in confounding factors like surface reflectance, aerosol vertical profiles, and particle composition. Many of these seasonally varying factors are not accounted for in current AOD retrievals . Here, we assess the utility of developing seasonal AOD 420 conversion factors for Pittsburgh and Rwanda. For this assessment, conversions are developed and applied in specific seasons (information on these seasons are presented in the supplemental information). For Pittsburgh, these approximately correspond to a winter, spring, summer, and fall season, while in Rwanda, these represent alternating wet and dry seasons.
For Pittsburgh, the major differences between seasons are related to temperature, with humidity varying to a lesser degree, as depicted in Fig. S2. In Rwanda, temperatures are relatively stable year-round, with seasons mainly differentiated by humidity 425 changes (although the second dry season appears to have been unusually wet, comparable to the previous wet season).
RAMP data are used to represent "ground truth" concentrations for both areas. An offline or "prior" approach is used here, i.e., calibrations are not modified based on data collected within the application period, in order to investigate the effect of generalizing a calibration developed in one season to a different season. Metrics are assessed for each individual site in each area, with all other sites being used to establish AOD conversion factors as in the previous section. The median results across 430 all sites are presented in Fig. 6 for each combination of initialization and application season.
For Pittsburgh, the summertime conversion factors perform best across all seasons, while the wintertime conversion factor performs worst (except when applied to winter). Thus, while there are some seasonal differences, a conversion factor https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License. developed during summer (or a conversion factor developed over the course of spring through fall) might generalize reasonably well to the entire year. In Rwanda, an alternating pattern is revealed, with wet season conversion factors applying 435 well to other wet seasons, and dry season conversion factors applying to other dry seasons. Many factors could contribute to this pattern, including changes in humidity and the resulting impact on extinction, as well as seasonal burning patterns affecting particle sizes and compositions. Conversion factors appear to generalize better between wet seasons than between dry seasons. Correlations are highest during the first dry season (DS1), regardless of when the conversion factor is developed; this was also the driest season and the season with the highest PM2.5 concentrations of the seasons measured. 440 Applications of conversion factors developed in other seasons to DS1 underestimate PM2.5 in this season, especially applications of factors developed during the wet seasons (when PM2.5 levels were much lower). This indicates that there is seasonality to PM2.5 concentrations which is not being reflected in the AOD data alone, and would require local monitoring to identify. Overall, these results indicate that conversion factors should be developed or updated at least on a seasonal basis, especially in Rwanda; a conversion factor developed during a limited monitoring campaign occurring in one specific season 445 may fail to generalize well to other seasons.

Regional generalization of AOD conversion factors developed in Rwanda
Finally, given the lack of ground-based monitoring in many parts of SSA, we assess whether a conversion factor developed in one city can be generalized to other cities and towns across SSA. Here, a single AOD conversion factor is developed using one site in Kigali, Rwanda and this factor is applied without modification to other sites across SSA. These include a second 450 site in Kigali, a site in Musanze in rural Rwanda, a site in Kinshasa (DR Congo), and three sites in Malawi (one near the urban area of Lilongwe and two other sites in more rural areas to the south, near Mulanje) where low-cost sensor systems are deployed. There are also two sites (Kampala, Uganda and Addis Ababa, Ethiopia) where hourly-resolution long-term regulatory-grade monitoring data are available; data from these sites are included for comparative purposes. An offline approach is used here, with a single factor being initialized over the entire study period. Results are presented in Fig. 7. 455 Correlation is relatively low across all application areas, with a weak decreasing trend as distance from the initialization site increases (the exception to this is found at the Mugogo site). Best performance in terms of CvMAE and normalized bias is found in Kigali, Kampala, and Kinshasa; these urban zones are likely most similar to the initialization site in terms of land use and resulting source mix; relatively best performance is found at the Kigali site which is much closer spatially. The Kampala site, with data collected via a traditional monitoring instrument, shows similar results as obtained at these other 460 urban sites where low-cost monitors are used. The other, more rural locations show poorer performance regardless of distance from the initialization site. However, the Addis Ababa site also shows much poorer performance, despite also being an urban area. This may be due to climate differences between Addis Ababa and the other cities considered, as well as differences in particle composition and size distributions, especially higher contribution to AOD from coarse (larger than These results indicate that, while conversion factors may generalize to sites with similar land use characteristics, physical distance alone is not as significant in determining AOD-PM relationship generalizability. Also, the overall low correlation values indicate the importance of local data, as spatial heterogeneity in satellite AOD to surface PM2.5 relationships can be a concern even for nearby sites. Finally, it should be noted that a single annual conversion factor, as is assessed here, could fail 470 to take into account seasonal variabilities (Sect. 3.5) and so can correlate poorly with surface PM2.5 even in or near the area where it is developed (as seen for the Kigali site here). A conversion factor which varies on at least a seasonal basis is therefore preferred; however, determining how to generalize such a time-varying conversion factor to other regions where seasonal definitions and characteristics can be quite different is a challenging problem.

Discussion 475
We have examined the feasibility of using low-cost sensors as a data source in developing relationships between surface PM2.5 concentrations and satellite AOD measurements. In a case study in Pittsburgh, there was no decrease in performance associated with the use of low-cost sensors for this purpose rather than more traditional regulatory-grade monitors.
Furthermore, the increased density of ground sites possible using low-cost sensors provided benefits in terms of more robust conversion factors compared to the more sparsely deployed traditional monitoring network. However, it was found that for 480 Pittsburgh, with a relatively dense low-cost sensor network and low PM2.5 concentrations, use of the nearest ground measurement sites outperformed the use of satellite AOD data to estimate surface PM2.5. Partly, this could be because AOD is rather low over this area (average of about 0.2) leading to lower signal-to-noise ratios which reduce AOD to surface PM correlation. Conversely, in Rwanda, a relatively sparse low-cost sensor network combined with satellite data with higher and more variable PM2.5 concentrations provided better estimates of surface PM2.5 concentrations than was available using only 485 the nearest surface monitor. This is highly relevant to SSA, as sparse local monitoring and high average PM2.5 concentrations (as measured by the few available ground-based monitors) are common features. Differences in seasonal characteristics (especially at the Rwanda locations) show the added value of season-specific conversion factors, while differences in characteristics between areas, especially urban and rural locations with highly variable particle types, limit the generalizability of conversion factors across regions. 490 It should be noted that the results of this paper pertain to local and instantaneous relationships, using the highest spatial and temporal resolution of satellite data currently available. Results may differ for spatially or temporally aggregated satellite and ground site data. In particular, such spatial and temporal aggregation is likely to reduce the impact of noise (but not bias) both from low-cost instruments and from satellite retrievals. However, such aggregate information does not take advantage of the potential inherent in low-cost sensors to provide near-real-time information on local air pollution. On a related point, 495 satellite data (at least, for most of the world using current platforms) cannot provide diurnal concentration profiles, instead presenting a "snapshot" of concentrations for a wide spatial domain but only for a specific time of day. Ground-based monitoring, including monitoring with low-cost sensors, will still be essential for this function, at least until new https://doi.org/10.5194/amt-2020-67 Preprint. Discussion started: 3 March 2020 c Author(s) 2020. CC BY 4.0 License. geostationary platforms with truly global coverage are available (Judd et al., 2018;She et al., 2020). Such satellites are planned for coverage of North America (the TEMPO satellite mission), Europe (Sentinel 4), and East Asia (GEMS); 500 unfortunately, there are no current plans for coverage of Africa by similar satellites.
The results presented here continue to highlight the need for ground-based PM2.5 monitoring in previously unmonitored areas such as SSA, especially in light of the benefits observed in Rwanda from having even a sparse ground monitoring network combined with satellite data on local spatial heterogeneity. These efforts should make use of traditional regulatory-grade instruments wherever possible, supplemented with low-cost monitors to increase network density and extend spatial 505 coverage. Findings in Pittsburgh indicate that denser monitoring networks, such as those made possible by low-cost sensors, improve accuracy and robustness of surface PM2.5 estimates from satellites (up to a certain point of diminishing returns).
Verification that the same trend will hold in other regions, especially in SSA, requires further dense deployments of low-cost sensors, and is the subject of ongoing work.
Further technical and research developments in this area have enormous promise for improving the understanding of local air 510 quality worldwide. A functioning system for converting satellite to ground-level air pollution data, relying on a group of "trusted" ground data sources, could be a valuable resource for assessing and correcting low-cost sensor data, allowing for in-field recalibration of drifting instruments, and better identification of malfunctioning sensors. Low-cost systems combining PM mass measurement and ground-up AOD data can help to establish AOD to surface PM relationships at finer spatio-temporal resolution (Ford et al., 2019). Open questions related to this research area include finding appropriate 515 timescales over which conversion factors can be considered constant within regions as well as continuing to examine the question of conversion factor generalizability between regions separated by spatial distances and across different climates and land use characteristics. More sophisticated conversion methods incorporating meteorological and land use information and outputs of chemical transport models can also be considered, albeit with the recognition that some of these inputs may not yet be readily available or well validated for SSA. 520

Code and data availability
Data related to the results and figures presented in this paper are available online at https://doi.org/10.5281/zenodo.3691833.
Codes related to the analysis of data and generation of figures are also provided at the same site.

775
Note that, in Rwanda, only one sensor was operational during Dry Season 2 (DS2) and Wet Season 3 (WS3), and so application of these conversions to an independent site was impossible; therefore, performance metrics are blacked out. In each figure diagonal (from top left to bottom right) elements correspond to the same season.