Supplemental Material for: Combining low-cost, surface-based aerosol monitors with size-resolved satellite data for air quality applications

1 Senseable City Lab, Massachusetts Institute of Technology, Cambridge MA, United States 2 Earth Sciences Division, NASA Goddard Space Flight Center, Greenbelt, Maryland 20771, United States 3* School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, United States 4 Pontifícia Universidade Católica do Paraná, Brazil * Now at: Department of Physics and Astronomy, University of Leicester, Leicester, UK. ϯ Correspondence to: Priyanka deSouza (desouzap@mit.edu)

Particle hygroscopicity is a consideration, as the OPCs operate at ambient relative humidity (RH), whereas PM 2.5 is assessed as dry mass. The Alphasense OPC-N2 sizing does depend on RH and aerosol hygroscopicity (Crilley et al., 2018;Hagan et al., 2019) . However, the ambient average RH for the current study was 34% (Table S2); RH this low is unlikely to affect the OPC sizing. Further, previous studies Gatari et al., (2005Gatari et al., ( , 2009 found that organic carbon and black carbon dominate the urban air pollution particles in Nairobi, which are sourced from various types of combustion. The higher the amount of organic content, the less hygroscopic they tend to be (e.g., Samset et al., 2018 ). This means that the particles present are likely to have low hygroscopicity. Figure S2 shows the derived PM 2.5 time series plots (in μg/m 3 ) for each Nairobi site from May 1 2016 to March 2 2017. PM data are recorded every minute. (The actual sampling time of the OPC-N2 is 5 seconds, after a 20 second warm-up period.) Some PM spikes registered are as high as 1000 μg/m 3 . In order to smooth the time-series and suppress outliers, that are likely due to instrument noise, we averaged the OPC measurements over 30-minute time intervals.
Co-locating the OPC with a reference monitor to obtain high-quality PM data would be required to calibrate the raw OPC measurements and distinguish the signal from noise directly. However, this would be costly and possibly time-consuming (Castell et al., 2017;Rai et al., 2017). Due to limited resources, we were unable to co-locate our low-cost sensors with a reference monitor in Nairobi. As such, we rely primarily upon the more robust raw particle counts per size bin reported by the monitors, rather than the reported PM 2.5 .  (Limbacher and Kahn, 2014; with the standard set of 74 mixtures. Each mixture is comprised of up to three of eight aerosol components ( Kahn et al., 2010 ; Table S3). Separate MISR RA AOD retrievals, reported in the MISR green band (centered at 558 nm wavelength), were obtained for all eight MISR aerosol components (Table S3) for pixels within a radius of 1.6 km from each of the five Nairobi surface monitoring sites.
Over the course of the study period, we had only 28 successful MISR retrievals, during eight unique days. This is the main motivation for combining MISR retrievals with those of MODIS, as described below. The relatively low AOD over Nairobi during the study period makes estimating aerosol-type uncertainty difficult. We restricted the MISR retrievals used in this study to the 10 that have total MISR AOD ≥ 0.15 (over three unique days), i.e., those meeting the good-quality MISR aerosol property criterion (Kahn et al., 2010). Figure S3: Examination of monthly particle-type variability for three cases where MISR obtained two retrievals at a given site in the same month. The fraction of the total near-surface AOD (AOD N-S ) ascribed to different MISR aerosol components is plotted for each case. Note that Components 2, 8, and 14 are aggregated, as discussed in Section 2.2.1 in the main text.
Unfortunately, we did not have multiple MISR observations with AOD ≥ 0.15 corresponding to a specific site in the same month. Therefore, we used the available MISR component AOD values obtained for a given site to represent the 'monthly effective' particle size distribution for aerosols over that site. We excluded months when there were no MISR retrievals for a site. As such, we assume the fractional contributions of the near-surface MISR aerosol components to the total AOD remain nearly unchanged in the study region for a given month, even if the total AOD varies. We tested this assumption to some extent using all 28 MISR retrievals, as we do have several instances of two observations in a month over a specific site if we include the lower-AOD cases. Figure S3 shows the fractional AOD values for different MISR aerosol components where two observations over a specific site were made in the same month. It can be seen that the fractions are very similar, which lends some credence to our particle property assumption.  Figure S4 shows the MISR columnar AOD for the 10 cases where the average total AOD ≥ 0.15. Small, spherical, light-absorbing (components: 8 and 14) and non-absorbing (component 2) particles are characteristic of organic aerosols and sulfate/nitrate, respectively (Kahn and Gaitley, 2015) , and are typical of urban pollution particles. The component AOD values for these particle types are relatively high for all coincident observations, consistent with our expectation that local pollution dominates the aerosol loading in the Nairobi region (see Section S1.3 on the GEOS-Chem model, below). Further, we know that the primary aerosol source in Nairobi during the study period is local pollution, so, lacking other constraints, assuming the particle type is constant on a monthly basis seems plausible. The AOD of component 6, a medium-large spherical particle often retrieved for dust, but also representing a generic 140  large, spherical non-absorbing particle (Kahn and Gaitley, 2015), is moderately high for all coincident observations, especially for the five cases with the lowest overall AOD. This could be a transported aerosol, representing a background component, less likely to be concentrated near the surface. The AOD for component 21 is 0, implying this component was not present at a detectable level in any of the MISR retrievals considered. In light of limited MISR aerosol-type retrievals, the assumption of relatively constant aerosol-type fractions remains a limitation of the current study. Future experiments, performed farther away from the equator, will have more frequent MISR observations, and likely higher numbers of successful retrievals. It takes about seven minutes for all 9 MISR cameras to observe a given location on the surface. We averaged the OPC PM 2.5 readings over 30 minutes, centered on the MISR overpass time. We matched these OPC values with the corresponding MISR RA AOD values for each component group, using a universe of 74 mixtures, within a radial distance of 1.6 km from each ground site. Each of the five sites has up to ~7 MISR RA coincidences within this radius. The MISR retrievals yielded a spatial distribution of spherical, light-absorbing and non-absorbing particles, with an effective radius of about 0.12 µm everywhere. Note that the size distribution for the 0.12 µm particles extends to 0.75 µm (Table S3); we rely on the tail of this distribution for comparisons with the OPCs, due to limited OPC sensitivity to smaller sizes.

S1.2.2. MAIAC/MODIS over Nairobi Sites
During the study period, the Terra satellite passed over Nairobi between ~10:30 am -11:30 am local time. Aqua overpasses occurred approximately three hours after each Terra overpass. MAIAC AOD at 550 nm was extracted for the study period from the Land Processes Distributed Active Archive Center (LP DAAC). Although the MAIAC AOD at 550 nm is slightly more error-prone than that at 470 nm, we that the adjacent pixel cloud masks were also clear. This led to a reduction of ~70% in the number of successful AOD retrievals, because Nairobi is at relatively high elevation and is often cloudy. Hu et al. (2014) filled in the gaps of Terra AOD retrievals for different overpasses, when a successful Aqua AOD retrieval was obtained, and vice versa, using a simple linear regression. However, predicted AOD values inevitably contain additional uncertainties. Thus, we only used Terra AOD in this study. An additional advantage of working exclusively with Terra AOD is that MISR is also aboard Terra, so their observations are temporally coincident.
As with MISR, we matched the PM 2.5 reading of each OPC site with the corresponding MAIAC AOD values from the MODIS grid cells within a radial distance of 1.6 km, at Terra overpass time. We tested the robustness of our results to using radial distances of 1 km and 0.5 km. There were up to nine MAIAC grid cells linked with each site. As the KGSA and All Saints sites are 2.63 km apart, some of nine the grid cells associated with the two sites overlap.
The mean AOD of the 1712 successful Terra MAIAC retrievals is 0.095 (min=0.014, max=0.301, sd=0.053). The successful retrievals occur over 66 unique days. A total of 304 successful Terra MAIAC AOD retrievals were acquired, over 20 unique days, during months when favorable MISR retrievals (MISR AOD≥ 0.15) were also obtained. Of these, 79 retrievals were made over UNEP, 48 were made over St Scholastica, 78 over KGSA, 87 over All Saints and 12 over Alliance. The mean successful Terra MAIAC AOD retrievals over these days is 0.126 (min=0.038, max=0.268, sd=0.048). We separately considered cases where the total MAIAC AOD ≥ 0.15 (85 of the 304 retrievals), in order to ensure that we were considering days when the surface aerosol dominated in the column.

S1.3. The GEOS-Chem Model
The GEOS-Chem model results are for 2012, the latest available, following two months of spin-up for chemical initialization, whereas the observations used in this paper are from 2016. Therefore, the model will underestimate any pollution sources that track population growth from 2012 to 2016. Population growth rates in Nairobi between 2012 and 2016 were ~8%; this is well within the GEOS-Chem model uncertainties, and given other uncertainties in the Nairobi example as well, we accept the 2012 estimates in the current analysis.
The GEOS-Chem model results indicate that the dominant aerosol components over Nairobi are organic aerosols (OA), typically produced from biofuel use ( Figure S5). The model also shows negligible sulfate-nitrate-ammonium that is typical in most rapidly developing countries. The Nairobi low-cost sensor sites coincide with two GEOS-Chem grid cells. The UNEP site is in one cell, and the other sites are all in the other. The GEOS-Chem grid cells are of a different size compared with MISR grid cells, and we used the GEOS-Chem results for the non-UNEP sites only, as this cell has a maximum areal overlap with the corresponding MISR cell.
The fraction of AOD in the first layer of the GEOS-Chem model, as a proportion of the total-column AOD, provides the AOD vertical scaling factor ( Figure S5). This factor is used to calculate the fractional AOD at the surface of Earth, to connect the near-surface aerosols mass results with the AOD from satellite data. The vertical scaling factors from GEOS-Chem, were ~ 0.4 for anthropogenic aerosol and 0.07 for dust. We attempted to validate the GEOS-Chem-derived vertical scaling factor by comparing with statistical data from Cloud-Aerosol Lidar and Infrared Pathfinder (CALIPSO) L2 aerosol extinction coefficient retrievals for the same season over the Nairobi region. Unfortunately, the narrow CALIPSO swath (~100 meters) passes no closer than ~40 km from the urban center of Nairobi, making the AOD vertical scaling factor derived from the latter difficult to compare with that from GEOS-Chem. We nevertheless calculated the monthly-averaged vertical scaling from CALIPSO, by dividing the sum of the aerosol extinction coefficient at 532 nm for the near-surface aerosol layer by the sum of all the aerosol extinction coefficients for the entire column. For cells within ~100 km from the ground-based monitoring sites, the scaling factor varied between 0.1 and 0.2 over the months in 2016 for which we had MISR retrievals > 0.15. This is less than that from the GEOS-Chem grid cells, most likely because over rural areas, a larger fraction of aerosol in the column is transported, whereas the dominant aerosol in the urban center is locally sourced and concentrated nearer the surface.
We also attempted to evaluate this vertical scaling layer using the Cloud-Aerosol Transport System (CATS) instrument (on board the International Space Station) L2 particle backscatter coefficient at 1064 nm wavelength over the Nairobi region. The monthly-averaged vertical scaling from CATS reported for cells within ~ 50 km from the ground-based monitoring sites, calculated by dividing the sum of the particle backscatter coefficient at 1064 nm for the near-surface aerosol layer by the sum of all the particle backscatter coefficients for the entire column, varies between 0.01 and 0.2 over the months in 2016 for which we had MISR retrievals > 0.15. Unfortunately, the L2 CATS product does not contain information about the backscattering coefficient at or near 550 nm wavelength. The 1064 nm measurements are much less sensitive to the small sized urban particles and more likely weighted toward any transported soil or dust particles, making it difficult to compare the vertical scaling parameter from CATS with that from GEOS-Chem. However, GEOS-Chem has been validated at the regional scale (Marais et al., 2019), so we use the available vertical scaling parameters from the model. The GEOS-Chem monthly scaling factors account for seasonal variation in the aerosol vertical distribution ( Figure S5). However, as all of the ground-based sites fall within one GEOS-Chem grid cell, it does not account for any variation between sites in the local AOD-PM 2.5 relationships. In the ensuing analysis, we allowed for site-specific factors in our regression analysis to compare these relationships over the study area. For future deployments, local, direct measurements of vertical extinction or at least backscatter from lidar would reduce our reliance solely on model-simulated scaling coefficients. Also, having higher-spatial-resolution modeling would allow us to account for any variability on scales smaller than the ~200 km resolution of the GEOS-Chem simulations. One barrier to running the model at higher spatial resolution is the uncertainty in current emissions inventories, especially over much of the developing world.

S2. Details of the Nairobi Example Regression Analysis
Because the satellite data are stable and self-consistent, we explore how well the PM 2.5 reported by the OPCs track the variability in AOD N-S as estimated by MISR and by MAIAC.
We first ran a regression analysis to relate surface PM 2.5 to the total satellite MISR AOD N-S using Equation S1 for our 10 coincident readings, using the reported average PM 2.5 from the OPC-N2 network acquired over the 30 minutes around the Terra overpass time.
PM 2.5 = x (total near-surface AOD N-S ) (with no intercept) (S1) We obtained an adjusted R squared of 0.88, and a of 138.6 (95% Confidence Interval:107, 170.2). This high correlation indicates that the vertical scaling from GEOS-Chem works well for our sites. Table 1 in Section 4 of the main text shows the total near-surface scaling of the MISR AOD N-S and the corresponding PM 2.5 (units: μg/m 3 ) for the ten cases where there are MISR RA retrievals over individual OPC sites in the Nairobi area, and the total-column mid-visible AOD exceeded 0.15.
We also scaled the MAIAC total-column AOD using the GEOS-Chem vertical distribution. We then ran simple linear regressions using the reported 30-minute-averaged PM 2.5 from the OPC-N2 network against the MAIAC AOD N-S in Equation S1 . We had a total of 1712 coincident measurements MAIAC AOD and PM measurements over the course of this study.
We obtained an adjusted R squared of 0.66 and a of 170.7 (95% CI: 165.0, 176.5). Note, however, that when we restrict the AODs considered in the regression to retrievals where the total MAIAC AOD ≥ 0.15 (85 retrievals), i.e., when near-surface pollution likely dominates the total-column aerosol concentration, we obtain an adjusted R squared of 0.80 and a of 138.6 (95% CI: 130.7, 146.5).
When we considered site-specific effects and added a site-specific interaction term in the regression model i.e., using Equation S2.using all 1712 data points, we obtained an adjusted R squared of 0.68 and a 1 of 183.2 (95% CI: 167.0, 199.4) for the Alliance site, 141.4 (95% CI: 130.0, 152.7) for All Saints,194.8 (95% CI: 181.8,207.8) for Kibera Girls Soccer Academy, 207.7 (95% CI: 195.8, 219.6) for St Scholastica and 140.5 (95% CI: 128.8, 152.1) for UNEP. When we restricted the retrievals to those where the total MAIAC AOD ≥ 0.15, we obtained an adjusted R square of 0.85. PM 2.5 = 1 x site-specific-factor x AOD (excluding intercept terms) The lower R squared obtained with the MAIAC measurements using all 1712 measurements compared to MISR is probably because we used only a subset of MISR retrievals for which AOD ≥0.15, to ensure we were including only measurements for which the particle-type retrievals are likely of good quality. These are also cases where the near-surface aerosol likely dominates, favoring assumptions in our application. When we include the low-AOD MAIAC retrievals, noise from the land surface can contribute significantly to the satellite retrieval. In addition, small amounts of transported aerosol above the near-surface layer can introduce larger uncertainties to the analysis when the AOD is low. When we restrict retrievals to total AOD >=0.15, we obtain a much closer correlation between the surface PM and MAIAC AOD retrievals as well. 302 303 304 305  Table 2 (Remember only 85 satellite observations with the total MAIAC AOD ≥ 0.15 are considered in this analysis). The corresponding daily-averaged PM 2.5 from the ground-based OPC in units of μg/m 3 are shown in red. The correlation between the two estimates of PM is 0.47.