Strategies of Method Selection for Fine Scale PM2.5 Mapping in an Intra-Urban Area Using Crowdsourced Monitoring

Fine particulate matter (PM2.5) is of great concern to the public due to its significant risk to human health. Numerous methods have been developed to estimate spatial PM2.5 concentrations in unobserved locations due to the sparse number of 10 fixed monitoring stations. Due to an increase in low-cost sensing for air pollution monitoring, crowdsourced monitoring of fine exposure control has been gradually introduced into cities. However, the optimal mapping method for conventional sparse fixed measurements may not be suitable for this new high-density monitoring approach. This study presents a crowdsourced sampling campaign and strategies of method selection for hundred metre-scale level PM2.5 mapping in an intra-urban area of China. During this process, PM2.5 concentrations were measured by laser air quality monitors and uploaded by a group of 15 volunteers via their smart phone applications during two periods. Three extensively employed modelling methods (ordinary kriging (OK), land use regression (LUR), and regression kriging (RK) were adopted to evaluate the performance. An interesting finding is that PM2.5 concentrations in micro-environments significantly varied in the intra-urban area. These local PM2.5 variations can be effectively identified by crowdsourced sampling rather than national air quality monitoring stations (light-polluted period: (69.67±18.81) – (76.45±14.55) μg m vs. (36.9±10.97) – (41.2±8.68) μg m; heavy-polluted period: 20 (162.72±15.96) – (171.89±21.5) μg m vs. (177.8±16.91) – (188.3±22.4) μg m). The selection of models for fine scale PM2.5 concentration mapping should be adjusted according to the changing sampling and pollution circumstances. Generally, OK interpolation performs best in conditions with non-peak traffic situations during a light-polluted period (hold-out validation R: 0.47–0.82), while the RK modelling can perform better during the heavy-polluted period (0.32–0.68) and in conditions with peak traffic and relatively few sampling sites (less than ~100) during the light-polluted period (0.40–0.69). Additionally, 25 the LUR model demonstrates limited ability in estimating PM2.5 concentrations on very fine spatial and temporal scales in this study (0.04–0.55), which challenges the traditional point about the good performance of the LUR model for air pollution mapping. This method selection strategy provides empirical evidence for the best method selection for PM2.5 mapping using crowdsourced monitoring, and this provides a promising way to reduce the exposure risks for individuals in their daily life.


Introduction
Fine particulate matter (PM2.5) has been associated with an increased risk of morbidity and mortality in both the long-term and the short-term (Beverland et al., 2012;Cohen et al., 2017;Di et al., 2017;Lelieveld et al., 2017). The persistent cumulative effects from exposure in daily activities, especially daily travelling, are critical (Kingham et al., 2013;Hankey et al., 2017). If individuals could consciously choose the location and time of their outdoor activities based on detailed knowledge about the 5 spatiotemporal variation in PM2.5 concentration, then their health protection could be improved.
In situ measurement is the most reliable way to capture the PM2.5 concentrations across every corner of a city in real time.
However, fixed monitoring stations in conventional air quality monitoring networks are sparse. As a result, site-based observations encounter challenges in capturing spatiotemporal variations of air pollutants, especially in intra-urban areas with unevenly distributed emission sources and dispersion conditions (Kumar et al., 2015;Zou et al., 2016;Apte et al. 2017). Spatial 10 mapping methods, including air dispersion modelling, spatial interpolation, satellite remote sensing (RS), and empirical models, have been increasingly employed to estimate concentrations of PM2.5 in unobserved locations over the past two decades (Jerrett et al., 2005;Henderson et al., 2007;El-Harbawi, 2013;Kim et al., 2014;Rice et al., 2015;Fang et al., 2016;Zou et al., 2017;Zhai et al., 2018;Xu et al., 2018;Liu et al., 2018). The outputs of a dispersion model considerably depend on detailed emission inventories and meteorological information, which are not usually available for many cities. The coarse spatial resolution (≥1-15 10 km) of satellite instruments and the data missing problem due to the cloud cover prohibit the widespread use of RS in PM2.5 concentration mapping in urban environments (Zou et al., 2015;Apte et al., 2017).
Conversely, geostatistical and empirical models can estimate concentrations at high spatial resolution with a rather low requirement for data. The most commonly employed models are ordinary kriging (OK) interpolation and land use regression (LUR) modelling. Some studies have improved the estimating accuracy by combining these two technologies (Mercer et al., 20 2011;De Hoogh et al., 2018). While they have been successfully applied to map the spatial variability of PM2.5 concentrations in various geographic areas, their accuracy varies as the concentration levels and sample sizes change (Wang et al., 2012;Mercer et al., 2011;Lee et al., 2014;Zou et al. 2015;Gillespie et al., 2016;Choi et al., 2017;De Hoogh et al., 2018).
Due to an increase in low-cost sensing for air pollution monitoring, the real-time strategies for fine exposure control in cities have been further developed (Kumar et al., 2015). Crowdsourced monitoring that enables citizens to produce geospatial data 25 is constantly growing and shows considerable potential (Heipke, 2010). Large and diverse groups of people who lack formal training can easily describe their environments with a mobile phone or smart phone and upload data via informal social networks and web technology. Unlike traditional fixed monitoring stations that are usually mounted on roofs (i.e., 3 to 20 metres above the ground) for the sake of instrument protection, crowdsourced monitoring provides real-time PM2.5 monitoring that reflects the real exposure for individuals who live and work on the ground. Although crowdsourced monitoring tends to 30 produce observations with questionable quality, it enables us to obtain measurements of ambient air pollution in dense networks at relatively low cost. Some studies have employed these data to display the air pollution concentration and investigate the exposure risks (Thompson, 2016;Miskell et al., 2017;Jerrett et al., 2017). These observations are still point measurements that are only representative of the limited area around the site and cannot satisfy the demand of obtaining the air pollution concentration whenever and wherever we want.
One way to address the previously mentioned challenge is to combine high-density crowdsourced observations with spatial mapping methods. An important investigation was performed by Schneider et al. (2017) in Oslo, Norway. They presented a universal kriging technique for urban NO2 concentration mapping that combines near-real-time crowdsourced observations of 5 urban air quality with output from an air pollution dispersion model. However, high-density crowdsourced measurements may vary among urban microenvironments with different human daily activities and among sparsely distributed conventional in situ measurements. Using the elected mapping methods from previous studies to depict the variation in air pollution on a very fine spatial and temporal scale with new monitoring ways may cause the misclassification of exposure and an underestimation of risk. As the number of valid crowdsourced observations may significantly change due to instrument faults, human error, 10 and other quality issues, the applicability of mapping methods to different sampling sizes needs sound scientific evidence.
In this study, we presented strategies of method selection for PM2.5 concentration mapping based on crowdsourced datasets with varying size. The intra-urban crowdsourced sampling campaign was conducted in the city of Changsha, China, over two periods in different pollution scenarios. The performance of OK, LUR and regression kriging (RK) in estimating PM2.5 pollution was evaluated and compared with an increasing number of training sites. The best performing method was employed 15 to plot the variation in the hourly PM2.5 concentration and identify the pollution hotspots in the intra-urban area. The results from this study will provide evidence for the method selection of PM2.5 mapping using crowdsourced monitoring and significantly contribute to efficient air pollution mapping and exposure assessment in intra-urban areas.

Measurement instrument
The portable laser air quality monitor SDL307 (produced by NOVA FITNESS Co., Ltd.) is employed to perform sampling.
The monitor manual can be downloaded from http://www.inovafitness.com/index.html. This monitor can be conveniently carried with a total size of 25×34×14 cm (Fig. 1a). According to the test report provided by the Center for Building Environment Test at Tsinghua University, the maximum relative error of this monitor is ±20% compared with a regulatory 25 monitor in the 20-1000 µg m -3 range and has a resolution of 0.1 µg m -3 . The concentration of particulate matter is measured using the light-scattering method (Fig. 1b). The monitor contains a special laser module, and the signals are recorded by a photoelectric receptor when particulate matter passes through laser light. The count and size of particulate matter are then analysed by a microcomputer after the signals are amplified and converted. Their mass concentrations are calculated based on the conversion factor between the light-scattering method and the tapered element oscillating microbalance technology. 30 To ensure the data quality of this monitor, we placed 115 laser air quality monitors in the same environment and continuously observed them for one week during each of the four seasons. If the relative error between the observation of one monitor and the average observations of the other monitors exceeded 5%, this monitor fell into disuse. This procedure was conducted both indoors and outdoors. Subsequently, 86 monitors with rather stable performance and a small difference between each observation remained. In addition, we randomly selected 30 portable laser air quality monitors to compare with the national 5 monitoring instruments to further guarantee the reliability of the sampling data. First, for ease of operation, three national air quality monitoring stations were selected. Second, for each station, 10 monitors were observed next to the national monitoring instrument (~15 metres above the ground in the study area) from 8:00 to 20:00 on December 20-22, 2015 and from 8:00 to 20:00 on December 29-31, 2015. The weather on December 20-22 was overcast with patchy drizzle and light rain at times, and the relative humidity (RH) ranged from 77% to 94%, while the weather on December 29-31 was cloudy with some 10 sunshine and a RH that ranged from 38%-67%.
The scatter plots and descriptive statistics of the valid hourly average PM2.5 concentrations from the laser air quality monitors and the national monitoring instruments were presented in Fig. 1c and

Sampling design 20
The sampling area is located in the Changsha metropolitan area (112°49′-113°14′E, 27°58′-28°24′N), which covers an area of approximately 920 km 2 and seven districts (refer to Fig. 2). Changsha is the capital of Hunan Province with a population that exceeds 7 million people. The area experienced high-level exposure to air pollutants due to an increase in anthropogenic activities and intensive energy consumption.
To ensure that the sampling sites exhibit a relatively even and typical distribution for different urban microenvironments (i.e., 25 residential community, building site, school, and park), a series of rules were designed to determine the potential PM2.5 sampling sites based on the distribution of potential emission sources (refer to Table 1). The data that support the sampling design consist of important points of interest (POI), dust surfaces, and main road networks. POI data includes industrial parks, enterprises, factories, depots, hospitals, schools, and parks. Dust surfaces refer to natural and artificial bare surfaces with vegetation that covers less than 10%, which easily produce atmospheric particulate matter, such as construction sites, stacked 30 substance, and natural bare land. These data were collected from the Information Center of Land and Resources of Hunan Province. More than three observations of PM2.5 concentrations are required every hour for each potential sampling site to improve the reliability of the sampling data. Given that the number of laser air quality monitors and the distance that a volunteer can walk in one hour are limited, only 2-4 sites can be set in the area in which a monitor can cover during the sampling. Therefore, a total of 208 potential PM2.5 sampling sites were selected. The centre of each area covered by a monitor were numbered in sequence (i.e., 1-86). The monitors were also numbered and labelled.

Sampling and data processing
Sampling was performed in two time periods in the winter of 2015 to examine the effect of air quality grades on the mapping 5 results. The first period fell between 8:00 and 12:00 on December 24. In this period, the official air pollution levels were "Good" and "Moderate" (i.e., Period 1, light-polluted period). The weather was overcast with occasional rain or drizzle, and the relative humidity (RH) ranged from 95% to 98%. The second period extended between 14:00 and 18:00 on December 25, when an orange warning signal of haze (i.e., official air pollution level was "Heavily Polluted") was released by the Changsha Meteorology Bureau (i.e., Period 2, heavy-polluted period). The weather was cloudy with some sunshine, and the RH ranged 10 from 39%-43%.
Before sampling started, every volunteer received one monitor and went to the corresponding area. At each potential monitoring site, the volunteer lifted the monitor (~2 metres above the ground) and held it for at least 60 seconds to measure the PM2.5 concentration. The observations were uploaded twice to four times hourly using a smart phone application (App) that we developed. The geographic coordinates of the sampling sites were also uploaded. For each hour, we eliminated the 15 sampling sites with less than three observations. The valid observations were then averaged at each site. As some volunteers quit after the sampling of the first period, the sampling sites in period 2 were concentrated in the central study area. A total of 179-208 samples were successfully collected at each hour in Period 1, and 105-118 samples were successfully collected in Period 2. The official observations at 10 national monitoring stations in the study area were also obtained (China Environmental Monitoring Center, CEMC: http://106.37.208.233:20035/) and averaged for comparison purposes. 20

Ordinary kriging
OK estimates the target variable at an unsampled location as a linear combination of neighbouring observations. OK relies on a weighting scheme, where closer observations have a greater impact on the final prediction. The weighting scheme is dictated by the variogram (Pang et al., 2010;Zou et al., 2015) and can be described as where Z * (X 0 ) is the estimation of an unknown sample point; Z(X i ) and ω i are the value of the i th known sample point surrounding the unknown sample point and its corresponding weight, respectively; and n is the number of known sample points.

Land use regression
LUR modelling predicts the air pollution concentration by linking measurements of monitoring sites and geographic elements around them using the least squares method. LUR is composed of predictor variable extraction and selection and regression 10 modelling and validation.
Geographic factors including pollution sources (dust surface and pollution industries), road networks, and land use/cover were employed to indirectly characterise the PM2.5 emissions in this study. These data were generated using multiple ring buffers with different radii (50-1000 m) at each monitoring site. Meteorological data including wind speed, atmospheric pressure, relative humidity, and temperature of 107 sites in and around the sampling area, which may affect the dispersion of PM2. 5,15 were also obtained. Geographic factors were made available by the Information Centre of Department of Land and Resources of Hunan Province. Meteorological data were released by the Hunan Meteorology Bureau. All variables (Table 2) were extracted using ArcGIS (version 10.0). The optimal buffer radius for the percentage of dust surfaces and land use, pollution industries density, and road density were defined based on the maximum Pearson correlation coefficients.
An automatic forward-backward stepwise regression procedure was employed to select the best fitting LUR models based on 20 the screened-out predictors. The final LUR models in this study were determined based on the criteria of the lowest Akaike information criterion (AIC) value and the highest fitting R 2 . The model structure can be expressed as 2.5, = 0 + 1 1, + 2 2, + ⋯ + , + , where 2.5, is the estimation of the hourly averaged PM2.5 concentration of site s, , (i=1,2,⋯,n) are independent variables, 0 is a constant, (i=1,2,⋯,n) are regression coefficients, and μ is the random error estimated using the least squares method. 25 This process was conducted in R statistical software (version 3.3.2) (Fox and Weisberg 2011, R Core Team 2016).

Regression kriging
RK is a two-stage statistical procedure in this study. First, separate standard LUR models were developed based on crowdsourced observations in the training dataset for each hour. Second, the residuals for the LUR models was calculated and interpolated for each hour using OK technology. Finally, the estimations of the residuals at the validation sites were extracted and added to the LUR estimations.
In this study, OK was performed using the Geostatistical Analyst Tool of ArcGIS (version 10.0), and interpolated residuals were obtained using the Extract Values to Point Tool. The entire process was implemented with Python scripts.

PM2.5 concentration mapping 5
The method that performed best with 90% training sites was chosen as the mapping method. Using this method, the spatial distributions of the PM2.5 concentration for each hour were estimated with all samples. In this study, nearest neighbour distances between two sampling sites ranged from 15 to 60 metres for Period 1 and 54 to 98 metres for Period 2. Considering the resolutions of the potential predictors, 100 metres was used as the mapping grid size. The spatial distributions of the PM2.5 concentration for each hour with measurements of 10 national monitoring stations were estimated using the same method for 10 comparison. Table 3   sampling sites in areas along the Xiangjiang River, especially in the higher education mega centre experienced extreme PM2.5 pollution (> 210 µg m -3 ).

Model performance for OK, LUR and RK
The box plots of Fig. 4 show the variation in the hold-out validation R 2 for the three mapping approaches in relation to the number of training sites. The average and standard deviation of the RMSE and MRE between the observed concentration and 5 predicted concentration of PM2.5 in the hold-out validation were presented in the Supporting Information (Table S3-S4). The average values and variability ranges of R 2 for OK, LUR and RK were positively associated with an increase in the number of training sites. RK performed best in Period 2 and at 8:00 and 12:00 of Period 1 with training sites less than ~100. The LUR demonstrated the poorest performance for both periods of the models tested.

Spatial patterns of crowdsourced PM2.5 concentration
For Period 1, crowdsourced PM2.5 concentrations generally increased from south-east to north-west with multiple hot spots. In the central and south regions of the study area, areas with a larger number of factories that experience a relatively higher PM2.5 concentration than other areas. The national monitoring PM2.5 concentrations, however, were less than 55 µg m -3 with limited 5 spatial variation. For Period 2, with the exception of 14:00, the national monitoring PM2.5 concentration maps showed higheast and low-west patterns. PM2.5 concentrations of central Yuelu district were rather low (<175 µg m -3 ). Crowdsourced PM2.5 concentrations demonstrate extensive cold spots of PM2.5 concentrations in southern Changsha County and the southern Kaifu district, while southern Yuelu and western Tianxin with a high-density of factories and roads were hot spots of PM2.5 concentration. 10

Discussion
Aimed at efficiently mapping the PM2.5 concentration in an intra-urban area at a fine scale using crowdsourced monitoring, a high-density crowdsourced sampling campaign and strategies of the popular mapping method selection with an increase in training sites were presented in China for the first time.
The number of sampling sites were 18 and 10 per 100 km 2 for Period 1 and Period 2, respectively. These data comprise a 15 considerable improvement compared with a density of approximately 0.015 sites per 100 km 2 in the national air quality monitoring network in China. As expected, crowdsourced PM2.5 measurements demonstrated detailed spatial variation among urban microenvironments, and these variations can hardly be disclosed by sparse national air quality monitoring stations. This finding suggests that crowdsourced sampling can effectively improve the density of PM2.5 monitoring at a rather low monetary cost and can be supportive of the short-term air pollution exposure assessment for epidemiologic studies at a fine scale. To 20 explore the spatial variation in the PM2.5 concentration for various urban microenvironments and compare with the national air quality measurements, the crowdsourced monitoring is assumed to cover a certain number of areas. However, persuading the general public in these areas to continuously observe and upload PM2.5 concentrations during their activities of daily living through a designed study is difficult. We employed a batch of volunteers to model their behaviours on the general public's behaviour and simultaneously collect data. This approach is a preliminary practice of crowdsourced monitoring and can be 25 further developed and improved in the long-term exposure assessment at the fine scale in the future with the progress in lowcost wearable air quality monitors and automatic processing techniques of crowdsourced data.
The hourly PM2.5 concentrations between crowdsourced sampling sites and national monitoring stations were rather different; this difference varied as the official air quality level changed. The crowdsourced PM2.5 concentrations were substantially larger than the national concentrations in Period 1 (light-polluted) and slightly lower in Period 2 (heavy-polluted). One possible 30 reason is that the national monitoring stations in the study area were installed on the roofs of mid-rise buildings (i.e., ~15 m) with ventilation and spaciousness, while crowdsourced sampling was conducted on the real ground (i.e., ~2 m). The change in the major pollution sources and meteorological conditions in the study area may contribute to the difference between two periods; the major contribution of local sources, especially the vehicle emission and the very high RH (95%-98%) during the light-polluted period, may cause the accumulation of PM2.5 near the ground; and the sources of long-range transport of regional pollution during the heavy-polluted period can increase the concentration of PM2.5 on the upper layer. This finding suggests that the air pollution exposure risk may remain relatively high for the public on the ground in some urban microenvironments, 5 even when official air pollution levels are "Good" and "Moderate" and sensitive groups should consider reducing some outdoor activities. The results confirm the necessity of developing real-ground high-density crowdsourced PM2.5 monitoring networks.
Although the low-cost sensor and the use of optical particle detection of monitors in sampling may cause inaccuracies in measurements, we have attempted to minimise the uncertainty by disusing the relatively inaccurate monitors (MRE>5%) used in preliminary indoor and outdoor experiments. Comparison experiments between laser air quality monitors and the national 10 monitoring instruments were also conducted at the same positions and heights for two time slots; the weather conditions and air quality scenarios of the two time slots were similar to the two sampling periods (i.e., overcast with light rain, RH≥76%: December 20-22 vs. Period 1; cloudy with sunshine, RH≤67%: December 29-31 vs. Period 2). The relatively good agreement between the hourly PM2.5 concentrations of laser monitors and those of national instruments had guaranteed the reliability of sampling data to a certain extent. The relative humidity may have slightly influenced the crowdsourced PM2.5 concentrations 15 in the light-polluted period since December 20-22 yielded a slightly lower R 2 and RMSE than those of December 29-31 but a higher MRE than that of December 29-31. However, the relative error of PM2.5 observations in preliminary and comparison experiments were generally small and fluctuated without distinct trends and leading factors. During the following procedure of mapping method selection, three methods were performed with the same dataset, which caused a limited influence of uncertainty in measurements on the method comparison results; therefore, we did not correct the measurements in this study. 20 However, more efforts are needed in crowdsourced measurements correction and uncertainty analysis in air pollution concentration mapping at high resolution for accurate exposure assessment in the future.
Unlike previous studies that conducted performance comparisons of OK, LUR and RK in estimating air pollution concentration on an annual and seasonal scale based on measurements from sparse regulatory stations (Mercer et al., 2011;Lee et al., 2014;Zou et al. 2015;Choi et al., 2017;De Hoogh et al., 2018), this research is the first study to evaluate and compare their 25 performance with an increase in the number of training sites at an hourly scale using crowdsourced monitoring.
As expected, the performance of three methods improved with an increase in the number of training sites. Compared with former studies that normally developed in other fields (e.g., spatial variability analysis of soil components in the environmental sciences) (Li and Heap, 2014), this study further confirmed the better performance of OK interpolation with larger training data sets in air pollution estimation. We substantiated the findings of Johnson et al. (2010), who discovered that LUR models 30 developed with fewer sampling sites may perform poorly using real-ground PM2.5 measurements. However, average hold-out validation R 2 (0.04-0.55) between the observed concentration and predicted concentration of PM2.5 in this study were smaller than the results in Johnson et al. (2010) (0.29-0.67) and similar studies of NO2 presented by Wang et al. (2012) and Gillespie et al. (2016) (0.44-0.85). The variations in the hourly average PM2.5 concentration between two sampling sites were generally sharper compared with the annual average values. The meteorological condition had a more sensitive role in the short-term transmission and diffusion of PM2.5 than the long-term processes. These findings suggest that the most effective way to improve the accuracy of the mapping method continues to increase the number of sampling sites and confirm the necessity of developing high-density crowdsourced sampling for PM2.5 monitoring. However, the increased variability ranges of R 2 and the standard deviation of RMSE and MRE with an increase in the number of training sites also suggest that the performance of these 5 methods was affected by more than sampling size. The spatial distribution of the samples, for example, may influence their estimating accuracy (Li and Heap 2014).
Contrary to the findings of Zou et al. (2015) and Choi et al. (2017) conduced at the annual scale, OK interpolation surprisingly showed a better performance in estimating the PM2.5 concentrations compared with the LUR modelling with a substantially higher average R 2 and lower RMSE and MRE. RK also performed better than LUR (0.32-0.71 vs. 0.04-0.55), which is 10 consistent with the findings of Mercer et al. (2011) (0.67-0.75 vs. 0.48-0.74) andDe Hoogh et al. (2018) (0.66 vs. 0.59). RK had the highest accuracy in Period 2 and at 8:00 and 12:00 of Period 1 with less than ~100 training sites. These results suggest that OK interpolation based on crowdsourced sampling is the best strategy for the PM2.5 mapping in the intra-urban area when the official air pollution levels are "Good" and "Moderate" for non-peak traffic conditions in this study, while RK is the best strategy when the pollution levels are "Heavy-polluted". These findings challenge the traditional point on the LUR model's 15 good performance in air pollution mapping and verify that the applicability of mapping methods varies as the monitoring technology and sampling density change. In addition, the accuracy of OK and LUR were distinctly higher for Period 1 (0.24-0.82; 0.13-0.55) than for Period 2 (0.18-0.59; 0.04-0.42), while that for RK was rather stable (0.40-0.71 vs. 0.32-0.68). This finding indicates the robustness and generalisation capability of RK in estimating the PM2.5 concentration.
Using the selected mapping method, the spatial distributions of the hourly PM2.5 concentration based on crowdsourced 20 sampling data and national air quality observations were successfully plotted and compared. The former distribution provides more information about the intra-urban PM2.5 variations than the latter distribution. The nearest-neighbour distances that range from 15 to 60 m between two crowdsourced sampling sites enable PM2.5 concentration mapping to attain the hundred metrescale level. In the light-polluted period, this phenomenon was more pronounced. These findings not only suggest the support of crowdsourced activities in PM2.5 monitoring on a fine scale but also prompt us to pay more attention to the scenarios with 25 low-level air pollution. This outcome is critical to the long-term future of air pollution prevention and control and public health protection for China, since the main emphasis has gradually shifted from the control of heavy pollution to the prevention of exposure risks.
As the crowdsourced PM2.5 concentrations maps revealed, areas with a larger number of factories and high-density of roads experienced relatively higher PM2.5 concentrations, while areas with high levels of green vegetation cover had lower PM2.5 30 concentrations. The relatively high concentration in the northwest corner of the study area with few factories in Period 1 may be attributed to the dust deposition from construction activities promoted by a high RH in this newly developed zone. This finding suggests that optimising the distribution of land use may improve the air quality to some extent and strengthening the control of local emission may be the primary way to reduce pollution in the light-polluted period. As the urban air quality grade has an important effect on the spatial distribution of samples (spatial autocorrelation, and heterogeneity), which may also be affected by sample size, the mechanism for this influence is somewhat equivocal and needs further research.

Conclusions
This study presented strategies of method selection for efficient PM2.5 concentration mapping with an increasing number of training sites using crowdsourced monitoring. The results confirmed that PM2.5 concentrations in microenvironments varied 5 across the intra-urban area in China's cities. These variations can be clearly disclosed by the crowdsourced PM2.5 sampling rather than the national air quality monitoring stations. The selection of models for fine scale PM2.5 concentration mapping should be adjusted with changing sampling and pollution circumstances. Generally, ordinary kriging (OK) interpolation performs the best in conditions with non-peak traffic situations in the light-polluted period, while regression kriging (RK) can perform better in the heavy-polluted period and conditions with peak traffic and relatively few sampling sites in the light-10 polluted period. Additionally, note that the land use regression (LUR) model demonstrates a limited ability in estimating PM2.5 concentrations at very fine scale in this study. This method selection strategy provides empirical evidence for the method selection of PM2.5 mapping using crowdsourced monitoring and a promising way to reduce the exposure risks for individuals in their daily lives.  a Ui (i=1, 2, …): ith subset of the set of potential PM2.5 sampling sites. b Ai (i=1, 2, …): ith subset of the union of supporting data. c X: element belongs to the set.
5 Table 2. Description of potential predictor variables for LUR. GIS dataset Predictor Variables Unit Buffer size (radius in metres) Figure 1: Principle and accuracy of measurement instrument. Y and X are laser air quality monitors and national monitoring instruments, respectively. The black dots, blue dots and red dots indicate PM2.5 observations with relative error of <10%, 10%−20%, and >20%, respectively, between two instruments. The black dotted line and red dotted line are the 1:1 line and 1:1.2 line as references.   RK with an increase in training sites: (a) Period 1; (b) Period 2. The boundaries of the boxes indicate the 75th percentile and 25th percentile (Q3 and Q1, respectively). The line within the box denotes the median (Q2), and the crosses denote the averages. The error bars above and below indicate the highest datum (Q3+1.5IQR, IQR is the interquartile range, IQR=Q3-Q1) and the lowest datum (Q1-1.5IQR), respectively. Dots above and below the error bars indicate the outliers.