Preprints
https://doi.org/10.5194/amt-2021-64
https://doi.org/10.5194/amt-2021-64

  30 Mar 2021

30 Mar 2021

Review status: this preprint is currently under review for the journal AMT.

Estimation of PM2.5 Concentration in China Using Linear Hybrid Machine Learning Model

Zhihao Song1, Bin Chen1, Yue Huang1, Li Dong1, and Tingting Yang2 Zhihao Song et al.
  • 1Atmospheric Science College of Lanzhou University, Lanzhou 730000, China
  • 2Gansu Seed General Station, Lanzhou 730030, China

Abstract. The satellite remote-sensing aerosol optical depth (AOD) and meteorological elements were employed to invert PM2.5 in order to control air pollution more effectively. This paper proposes a restricted gradient-descent linear hybrid machine learning model (RGD–LHMLM) by integrating a random forest (RF), a gradient boosting regression tree (GBRT), and a deep neural network (DNN) to estimate the concentration of PM2.5 in China in 2019. The research data included Himawari-8 AOD with high spatiotemporal resolution, ERA-5 meteorological data, and geographic information. The results showed that, in the hybrid model developed by linear fitting, the DNN accounted for the largest proportion, whereas the weight coefficient was 0.62. The R2 values of RF, GBRT, and DNN were reported 0.79, 0.81, and 0.8, respectively. Preferably, the generalization ability of the mixed model was better than that of each sub-model, and R2 reached 0.84, whereas RMSE and MAE were reported 12.92 µg/m3 and 8.01 µg/m3, respectively. For the RGD-LHMLM, R2 was above 0.7 in more than 70 % of the sites, whereas RMSE and MAE were below 20 µg/m3 and 15 µg/m3, respectively, in more than 70 % of the sites due to the correlation coefficient having seasonal difference between the meteorological factor and PM2.5. Furthermore, the hybrid model performed best in winter (mean R2 was 0.84) and worst in summer (mean R2 was 0.71). The spatiotemporal distribution characteristics of PM2.5 in China were then estimated and analyzed. According to the results, there was severe pollution in winter with an average concentration of PM2.5 being reported 62.10 µg/m3. However, there was slight pollution in summer with an average concentration of PM2.5 being reported 47.39 µg/m3. The findings also indicate that North China and East China are more polluted than other areas and that their average annual concentration of PM2.5 was reported 82.68 µg/m3. Moreover, there was relatively low pollution in Inner Mongolia, Qinghai, and Tibet, for their average PM2.5 concentrations were reported below 40 µg/m3.

Zhihao Song et al.

Status: open (until 25 May 2021)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Zhihao Song et al.

Zhihao Song et al.

Viewed

Total article views: 185 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
152 29 4 185 1 4
  • HTML: 152
  • PDF: 29
  • XML: 4
  • Total: 185
  • BibTeX: 1
  • EndNote: 4
Views and downloads (calculated since 30 Mar 2021)
Cumulative views and downloads (calculated since 30 Mar 2021)

Viewed (geographical distribution)

Total article views: 142 (including HTML, PDF, and XML) Thereof 142 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 19 Apr 2021
Download
Short summary
The results show that the RGD–LHMLM can achieve the expected target well. The overall inversion accuracy (R2) of the model can be achieved 0.84 and the RMSE is 12.92µg/m3. For the model, R2 was above 0.7 in more than 70% of the sites, whereas RMSE and MAE were below 20 µg/m3 and 15 µg/m3, respectively. There was severe pollution in winter with an average PM2.5 concentration of 62.10 µg/m3. However, there was slight pollution in summer with an average PM2.5 concentration of 47.39 µg/m3.