Research of Low-cost Air Quality Monitoring Models with Different 1 Machine Learning Algorithms

. To improve the performance of the calibration model for the air quality monitoring, a low-cost multi-parameter air quality 8 monitoring system (LCS) based on different machine learning algorithms is proposed. The LCS can measure particulate matter 9 (PM 2.5 and PM 10 ) and gas pollutants (SO 2 , NO 2 , CO and O 3 ) simultaneously. The multi-input multi-output (MIMO) prediction 10 model is developed based on the original signals of the sensors, ambient temperature ( T ) and relative humidity ( RH ), and the 11 measurements of the reference instrumentations. The performance of the different algorithms (RF, MLR, KNN, BP, GA-BP) with 12 the parameters such as determination coefficient R 2 , Root Mean Square Error (RMSE), mean square error (MSE) and mean absolute 13 error (MAE) are compared and discussed. Using these methods, the R 2 of the algorithms (RF, MLR, KNN, BP, GA-BP) for the 14 PM is in the range 0.68 - 0.99; the RMSE values of PM 2.5 and PM 10 are within 2.36 - 18.68 μgm -3 and 4.55 – 45.05 μgm -3 , 15 respectively; the MAE values of PM 2.5 and PM 10 are within 1.44 - 12.80 μgm -3 and 3.21-23.20 μgm -3 , respectively. The R 2 of the 16 algorithms (RF, MLR, KNN, BP, GA-BP) for the gas pollutants (O 3 , CO and NO 2 ) is within 0.70 - 0.99; the RMSE values for 17 these pollutants are 4.05


Introduction
The development along with increased population and urbanization brings disadvantages, such as decreasing air quality and impact on public and individual health (Khreis et al., 2022;Manisalidis et al., 2020;Singh et al., 2021).Among the atmospheric pollutants, the primary pollutant is fine particulate matter, which affects the respiratory system and cardiac activity of humans.The secondary pollutants are SO 2 , CO, NO x , and O 3 , which also induce disease or chronic poisoning.To improve the understanding of air pollution exposure and predict future air quality trends (Zimmerman et al., 2018), air quality assessment and forecasting are the essentials.The conventional air quality monitoring instrumentations are high cost, which has limited the spatial coverage of the monitoring stations (Zimmerman et al., 2018).The development and applications of the low-cost commercially available sensor-based air quality monitoring system (LCS) would considerably reduce both installation and maintenance costs (Spinelle et al., 2017).The larger spatial density of the air quality grid monitoring network becomes possible, which would play an important role in monitoring pollution trends, locating pollution sources, supporting environmental management (Zhao et al., 2019), and supporting better epidemiological models (Khreis et al., 2022;Zimmerman et al., 2018).These demands promote the LCS growing gradually (Cui et al., 2021;Wang et al., 2016).
Published by Copernicus Publications on behalf of the European Geosciences Union.
G. Wang et al.: Research of models with different machine learning algorithms The LCS typically utilizes the electrochemical or lightscattering sensors for gas-phase or particulate pollutants measurement, such as sulfur dioxide (SO 2 ), nitrogen oxide (NO 2 ), carbon monoxide (CO), ozone (O 3 ), and particulate matter (PM).These electrochemical sensors have intrinsic problems, such as temperature or humidity impacts, and gaseous cross-sensitivities (Spinelle et al., 2015(Spinelle et al., , 2017;;Jiao et al., 2016;Zimmerman et al., 2018).For example, limited by the poor selection performance, the NO 2 electrochemical sensor also undergoes redox reactions in the presence of O 3 gaseous pollutants.The diffusion coefficient of the electrochemical sensor can be affected by temperature and relative humidity (Hitchman et al., 1997;Masson et al., 2015).The reagent of the electrochemical sensor is consumed over time, which affects the stability of the sensor.These features of the sensors have historically been poorly addressed by laboratory calibrations, limiting the utility for air quality monitoring (Zimmerman et al., 2018).
The de-convolving of cross-sensitivity effect and stability on sensor performance is complex (Zimmerman et al., 2018).The linear or multivariate linear calibration models (Alexopoulos, 2010;Khreis et al., 2022;Zoest et al., 2019) have been developed.However the performance is poor in ambient data (Khreis et al., 2022).The accurate and precise calibration models for the low-cost sensors are particularly critical to the success of dense sensor networks, as poor signal-tonoise ratios and cross-sensitivities hamper their ability to distinguish the pollutant concentrations.There has been increasing interest in multifarious algorithms for low-cost sensor calibration, and lots of studies using multi-input multi-output models (Alexopoulos, 2010) and neural networks (Spinelle et al., 2015) have been published.The artificial neural network (ANN) calibration model has the intelligence to process nonlinear data (Amuthadevi et al., 2021;Janabi et al., 2021), which has been used in calibration models for measuring ozone or nitrogen oxide (Esposito et al., 2016;Spinelle et al., 2015).For example, the ANN calibration model was used to calibrate O 3 , and the uncertainty could meet the European data quality objectives; however, meeting these objectives for NO 2 remained a challenge (Spinelle et al., 2015).Dynamic neural network calibrations of NO 2 sensors were demonstrated with the mean absolute error less than 2 ppb; however, the performance for O 3 was not the same (Esposito et al., 2016).High-dimensional multi-response model was used to calibrate CO, NO, NO 2 , and O 3 , with the 5 min average RMSE values of 39.2, 4.52, 4.56, and 9.71, respectively (Cross et al., 2017).A random-forest-based machine learning algorithm was used to improve the calibration strategies of low-cost sensors, with the mean absolute error values 38 ppb for CO, 10 ppm for CO 2 , 3.5 ppb for NO 2 , and 3.4 ppb for O 3 , respectively (Zimmerman et al., 2018).Furthermore, multiple-linear-regression-based (Ionascu et al., 2021) temperature and humidity correction and ANN-based calibration have shown potential for significant further improvement for leave-one-out cross-validation (Ali et al., 2021).
With the 16 d process, the combined supervision calibration model was used to improve the R 2 of SO 2 , NO 2 , and O 3 by 75.8 %, 38.6 %, and 4.7 % to 0.58, 0.61, and 0.90, respectively (Cui et al., 2021).An integrated genetic programming dynamic neural network model was used to accurately estimate the carbon monoxide and nitrogen dioxide pollutant concentrations from the multi-sensor measurement data (Ari and Alagoz, 2022).A predictive model using multilayer perceptron, support vector regression, and linear regression was developed to analyze the CO 2 and invehicle particulate matter, with the R 2 of 0.9981 (Goh et al., 2021).The convolutional neural network (CNN), long-shortterm-memory-convolutional-neural-network (LSTM-CNN), and CNN-LSTM models were used to improve the prediction performance of the ozone by 3.58 %, 1.68 %, and 3.37 %, respectively (Rezaei et al., 2023).However, these calibrations have only been tested utilizing fewer models with a short measurement period and small number of sensor matrices, each containing one sensor per pollutant (Cross et al., 2017;Esposito et al., 2016;Spinelle et al., 2015); they have not been utilized to evaluate and predict the concentration values of multi-pollutants simultaneously, such as PM 2.5 , PM 10 , SO 2 , NO 2 , CO, and O 3 .
The random forest (RF) (Breiman, 2001;Liu et al., 2012), multivariate linear regression (MLR) (Alexopoulos, 2010), K-nearest neighbor (KNN) (Zhao and Lai, 2021), back propagation (BP) neural network (Xu et al., 2021), and geneticalgorithm-back-propagation neural (GA-BP) network (Ning et al., 2019;Wang et al., 2019) are five commonly used machine learning algorithms with different characteristics.With the strong nonlinear mapping ability and adaptive ability, the RF, BP, and GA-BP are suitable for processing complex, high-dimensional, and nonlinear data with high prediction accuracy, such as the air quality monitoring.With the purpose of quantifying the degree of influence of the independent variable, the MLR is suitable for evaluating the influence of multiple independent variables on the dependent variable, such as the cross-sensitivity effect between different factors.The KNN is also a widely common algorithm to compare with RF, BP, GA-BP, and MLR.
In this work, the LCS is developed to measure PM 2.5 , PM 10 , SO 2 , NO 2 , CO, and O 3 simultaneously, and the performances of the calibration strategies based on the five machine learning algorithms are contrasted.Taking the original electronic signals of the sensors as input and measurements obtained by the reference instrumentations as output, five calibration strategies are applied and contrasted.The measurement is implemented under real-world conditions within almost a 12-month period (1 March 2021 and 28 February 2022) spanning multiple seasons and a wide range of meteorological conditions to ensure calibration model robustness.The performance of the different algorithms with the parameters, such as determination coefficient (R 2 ), root mean square error (RMSE) (Janabi et al., 2021), and mean absolute error (MAE), is compared and discussed.The rest of this paper is organized as follows.The measurement setup is described in Sect. 2. The principles of the calibration strategies are presented in Sect.3. The results and discussion are shown in Sect. 4. The conclusion and discussion are drawn in Sect. 5.

Measurement setup
This section describes the measurement site and data collection, schematic block of the LCS, and the reference instrumentation.The low-cost here is defined as below USD 150 per pollutant, commercial availability, and low maintenance.The sensors typically utilize electro-chemical signal and scattering light intensity for gas-phase pollutant (SO 2 , NO 2 , CO and O 3 ) and particle pollutant (PM 2.5 , PM 10 ) measurements.

Measurement site and data collection
Measurements for gas-phase pollutants and particle pollutants were made continuously between 1 March 2021 and 28 February 2022, which were used as the start and end dates for the analyses.The location, shown in Fig. 1, was 30 Yaochang Street, Zhongyuan District, Zhengzhou, Henan Province of China.There was an independent reference monitoring system for PM 2.5 , PM 10 , CO, SO 2 , NO 2 , and O 3 measurement.The LCS was mounted at a consistent height with the reference monitoring system.The time taken for one set of data collection was 1 min and repeated four times.The outlier of the four sets of data was eliminated by using the Dixon principle.The remaining data were used to get the mean values for each experiment.The values of the LCS and reference instruments were separately logged to the server with an interval of 5 min.During the measurement period, the ranges of the ambient temperature and relative humidity separately were −5 to +50 • C and 10 % to 98 %.

Schematic block of LCS
In this study, the LCS is developed by Hanwei Electronics Group Corporation, and its schematic block diagram is shown in Fig. 2. The LCS uses a commercially available particulate matter sensor (PM3006, Cubic sensor and Instrument Co., China) and electrochemical SO 2 , NO 2 , O 3 , and CO sensors (B4, Alphasense, UK), respectively.The particulate matter sensor device is a laser-diode (LD)-based particle sensor, using a spectrophotometer to measure the particle scattering light intensity.The PM sensor device (PM3006) can measure size-dependent PM 2.5 and PM 10 concentration of the particles in the size range of 0.3 to 10 µm.The gas pollution (SO 2 /NO 2 /O 3 /CO) sensors used are with four electrodes (i.e., reference, worker, counter, and auxiliary electrodes), where the auxiliary electrode is not exposed to the target analyte to account for changes in the sensor baseline signal under different meteorological conditions (Mead et al., 2013).
The electrochemical sensor outputs are measured using electronic circuitry designed by Hanwei and optimized for signal stability.The circuitry is developed with custom electronics to drive the device, multiple stages of filtering circuitry for specific noise signatures, and an analog-to-digital converter for measurement of the conditioned signal.
Due to the redox reaction on the anode and the cathode of the electrochemical sensor, the movement of charge between the electrodes produces a current proportional to the analyte reaction rate, which can be used to determine the analyte concentration (Mead et al., 2013) and whether the sensor is working effectively.
Before installed into the LCS, calibrated with the different models and used in real-world conditions, the performance of the sensors should be checked in the laboratory.The linearity of the gas sensors was tested under steadily increased concentration, which was from 0-5 mg m −3 for CO sensor, 0-0.2 mg m −3 for NO 2 , 0-1.1 mg m −3 for O 3 , and 0-1.4 mg m −3 for SO 2 with five more test points, shown in Fig. 3. Since the units of outputs of the reference instruments and the sensors were different, the slope was not expected to be 1 (Cui et al., 2021).As shown in Fig. 3, the R 2 for the gas sensors is more than 0.93, which indicated that these gas sensors have good linear responses before calibration and verified the sensor is working properly and effectively and could be applied to the LCS.
However, even with an auxiliary electrode, electrochemical sensors may insufficiently account for the impacts of temperature and relative humidity.With the standard gases through the test chamber and the concentrations stabilized at 27 ppb for SO 2 , 3.9 ppb for NO 2 , 13 ppb for O 3 , and 1.22 ppm for CO, the output voltages of the four types of gaseous sensors are nonlinearly fluctuated with the linearly increasing temperature and the relative humidity (RH) (Cui et al., 2021).With the purpose of eliminating the influence of the external environment on the sensor as much as possible, the particles flow through a sampling cutter and heat-tracing pipeline to the particulate matter sensor, and the gaseous pollutants are pumped to the electrochemical sensors, which are secured in a thermo-tank.The temperature values of the heat-tracing pipeline and thermo-tank can be maintained at 60 • ± 2 • C to reduce the influence of relative humidity and 25 • ± 2 • C (Wei et al., 2018) to keep the sensor operating at a stable temperature, respectively.
The measurement results of particulate matter sensor and gas pollution sensors, transmitted to the system control module through the data buses, are directly displayed on the local display module and wirelessly transmitted to the corresponding online server through the transmission module.As the uni-variate linear models do not incorporate any crosssensitivities to other pollutants or any nonlinearities in the response, we attempt to use the sensor electronic results as the input and the reference measurements as the output, to build multi-dimensional multi-response prediction models to de-convolve the effects of cross-sensitivity and stability on https://doi.org/10.5194/amt-17-181-2024Atmos.Meas.Tech., 17, 181-196, 2024  sensor performance utilizing MLR, RF, KNN, BP, and GA-BP calibration models.

Reference instrumentation
In order to reduce the adsorption effect on particle matter and gaseous pollutants, the reference measurements are made on ambient air continuously drawn through Teflon fluorinated ethylene propylene (FEP) (Wei et al., 2018) tubing with a six-port stainless-steel manifold for flow distribution to the gas analyzers and particulate monitors (Mead et al., 2013).It should be pointed out that the LCS was mounted at a consistent height with the reference monitoring system during the measurement period.
The reference ambient particulate monitor 5014i, which uses beta attenuation of the ambient particulate deposited onto a filter tape, is applied to measure the mass concentration of suspended and refined particulates.The reference NO-NO 2 -NO X monitor 42i, using the linear proportional of the chemi-luminescence reaction of NO and O 3 after NO 2 transformed into NO, is utilized to measure the NO 2 concentration.The SO 2 reference analyzer is 43i using the ultraviolet light (which is emitted as the excited SO 2 molecules decay to lower energy states) intensity proportional to the SO 2 concentration.The CO reference monitor is 48i utilizing the principle that CO absorbs infrared radiation at a wavelength of 4.6 µm, and the infrared absorption can be transformed to be proportional to the CO concentration.The 49i O 3 analyzer operates on the principle that O 3 molecules absorb UV light at a wavelength of 254 nm, and the absorption intensity of the UV light is directly related to the ozone concentration.All these reference monitors are produced by Thermo

Principles
This section describes the principles of the calibration methods, such as MLR, BP, GA-BP, KNN, and RF and the metrics for performance evaluation.The calibration models are constructed with the sensors' (i.e., PM 2.5 , PM 10 , CO, SO 2 , NO 2 , and O 3 sensors) electronic results as the input and the reference measurements as the output.

Multiple linear regression model
After the data collected by the LCS, the raw data should be preprocessed.The PM3006 particulate matter sensor can output six kinds of particle range (i.e., > 0.3, > 0.5, > 1.0, > 2.5, > 5.0 and > 10 µm, respectively).By subtracting the six particle range values in turn, the individual particle counters are obtained and expressed as x 0.5 , x 1.0 , x 2.5 , x 5.0 , and x 10.0 (listed in Table 1).The measured particle number concentra-Table 1. Size range of the particulate matter sensor.The sensor can measure particles with the size range of 0.3-0.5, 0.5-1.0,1.0-2.5, 2.5-5.0, and 5.0-10 µm, simultaneously.The corresponding particle counters are expressed as x 0.5 , x 1.0 , x 2.5 , x 5.0 , and x 10.0 , respectively.
tion is converted to PM mass concentrations in the PM 2.5 and PM 10 size fractions.
Taking the particle counters, listed in Table 1, as input and the concentrations Y PM2.5 and Y PM10 of PM 2.5 and PM 10 measured by 5014i as output, the multivariate linear regression (MLR) models (Alexopoulos, 2010;Zoest et al., 2019) are built.Due to the previously established influence of ambient temperature (T ) and relative humidity (RH) on sensor response (Masson et al., 2015;Jiao et al., 2016), the particle counter terms are pretreated and individual from each other.The multi-input one-response preprocessing and prediction models can be written as Eq. ( 1) to obtain the Y PM2.5 concentrations.
To obtain the concentration Y PM10 , the multi-input oneresponse preprocessing and prediction models can be written as Eq. ( 2).
where W PM10 = [w 1_PM10 , w 2_PM10 , w 3_PM10 , w 4_PM10 , w 5_PM10 , w 6_PM10 , w 7_PM10 ] denotes the corresponding weight coefficients; the X PM10 = [x 0.5 , x 1.0 , x 2.5 , x 5.0 , x 10.0 , T , RH] represents the individual particle counters, the temperature sensor and humidity sensor; the b PM10 is the intercept value of the model.Due to the poor selection performance and cross interference of the electro-chemical sensor response, the output values from the sensors and the concentrations of the target pollutants, such as O 3 , NO 2 , and SO 2 concentrations, measured by the inference monitor are used to build the MLR model.The CO gaseous pollution is also one of the criteria pollutants, which must be measured in China.Thus, the https://doi.org/10.5194/amt-17-181-2024Atmos.Meas. Tech., 17, 181-196, 2024 multi-dimensional multi-response preprocessing and prediction model for the four types of gas pollution, T , and RH can be written as Eq. ( 3).
(3) Equation ( 3) can be simplified as where is the corresponding weight coefficient; the is the convertor output values of the sensors through the electronic circuitries; the is the intercept value of the model.Hereto, the multi MLR models for the gas sensor and PM sensor are separately developed.The training data are used to calculate the model regression coefficient and intercept values, and the withheld testing data are utilized to evaluate the performance of the model performance.

BP neural network model
The BP neural network algorithm is one of the most widely used ANN models.It is a multi-layer feed-forward network trained through an error back propagation algorithm by constantly adjusting the weight and intercept of the network.The feed-forward topological structure of the BP neural network model, shown in Fig. 4, includes the input layer, hidden layer, and output layer.With the purpose of avoiding the numerical problems caused by the extreme values of polarization, eliminating the misleading effects for feature extraction, and obtaining the accurate estimation of pollutant concentrations (Janabi et al., 2021), the collected input sensor date X I and output date Y O should be respectively normalized with minmax normalization to limit values in each dimension between 0 and 1 (Bakiler and Guney, 2021).
After the normalization process, the BP network can be established.To optimize the best parameters of the network, the number of hidden layers, the transfer functions of the layers, and the end conditions should be determined.If the parameters are inappropriate, the BP model will be overtrained or insufficient.In this study, a shallow structure with a single hidden layer is chosen, as extensive testing did not show any noticeable improvement in calibration performance with deeper structure consisting of multiple hidden layers (Ali et al., 2021).This also reduced the complexity and the training time.

Genetic algorithm-BP model
In the traditional BP neural network, the initial weights and thresholds are randomly generated.The results often fall into a local minimum rather than a global minimum and would lead to the distortion of the prediction result.In addition, the convergence speed of the BP neural network is usually slow.To solve these problems, the genetic algorithm (GA) (Liang et al., 2018) with BP algorithm is also used to avoid the inherent defects of BP algorithm.The GA method is essentially a direct search method that does not rely on specific problems and gradient information.It follows the survival and elimination rule of biological evolution, generates the following hypotheses by mutating and reconstructing the best existed hypothesis, and makes it possible to solve the problem (Ning et al., 2019).Generally, the GA is used to find an optimal initial weight and a threshold value for the model, so that the model could converge in the direction of a minimum value (Wang et al., 2019).The GA-BP hybrid algorithm is used to reduce the time for the BP neural network to adjust the weight and threshold itself and achieve the goal of improving work efficiency.

K-nearest neighbor model
The k-nearest neighbor (KNN) is also one of the simplest methods for classification as well as regression problems (Kumar, 2015;Zhao and Lai, 2021).The KNN is a supervised method that uses estimation based on values of neighbors, which can automatically adapt to the supervised learning problems with arbitrary Bayes decision boundaries (Zhao and Lai, 2021).From the supervisor data set, the KNN solution utilizes the values of given dependent variable y i to approximate the dependent variable y * , which is close with respect to distance between their corresponding model parameters.For the regression problem, the mean of the observed labels of k-nearest neighbors of independent variable X is assigned to be the predicted label.In this study, the k is set to 10 with the performance having no obvious difference from other numbers.

Random forest model
The random forest (RF) model is used for solving regression or classification problems (Breiman, 2001;Liu et al., 2012).It works by constructing an ensemble of decision trees using a training data set; the mean value from that ensemble of decision trees is then used to predict the value for new input data (Zimmerman et al., 2018).With the purpose of establishing a RF model, the maximum number of decision trees of the forest should be specified.Each tree is constructed using a bootstrapped random sample from the training data set.By considering a random subset of the possible explanatory variables with the strongest predictor of the response, the origin node of the decision tree can be split into sub-nodes.The node-splitting process is repeated until a terminal node is reached.The terminal node can be specified using the maximum number of sub-nodes or the minimum number of data points in the node.To illustrate the method, consider building a random forest model for one LCS using a single decision tree and a subset of 20 490 data points to build a calibration model, shown in Fig. 5.The RF model can predict data with variable parameters within the training range.Therefore, a larger and more variable training data set should create a better final model.To avoid missing any spikes during the training window, a 5-fold cross-validation approach (Zimmerman et al., 2018) is also used to maximize utilization of the training data set.This approach helps to minimize bias in training data selection when predicting new data and ensures that every point in the training window is used to build the model.

Metrics for performance evaluation
To quantitatively compare the performances of the five calibration models applied to the LCS, and balance the disadvantages of the different metrics, the determination coefficient (R 2 ), root mean square error (RMSE) (Janabi et al., 2021), and mean absolute error (MAE) are utilized.The R 2 reflects the fit degree between the model output data and the reference monitor measurement.The measurement results should meet the requirements of environmental standards of China (Jiao et al., 2016).The RMSE measures how much error there is between the predicted values and the reference measurements and is sensitive to extreme values (Chai and Draxler, 2014).The MAE is a good choice to evaluate the error when the distribution is not Gaussian (Rezaei et al., 2023).The formulas for the evaluation metrics are presented as Eqs.( 5)-( 7), respectively.
where ŷi , y i , and y represent the ith model output data from the algorithm-based LCS system, the reference data from the reference instrumentations, and the mean value of the reference instrumentations, respectively.The n is the number of the measurement data in the data set.

Results and discussion
Following the model building, the goodness of regression and root mean square error between the model output concentrations and the reference monitor concentrations are evaluated for all calibration model approaches.The plots for the PM 2.5 , PM 10 , O 3 , CO, NO 2 , and SO 2 illustrating the time series and goodness of fit of the models are provided in Figs.6-15.The R 2 and RMSE values are listed in Tables 2-7.

Parameters of the model
For the BP and GA-BP models, the parameters are the functions for the hidden layer and output layer, the type of the hidden layer, the number of iteration times, and the number of the nerve units (Xu et al., 2021).The functions for the hidden layer and output layer in this study respectively are the default tansig and the purelin functions.With the more complex type of the hidden layer and less obvious improvement, the hidden layer is single type to achieve the goal of work efficiency.
To determine the best number of iteration times and nerve units, the measurement from the LCS and reference monitor between 1 March and 30 June 2021 is used.The number of iteration time is optimized using the mean squared error (MSE) between the model value from the model and the reference monitor output value.The tendency of the MSE is shown in Fig. 6   It is observed that the MSE decreases with the number of iterations increasing; the rate of decrease and the variation of the MSE is negligible beyond 100 iterations.More iterations incur higher computational cost for the training and small performance improvement.There is also the risk of overtraining resulting in poor generalization capability.Using this method, the same number of iterations can be obtained with the different gas pollutants within 1 July and 31 October 2021 as well as 1 November 2021 and 28 February 2022.
The node number of the nerve units is determined by the contrast results of determination coefficient R 2 for different gas and PM pollutants within 1 March and 30 June 2021.The results are shown in Fig. 7.The R 2 is improved as the number of nerve units increasing.The rate of increase and the variation of R 2 is negligible beyond 70 units.More units incur higher computational cost and time for the training and small performance improvement.Using this method, the same number of the nerve units can be obtained with the different gas pollutants within 1 July and 31 October 2021 as well as 1 November 2021 and 28 February 2022.
For the RF model, the number of trees is determined by using grid search method, which will search the optimal hyperparameter by traversing a given hyper-parameter combina- tion (Zhu et al., 2022).A total of 11 kinds of tree numbers are set between 2 and 22 by using grid search to traverse these 11 kinds of tree numbers to obtain different R 2 .For instance, the R 2 values for different gas pollutants within 1 March and 30 June 2021 are shown in Fig. 8.The R 2 is improved as the number of trees increases.The rate of increase and the variation of R 2 is negligible beyond 20.The terminal node is specified using a maximum number of subnode points per node.The R 2 is also improved as the number of sub-nodes increases under the same tree number.The rate of increase and the variation of R 2 is negligible beyond 100.A higher number of trees or sub-nodes incurs higher computational cost and time for the training and small performance improvement.Using this method, the same number of trees can be obtained with the different gas pollutants within 1 July and 31 October 2021 as well as 1 November 2021 and 28 February 2022.

Measurement results of PM
With the purpose of avoiding over-fit in the five models, the random partition parameters of train ratio and test ratio are 80 % and 20 %, respectively.To ensure the robustness of the model evaluation, the 5-fold cross validation is also conducted.The data set is divided into five mutually exclusive subsets with same size, where the four subsets are randomly selected as the training set each time, and the remaining subset is used as the test set.After completing each round of validation, four copies are selected again to train the model and the remaining copy is used for validation.After several rounds (less than five), the loss function is selected to evaluate the optimal model and parameters (Mahesh et al., 2023;Zimmerman et al., 2018).
With the results from 1 March 2021 to 28 February 2022 and according the trend of the ambient temperature, the total data are divided into three segments.The three segments (I, II, and III) separately are within 1 March and 30 June 2021 with the size of 32 481, 1 July and 31 October 2021 with the size of 31 287, and 1 November 2021 and 28 February 2022 with the size of 32 053, respectively.
During the measurement period, the ranges of the ambient temperature and relative humidity separately were −5 to +50 • C and 10 % to 98 %, shown in  Table 2. Performance of different calibration models for the PM 2.5 and PM 10 against the reference monitor.The determination coefficient R 2 (higher is better, maximum of 1) of different calibration models (RF, MLR, KNN, BP, GA-BP) versus the reference monitor.formance, followed by KNN, BP, and GA-BP, with MLR having the worst.
The R 2 values between the reference data and the five model data are also shown in Figs.10b and 11b and listed in Table 2.The R 2 of RF for the PM is better than 0.98.The R 2 of MLR for the PM is less than 0.91 and even less than 0.7.The R 2 values of the other three models are within 0.86 and 0.98.
The performance of different calibration models for the PM against the reference monitor is also evaluated using RMSE, MSE, and MAE.The results are listed in Tables 3,  4, and 5, respectively.Using the data listed in Table 3, the RMSE values from the first (I) and third (III) periods are larger than the ones from the second (II) stage.The main reason may be the large fluctuation range of the PM for the climatic factors in winter and spring, resulting in the poor model fit.The RMSE values of PM 2.5 between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithms data are within 2. 36-5.49, 12.63-18.68, 5.67-13.05, 6.56-14.35, and 6.61-14.35BP, and GA-BP algorithms data are calculated as 4. 55-10.37, 16.43-45.05, 11.14-27.08, 12.15-23.10, and 11.99-23.65,respectively.Using the data listed in Table 4, the MAE values have the same characteristics with RMSE.The MAE values of PM 2.5 between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithm data are within 1. 44-3.45, 8.37-12.80, 3.56-8.31, 4.46-9.55, and 4.48-9.54,respectively.The MAE values of PM 10 between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithm data are within 3. 21-5.28, 12.21-23.20, 8.00-13.35, 8.99-15.26, and 8.89-15.43,respectively.

Measurement results of gas pollution
With the results from 1 March 2021 to 28 February 2022 and according the trend of the ambient temperature, shown        For the O 3 model, the R 2 of RF is better than 0.98.The R 2 of MLR is less than 0.90 and even less than 0.8.The R 2 values of the other three models are within 0.82 and 0.97.For the CO model, the R 2 of RF is better than 0.97.The R 2 of MLR is less than 0.81 and even less than 0.7.The R 2 values of other three models are within 0.81 and 0.94.For the NO 2 model, the R 2 of RF is better than 0.96.The R 2 of MLR is less than 0.60 and even less than 0.5.The R 2 values of other three models are within 0.70 and 0.90.For the SO 2 model, the R 2 of RF is better than 0.93.The R 2 of MLR is less than 0.40 and even less than 0.1.The R 2 values of other three models are within 0.27 and 0.80.
The performances of different calibration models for the gas pollution against the reference monitor are also evaluated using RMSE and MAE.The results are listed in Tables 6 and  7, respectively.
Using the data listed in Table 6, the RMSE values of O 3 , CO, and NO 2 from the first (I) and third (III) periods have little difference with the one from the second (II) period, indicating the O 3 , CO, and NO 2 electrochemical sensors are suitable for the ambient O 3 , CO, and NO 2 measurements.The RMSE values of O 3 between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithm data are within 4. 05-4.08, 14.00-17.79, 9.84-10.57, 11.46-14.67, and 11.41-14.40, respectively.The RMSE values of CO between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithm data are within 0.02-0.06,0.12-0.23,0.06-0.16,0.09-0.18,and 0.09-0.18,respectively.The RMSE values of NO 2 between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithm data are within 2. 88-3.99, 13.54-14.54, 6.93-9.61, 9.37-11.07, and 9.21-11.21,respectively.Using the RF model, the RMSE values of SO 2 are better than the values of other methods but still have differences during the three periods.However, using other models, the RMSE values of SO 2 from the first (I) and third (III) periods are larger than the ones from the second (II) period.The main reason may be the large ambient fluctuation for the climatic factors in winter and spring, resulting in the poor model fit.The RMSE values of SO 2 between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithm data are within 0.64-1.68, 2.69-5.37, 1.49-4.05, 2.06-4.63, and 2.03-4.60, respectively.Using the data listed in Table 7, the MAE values have the same characteristics with RMSE.The MAE values of O 3 between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithm data are within 2. 76-2.88, 10.79-13.46, 7.06-7.33, 8.70-11.14, and 8.67-10.90, respectively.The MAE values of CO between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithm data are within 0.02-0.05,0.09-0.19,0.04-0.11,0.07-0.14, and 0.07-0.14,respectively.The MAE values of NO 2 between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithm data are within 1. 84-2.80, 10.41-11.08, 4.45-6.85, 6.59-8.27, and 6.48-8.41,respectively.The MAE values of SO 2 between the reference data and the RF, MLR, KNN, BP, and GA-BP algorithm data are within 0.39-1.16, 1.96-4.24, 0.91-2.84, 1.41-3.43, and 1.36-3.40, respectively.As shown in Figs.12-15 and listed in Tables 5-7, the results of each model have little difference among the three periods for the O 3 , CO, and NO 2 measurements, and the RF model outperforms other models.
For the data of SO 2 , the results of RF are better than the ones of other methods and have little difference among the three periods.However, the performances of other methods (MLR, KNN, BP, GA-BP) are poorer than the ones during the first and third periods.There may be some reasons for this phenomenon.The first one is the cross-interference effect from NO 2 and O 3 , which have the wide range of fluctuations (from about 20 to 125 µg m −3 ) and increasing tendency in period I, respectively.The NO 2 and SO 2 can react chemically under certain conditions to produce sulfuric acid (H 2 SO 4 ) and nitric acid (HNO 3 ), which will affect the reading of SO 2 sensor.The O 3 , highly oxidizing gas, may react with SO 2 to form H 2 SO 4 or sulfite (H 2 SO 3 ), resulting in inaccurate sensor readings.The second one is the ambient temperature has a wide range of fluctuations (from about −5 to +45 • C) during the first and third periods, which will affect the stability of electrode material and the readings of the sensor.The last one is the concentration of ambient SO 2 is high (more than 30 µg m −3 ) in period I and period III, beyond the actual measurement range of the SO 2 sensor, which will be researched in future.

Conclusions and discussion
A low-cost air quality monitoring system (LCS) based on RF, MLR, KNN, BP, and GA-BP algorithms is proposed.The system can measure gas-phase pollutants (SO 2 , NO 2 , CO, and O 3 ) and particle pollutants (PM 2.5 and PM 10 ), simultaneously.With the purpose of estimating the performance of the five algorithms, the LCS was mounted at the same location (Zhengzhou, China) and consistent height with the reference monitoring system.The measurement was made continuously from 1 March 2021 to 28 February 2022, with the ranges of the ambient temperature and relative humidity separately −5 to +50 • C and 10 % to 98 %, respectively.The values of the LCS and reference instruments were separately logged to the server for further comparative analysis.
With the pretreated and individual particle counters, T and RH as input, and the concentrations of PM 2.5 and PM 10 measured by the reference instrumentation separately as output, the multi-input single-output evaluation models based on RF, MLR, KNN, BP, and GA-BP algorithms can be obtained.With the four types of electro-chemical sensor raw data, T and RH as input, and the measurements from the reference monitors as output, the multi-input multi-output evaluation models based on the five algorithms can be obtained.The performances of the calibration models are quantitatively compared by utilizing R 2 , RMSE, and MAE.
The experimental results show that the R 2 of RF for the PM is better than 0.98; the R 2 of MLR for the PM is less than 0.91; the R 2 values of the other three models are within 0.86 and 0.98.The R 2 of RF for the gas pollutants (SO 2 , NO 2 , CO, and O 3 ) is better than 0.93; the R 2 of KNN, BP, and GA-BP for the gas pollutants (SO 2 , NO 2 , CO, and O 3 ) is within 0.27 to 0.97; the R 2 of MLR for the NO 2 , CO, and O 3 is within 0.46 to 0.90, but for SO 2 it is less than 0.40 and even less than 0.1.
Overall, we conclude that, with careful data management and calibration using the machine learning algorithms, especially the RF method, these measurements are consistent with the national environmental protection standard requirement of China.The LCS may significantly improve our ability to spatial heterogeneity in air pollutant concentrations.The air pollutant maps will assist researchers, policymakers, and communities in developing new policies or mitigation strategies to enhance human health.In the next study, we will focus on improving the matching of the measurement precision and range, the generalization of the algorithms in more applications, and the performance of the SO 2 sensor.
Code and data availability.The data presented in this study are available on request by the corresponding author.The models and associated codes are not available online due to a provisional patent application.
Author contributions.Conceptualization: GW and CY; methodology: GW and KG; software: GW and KG; data curation: YW; Writing original draft preparation: GW; writing review and editing: KG and CY; supervision: HG.All authors have read and agreed to the published version of the manuscript.

Figure 1 .
Figure 1.Location of the air quality monitoring station during the measurement period.

Figure 2 .
Figure 2. Schematic block and site photo of the LCS.Panel (a) is the schematic block of the LCS.The system control module can ensure the temperature stability of the heat tracing pipeline and thermo-tank through the heat tracing control module and thermostatic control module, respectively.The sampling cutter is used to filter particles larger than 10 µm.The sampling pump is utilized to deliver ambient air to the surface of the sensors.Panel (b) is the site photo of the LCS.

Figure 3 .
Figure 3. Linearity of gas sensors before calibration.Electrical output signals versus single standard gas concentration is tested in laboratory conditions.Panels (a) and (b) represent the proportional relations between CO and O 3 sensors as well as SO 2 and NO 2 sensors, respectively.The duration of each measurement is about 30 min.

Figure 4 .
Figure 4. Topological structure of BP neural network model.Panel (a) is the feed-forward topological structure.The X I and Y O are the input data and output data, respectively.The X i and Y i separately indicate the normalized items of X and Y .The w i1 and b i1 as well as w j 2 and b j 2 separately represent the weight value and intercept value of the hidden layer and output layer.Panel (b) is equivalent to panel (a) to simplify the formulas.

Figure 5 .
Figure 5. Simplified illustration of the RF with a single decision tree and a subset.The x[0], x[2], and x[3] represent the CO, SO 2 , and O 3 pollutants.At the first split, points with normalized CO sensor signal ≤ 0.052 are sent to a terminal node; the remaining points go to the other splitting node.The sample is the number of data points in each terminal node.The value is the average in each terminal node.

Figure 6 .
Figure 6.The MSE with the number of iterations.

Figure 7 .
Figure 7.The R 2 with different node numbers of the neuron for the pollutants.

Figure 8 .
Figure 8.The R 2 with different tree numbers of the RF model for the pollutants.
Fig. 9.The ambient temperature increased, decreased, and fluctuated separately within 1 March and 30 June 2021, 1 July and 31 October 2021, and 1 November 2021 and 28 February 2022.The time series data and regressions of five modes for PM from reference monitor and LCS calibration output are shown in Figs. 10 and 11.As shown in Figs.10a and 11a, the general tendencies of the data fluctuation between the reference monitor and the RF, MLR, KNN, BP, and GA-BP algorithms of the LCS are consistent with each other.The RF model has the best per-

Figure 9 .
Figure 9. Temperature and relative humidity ranges during the measurement period.

Figure 10 .
Figure 10.Time series and regressions comparing the reference monitor PM 2.5 data (black) to five calibration model PM 2.5 results, where red, blue, magenta, olive, and navy represent RF, MLR, KNN, BP, and GA-BP, respectively.Panel (a) shows the whole time series data of the measurement period.Panel (b) shows the regressions of the five calibration models.

Figure 11 .
Figure 11.Time series and regressions comparing the reference monitor PM 10 data (black) to five calibration model PM 10 results, where red, blue, magenta, olive, and navy represent RF, MLR, KNN, BP, and GA-BP, respectively.Panel (a) shows the whole time series data of the measurement period.Panel (b) shows the regressions of the five calibration models.

Figure 12 .
Figure 12.Time series and regressions comparing the reference monitor O 3 data (black) to five calibration model O 3 results, where red, blue, magenta, olive, and navy represent RF, MLR, KNN, BP, and GA-BP, respectively.Panel (a) shows the whole time series data of the measurement period.Panel (b) shows the regressions of the five calibration models.

Figure 13 .
Figure 13.Time series and regressions comparing the reference monitor CO data (black) to five calibration model CO results, where red, blue, magenta, olive, and navy represent RF, MLR, KNN, BP, and GA-BP, respectively.Panel (a) shows the whole time series data of the measurement period.Panel (b) shows the regressions of the five calibration models.
in Fig.9, the total data are also divided into three same segments as shown in Sect.4.2.With the same purpose of avoiding over-fit in the five models and ensure the robustness of the model evaluation, the random partition parameters of train ratio and test ratio are also 80 % and 20 %, and the 5-fold cross validation is also conducted.The time series data and regressions of five modes for gas pollution from reference monitor and LCS calibration output are shown in Figs.12-15.As shown in Figs.12a-15a, the general tendencies of the data fluctuation between the reference monitor and the RF, MLR, KNN, BP, and GA-BP algorithms of the LCS are consistent with each other.The RF model has the best performance, followed by KNN, BP, and GA-BP, with MLR having the worst.The R 2 values between the reference data and the five model data are also shown in Figs.12b-15b and listed in

Figure 14 .
Figure 14.Time series and regressions comparing the reference monitor NO 2 data (black) to five calibration model NO 2 results, where red, blue, magenta, olive, and navy represent RF, MLR, KNN, BP, and GA-BP, respectively.Panel (a) shows the whole time series data of the measurement period.Panel (b) shows the regressions of the five calibration models.

Figure 15 .
Figure 15.Time series and regressions comparing the reference monitor SO 2 data (black) to five calibration model SO 2 results, where red, blue, magenta, olive, and navy represent RF, MLR, KNN, BP, and GA-BP, respectively.Panel (a) shows the whole time series data of the measurement period.Panel (b) shows the regressions of the five calibration models.
w 11 w 12 w 13 w 14 w 15 w 16 w 21 w 22 w 23 w 24 w 25 w 26 w 31 w 32 w 33 w 34 w 35 w 36 w 41 w 42 w 43 w 44 w 45 w 46 . The training is performed for 500 iterations.

Table 3 .
Performance of different calibration models for the PM 2.5 and PM 10 against the reference monitor.The RMSE values (lower is better) of different calibration models (RF, MLR, KNN, BP, GA-BP) versus the reference monitor.

Table 4 .
Performance of different calibration models for the PM 2.5 and PM 10 against the reference monitor.The MAE values (lower is better) of different calibration models (RF, MLR, KNN, BP, GA-BP) versus the reference monitor.

Table 5 .
Performance of different calibration models for the gaseous pollutants (SO 2 , CO, NO 2 , and O 3 ) against the reference monitor.The determination coefficient R 2 (higher is better, maximum of 1) of different calibration models (RF, MLR, KNN, BP, GA-BP) versus the reference monitor.

Table 6 .
Performance of different calibration models for the gaseous pollutants (SO 2 , CO, NO 2 , and O 3 ) against the reference monitor.The RMSE values (lower is better) of different calibration models (RF, MLR, KNN, BP, GA-BP) versus the reference monitor.

Table 7 .
Performance of different calibration models for the gaseous pollutants (SO 2 , CO, NO 2 , and O 3 ) against the reference monitor.The MAE values (lower is better) of different calibration models (RF, MLR, KNN, BP, GA-BP) versus the reference monitor.