Neural Network Based Estimation of Regional Scale Anthropogenic CO2 Emissions Using OCO-2 Dataset Over East and West Asia

Atmospheric carbon dioxide (CO2) is the most significant greenhouse gas and its concentration is continuously increasing mainly as a consequence of anthropogenic activities. Accurate quantification of CO2 is critical for addressing the global challenge of climate change and designing mitigation strategies aimed at stabilizing the CO2 emissions. Satellites 20 provide the most effective way to monitor the concentration of CO2 in the atmosphere. In this study, we utilized the concentration of column-averaged dry-air mole fraction of CO2 i.e., XCO2 retrieved from a CO2 monitoring satellite, the Orbiting Carbon Observatory 2 (OCO-2) to estimate the anthropogenic CO2 emissions using Generalized Regression Neural Network over East and West Asia. OCO-2 XCO2 and the Open-Data Inventory for Anthropogenic Carbon dioxide (ODIAC) CO2 emission datasets for a period of 5 years (2015-2019) were used in this study. The annual XCO2 anomalies were calculated 25 from the OCO-2 retrievals for each year to remove the larger background CO2 concentrations and seasonal variabilities. Then the XCO2 anomaly and ODIAC emission datasets from 2015 to 2018 were used to train the GRNN model, and finally, the anthropogenic CO2 emissions were estimated for 2019 based on the XCO2 anomalies derived for the same year. The XCO2based estimated and the ODIAC actual CO2 emissions were compared and the results showed a good agreement in terms of spatial distribution. The CO2 emissions were estimated separately over East and West Asia. In addition, correlations between 30 the ODIAC emissions and XCO2 anomalies were also determined separately for East and West Asia, and East Asia exhibited relatively better results. The results showed that satellite-based XCO2 retrievals can be used to estimate the regional scale anthropogenic CO2 emissions and the accuracy of the results can be enhanced by further improvement of the GRNN model with the addition of more CO2 emission and concentration datasets. https://doi.org/10.5194/amt-2021-222 Preprint. Discussion started: 11 August 2021 c © Author(s) 2021. CC BY 4.0 License.

from the satellite measurements show a positive correlation with the CO2 emission inventories (Hakkarainen et al., 2016;Yang 90 et al., 2019) which implies that these space-based observations can be used to assess the anthropogenic CO2 emissions by enhancing the anthropogenic XCO2 concentration.
Asia is the home to the most populous nations with the highest amounts of CO2 emissions. East Asia, in particular, China significantly contributes to the global carbon budget and has accounted for ~30% of the overall growth in global CO2 emissions over the past 15 years (Edgar, 2017). This increment in the CO2 levels is mainly due to the rapid economic growth and 95 anthropogenic activities (Shan et al., 1997). China has pledged to make aggressive efforts to reduce the CO2 emissions per unit GDP by 60-65% relative to 2005 levels, and peak carbon emissions overall, by 2030 (Unfcc, 2015). West Asia is also a region with higher rates of anthropogenic CO2 emissions (Mustafa et al., 2020) and some of its countries, such as Iran, Saudi Arabia, and Turkey are listed among the 10 largest CO2 emitting nations in the world. Several studies have been carried out to estimate the CO2 emissions using various machine learning techniques but most of them do not deal with the spatial distribution. (Rao,100 2021) estimated the CO2 emissions using Support Vector Machine (SVM). (Zhonghan et al., 2018) predicted the CO2 flux emissions based on published data including latitude, age, potential net primary productivity (NPP) and mean depth using Back Propagation Neural Network (BPNN) and Generalized Regression Neural Network (GRNN). (Yang et al., 2019) estimated the anthropogenic CO2 emissions using GOSAT XCO2 retrievals over China and the results showed a good agreement between the estimated and the ODIAC CO2 emission dataset. In this study, we have improved the model initially 105 developed by (Yang et al., 2019) to estimate the regional scale anthropogenic CO2 emissions using OCO-2 XCO2 retrievals over East and West Asia. MODIS NPP, OCO-2 and ODIAC CO2 datasets were obtained for a period of five years from January 2015 to December 2019. XCO2 anomalies were calculated from the OCO-2 retrievals for each year, GRNN model was trained using XCO2 anomalies, MODIS NPP, and ODIAC CO2 emissions with four years of data from 2015 to 2018 and then anthropogenic CO2 emissions were estimated for the year 2019 based on 2019 NPP and XCO2 anomalies. Atmospheric CO2 110 monitoring satellites can detect and analyze the anthropogenic CO2 signatures and the satellite-based estimation of anthropogenic CO2 emissions can be helpful in investigating the carbon emissions as a data-driven method, which is different to the conventional method in calculating emission inventory. Although estimation of anthropogenic CO2 emission using satellite datasets is a challenging task because some other factors such as the atmospheric transport and the terrestrial ecosystem play notable roles in controlling the spatial distribution of atmospheric CO2 (Cao et al., 2017) but still this data-driven method 115 can provide a meaningful help in quantifying anthropogenic CO2 emissions that will be important for evaluating the effects for anthropogenic CO2 emissions reduction at regional as well as global scales. The details about the datasets and methods are provided in Section 2. The results including estimated CO2 emissions and evaluation of these emissions, and correlation between ODIAC CO2 emissions and XCO2 anomalies are discussed in Section 3.

OCO-2 Dataset 135
The Orbiting Carbon Observatory 2 (OCO-2) was launched by the National Aeronautics and Space Administration (NASA) on 2 July 2014 to monitor the concentration of atmospheric CO2 at regional and global levels . It carries a threechannel imaging grating spectrometer that collects high-resolution, bore-sighted spectra of reflected sunlight. Spectra are collected in the molecular oxygen A-band at 0.765 microns and the CO2 bands at 1.61 and 2.06 microns (Hakkarainen et al., 140 2019). Information from all these bands is combined to calculate the XCO2. The spatial resolution of OCO-2 is 2.25 km x 1.29 km. More details about the instrument design, calibration approach, on-orbit performance, and measurement principles are provided in a previous study . In this study, we used OCO-2 ACOS/XCO2 version 10r product that was generated using the ACOS Level 2 Full Physics (L2FP) retrieval algorithm which used a Bayesian optimal estimation framework to derive estimates of XCO2 from spectral measurements of reflected solar radiation Crisp et al., 2012). A 145 comprehensive study about the validation of OCO-2 XCO2 retrievals against the Total Carbon Column Observing Network (TCCON) CO2 dataset reported an absolute median difference of less than 0.4 ppm and the RMS difference less than 1.5 ppm between the two datasets . Similar experiments have been carried out for validation of different versions of OCO-2 XCO2 products and the results showed that the OCO-2 dataset was consistent and reliable for atmospheric CO2 monitoring (Kiel et al., 2019;O'dell et al., 2018). The quality and the quantity of the XCO2 product have been improved with 150 the developments in the ACOS FP retrieval algorithm. The latest OCO-2 XCO2 product has single sounding precision of ~0.8 ppm over land and ~0.5 ppm over water, and RMS biases of 0.5-0.7 ppm over both land and water (ODell et al., 2021).The evolution of the ACOS L2FP retrieval algorithm from v7 to v10 is summarized in Table 1.
No major changes were made in the ACOS v9 L2FP retrieval algorithm relative to v8 except for sampling of meteorological prior. The trace gas absorption coefficient tables (ABSCO) were updated in various versions of the ACOS 155 L2FP retrieval algorithms. The source of the prior meteorology was changed from the European Center for Medium-range Weather Forecast (ECMWF) in ACOS v7 to the NASA Goddard Modeling and Assimilation Office (GMAO) Goddard Earth Observing System (GEOS) Forward Processing -Instrument Team (FP-IT) products for v8/9. The aerosol prior source was changed from the GMAO Modern-Era Retrospective analysis for Research and Applications (MERRA) product in v7-9 to GEOS5 FP-IT in v10. Moreover, an additional stratospheric aerosol layer was introduced in ACOS v8-10. The prior value of 160 aerosol optical depth for each retrieved aerosol type was lowered from 0.0375 in v7 to 0.0125 in v8-10. The CO2 prior developed by the TCCON team using the ggg2014 algorithm remained same in v7/8/9of the algorithm. Another major change was switching the land surface model from a purely Lambertian land surface model to Bi-Directional Reflectance Distribution Function (BRDF) model (Taylor et al., 2021).

ODIAC Dataset 180
ODIAC is a global fossil-fuel CO2 (FFCO2) emission dataset with 1 × 1 km, monthly resolution over land and 1×1 degree, annual resolution for international bunkers from the year 2000 onward (Oda et al., 2018). It shares country scale estimates with Carbon Dioxide Information Analysis Center (CDIAC) but distributes the emissions differently within the countries and includes gridded international bunker emissions (Oda and Maksyutov, 2015). CDIAC distributes the CO2 emissions based on 185 the population density while ODIAC incorporates power plant profiles and nighttime light observation for emission distribution (Wang et al., 2020). ODIAC shows a better agreement with the US bottom-up inventory (Gurney et al., 2009) than CDIAC and it is commonly used in flux inversions (Crowell et al., 2019;Lauvaux et al., 2016;Maksyutov et al., 2013;Takagi et al., 2011). In this study, we used the 2020 version of ODIAC emission dataset that is freely available and can be downloaded from http://db.cger.nies.go.jp/dataset/ODIAC/. 190

Methods
Estimation of anthropogenic CO2 emissions includes three major steps as shown in Figure 1. The first step includes enhancing the XCO2 concentration influenced by anthropogenic activities, the second step is about setting up the GRNN model using XCO2, NPP, and ODIAC datasets, and the final step is the validation of estimated CO2 emissions against the actual ODIAC emission dataset. 195 OCO-2 XCO2 dataset was downloaded from the Earthdata platform (https://earthdata.nasa.gov/) and to ensure the reliability of the data, screening and filtering of the dataset was carried out following the instructions given in the OCO-2 Data User Guide (DUG). Each sounding that is processed using the ACOS L2FP retrieval algorithm is assigned either a "good" (=0) or "bad" (=1) quality flag based on screening criteria derived from comparisons with TCCON and modelled CO2 fields. It is generally advised that users should use the "good" quality soundings for regional and local scale studies because the soundings 200 flagged as "bad" quality might include biases that compromise their utility for the application. In this study, the OCO-2 XCO2 retrievals were included if: (i) they were flagged good (flag=0) and (ii) the standard deviation of the good soundings for the day was less than 2 ppm. CO2 has a larger background concentration and a longer atmospheric life time compared to other greenhouse gases (Hakkarainen et al., 2019). Because of this, XCO2 varies by nearly 2% over the seasonal cycle and from pole to pole. In addition, XCO2 variations influenced by anthropogenic activities are also smaller on the scale of satellite sounding 205 (2-4 km 2 ). Therefore, high precision is critical for accurate quantification of the XCO2 anomalies related to anthropogenic activities. To highlight the emission areas, CO2 seasonal variability and the large background concentrations must be removed.

Formatted: Subscript
Deleted: on the basis of This equation calculated the XCO2 anomalies for each observation. Subtraction of daily background concentration removes the seasonal variability. The space-based soundings are irregularly distributed and have spatiotemporal gaps because a large amount of the satellite observations is removed after screening for clouds and other artifacts. To deal with the spatiotemporal gaps, kriging interpolation was used and a mapping dataset was generated with the spatial resolution of 0.5°×0.5° 240 Longitude/Latitude and temporal resolution of 16 days. Finally, the mean against each grid cell was calculated for each year from 2015 to 2019. The annual mean of XCO2 (anomaly) can detrend the seasonal variation (Hakkarainen et al., 2016). The annually-averaged XCO2 anomalies were resampled at a grid with a spatial resolution of 1°×1° Longitude/Latitude and used along with 1°×1° Longitude/Latitude ODIAC emission dataset to setup the GRNN model.
During the process of photosynthesis, the living plants convert the CO2 into sugar molecules they use for food. In the process 245 of making food, they also release the oxygen we breathe. Plant productivity plays a crucial role in the global carbon cycle by absorbing the CO2 released by anthropogenic activities. The net primary productivity (NPP) shows how much CO2 is absorbed by the plants during photosynthesis minus how much CO2 is released during respiration. A negative value of NPP means that CO2 is released into the atmosphere and a positive value represents the absorption of atmospheric CO2. To improve the model results, an NPP dataset (MOD17A3HGF) provided by MODIS has also been used in this study. It provides information about 250 annual NPP and is distributed by NASA's Land Processes Distributed Active Archive Center (LP DAAC). The NPP dataset with a spatial resolution of 500 meters (m) was downloaded from the LP DAAC website (https://lpdaac.usgs.gov/products/mod17a3hgfv006/). The annual NPP is derived from the sum of all 8-day Net Photosynthesis (PSN) products (MOD17A2H) from the given year. The MODIS NPP dataset was reprojected and resampled to the spatial resolution of 1°×1° Longitude/Latitude for each year and used along with the ODIAC and OCO-2 datasets to train the GRNN 255 model and as well predicting the CO2 emission.
XCO2 variations are primarily influenced by anthropogenic activities and terrestrial ecosystems, there are both linear and non-linear mapping between the XCO2 and the emissions. We adopted the GRNN algorithm to represent the non-linear mapping between the independent variables (XCO2 anomaly and NPP) and dependent variable (CO2 emission). The GRNN is a memory-based network that provides estimates of continuous variables and converges to underlying regression. The 260 regression of a dependent variable on an independent variable is the computation of the most probable value of the dependent variable for each value of the independent variable based on a finite number of possibly noisy measurements of the independent variable and the associated values of the dependent variable. The dependent and the independent variables are usually vectors (Rooki, 2016). The architecture of GRNN is shown in Figure 2. It consists of four layers including an input layer, a hidden layer, a summation layer, and a decision layer. In the input layer, each neuron corresponds to the independent variable that is 265 expressed as a mathematical function and the independent variable values are standardized. Then the standardized values of the independent variable are transferred to the neurons in the hidden layer. In this layer, each neuron stores the values of the dependent and independent variables and calculates a scalar function. The third layer known as the summation layer contains two neurons; the denominator summation unit which sums the weight values being received from the hidden layer, and the 7 numerator summation unit which sums the weight values multiplied by the actual target-dependent variable value for each hidden neuron. Finally, the target-dependent value is obtained in the decision layer by dividing the value accumulated in the numerator summation unit by the value in the denominator summation unit. To develop a neural network, the dependent and the independent training variables must be standardized, so that in the input layer all training data will have the same order of magnitudes (Yang et al., 2019). 285 where is the dimension of the variable vector # , is the spread parameter and an optimal spread parameter value is obtained after several runs following the mean squared error of the estimated values, which must be kept at a minimum (Rooki, 2016). 290 In this study, values of spread parameters were optimized using the Holdout Method. More detail about the Holdout Method is provided in a previous study (Specht, 1991). The weight of the denominator neuron was set to 1.0. The predicted target dependent variable was defined by the following equation: where the values calculated with the scalar function in a hidden neuron are weighted with the corresponding values of the training samples # . is denoting the number of training samples.

Spatial Distribution of XCO2 Observations and Anomalies 300
The satellite-based observations are sensitive to clouds and aerosols, therefore, much of the data is discarded during the preprocessing due to the presence of cloud and aerosol content (Mustafa et al., 2021b). Figures 3a and 3b show the quantity of XCO2 retrievals from 2015 to 2019 on a spatial grid of 0.5°×0.5° Longitude/Latitude over West and East Asia, respectively.
OCO-2 shows a good spatial coverage over East Asia, however, southern parts of the region, in particular, the Tibetan plateau has a relatively lower number of XCO2 retrievals. The Tibetan plateau is the most extensively elevated surface on Earth and 305 satellite measurements show larger uncertainties over this region (Yang et al., 2019). In the case of West Asia, the southern parts of the region have a lower number of XCO2 retrievals. In the southern parts of West Asia, a very large desert, the Rub' al Kahli is located that stretches across Saudi Arabia, Yemen, Oman, and United Arab Emirates (UAE) and often observes dust storms. The lower number of XCO2 retrievals in these parts of the region might be due to the ACOS XCO2 retrieval algorithm that excludes the satellite measurements with high aerosol optical depth and cloud optical thickness (Crisp et al., 310 2012;O'dell et al., 2012).
Deleted: (Yang et al., 2019) Deleted: (Rooki, 2016) Deleted: (Specht, 1991) Deleted: (Mustafa et al., 2021) 315 Deleted: (Yang et al., 2019) Deleted: (Crisp et al., 2012;O'Dell et al., 2012, p.1) Figure 3c shows the spatial distribution of five years-averaged XCO2 anomalies calculated using the method described in section 2.2 over West Asia. The higher concentrations of XCO2 anomalies were observed over central parts of the region that included Iran, Kuwait, Saudi Arabia, and Iraq. Iran and Saudi Arabia are listed among the top 10 CO2 emitting nations and 320 produce over 6% of the global CO2 emissions (Jalil, 2014). In addition, Iran, Saudi Arabia, and Iraq are the major fuel consumers of the region and contribute more than 60% of the region's total fossil fuel CO2 emissions (Boden et al., 2017).  reaches the minimum value in August. The decrement in its concentration from May to August is due to several reasons, primarily due to the strong photosynthesis and weak respiration rate by the plants, and this process is enhanced during the monsoon or rainy season (Mustafa et al., 2020). The increment in XCO2 concentration from September to April is likely to be caused by weak photosynthesis and strong respiration, the use of heating systems in winter, and strong microbial activity (Cao et al., 2017;Mustafa et al., 2021a). 335

Estimated CO2 Emissions
The annually-averaged XCO2 anomalies, MODIS NPP, and ODIAC CO2 emission datasets for a period of four years from 2015-2018 were used as a training dataset for the GRNN model built to estimate the CO2 emissions using the method described in section 2.2. Then the GRNN model was applied to 2019 annually-averaged XCO2 anomalies and NPP datasets to predict 340 the CO2 emissions with the same unit as the ODIAC CO2 emissions. The analyses were carried out separately over East and West Asia. Figures 4a and 4b show the estimated and the ODIAC CO2 emissions over East Asia, respectively. The results show that the estimated and the inventory CO2 emissions exhibit nearly the same spatial distribution pattern. The eastern part of the region shows higher CO2 emissions and the western and northern parts, in particular, the Tibetan plateau and Mongolia show the minimum CO2 emissions. The pattern is also similar to XCO2 anomalies distribution over East Asia (Figure 3d). The 345 estimated CO2 emissions have a relatively smoother distribution pattern compared to the ODIAC CO2 emission and it might be due to the interpolation of the OCO-2 dataset. Figure 4c shows the difference between the estimated and the inventory CO2 emissions over East Asia. The estimated CO2 emissions are generally overestimated relative to the ODIAC CO2 emissions; however, the emissions are underestimated over some parts of the region as well. Figure 4d shows the landcover distribution of East Asia provided by the Copernicus Global Land Services (Buchhorn et al., 2020). The predicted CO2 emission is 350 overestimated over most of the regional parts; whereas, this overestimation is more significant over agricultural areas which 9 are located near the high-density region, i.e., eastern China. Eastern China, Japan, and Korea are known to be among the regions with the highest CO2 emissions and this underestimation over the agricultural areas might be caused by the nearby 370 CO2 emitting sources which raise the CO2 concentration of the nearby areas through atmospheric transport. Previous studies demonstrated that the concentration of atmospheric CO2 was influenced by atmospheric transport (Cao et al., 2017;Kumar et al., 2014). The areas where the predicted CO2 emission is underestimated are covered by agriculture, forest and vegetation.
This underestimation of the predicted CO2 emissions over these areas indicate the presence of uncertainties in the XCO2 anomalies that are likely to be produced by the CO2 uptake of the biosphere which is still remaining in the XCO2 anomalies. 375 In addition, the areas where the estimated CO2 emissions are overestimated have higher elevations. OCO-2 observations show larger uncertainties over elevated and mountainous areas, especially the Tibetan Plateau where the OCO-2 retrievals are significantly overestimated (Kong et al., 2019;Mustafa et al., 2020) and this might also have a contribution to the overestimation of estimated CO2 emissions. The difference between the estimated and the ODIAC CO2 emissions was ranging from -0.06x10 9 kg to 3.2x10 9 kg and the magnitude of difference between -1x10 9 kg to 1x10 9 kg accounted for 84% of the 380 total number of grid cells. (Yang et al., 2019) estimated the CO2 emissions by a similar machine learning approach using GOSAT XCO2 retrievals over China and differences between the estimated and the ODIA CO2 emissions were between -5x10 9 kg to 5x10 9 kg. Moreover, the predicted results from the referenced study exhibited overall less CO2 emissions relative to the ODIAC emissions contradicting our results. Our study showed better results and it might be due to several reasons; (i) we improved the prediction model with the addition of NPP dataset (Figure 4e), (ii) we utilized the higher resolution XCO2 385 retrievals provided by OCO-2, and (iii) we incorporated the OCO-2 XCO2 retrievals processed using the latest version of the retrieval algorithm. The newer version of the ACOS L2FP retrieval algorithm has improved the quantity as well as the quality of the satellite-based observations (Taylor et al., 2021). is similar with some differences in their magnitudes. CO2 emissions in the eastern parts are relatively larger compared to other parts of the region. Figure 5c shows the difference between the estimated and the ODIAC CO2 emissions. The satellite-based estimated CO2 emissions are generally overestimated compared to the actual ODIAC CO2 emissions. The estimated CO2 emissions are notable larger over Iran and Saudi Arabia. Figure 5d shows the landcover distribution of West Asia. It can be seen that the predicted CO2 emissions are overestimated over the areas that are covered by either urban settlements or bare 395 land. The overestimation of estimated CO2 over these areas is likely to be caused by atmospheric transportation that influences the spatial distribution of atmospheric CO2 (Cao et al., 2017). Moreover, a large part of West Asia is covered by deserts and these deserts observe a notably lower number of OCO-2 retrievals (Figure 3a). The overestimation of the predicted CO2 emissions over the largest desert of the region, the Rub' al Kahli, located in southern parts is likely to be caused by the uncertainties in the satellite-based XCO2 anomalies and these uncertainties are likely to be produced due to a lower number of 400 OCO-2 retrievals. In addition, a previous study also indicated that the ACOS XCO2 retrieval algorithm showed uncertainties over deserts (Bie et al., 2018). Similar to East Asia, the predicted CO2 emissions over West Asia are also underestimated over Formatted: Subscript Deleted: (Cao et al., 2017;Kumar et al., 2014)

455
This overestimation in the predicted CO2 emissions might be caused by the nearby CO2 emitting sources which raise the CO2 concentration of the nearby areas through atmospheric transport. Previous studies demonstrated that the concentration of atmospheric CO2 was influenced by atmospheric transport (Cao et al., 2017; the areas that are covered by agriculture or vegetation and this underestimation might be due to the presence of CO2 uptake of 500 the biosphere in the XCO2 anomalies calculated using the satellite-based retrievals. The difference between the estimated and the ODIAC CO2 emission was ranging from -0.16x10 9 kg to 2.8x10 9 kg and the magnitude of difference between -1x109 kg to 1x109 kg accounted for 88% of the total number of grid cell. Figure 6 shows the correlation analysis between the ODIAC CO2 emissions and the XCO2 anomalies calculated using the OCO-2 retrievals over East and West Asia. (Yang et al., 2019)found that the cluster of XCO2 changes derived from satellitebased observations showed a better and more significant correlation with the CO2 emissions relative to a single grid of XCO2 and it might be due to the reason that the atmospheric CO2 measurement is an instantaneous snapshot of the realistic atmosphere . For that, we segmented the ODIAC emissions which were binned according to every 0.3 tons/year of lgE 510 using mean emissions calculated from annual emissions during 2015-2019, and then the correlation analysis was carried out between the mean of emissions and the mean of the XCO2 anomalies within the binned regions. The results showed a positive and significant correlation between the two datasets. Figures 6a and 6b show the spatial distribution of segmented ODIAC emissions over East Asia and the scatterplot between the mean of emissions and mean of XCO2 anomalies, respectively. The two datasets show a positive and significant correlation with the determined coefficient (R 2 ) of 0.81. The spatial distribution 515 of segmented ODIAC emissions over West Asia and the scatterplot between the mean of emissions and mean of XCO2 anomalies are shown in Figures 6c and 6d, respectively. The two datasets showed a good correlation with the determined coefficient (R 2 ) of 0.60. Several studies correlated the satellite-based XCO2 anomalies with the CO2 emissions (Fu et al., 2019;Shekhar et al., 2020). (Yang et al., 2019) performed a correlation analysis between the GOSAT based XCO2 anomalies with the ODIAC CO2 emissions over China and found a significant correlation with a determined coefficient (R 2 ) of 0.82 which 520 increased up to 0.95 if the analysis was carried out with higher values of CO2 emissions. In our study, the correlation between the CO2 emissions and XCO2 anomalies is relatively low for West Asia and it might be due to the uncertainties in the OCO-2 retrievals. A large part of West Asia is covered by deserts and (Bie et al., 2018) reported that the ACOS XCO2 retrieval algorithm showed uncertainties over deserts.

4 Summary and Conclusions
In this study, the anthropogenic CO2 emissions were estimated using satellite datasets employing a neural network-based method. The study was carried out using ODIAC CO2 emission, OCO-2 XCO2, and MODIS NPP datasets from 2015 to 2019.
To remove the CO2 seasonal variability and the large background concentration from the OCO-2 XCO2 retrievals, XCO2 anomalies were calculated for each year. Then a GRNN model was built and XCO2 anomalies, NPP, and CO2 emissions from 530 11 spatial distribution. The estimated CO2 emissions showed better results over East Asia compared to West Asia and it might be due to the uncertainties in the XCO2 retrievals because previous studies reported that the ACOS XCO2 retrieval algorithm 560 produced uncertainties over deserts. The predicted CO2 emissions were generally overestimated and this overestimation was larger over the areas that were closer to the high-density urban regions. The overestimations might be due to the nearby high emission CO2 sources that raised the XCO2 concentration due to the effects of atmospheric transportation. The satellite-based estimated CO2 emissions were underestimated over some parts of the regions which were mostly covered by the agricultural areas and vegetation and it was likely to be caused by the uncertainties in the calculated XCO2 anomalies and these uncertainties 565 were produced due to the presence of CO2 uptake of the biosphere. We compared our results with a previous study carried out using a similar prediction model incorporating GOSAT XCO2 retrievals. The referenced study generally underestimated the predicted CO2 emissions with larger differences relative to ODIAC CO2 emission contradicting to our results. Our study showed relatively better results and it might be due to several reasons; (i) we improved the prediction model with the addition of NPP dataset, (ii) we incorporated OCO-2 XCO2 retrievals which have higher spatial resolution compared to the GOSAT 570 XCO2 retrievals, and (iii) we used the XCO2 product processed using the latest version of the ACOS L2FP retrieval algorithm.
The newer version of the algorithm has improved the quantity as well the quality of the XCO2 retrievals. Moreover, correlation analysis was also carried out between the ODIAC CO2 emissions and OCO-2 XCO2 anomalies and the results were significant with R 2 of 0.81 and 0.60 over East and West Asia, respectively. The results were in agreement with the previous studies.
The results from our study suggested that the CO2 emissions can be estimated using the observations obtained from 575 the CO2 monitoring satellites. Currently, several satellites are orbiting around the Earth and dedicatedly monitoring atmospheric CO2. Joint utilization of the observations from the old and the latest satellites such as OCO-3, GOSAT-2, and TanSAT might reduce the spatiotemporal gaps and uncertainties. In future studies, we intend to improve the GRNN model by the addition of CO2 uptake datasets and join utilization of multi-sensor data.

580
Author contributions. FM carried out the analysis under the supervision of LB, with inputs and supports from QW, NY, MS, MB, RWA, and RI. FM wrote the original article with feedback from all the co-authors.
Competing Interests. All the authors declare that there is not any personal or financial conflict of interest.

585
Acknowledgements. The authors acknowledge the efforts of NASA to provide the OCO-2 data products. The authors are also thankful to the National Institute of Environmental Studies (NIES) for providing the ODIAC CO2 emission dataset. The foremost author (Farhan Mustafa) is thankful to Thomas E. Taylor (Colorado State University, USA) for providing help in summarizing the evolution of the ACOS L2FP retrieval algorithm.