To retrieve aerosol properties from satellite measurements of the oxygen A-band in the near-infrared, a line-by-line radiative transfer model implementation requires a large number of calculations. These calculations severely restrict a retrieval algorithm's operational capability as it can take several minutes to retrieve the aerosol layer height for a single ground pixel. This paper proposes a forward modelling approach using artificial neural networks to speed up the retrieval algorithm. The forward model outputs are trained into a set of neural network models to completely replace line-by-line calculations in the operational processor. Results comparing the forward model to the neural network alternative show an encouraging outcome with good agreement between the two when they are applied to retrieval scenarios using both synthetic and real measured spectra from TROPOMI (TROPOspheric Monitoring Instrument) on board the European Space Agency (ESA) Sentinel-5 Precursor mission. With an enhancement of the computational speed by 3 orders of magnitude, TROPOMI's operational aerosol layer height processor is now able to retrieve aerosol layer heights well within operational capacity.

Launched on 13 October 2017, The TROPOspheric Monitoring Instrument

The ALH retrieval algorithm is computationally expensive, requiring several minutes to compute

The bottleneck identified here is the large number of calculations that the forward model has to compute to retrieve information on weak scatterers such as aerosols. Several steps to circumvent this bottleneck exist, such as using correlated

The studies of

Section

The TROPOMI aerosol layer height is one of the many algorithms that exploit vertical information on scattering aerosol species in the oxygen A-band

The cost function

Optimal estimation iteratively simulates TOA radiance spectra until the convergence of

Reflectances are calculated by accounting for scattering and absorption of photons from their interactions with aerosols, the surface, and molecular species. Molecular scattering of photons in the oxygen A-band is described by Rayleigh scattering, and absorption is described by photon-induced magnetic dipole transition between

To reduce the number of calculations, various atmospheric properties are simplified. As the Rayleigh optical thickness is low at 760 nm, DISAMAR only computes the monochromatic component of light by calculating the first element of the Stoke's vector. The exclusion of higher-order Stoke's vector elements of the radiation fields has not shown to be a significant source of error

Calculating the influence of rotational Raman scattering (RRS) is also ignored, as it is a computationally expensive step. While this exclusion of RRS is not advised by the literature

Perhaps the largest simplification of the atmosphere lies in model's description of aerosols, assumed to be distributed in a homogeneous layer at a height

These simplifications in the DISAMAR forward model are a necessity for the line-by-line aerosol layer height algorithm, owing to its slow computational speed. The speed-up of forward model simulation encourages an increase in the complexity of the simulation assumptions.

TROPOMI's near-infrared (NIR) spectrometer records data between 675 and 775 nm, spread across two bands: band 5 contains the oxygen B-band and band 6 contains the oxygen A-band. The spectral resolution, which is described by the full width at half maximum (FWHM) of the instrument spectral response function (ISRF), is 0.38 nm with a spectral sampling interval of 0.12 nm. The spatial resolution is around

Input parameters required by the TROPOMI ALH retrieval algorithm encompass satellite observations of the radiance and the irradiance, solar–satellite geometry, and a host of atmospheric and surface parameters required for modelling the interactions of photons within the Earth's atmosphere (see Table

TROPOMI incorporates information from the VIIRS instrument to detect the presence of cirrus clouds in the measured scene (using a cirrus reflectance threshold of 0.01). This information is further combined with cloud fraction retrievals by the TROPOMI FRESCO algorithm (maximum cloud fraction of 0.6), and the difference between the scene albedo in the database in the UV band and the apparent scene albedo at the same wavelength calculated using a lookup table (if the difference is larger than 0.2, it suggests cloud contamination). A combination of these different cloud detection strategies results in the “cloud_warning” flag in the Level-2 TROPOMI ALH product. In this paper, however, we use a strict FRESCO cloud fraction filter of 0.2 to remove cloudy pixels.

Input parameters required for retrieving aerosol layer height using TROPOMI measured spectra.

Calculation of TOA reflectance and its derivatives with respect to

Artificial neural networks consist of connected processing units, each individually producing an output value given a certain input value. The interaction of these individual processing units, also known as nodes (or neurons), enable the connecting network to map a set of inputs (also known as the input layer) to a set of outputs (also known as the output layer). The connections are known as weights and their values symbolise the strength of a connection between two nodes. As the nodes connect inputs to the outputs, higher values in a set of connecting weights represent a stronger influence of a particular parameter in the input layer over a particular parameter in the output layer. These weights are determined after training the neural network.

The training (or optimisation) of a neural network begins with a training data set containing many instances of input and output layer elements. As true values of the output layer for a given set of inputs are exactly known in the training data set, the biased output of the neural network calculated after using randomised, non-optimised weights can be easily calculated. These biases are called prediction errors, and are an essential element in the optimisation of the neural network weights. The mean squared error (MSE) between the true output and the calculated output is also called the loss function (henceforth annotated as

The standard architecture of the NN-augmented operational aerosol layer height processor includes three neural network models for estimating top-of-atmosphere sun-normalised radiance, the derivative of the reflectance with respect to

The models are trained using the Python Tensorflow module

The inputs for NN are collectively referred to as the feature vector. The parameters included in the feature vector are a very important factor deciding the performance of the neural network. The primary classes of model parameters (relevant to retrieving

Scene-dependent input model parameters for the NN model. See also Fig.

As the NN forward model is specifically designed for TROPOMI, the solar–satellite geometry is selected to represent TROPOMI orbits for the training data. Meteorological parameters for the locations associated with these solar–satellite geometries are derived from the 2017 60-layer ERA-Interim reanalysis data

Generally, the required training data size increases with increasing nonlinearity between input and output layers in a neural network – there is no specific method to accurately determine the required sample size before training. The number of spectra generated for the training set was determined by training different models with a different number of spectra in the training set ranging from 1000 to 600 000. In general it was observed that incorporating more data resulted in a better neural network model. In order to test the trained neural network model, 500 000 spectra were selected. Finding the most optimal neural network configuration requires testing the trained neural network model. To that end, the training data set was divided into a training–testing split, where the model was trained on the majority of the training data set and tested on the remaining minority. Once trained, the model was tested again on a test data set with 100 000 scenes outside of the training data set. These spectra were generated using DISAMAR with the model parameter ranges described in Table

Histograms of the various input parameters for each of the neural network models in NN. Minimum and maximum values for each of the parameters are shown in Table

The most optimal configurations for each of the three NN models are determined by the number of hidden layers, the number of nodes on each layer, and the chosen activation function for which the discrepancy between the modelled output for specific inputs and the truth (derived from DISAMAR) is minimal. The difference between the outputs calculated by DISAMAR and NN for these three models provides insight into their performance.

In order to test the most optimal number of layers, the most optimal number of nodes per layer, and the activation function, several neural network configurations were trained for 250 000 iterations and their summed losses (defined as

Summed loss as a function of training step for different neural network model configurations.

To begin, with 50 nodes per hidden layer, three neural networks – one-layered, two-layered, and three-layered – for each of the three models were trained. The neural network models performed best with at least two hidden layers (Fig.

Schematic of each of the three neural networks in NN. There are two hidden layers, each containing 100 nodes.

Performance of the finalised neural network. Panels

The finalised configurations were then trained for 1 million iterations after which they were applied to the test data set to study prediction errors. Figure

To test the NN-augmented retrieval algorithm, we apply the generated NN models to synthetic test data and real data from TROPOMI and compare its retrieval capabilities to those of DISAMAR. The synthetic data were produced using the DISAMAR radiative transfer model; therefore, we expect the online radiative transfer retrievals to be generally better than the NN-based retrievals. The aerosol model utilised in the retrieval is the same at that in Sect.

A comparison of biases (in the presence of model errors) in the final retrieved solution is indicative of the efficacy of NN in replacing DISAMAR to retrieve ALH. To directly compare the

Retrieved layer heights compared between DISAMAR and NN for 2000 synthetic spectra in the presence of model errors. The dots represent converged scenes only, with the

Histogram of differences between the retrieved

A count of converged and non-converged results from synthetic experiments (sim) comparing retrieved (ret) aerosol layer heights between DISAMAR and NN.

The retrieved aerosol layer heights from DISAMAR and NN in the presence of model errors in aerosol layer thickness were found to be similar (Fig.

A total of 5558 retrievals from the 8000 different cases converged to a final solution. On average,

From the 8000 scenes within the synthetic experiment, NN retrieved aerosol layer heights for 546 scenes where DISAMAR did not. Conversely, 586 scenes converged for DISAMAR and not for NN. A comparison of the biases from these odd retrieval results is plotted in Fig.

Histogram of biases (retrieved minus true) for scenes in the synthetic experiment for which either NN converges to a solution (red bar plot) and DISAMAR does not, or DISAMAR converges to a solution (blue bar plot) and NN does not.

The December 2017 southern California wildfires have been attributed to very low humidity levels, following delayed autumn precipitation and severe multi-annual drought

Figure

Comparison of retrieved aerosol layer heights from TROPOMI-measured spectra (orbit number 858) for the 12 December 2017 southern California fires using DISAMAR and NN.

Statistics of difference between retrieved

The time required by the line-by-line operational processor was

Of the algorithms that currently retrieve TROPOMI's suite of Level-2 products, the aerosol layer height processor is an example of one that requires online radiative transfer calculations. These online calculations have traditionally been tackled with KNMI's radiative transfer code DISAMAR, which calculates (among other parameters) sun-normalised radiances in the oxygen A-band. There are, in total, 3980 line-by-line calculations per iteration in the optimal estimation scheme, requiring several minutes to retrieve aerosol layer height estimates from a single scene. This limits the yield of the aerosol layer height processor significantly.

The bottleneck is identified to be the number of calculations DISAMAR needs to carry out at every iteration of the Gauss–Newton scheme of the estimation process. As a replacement, this paper proposes using artificial neural networks in the forward model step. Three neural networks are trained for the sun-normalised radiance and the derivative of the reflectance with respect to aerosol layer height and aerosol optical thickness, which are the two state vector elements. As the goal is to replicate and replace DISAMAR, line-by-line forward model calculations from DISAMAR were used to train these neural networks. A total of 500 000 spectra were generated using DISAMAR, and each of the neural network models was trained for a total of 1 million iterations with the mean squared error between the training data output and the neural network output being the cost function to be minimised in the optimisation process.

Over a test data set with 100 000 different scenes unique from the training data set, the neural network models performed well, with errors generally not exceeding 1 %–3 % in the predicted spectra and derivatives. Having tested the neural network models for prediction errors in the forward model output spectra, they were implemented into the aerosol layer height breadboard algorithm and further tested for retrieval accuracy. In order to do so, experiments with synthetic as well as real data were conducted. The synthetic scenes included 2000 spectra with different model errors in aerosol and surface properties. In these cases, the neural network algorithm showed very good compatibility with the aerosol layer height algorithm, as it was able to replicate the biases satisfactorily.

We evaluate aerosol layer heights retrieved from TROPOMI measurements over southern California on 12 December 2017, when the fire plume extensively floats from land to ocean over a dry and almost cloudless scene. Operational retrievals using both DISAMAR and the neural network forward models showed very similar results, with a few outliers around 500 m for pixels containing low aerosol loads. These biases were outweighed by the upgrade in the computational speed of the retrieval algorithm, as the neural-network-augmented processor observed a speed-up of 3 orders of magnitude, making the aerosol layer height processor operationally feasible. Having achieved this improvement in its computational performance, the aerosol layer height algorithm is planned to operationally retrieve the product for all possible pixels in each orbit of TROPOMI. Such a boost in processor output allows for better analyses of retrievals and offers the possibility of removing some of the forward model simplifications mentioned in Sect.

Satellite images of the 12 December 2017 Californian fires were derived from the MODIS 1 km Calibrated Radiances product developed by the

SN developed the neural network algorithm, supervised by MdG, JPV, and PFL. Several adjustments to the algorithm were made by MS, who also offered alternative viewpoints on the algorithm, supported the deployment of the algorithm, and helped diagnose the algorithm's performance post-deployment. JdH developed DISAMAR. MtL deployed the algorithm into the operational TROPOMI Level-2 processor.

The authors declare that they have no conflict of interest.

This article is part of the special issue “TROPOMI on Sentinel-5 Precursor: first year in operation (AMT/ACPT inter-journal SI)”. It is not associated with a conference.

This publication contains modified Copernicus Sentinel data. This research is partly funded by the European Space Agency (ESA) within the EU Copernicus programme.

This paper was edited by Jhoon Kim and reviewed by three anonymous referees.