Articles | Volume 16, issue 21

https://doi.org/10.5194/amt-16-5305-2023

Articles | Volume 16, issue 21

Research article

09 Nov 2023

Research article |

| 09 Nov 2023

A neural-network-based method for generating synthetic 1.6 µm near-infrared satellite images

Florian Baur, Leonhard Scheck, Christina Stumpf, Christina Köpken-Watts, and Roland Potthast

Abstract

In combination with observations from visible satellite channels, near-infrared channels can provide valuable additional cloud information, e.g. on cloud phase and particle sizes, which is also complementary to the information content of thermal infrared channels. Exploiting near-infrared channels for operational data assimilation and model evaluation requires a sufficiently fast and accurate forward operator. This study presents an extension to the method for fast satellite image synthesis (MFASIS) that allows for simulating reflectances of the 1.6 µm near-infrared channel based on a computationally efficient neural network with the same accuracy that has already been achieved for visible channels. For this purpose, it is important to better represent vertical variations in effective cloud particle radii, as well as mixed-phase clouds and molecular absorption in the idealized profiles used to train the neural network. A new approach employing a two-layer model of water, ice and mixed-phase clouds is described, and the relative importance of the different input parameters characterizing the idealized profiles is analysed. A comprehensive data set sampled from Integrated Forecasting System (IFS) forecasts together with different parameterizations of the effective water and ice particle radii is used for the development and evaluation of the method. Further evaluation uses a month of ICOsahedral Non-hydrostatic development based on version 2.6.1 (ICON-D2) hindcasts with effective radii directly determined by the two-moment microphysics scheme of the model. In all cases, the mean absolute reflectance error achieved is about 0.01 or smaller, which is an order of magnitude smaller than typical differences between reflectance observations and corresponding model values. The errors related to the imperfect training of the neural networks present only a small contribution to the total error, and evaluating the networks takes less than a microsecond per column on standard CPUs. The method is also applicable for many other visible and near-infrared channels with weak water vapour sensitivity.

Download & links

Article (PDF, 8123 KB)

Download & links

How to cite.

Received: 27 Feb 2023 – Discussion started: 18 Apr 2023 – Revised: 08 Aug 2023 – Accepted: 21 Aug 2023 – Published: 09 Nov 2023

1 Introduction

Over the last decades, satellite observations have become the most important observation type assimilated in numerical weather prediction (NWP) systems. They dominate not only in terms of the total number of assimilated observations, but also with respect to the overall impact on the forecast quality of operational global NWP systems (Bormann et al., 2019; Eyre et al., 2022). The preferred way to assimilate satellite observations from imagers and sounders is the direct assimilation of radiances, which requires a forward operator to generate synthetic radiances from the NWP model state. In contrast to assimilating retrievals, no external a priori information (e.g. from other models or climatologies) is required in the direct assimilation approach, and in general the characterization of errors is also less problematic (Errico et al., 2007). Satellite radiances are increasingly assimilated not only in clear-sky conditions, but also in the presence of clouds and precipitation. This so-called “all-sky” approach is being applied successfully to microwave (MW) observations in different global NWP systems (Geer et al., 2018), and progress is being made towards the direct all-sky assimilation of infrared (IR) observations (Geer et al., 2018, 2019; Li et al., 2022). Similarly, satellite observations are also assimilated in many regional models (Gustafsson et al., 2018) with a particular focus on observations from geostationary imagers providing temperature, moisture and cloud information with high temporal and spatial resolution (see e.g. Otkin and Potthast, 2019; Okamoto, 2017).

Efforts to improve the exploitation of satellite observations that are currently underutilized are ongoing, both in terms of assimilating already operational data under all conditions and over all surfaces and in terms of using channels that are not yet directly assimilated at all (Valmassoi et al., 2022; Hu et al., 2022). Solar satellite channels fall into the latter category, mostly because sufficiently fast and accurate forward operators are missing or have only become available recently. The development of such operators was hampered by the fact that standard radiative transfer (RT) methods for the solar spectral range (with wavelengths λ<4 µm) are computationally very expensive, as they require the detailed modelling of multiple scattering processes, which are much more important than in the thermal part of the spectrum (λ>4 µm). Moreover, 3D RT effects, i.e. effects involving horizontal photon transport, e.g. related to inclined cloud tops, cloud shadows or complex topography (see Marshak and Davis, 2005, for a detailed discussion), can be important for solar channels, especially at high resolutions and for large zenith angles. For visible channels Scheck et al. (2016) developed a method for fast satellite image synthesis (MFASIS), an efficient 1D RT approach based on a strong simplification of the vertical cloud structure and the use of precomputed results stored in compressed lookup tables (LUTs). This LUT-based version of MFASIS has been integrated into the Radiative Transfer for TOVS (RTTOV) satellite forward operator package in v12.2 with subsequent improvements in v12.3 and v13.1 (Saunders et al., 2018, 2020). MFASIS has been used in several model evaluation studies (Heinze et al., 2016; Stevens et al., 2020; Sakradzija et al., 2020; Geiss et al., 2021) as well as data assimilation studies (Schröttle et al., 2020; Scheck et al., 2020). The assimilation of visible SEVIRI (Spinning Enhanced Visible and InfraRed Imager) observations at 0.6 µm using RTTOV-MFASIS became operational at DWD (Deutscher Wetterdienst, German Meteorological Service) in March 2023. An extension to MFASIS to account for the most important 3D RT effects in a computationally efficient way is available (Scheck et al., 2018), and recently a faster and more flexible version based on neural networks instead of LUTs has been developed (Scheck, 2021 a) and integrated into RTTOV 13.2.

The cloud information contained in visible channels is complementary to that available from thermal infrared channels. While visible channels provide almost no information that could be used to determine the cloud top height or discriminate frozen clouds from liquid clouds, they contain much more information on the cloud water or cloud ice content as they saturate only for much thicker clouds than thermal channels (Geiss et al., 2021). There is also some dependency of visible radiances on the cloud particle sizes and the surface structure of clouds.

Near-infrared (NIR) channels ( $0.75 \leq λ \leq 4$ µm) depend on cloud particle sizes and angles in a different way, compared to visible channels, and can thus provide additional information that could be very valuable for both model evaluation and data assimilation. The combined information of visible and near-infrared channels has been successfully used for many years to simultaneously retrieve cloud optical thickness and effective particle radii (following Nakajima and King, 1990). Such observations constraining the cloud microphysics are also of special relevance for NWP models employing advanced cloud physics schemes like two-moment schemes that provide prognostic effective cloud particle sizes (see e.g. Seifert and Beheng, 2006). Of particular interest is the 1.6 µm channel available on many satellite imagers because at this wavelength ice absorbs radiation considerably stronger than water. In combination with a visible channel, for which absorption by both water and ice is very weak, the 1.6 µm channel can thus provide information that is helpful for distinguishing liquid clouds from frozen clouds (but will not in all cases allow for a clear distinction; see e.g. Fig. 4 in Coopman et al., 2019). While information on the cloud phase is also available from thermal infrared channels, using near-infrared channels in addition (Baum et al., 2000) or instead (Nagao and Suzuki, 2021) can improve the reliability of retrievals. Assimilating the 1.6 µm channel could thus be a promising way to reduce cloud-phase errors.

MFASIS can already be applied to NIR channels, and LUTs for 1.6 µm channels of different instruments are available as part of the RTTOV package. However, mainly due to the rather approximate treatment of mixed-phase clouds, the currently employed method is considerably less accurate for this channel. Some corrections included in RTTOV 13.1 allow for avoiding the largest errors, but the accuracy in the 1.6 µm channel is still lower than in visible channels. This study demonstrates how to both improve the accuracy and reduce the computational effort through using a machine-learning-based approach. We focus on generating synthetic images for the 1.6 µm channel of the SEVIRI instrument aboard Meteosat Second Generation (MSG) from global and regional NWP model data. Building on the neural-network-based results for visible channels of Scheck (2021 a), suitable network input parameters to account for the more complex dependency of near-infrared radiances on the atmospheric state will be identified. Networks with these input parameters are then trained and tested on different data sets.

The rest of this study is organized as follows: data and methods are discussed in Sect. 2, suitable network input parameters are derived in Sect. 3, the training of neural networks based on these profiles is discussed in Sect. 4, the full method is evaluated using different data sets in Sect. 5 (also for other solar channels) and conclusions are given in Sect. 6.

2 Data and methods

2.1 Radiative transfer methods

2.1.1 DOM

For reference calculations and the generation of neural network training data, the discrete ordinate method (DOM; see Stamnes et al., 1988) is used. We rely on the implementation of DOM in the RTTOV package (Saunders et al., 2018). The required input data comprise vertical profiles of the cloud water and cloud ice content, including the corresponding effective particle radii, a value for the surface albedo (A), solar and satellite zenith angles (θ₀, θ), and the difference in their azimuth angles (Δϕ). DOM solves the plane-parallel radiative transfer equations and computes the resulting top-of-atmosphere reflectance. In RTTOV, the liquid cloud optical properties used in this process are based on Mie (1908), and for ice clouds the optical properties for the general habit mixture of Baum et al. (2005, 2007) are used. Aerosols are neglected in this study, but clear-sky Rayleigh scattering and molecular absorption are taken into account.

2.1.2 MFASIS

DOM generates accurate 1D RT solutions but is significantly too slow for operational applications like data assimilation. For this reason, the fast method MFASIS (Scheck et al., 2016) was developed and has subsequently been implemented in RTTOV (beginning with version 12.2; see Saunders et al., 2020). In MFASIS, the cloud top height and details of the vertical cloud structure are not taken into account for computing the reflectance, and still the reflectance errors with respect to the DOM solution are small. These properties of the input profiles can therefore be considered to be not very important for the resulting reflectance. The complex vertical profiles from NWP runs are in MFASIS replaced by highly idealized profiles with the same total optical depths and mean effective particle radii. These idealized profiles contain a homogeneous ice cloud above a homogeneous water cloud at fixed heights embedded in a standard atmosphere. Only eight parameters are used to fully characterize the idealized radiative transfer problem: the optical depths and vertically averaged effective particle radii for the water and the ice cloud, three angles to define the sun–satellite geometry, and the surface albedo. Reflectances for many combinations of the parameters are precomputed using DOM and are stored in an 8D lookup table (LUT). The latter is reduced from 8 GB to 21 MB using a lossy compression method. To obtain reflectances for arbitrary input profiles, it is only necessary to compute the input parameters from them and interpolate the reflectance in the LUT at the corresponding location.¹ This process only takes several microseconds and is thus orders of magnitude faster than running DOM. Both the achieved speed and the accuracy are sufficient for assimilation of visible radiance observations in operational applications.

While the simplification of the vertical profiles in MFASIS causes reflectance errors that are acceptable for data assimilation or model evaluation using visible channels, they remain significantly larger for the 1.6 µm near-infrared channel that is considered in this study for three reasons:

The absorption in ice is considerably stronger than in water. Replacing mixed-phase clouds, which are often dominated by liquid water at the top, by an ice cloud above a water cloud therefore causes large errors.
The 1.6 µm channel is slightly affected by molecular absorption (due to CO₂, CH₂ and for wider channels like the one on the SEVIRI instrument considered here also water vapour), which means that the air mass between the cloud and satellite will have a stronger influence on the reflectance than in visible channels.² For SEVIRI the water vapour mass will also have some influence. To give an example, for a relatively high column-integrated water vapour content of 50 mm and solar and satellite zenith angles of 60^∘ the reflectance is reduced by about 5 %.
The vertical variation in the effective particle radii is not taken into account in the simplified profiles. While this error source alone would still be acceptable (it is indeed also present for visible channels), it contributes significantly to the total error, in addition to the two other problems listed above. The problem is that the effective radii in the uppermost cloud layers, from which photons can escape after single-scattering events, may be different from the effective radii at higher optical depths that contribute to the reflectance by multiple scattering processes. To approximate both the correct scattering angle dependence of the reflectance, which is dominated by single-scattering processes, and the correct angle-averaged reflectance, which is often dominated by multiple scattering processes, at least two different radii are required.

Preliminary solutions to account for the largest errors were introduced into the MFASIS implementation in RTTOV v13.1: replacing ice within or below water clouds with water clouds of the same optical depth reduced the errors for mixed-phase clouds. The computation of the mean effective radius was modified to give more weight to the upper cloud layers for thick clouds. While these corrections succeeded in removing the largest errors, the mean errors are still considerably larger than in visible channels. In this study we present a new approach, which is more accurate and faster as well as based on neural networks.

2.1.3 Neural networks and MFASIS-NN

Artificial neural networks are the most popular machine learning approach. They have the advantages that mature, easy-to-use implementations are available and that many CPUs and GPUs now support hardware-accelerated training and evaluation of these networks. A neural-network-based version of MFASIS for visible channels, in the following referred to as MFASIS-NN, has been developed by Scheck (2021 a). While the simplification of vertical profiles in this method is the same as in MFASIS, the LUT is replaced by a deep feed-forward neural network. The input parameters of the network correspond to the dimensions of the LUT. Reflectances for arbitrary albedo values can be computed from the three output parameters approximating reflectances for surface albedo values 0, 0.5 and 1 (see Eq. 6 in Scheck, 2021 a, and the discussion in Sect. 4.1). The study shows that networks with several thousand parameters in four to eight hidden layers can be trained well enough to achieve reflectance errors that are in general smaller than the ones of the LUT version. The amount of data to be generated with DOM for the training is a factor of 1000 smaller than the 8 GB required for the LUT-based MFASIS. Moreover, using a computationally cheap activation function and a Fortran inference code optimized for small networks, MFASIS-NN is an order of magnitude faster than the LUT-based MFASIS.

As in Scheck (2021 a), deep feed-forward neural networks are used in this study. Networks with 8 hidden layers and 15, 25 or 32 nodes per layer are considered. The networks are initialized with random numbers and trained with the open-source TensorFlow package (Abadi et al., 2015) using standard methods. The mini-batch gradient descent method (with a batch size of 256) and the Adam algorithm (see Chap. 8 in Goodfellow et al., 2016) with a learning rate of $2.5 \times 10^{- 4}$ were utilized for this purpose. About 1.4×10⁷ synthetic training data samples were generated by assuming random numbers for the normalized network input parameters (uniformly distributed in [0,1], which means that the unnormalized, physical variables are uniformly distributed over the ranges given by Table 2) and by computing the corresponding reflectance with DOM. A total of 80 % of the samples were used for the training, and 20 % served as independent validation data. The networks were trained for 4000 epochs. During the training, the updated network weights and biases were stored only if they resulted in a reduced root-mean-square error of the validation data set, an approach known as early stopping. For the evaluation or inference of networks, we employed FORNADO, an optimized Fortran code including tangent linear and adjoint versions. To reduce the computational effort, the “cheap soft unit” (CSU; see Fig. 2 in Scheck, 2021 a), defined as

\begin{matrix} (1) & f_{CSU} (x) = \{\begin{array}{lcl} 0, & if & x < - 2 \\ - 1 + 0.25 (x + 2)^{2}, & if & x \in [- 2, 0] \\ x, & if & x > 0, \end{array} \end{matrix}

was used as an activation function for the hidden layers. This function is very similar to the well-known exponential-linear unit (ELU), $f_{ELU} (x) = min (e^{x} - 1, | x |)$ , but does not involve a computationally expensive exponential function, which can also prevent the compiler from using vector instructions. For the output nodes, we used the softplus function, $f_{softplus} (x) = \ln (1 + e^{x})$ , which guarantees that all output values are positive.

2.2 NWP SAF profiles

A comprehensive set of profiles available from the Satellite Application Facility for Numerical Weather Prediction (NWP SAF) project is used to tune and evaluate the methods developed for this study. The data set comprises 5000 individual profiles selected from a year (1 September 2013–31 August 2014) of short-range forecasts produced with the Integrated Forecasting System (IFS) of the European Centre for Medium-Range Weather Forecasts (ECMWF) using an algorithm which only selects profiles that are sufficiently different in cloud variables compared to the other selected profiles. The profiles represent realistic seasonal variability, and, as they are spread over the entire globe, global variability is also well represented. About 30 % of the profiles are located over land and about 40 % between the northern and southern tropics. Refer to Eresmaa and McNally (2014) for further information about the data set. It should be noted that the cloud fraction profiles, c(z), were modified for this study. To avoid having to take cloud overlap into account, for which different assumptions exist (see e.g. Scheck et al., 2018) and which is not the focus of this work, the cloud fraction was set to zero for $c < \frac{1}{2}$ and to one for $c ≧ \frac{1}{2}$ . While this simplification certainly has some impact on the distribution of total optical depths, it should not pose a serious limitation while making reference calculations with DOM much cheaper.³

The NWP SAF profiles do not contain any information on effective cloud particle sizes, which are required for RT calculations. Therefore, parameterizations have to be used. For effective radii of water cloud droplets, the parameterization of Martin et al. (1994) is used, which depends on the liquid water content and a droplet number concentration N_C. Here, we adopt either N_C=100 cm⁻³ or N_C=200 cm⁻³, which are typical values used in NWP models. For effective ice particle sizes we rely either on the parameterization by McFarquhar et al. (2003), which depends only on the ice content, or on the one by Wyser (1998), which depends in addition on the temperature. All of these radius parameterizations can produce unrealistically small radii for low water/ice contents and under certain conditions also radii that are larger than the maximum radii RTTOV accepts. To reduce the impact of these cases, we limit the effective droplet radii to the range [5 µm,25 µm] and the effective ice particle radii to the range [20 µm,60 µm], as in Scheck et al. (2016). With these effective radii and the water/ice contents from the IFS data, extinction coefficient profiles for water and ice cloud layers (β_w(z) and β_i(z)) can be computed using channel-specific conversion factors provided by RTTOV. The vertical integrals of β_w(z) and β_i(z) are the water and ice optical depths τ_w and τ_i. From their distribution (Fig. 1) it is evident that there is a wide variety of water, ice and mixed-phase clouds with optical depths up to several hundreds.

https://amt.copernicus.org/articles/16/5305/2023/amt-16-5305-2023-f01

Figure 1Water cloud optical depth τ_w and ice cloud optical depth τ_i for all profiles of the std data set (see Table 1).

A neural-network-based method for generating synthetic 1.6 µm near-infrared satellite images

2.1 Radiative transfer methods

2.1.1 DOM

2.1.2 MFASIS

2.1.3 Neural networks and MFASIS-NN

2.2 NWP SAF profiles

2.3 ICON hindcasts

3.1 Extending the MFASIS approach

3.2 Impact of surface pressure and cloud top height

3.3 Optimizing two-layer clouds

3.4 Accounting for mixed-phase clouds

3.5 A simple bias correction

4.1 Setup

4.2 Results

5.1 NWP SAF profiles

5.2 Regional ICON hindcasts

5.3 Other solar channels