An improved near-real-time precipitation retrieval for Brazil

Pfreundschuh, Simon; Ingemarsson, Ingrid; Eriksson, Patrick; Vila, Daniel A.; Calheiros, Alan J. P.

doi:https://doi.org/10.5194/amt-15-6907-2022

Articles | Volume 15, issue 23

https://doi.org/10.5194/amt-15-6907-2022

Articles | Volume 15, issue 23

Research article

01 Dec 2022

Research article |

| 01 Dec 2022

An improved near-real-time precipitation retrieval for Brazil

Simon Pfreundschuh, Ingrid Ingemarsson, Patrick Eriksson, Daniel A. Vila, and Alan J. P. Calheiros

Abstract

Observations from geostationary satellites can provide spatially continuous coverage at continental scales with high spatial and temporal resolution. Because of this, they are commonly used to complement ground-based precipitation measurements, whose coverage is often more limited.

We present Hydronn, a neural-network-based, near-real-time precipitation retrieval for Brazil based on visible and infrared (Vis–IR) observations from the Advanced Baseline Imager (ABI) on the Geostationary Operational Environmental Satellite 16 (GOES-16). The retrieval, which employs a convolutional neural network to perform Bayesian precipitation retrievals, was developed with the aims of (1) leveraging the full potential of latest-generation geostationary observations and (2) providing probabilistic precipitation estimates with well-calibrated uncertainties. The retrieval is trained using more than 3 years of collocations with combined radar and radiometer retrievals from the Global Precipitation Measurement (GPM) core observatory over South America.

The accuracy of instantaneous precipitation estimates is assessed using a separate year of GPM combined retrievals and compared to retrievals from passive microwave (PMW) sensors and HYDRO, the Vis–IR retrieval that is currently in operational use at the Brazilian Institute for Space Research. Using all available channels of the ABI, Hydronn achieves accuracy close to that of state-of-the-art PMW precipitation retrievals in both precipitation estimation and detection despite the lower information content of the Vis–IR observations.

Hourly, daily, and monthly precipitation accumulations are evaluated against gauge measurements for June and December 2020 and compared to HYDRO, the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) Cloud Classification System (CCS), and the Integrated Multi-satellitE Retrievals for GPM (IMERG). Compared to HYDRO, Hydronn reduces the mean absolute error for hourly accumulations by 21 % (22 %) compared to HYDRO by 44 % (41 %) for the mean squared error (MSE) and increases the correlation by 138 % (312 %) for June (December) 2020. Compared to IMERG, the improvements correspond to 16 % (14 %), 12 % (12 %), and 20 % (56 %), respectively. Furthermore, we show that the probabilistic retrieval is well calibrated against gauge measurements when differences in the distributions of the training data and the gauge measurements are accounted for.

Hydronn has the potential to significantly improve near-real-time precipitation retrievals over Brazil. Furthermore, our results show that precipitation retrievals based on convolutional neural networks (CNNs) that leverage the full range of available observations from latest-generation geostationary satellites can provide instantaneous precipitation estimates with accuracy close to that of state-of-the-art PMW retrievals. The high temporal resolution of the geostationary observation allows Hydronn to provide more accurate precipitation accumulations than any of the tested conventional precipitation retrievals. Hydronn thus clearly shows the potential of deep-learning-based precipitation retrievals to improve precipitation estimates from currently available satellite imagery.

Download & links

How to cite.

Received: 21 Mar 2022 – Discussion started: 06 Apr 2022 – Revised: 27 Sep 2022 – Accepted: 17 Oct 2022 – Published: 01 Dec 2022

1 Introduction

Timely and highly resolved measurements of precipitation constitute an important source of information for weather forecasting, disaster response and hydrological modeling. These measurements can be provided by dense radar and gauge networks, but their coverage is typically limited in less populated regions. However, even where these measurements are available, they are not necessarily without issues. The ability of rain gauge measurements to truthfully represent spatial precipitation statistics is limited by their extreme localization (Smith et al., 1996). Ground-based precipitation radars are affected by beam blocking as well as measurement errors caused by the varying altitude of the radar beam along its range (Holleman, 2007).

Since satellite observations provide continuous spatial coverage, they are well suited to complement the measurements from gauges and ground radars. Microwave observations generally provide the most direct spaceborne measurements of precipitation because of their sensitivity to emission and scattering from precipitating hydrometeors. Unfortunately, due to their comparably low spatial resolution, these sensors are currently only employed on platforms in low-Earth orbit. Since this limits the width of the satellite swath, a large constellation of sensors on different platforms is required to achieve low revisit times. This is the approach pursued by the Global Precipitation Measurement (GPM; Hou et al., 2014). Nonetheless, the mean revisit time for the passive microwave (PMW) sensors of the GPM constellation still exceeds 1 h in the tropics.

Visible and infrared (Vis–IR) observations from the latest generation of geostationary satellites (Schmit et al., 2005) provide spatial resolutions between 0.5 and 2 km at the sub-satellite point and a temporal resolution of up to 10 min for full-disk observations. The disadvantage of these observations for measuring precipitation is that they are mostly sensitive to the properties of the upper parts of clouds, which are only indirectly related to the precipitation near the surface. Their unrivaled spatial and temporal resolution makes them a valuable source of information for satellite-based precipitation estimates nonetheless.

The operational use of geostationary Vis–IR observations for precipitation retrievals dates back more than 40 years (Scofield and Oliver, 1977), and a large number of different algorithms have been developed over the years (Arkin and Meisner, 1987; Adler and Negri, 1988; Vicente et al., 1998; Sorooshian et al., 2000; Kuligowski, 2002; Scofield and Kuligowski, 2003; Hong et al., 2004; Kuligowski et al., 2016). Due to the aforementioned indirect relationship between observations and precipitation, nearly all of these methods are based on empirical relationships derived from satellite observations collocated with reference data derived from more direct measurement techniques such as ground-based radars. Moreover, operational retrievals often rely on corrections to improve the accuracy of their estimates. The Self-Calibrating Multivariate Precipitation Retrieval (SCamPR; Kuligowski et al., 2016), for example, is dynamically calibrated using the latest available microwave precipitation estimates. Similarly, Karbalaee et al. (2017) develop a correction for the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) Cloud Classification System (CCS; Hong et al., 2004) based on retrievals from passive-microwave sensors. PERSIANN-CCS is superseded by the PERSIANN-PDIR (Nguyen et al., 2020) algorithm, which, in addition to refining the mathematical formulation of the regression scheme of PERSIANN-CCS, adds a regional correction scheme.

Another example is the HYDRO precipitation retrieval that is currently in operational use at the National Institute for Space Research (INPE) in Brazil, which is based on the Hydroestimator algorithm (Scofield and Kuligowski, 2003). It employs an empirical relationship between the 10.7 µm IR channel and precipitation rates with additional corrections. To adapt it for application over South America, yet another correction was derived by de Siqueira and Vila (2019), which improved the accuracy of precipitation accumulations but not that of instantaneous precipitation rates.

A common shortcoming of all retrieval algorithms discussed above is that they neglect retrieval uncertainties. The retrieval of precipitation rates from Vis–IR observations constitutes an inverse problem that is strongly underconstrained. This is true even for microwave-based retrievals and likely exacerbated by the less direct information content in the Vis–IR observations. The ill-posed character of the retrieval problem leads to significant retrieval uncertainties. Providing probabilistic estimates that quantify these uncertainties would help the characterization of precipitation estimates and thus increase their usefulness.

This study presents Hydronn, a novel real-time precipitation retrieval that uses Vis–IR observations from the Geostationary Operational Environmental Satellite 16 (GOES-16) Advanced Baseline Imager (ABI; Schmit et al., 2005) to retrieve precipitation over Brazil. It was designed with two aims: (1) to leverage the full potential of observations from the latest generation of geostationary sensors and (2) to develop a Bayesian precipitation retrieval algorithm that can provide well-calibrated uncertainty estimates.

Pfreundschuh et al. (2018) have shown that when a retrieval is cast as a probabilistic regression problem and solved using a neural network, the obtained results are equivalent to those obtained using traditional Bayesian retrieval methods, given that the a priori distribution matches the distribution of the data used to train the neural network. Neural-network-based probabilistic regression techniques thus provide a powerful and flexible way of combining recent advances in deep learning with the theoretically sound handling of retrieval uncertainties of Bayesian retrieval methods. Hydronn builds on this approach and uses a convolutional neural network (CNN) to predict a binned approximation of the probability density function (PDF) of the marginal posterior distribution of the observed precipitation at each output pixel.

The Hydronn retrieval is trained using more than 3 years of collocated observations from the ABI and combined radar–radiometer retrievals from the GPM core observatory (Grecu et al., 2016) over South America. The accuracy of Hydronn's instantaneous precipitation estimates is evaluated using a separate year of GPM combined retrievals and compared to HYDRO and the Goddard Profiling Algorithm (GPROF; Kummerow et al., 2015) applied to PMW retrievals from the GPM Microwave Imager (GMI). The accuracy of precipitation accumulations is evaluated using gauge measurements from June and December 2020. They are compared with HYDRO and two other commonly used precipitation products: PERSIANN-CCS, which is based on geostationary IR observations only, and the Integrated Multi-satellitE Retrievals for GPM (IMERG; Huffman et al., 2020), which combines observations from microwave, geostationary sensors, and rain gauges.

2 Data

This section introduces the various datasets that are used to train and evaluate the Hydronn retrievals.

2.1 GPM CMB

The GPM dual-frequency precipitation radar (DPR) and GMI Combined Precipitation product (Olson, 2017) combines observations from the DPR and GMI on board the GPM core observatory (Grecu et al., 2016). Although the officially listed short name for the product is GPM_2BCMB (Olson, 2017), we will refer to it as GPM CMB since we consider it more readable. Because of the high sensitivity to precipitating hydrometeors of the active and passive microwave observations, the product provides the most accurate spaceborne precipitation estimates that are currently available. In this study the product is used as reference data to train the Hydronn retrievals and to assess their accuracy for instantaneous precipitation estimates.

2.2 Rain gauge data

The rain gauge measurements that are used in this study were compiled by the National Institute of Meteorology of Brazil and consist of hourly gauge measurements covering the time range May 2000 until May 2020. June and December of 2020 will be used from these data for the evaluation of Hydronn. Data from 2018 and 2019 are used to derive correction factors for the calibration of the hourly precipitation estimates produced by Hydronn, as will be described in Sect. 3.6.

https://amt.copernicus.org/articles/15/6907/2022/amt-15-6907-2022-f01

Figure 1Overview of the rain gauge data from June and December 2020 used to validate the retrievals. Panel (a) displays their spatial distribution by means of the number of gauges falling into each hexagon. Hexagon-free areas are not covered by any gauges. Panels (b) and (c) show the mean precipitation measured by all gauges falling into each hexagon.

From all available gauge stations only those with a data availability exceeding 90 % during June and December 2020 were selected. Their geographical distribution is displayed together with the mean precipitation in Fig. 1. The gauge density is fairly high on the southeastern coast of Brazil but decreases markedly towards the northwest.

The precipitation in June 2020 is limited to the south of the country, small parts of the west coast, and the Amazon basin, although the latter is only sparsely covered by the gauge observations. June is typically the beginning of the dry seasons in the central part of the country, which is clearly visible in the gauge measurements.

December 2020 saw high precipitation amounts on the southwestern coast of the country extending towards the northwest, which are associated the South Atlantic convergence zone (SACZ; Satyamurty et al., 1998). Very low precipitation rates are observed in the northeast of the country, which is influenced by large-scale subsistence patterns (de Siqueira and Vila, 2019).

2.3 HYDRO

HYDRO is the currently operational near-real-time precipitation retrieval at the Center for Weather Forecast and Climate Studies/National Institute for Space Research (CPTEC/INPE). It is based on the Hydroestimator (Scofield and Kuligowski, 2003) and thus uses a combination of empirical power-law relationships between 10.7 µm IR brightness temperatures and surface precipitation with correction factors, taking into account model-derived moisture and wind parameters as well as cloud structure. The current version of the retrieval is described in de Siqueira and Vila (2019), which also introduces regional correction factors based on a climatology of surface precipitation rates derived from radar measurements of the Tropical Rainfall Measurement Mission (TRMM; Simpson et al., 1996) and GPM. For this study we use the corrected version of HYDRO proposed in de Siqueira and Vila (2019) with a regional correction for all of Brazil (referred to as HYDROBR in de Siqueira and Vila, 2019).

2.4 GPROF GMI

The Goddard Profiling Algorithm (GPROF; Kummerow et al., 2015) is used to retrieve surface precipitation from the PMW sensors of the GPM constellation. The algorithm is a Bayesian retrieval scheme that is based on a retrieval database principally built up of collocations of GMI observations and the GPM CMB retrievals. Because GMI is a dedicated precipitation sensor and because its retrieval is based on direct collocations with GPM CMB, the GPROF GMI retrieval is considered the most accurate of the sensors of the GPM constellation. Moreover, since GMI observations can always be collocated with the test data for the Hydronn retrieval, we use GPROF GMI as a baseline to assess the Hydronn retrievals against.

2.5 PERSIANN-CCS

PERSIANN-CCS (Hong et al., 2004) uses 10.7 µm IR observations from geostationary satellites to retrieve precipitation. Input images are first segmented using increasing temperature thresholds in order to identify pixels that correspond to convective activity. These pixels are consecutively assumed to be precipitating and classified using a neural-network-based algorithm. Quantitative precipitation estimates at pixel level are derived from this classification by applying a class-specific power-law relationship that relates the 10.7 µm brightness temperatures to precipitation.

The dataset that is used for the evaluation against Hydronn consists of hourly precipitation rates that are distributed in near-real time through the PERSIANN data portal (UCI CHRS Data Portal, 2022). Although the global CCS dataset is currently being replaced by the updated PERSIANN–Dynamic Infrared Rain Rate (PDIR-Now; Nguyen et al., 2020), we were not able to use it for this study due to parts of the evaluation period being missing from the online archive.

2.6 IMERG

IMERG (Huffman et al., 2020) combines retrievals from passive microwave and IR observations as well as rain gauge measurements to produce global, half-hourly measurements of precipitation. Due to its reliance on a wealth of measurement sources as well as the sophistication of the retrieval pipeline, the product can be considered one of the most robust satellite-based precipitation products that are currently available (Pradhan et al., 2022).

Three different configurations of IMERG products are available: IMERG-Early and IMERG-Late are based solely on satellite observations and available with latencies of 4 and 14 h, respectively. IMERG-Final is adjusted using global gauge measurements but available only after 3.5 months. Although Hydronn has been designed to target near-real-time applications and is thus more similar to IMERG-Early, we use IMERG-Final for our comparison as it constitutes the most elaborate precipitation estimates that are currently available and can thus be considered the state of the art of global quantitative precipitation estimates.

3 Method

This section describes the implementation of Hydronn, the proposed near-real-time precipitation retrieval algorithm for Brazil. It is based on a convolutional neural network (CNN), which is used to predict the a posteriori distribution of instantaneous precipitation. Following this, it is discussed how the probabilistic precipitation estimates can be combined to hourly accumulations, and an a priori adjustment is proposed to account for differences between the training data and the gauge measurements that are used to evaluate the retrieval.

3.1 Training data

The training data for the Hydronn retrieval are generated from collocations of input observations from the GOES-16 ABI and retrieved surface precipitation from GPM CMB over South America. Figure 2 shows the domain over which the training data were extracted (marked as “R1” in Fig. 2). It extends from −85 to −30^∘ E in longitude and −40 to 10^∘ N in latitude. The plot also shows extracted training scenes and corresponding GPM CMB precipitation estimates for 23 September 2019.

https://amt.copernicus.org/articles/15/6907/2022/amt-15-6907-2022-f02

Figure 2GOES-16 true-color composite from 23 September 2019 (generated using the natural_color composite in Satpy; Raspaud et al., 2021). The rectangle R1 marks the domain over South America, which was used for the extraction of training and testing collocations between GOES-16 ABI and GPM CMB. Dashed polygons show the boundaries of the training scenes extracted for this day together with the collocated GPM CMB results. The rectangle R2 marks the secondary domain, which is used as an additional test domain to assess the impact of the spatially limited training domain.

The GOES-16 ABI observations were extracted at their native resolutions. The surface precipitation from GPM CMB was mapped to the 2 km resolution of the ABI's IR channels using nearest-neighbor interpolation. Collocations were extracted for the time range 1 January 2018 until 1 January 2020 and 1 January 2021 until 1 September 2021.

Collocations from 1 January 2020 until 1 January 2021 were extracted and set aside as test data for assessing the accuracy of the instantaneous precipitation estimates of Hydronn. In addition to this, collocations over an additional region (marked “R2” in Fig. 2) were extracted on days 1, 6, 11, 16, 21, and 26 of each month of the year 2020. These additional test data will be used to investigate the impact of the regional training database on the retrieval accuracy.

https://amt.copernicus.org/articles/15/6907/2022/amt-15-6907-2022-f03

Figure 3Distribution of reference precipitation rates. Panel (a) shows the seasonal PDFs of precipitation rates in the training data. Panel (b) shows the PDFs of precipitation rates measured by the gauges over the time period covered by the training data. Grey lines in the background trace the PDFs of the precipitation rates in the training data shown in panel (a).

An improved near-real-time precipitation retrieval for Brazil

2.1 GPM CMB

2.2 Rain gauge data

2.3 HYDRO

2.4 GPROF GMI

2.5 PERSIANN-CCS

2.6 IMERG

3.1 Training data

3.2 Retrieval configurations

3.3 Neural network model

3.4 Probabilistic precipitation estimates

3.5 Calculation of hourly accumulations

3.6 Correcting for a priori assumptions

4.1 Instantaneous precipitation estimates

4.1.1 Case study

4.1.2 Accuracy over target region

4.1.3 Accuracy over the Northern Hemisphere

4.2 Validation against rain gauge data

4.2.1 Quantitative precipitation estimates

4.2.2 Probabilistic estimates

4.3 Case study

5.1 Information content of Vis–IR observations

5.2 Probabilistic precipitation retrievals

5.3 Utility of a priori corrections

A1 Quantitative precipitation estimation

A2 Precipitation detection