Leveraging machine learning for quantitative precipitation estimation from Fengyun-4 geostationary observations and ground meteorological measurements

Li, Xinyan; Yang, Yuanjian; Mi, Jiaqin; Bi, Xueyan; Zhao, You; Huang, Zehao; Liu, Chao; Zong, Lian; Li, Wanju

doi:10.5194/amt-14-7007-2021

Articles | Volume 14, issue 11

https://doi.org/10.5194/amt-14-7007-2021

© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/amt-14-7007-2021

© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 14, issue 11

Research article

|

05 Nov 2021

Research article |

| 05 Nov 2021

Leveraging machine learning for quantitative precipitation estimation from Fengyun-4 geostationary observations and ground meteorological measurements

Xinyan Li, Yuanjian Yang, Jiaqin Mi, Xueyan Bi, You Zhao, Zehao Huang, Chao Liu, Lian Zong, and Wanju Li

Download

Final revised paper (published on 05 Nov 2021)
Supplement to the final revised paper
Preprint (discussion started on 20 Jul 2021)
Supplement to the preprint

Interactive discussion

Status: closed

RC1:
'Reviewer's comment on amt-2021-175', Anonymous Referee #2, 10 Aug 2021

The paper “Leveraging machine learning for quantitative precipitation estimation from Fengyun-4 geostationary observations and ground meteorological measurements”, by Li and co-workers, presents a preliminary application of a machine learning technique to retrieve precipitation hourly rate from geostationary VIS-IR data. A Random Forest classificator is applied to multispectral AGRI data on board the Chinese FY-4 satellite for 3 2-day storms occurred in Southern China: calibration and validation of the estimates are performed against hourly automatic weather station data.

The paper is interesting since there is very little published work on FY-4 data, however, I think the present manuscript needs a deep revision before to be published on AMT. Below, my suggestion to improve the quality of the manuscript.

Introduction

Lines 51 and following: any introduction on multisensor precipitation estimation cannot forget international programmes that provides high quality and high resolution precipitation products at global or continental scale, such as NASA-GPM or H-SAF. Please, complete.

Line 85, and in many other parts of the paper, are mentioned high-density stations, without any quantitative indication on how the density is measured and how “high density” is defined. Please, give more quantitative details on the station distribution.

Line 96. In Figure 1 please write the meaning of red areas (NPP_NTL?).

Lines 106 and line 108 mention levels: “met the levels for large-scale heavy precipitation” and “met the heavy rain level”. How are these levels defined?

Data

Line 128, very likely, 4km is the nominal resolution at nadir.

Lines 145-154. ERA5 fields come with significant latency (5 days for the “preliminary daily updates” and 3 months for the “quality-assured data”. Are these times compliant to the Authors’ aim to “monitor flood” (line 81)? Moreover, how could be possible the “the real-time monitoring and prediction of summer precipitation over East Asia” (lines 383-384)? Please discuss the temporal applicability of the proposed technique.

Lines 155-171. Please, improve this description. First try to clearly separate different steps of the algorithm (e.g. with bullet points), then use different fonts to define variables in the text (mtray, ntray...).

Line 184 and elsewhere. Please, do not use the word “prediction” here and in the whole document to refer to the output of your algorithm, use “estimate”, instead.

Lines 195-200. Here is the main lack of the paper: POD and FAR cannot be used separately to assess the quality of an estimates. Besides an error in the sentence (“optimal value of FAR is 1, and the worst value is 0”, actually, the opposite is true), to measure the capability of the technique to correctly classify wet/dry pixel you need or to comment both POD and FAR number together (and avoid sentences as on line 17), or to compute synthetic indicators such as Equitable Threat Score (ETS), Hanssen and Kuiper or Heidke Skill Score, and do again the analysis looking at the values of these indicators as reference.

Results

Line 205, figure 4, please, use a reasonable number of digits in the numbers reported on the panels. Moreover, POD and FAR should be <1.

Lines 214-215. This sentence is a speculation not supported by evidence, please motivate it better or cancel.

Lines 218-221. To better illustrate this issue, please, use the indicators I suggested few lines above (ETS, HK…)

Line 223. Again, an unsupported sentence, please, give evidence or remove.

Lines 224-230. This paragraph not clear: if the Authors have the feeling that the dataset is not large enough to carry on proper training/testing procedure (that was also my feeling at the beginning) why not to add some more case?

Lines 245-246. I do not see this sentence comes out. Numerical indicators (POD, FAR, R, ETS….) tell us much more from the quantitative point of view with respect to simple visual comparisons of rain maps. Please, use numbers if you want to make quantitative assessments.

Lines 263-264. Absolute and relative errors are not defined in the text.

Lines 291-293. This sentence does not tell anything about the technique accuracy, since POD alone is considered.

In general, discussion and conclusion have to be rewritten once the new indictors I suggested will be implemented.

Citation: https://doi.org/10.5194/amt-2021-175-RC1
- AC1: 'Reply on RC1', Yuanjian Yang, 06 Sep 2021
  
  Dear Reviewer,
  Thank you for your efforts in reviewing our manuscript. We appreciate to receive the useful comments . These comments are very constructive and we have revised our manuscript carefully by following all referees’ comments and suggestions. Please find our point-by-point responses.
  Please see the supplement
  
  Citation: https://doi.org/10.5194/amt-2021-175-AC1
RC2:
'Comment on amt-2021-175', Anonymous Referee #1, 11 Aug 2021
The title could be revised as: “Leveraging machine learning for quantitative precipitation estimation from Fengyun-4 geostationary observations and ground meteorological measurements”

“Large-scale and high-quality precipitation products derived from satellite remote sensing spectral data have always been a challenging issue in satellite quantitative precipitation estimation (QPE) . Moreover, QPE research related to China’s Fengyun-4A (FY-4A) geostationary satellite is still very limited.” could be revised as: “Deriving large-scale and high-quality precipitation products from satellite remote sensing spectral data is always challenging in quantitative precipitation estimation (QPE), and limited studies have been conducted even using the China’s latest Fengyun-4A (FY-4A) geostationary satellite.”

Line 156: “We constructed an RF model through the RF data package in R language, and established a relationship model of the satellite spectrum, cloud parameters and precipitation for the inversion and prediction of precipitation” could be changed to “A data-driven regression model was established between the observed precipitation and satellite spectrum as well as cloud parameters using the RF method.”

Figure 1: legend notations should be corrected, e.g., all_station should be automatic station?

Figure 4: the results indicate significant over-fitting issue of these two prediction models, what are possible reasons? Also, the high precipitation was underestimated, is there any possible way to address this?

Figure 9: large biases were observed for stations located in mountain areas, maybe the inclusion of DEM as a predictor could account for such biases.
Citation: https://doi.org/10.5194/amt-2021-175-RC2
- AC2: 'Reply on RC2', Yuanjian Yang, 06 Sep 2021
  
  Dear Reviewer,
  Thank you for your efforts in reviewing our manuscript. We appreciate to receive the useful comments . These comments are very constructive and we have revised our manuscript carefully by following all referees’ comments and suggestions. Please find our point-by-point responses.
  Please see the supplement
  
  Citation: https://doi.org/10.5194/amt-2021-175-AC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

AR by Yuanjian Yang on behalf of the Authors (06 Sep 2021) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (08 Sep 2021) by Simone Lolli

RR by Anonymous Referee #1 (09 Sep 2021)

ED: Publish as is (01 Oct 2021) by Simone Lolli

AR by Yuanjian Yang on behalf of the Authors (09 Oct 2021)

Short summary

A random forest (RF) model framework for Fengyun-4A (FY-4A) daytime and nighttime quantitative precipitation estimation (QPE) is established using FY-4A multi-band spectral information, cloud parameters, high-density precipitation observations and physical quantities from reanalysis data. The RF model of FY-4A QPE has a high accuracy in estimating precipitation at the heavy-rain level or below, which has advantages for quantitative estimation of summer precipitation over East Asia in future.