Validation of satellite OPEMW precipitation product with ground-based weather radar and rain gauge networks

Introduction Conclusions References


Introduction
The accurate estimation of rainfall is crucial for many applications, including its short-term assessment and long-term monitoring.A summary of recent activities, ongoing research, and future plans about precipitation measurement and modeling is given by Michaelides et al. (2009) and Tapiador et al. (2012), while comprehensive overviews of precipitation remote sensing from satellite and its applications are given by Levizzani et al. (2007), Kidd and Levizzani (2011), and Kucera et al. (2013).Nowadays, operational precipitation products are routinely delivered through large programs, such as the US National Oceanic and Atmospheric Administration (NOAA) operational hydrological products (Ferraro et al., 2005) or the Satellite Applications Facility on Support to Operational Hydrology and Water Management (H-SAF; Mugnai et al., 2013).However, rainfall estimation Published by Copernicus Publications on behalf of the European Geosciences Union.
algorithms, validation strategies, and assimilation into numerical weather prediction and hydrological high-resolution models are topics still under investigation, especially over land (e.g., Anagnostou, 2004).Moreover, the likely increase of extreme events due to climate-related forcing brings even more importance to rainfall retrieval as a means for monitoring environmental hazards (e.g., Nunes and Roads, 2007).Furthermore, the assessment of precipitation detection and quantitative estimation from space remains a major issue (e.g., Ebert et al., 2007).Precipitation estimate methods from passive sensors use observations at visible (VIS), infrared (IR), and microwave (MW) frequencies.Among them, the microwave techniques provide the most direct observation of precipitation as microwave radiation is less affected by cloud droplets and interacts with precipitation-sized hydrometeors.Regarding platforms, geosynchronous (GEO) and low Earth orbit (LEO) satellites offer complementary features in terms of revisit time and spatial resolution.While LEO satellites offer low revisit time (only twice a day for a given place at midlatitude) but high spatial resolution (0.25-1 km for VIS-IR, 10-50 km for MW), GEO satellites ensure high revisit time (on the order of 15 min) at moderate spatial resolution (1-4 km for VIS-IR, > 100 km for MW).Given this limitation, so far MW sensors have been deployed on LEO satellites only, although GEO MW missions have been proposed (Savage et al., 1995;Tanner et al., 2007).In the last decades, several microwave radiometers aboard LEO satellites have been exploited for rainfall remote sensing (e.g., Special Sensor Microwave Imager, Advanced Microwave Sounding Unit, Tropical Rainfall Measuring Mission Microwave Imager, Advanced Microwave Scanning Radiometer) and numerous retrieval algorithms have been developed (e.g., Spencer et al., 1989;Wilheit et al., 1991;Ferraro and Marks, 1995;Staelin and Chen, 2000;Kummerow et al., 2001;Bennartz et al., 2002;Bauer et al., 2005;Boukabara et al., 2007Boukabara et al., , 2011;;Laviola and Levizzani, 2009).Two processes are used to identify precipitation from MW observations: emission from rain droplets (leading to MW radiation enhancement) and scattering caused by precipitating ice aloft (leading to MW radiation depression).Emission is usually exploited over a cold background (such as ocean, e.g., Wilheit et al., 1991), while scattering must be used over land, where the surface has higher and more variable emissivity (Spencer et al., 1989;Ferraro and Marks, 1995).However, high-frequency MW observations are less sensitive to the surface background and thus can be exploited to retrieve precipitation over land, ocean, and even problematic backgrounds, such as coastlines (Staelin and Chen, 2000;Laviola and Levizzani, 2009).
At the Institute of Methodologies for the Environmental Analysis of the National Research Council of Italy (IMAA-CNR) a number of approaches have been proposed to retrieve cloud and rainfall information from satellite observations (Romano et al., 2007;Ricciardelli et al., 2008Ricciardelli et al., , 2010;;Di Tomaso et al., 2009).In particular, the Precipitation Estimation at Microwave Frequencies (PEMW) algorithm was developed to infer rain rates from satellite passive microwave observations in the 89 to 190 GHz range (Di Tomaso et al., 2009).The PEMW algorithm relies on satellite observations made by the Advanced Microwave Sounding Unit/B (AMSU/B) or the Microwave Humidity Sounder (MHS) onboard the NOAA satellites and/or the European Polar Satellite MetOp-A.The PEMW performances were tested (Di Tomaso et al., 2009) at relative high latitudes against the UK NIMROD radar network, and at tropical low latitudes against rain gauges.The rain gauges, belonging to the US Atmospheric Radiation Measurement (ARM) program, are deployed on the island of Nauru in the tropical western Pacific.A total of 6 case studies were used for the validation of the PEMW algorithm against the NIMROD radar observations, while less than 30 satellite overpasses were used for the validation against the one-point rain gauge measurements in Nauru.Since then, the operational version of PEMW (OPEMW) has been running at IMAA-CNR in support to numerical hydrometeorology and flood-hazard-alert systems.However, the validation of OPEMW over the geographical area where its estimates are actually utilized -i.e., the Italian territory -has never been tested before.Thus, similarly to what was proposed by Antonelli et al. (2010), this analysis carries out a detailed validation of the OPEMW product by comparing the satellite estimation against the weather radar and rain gauge networks deployed over the Italian territory.
This paper is organized as the follows: Sect. 2 describes the data set under consideration, Sect. 3 summarizes the methodology used for data comparison, Sect. 4 reports the results of the data analysis, and Sect. 5 summarizes the quantitative results and draws the final conclusions.

OPEMW
OPEMW is the software package developed at IMAA-CNR that implements and runs the PEMW algorithm operationally.OPEMW has been running at IMAA-CNR since 2010 in support to numerical hydrometeorology and floodhazard-alert systems.The details of PEMW are described in Di Tomaso et al. (2009).Here we just want to mention that PEMW consists in a precipitation estimation algorithm exploiting passive MW observations in the frequency range 89 to 190 GHz together with prior knowledge of the nature of precipitation events.Radiative transfer simulations in a variety of different atmospheric scenarios are used to fit an extensive set of regression curves between rain rate and linear combinations of the radiances to be observed.Various atmospheric and surface conditions (land and water) are considered, including extreme events, in order to derive the domain for the regression coefficients.The procedure automatically selects the most appropriate scenario and hence the most suitable coefficients fitting a model between the observations and rain rate.This selection is based on the assumption that given the real scenario, the regression curves all retrieve similar rain rates.Thus, PEMW is a specific version of a more general approach that optimizes the distance of the parameters of the regression curves with respect to the estimated rain rate.The advantages of PEMW include a good spatial and temporal resolution, little influence from the background surface, high sensitivity to light rain, and a rain rate estimate that is consistent throughout the channels.Potential weaknesses are inherent with issues related to the crossscan observations and to the instrumental absolute calibration.PEMW was developed to work with observations from AMSU/B, flying aboard the NOAA Polar Operational Environmental Satellites (POES).PEMW was later adapted to work with observations from MHS, flying aboard the most recent NOAA POES satellites and the European Polar System (EPS) MetOp-A belonging to the European Organisation for the Exploitation of Meteorological Satellites (EU-METSAT).POES and EPS are respectively the US and European contributions to the Initial Joint Polar-Orbiting Operational Satellite System (IJPS).Here we just mention that both AMSU-B and MHS are cross-track, line-scanning microwave radiometers measuring radiances in five channels in the frequency range from 89 to 190 GHz.AMSU-B and MHS exploit similar channels: center frequencies for the AMSU-B channels are 89, 150, 183 ± 1, 183 ± 3, and 183 ± 7 GHz; while for the MHS channels are 89, 157, 183 ± 1, 183 ± 3, and 190 GHz.Therefore, the two instruments differ only at channel numbers 2 (150 vs. 157 GHz) and 5 (183 ± 7 vs. 190 GHz); however, the differences are more technical rather than functional, and the two channel duplets show the same fundamental features.AMSU-B and MHS fly at a nominal altitude of 850 km and they observe the Earth scanning circa ±50 • across nadir, taking 90 consecutive fields of view (FOVs) per scan.The five channels are co-registered with 1.1 • antenna beam width.At nadir, the footprint corresponds to a circle of diameter approximately 16 km.Due to the cross-track observation geometry, away from nadir the FOVs have an ellipsoidal shape.The FOV axes range from 16 km × 16 km at nadir to 51 km × 25 km at maximum scanning angle (Bennartz, 2000).The first axis refers to the crosstrack and the second to the along-track direction.Further details on the instruments features can be found in the NOAA KLM user's guide (NCDC/NOAA, 2008) and the ATOVS Level 1b Product Guide (EUMETSAT, 2010).At the time of writing there are five operational NOAA POES satellites (namely the N-15, N-16, N-17, N-18, and N-19) spaced approximately 2-6 h apart and carrying either the AMSU-B (N-15, N-16, and N-17) or the MHS (N-18 and N-19) instruments; in addition, there are two operational EPS satellites, MetOp-A and MetOp-B, both carrying the MHS instrument.
The AMSU-B and MHS raw data are received in nearreal time at IMAA-CNR and processed with the AAPP code (UK Met Office, 2011).The level 1c data are then processed by OPEMW, and the rain rate product is sent to CETEMPS (Centre of Excellence for the integration of remote sensing and modeling for the prediction of severe weather) and also stored in the IMAA-CNR archive.The present version (V4) of OPEMW has been running since May 2011.The data set considered here covers one full year, from July 2011 to June 2012.For this period the three AMSU-B instruments were unavailable due to instrumental failures, and the MetOp-B was still in its pre-launch and commissioning phases; thus the following analysis focuses on MHS observations from N-18, N-19, and MetOp-A.An example of the operational surface rain intensity (sri, mm h −1 ) map product by OPEMW is shown in Fig. 1.Note that the horizontal resolution of OPEMW sri product is the same as for MHS (i.e., varying with scan angle).However, the operational graphical output is represented as uniform circles to clearly display the scan line direction.

Weather radar network
Microwave weather radars are considered a fairly established technique for retrieving rain rate fields over large areas from measured reflectivity volumes.In the framework of the national early-warning system for multi-risk management, the Italian Department of Civil Protection (DPC) was appointed to complement the existing weather radar systems in order to increase the coverage of the Italian territory.The resulting Italian national weather radar network is coordinated by the DPC, in collaboration with regional authorities, research centers, the Air Traffic Control service (ENAV), and the Meteorological Service of the Air Force (CNMCA).Once completed, the Italian radar network shall include 25 C-band (∼ 5 GHz) radars (including 7 polarimetric systems) and 5 dual-polarized X-band (∼ 10 GHz) radars, deployed throughout the country (Vulpiani et al., 2008a).Currently, the radar network is composed of 20 weather radars: 10 Cband radars belonging to regional authorities (5 of which are polarimetric), 2 C-band radars owned by ENAV, and 6 C-band radars (two of which are polarimetric) plus 2 Xband polarimetric radars owned directly by the DPC.The DPC collects radar data in near real-time by satellite links to the two national radar primary centers (RPC), one located at the DPC headquarters in Rome and the other at the Centro Internazionale in Monitoraggio Ambientale (CIMA) Research Foundation in Savona.Procedures for mitigating ground clutter, anomalous propagation, and beam blockage effects are applied (Vulpiani et al., 2008b).The radar volumes are then centrally processed to produce the so-called radar network composite (RNC) of products such as the vertical maximum intensity (VMI, dBZ), the constant altitude plan position indicator (CAPPI, dBZ), the surface rain intensity (sri, mm h −1 ), and the 1 h accumulated surface rain total (srt, mm).The sri product is computed applying the reflectivity-rainfall (Z-R) relationship proposed by Marshall and Palmer (1948) to the lowest beam map (LBM) product.The latter is the near-ground reflectivity map obtained from the corrected radars volume data using the lowest height reflectivity value in each vertical column.All products are obtained over a grid of 1400 × 1400 km 2 with spatial resolution of ∼ 1 km and temporal resolution of 15 min; the gridded products are then distributed to the hydrological and meteorological regional services.The RNC sri product, used here, represents the best radar estimate available for the period under analysis.It must be underlined that this was not adjusted to match rain gauge products, even though statistical techniques are being tested (Marzano et al., 2012).Procedures to improve the quality of the RNC sri product, including attenuation compensation, polarimetric rainfall inversion techniques, and adaptive algorithms to retrieve mean vertical profiles of reflectivity (VPR) are currently under validation at the DPC (Vulpiani et al., 2008a(Vulpiani et al., , 2012)).An example of the RNC sri product is shown in Fig. 2 for the same time interval of Fig. 1.

Rain gauge network
Several rain gauge networks have been deployed over the Italian territory through the years, which belong to independent regional and national authorities.DPC was recently appointed to manage the existing rain gauge networks in collaboration with other regional and national authorities.This integrated network is one of the densest in the world, with data in Figure 1).Note that here the time displayed at the bottom is in CEST (Central Europe 6 Summer Time).7 Fig. 2.An example of the graphical output of RNC (courtesy of DPC).The surface rain intensity (sri) product is color-coded according to the vertical bar (in mm h −1 ) and layered over the Meteosat Second Generation (MSG) 10.8 µm image (in normalized inverted greyscale).Data obtained from RNC at 01:15 UTC and MSG observations at 01:00 UTC on 5 July 2011 (i.e., within 7 min from data in Fig. 1).Note that here the time displayed at the bottom is in CEST (Central Europe Summer Time).more than 3000 rain gauges (Vulpiani et al., 2012).Figure 3 shows the distribution of the rain gauges over Italy.The average distance between neighboring rain gauges is less than 10 km.These are a tipping bucket type of rain gauge with 0.2 mm h −1 minimum detectable rain rate.A reduced set (∼ 300) is provided with heating to prevent snow and ice clogging and to measure the water equivalent of frozen precipitation.Rain gauge data acquisition and processing are performed by regional authorities at different temporal intervals, ranging from 5 to 30 min.Typically 65-85 % of the total number of rain gauges are available at the same time.The rain gauge network (RGN) data are collected and centrally processed in near-real time by the DPC.The set of 1 h accumulated rain for each rain gauge of the network represents the RGN surface rain intensity (sri, mm h −1 ) product.The DPC distributes the RGN sri product through DEWETRA, a Web-based software developed by CIMA Research Foundation on behalf of the DPC. Figure 3 also shows the RGN sri (color-coded according to the accumulated rain thresholds referring to the last 24 h) for the same 1 h period containing the satellite overpass in Fig. 1 and the radar composite in Fig. 2. The point measurements of rain gauges are upscaled into the satellite FOVs, as will be discussed in Sect.3.1.The figure shows 1 h accumulated rain (sri) between 01:00 and 02:00 UTC on 5 July 2011 (i.e., the 1 h period containing both Figs. 1 and 2).Rain gauges are indicated with circles and colorcoded as follows: grey →missing data; white →no rain; green →0 < sri< T1; yellow →T1 < sri < T2; orange →T2 < sri < T3; red →T3 < sri; where Ti (with i =1-3) represent three thresholds for light, moderate, and intense rainfall, whose values differ depending on the respective catchment (data courtesy of DPC, image obtained using DEWETRA).

Methodology
The OPEMW sri product is validated against the sri products from the radar network composite (RNC) and the rain gauge network (RGN).The data set considered here covers one full year (July 2011-June 2012).Data from the three sources were treated for (i) checking data quality, (ii) finding space-time colocation, and finally (iii) computing statistical scores.

Space-time colocation
The OPEMW surface rain intensity product is colocated with the ground-based products so that each satellite FOV is associated with the corresponding surface rain intensity values derived from RNC and RGN.The temporal colocation is obtained as follows: -Each OPEMW sri product is associated with the time of the satellite overpass, since it is a nearly instantaneous observation.
-Then, the procedure searches for a RNC sri product within 8 min before/after the satellite overpass, which is usually found since the RNC sri product is available every 15 min.
-Finally, the procedure searches for the RGN sri product that corresponds to the 1 h time period in which the satellite overpass has occurred.
Units for OPEMW, RNC, and RGN sri products are mm h −1 and thus are comparable.However, OPEMW and RNC sri products correspond to nearly instantaneous observations.In contrast, the RGN sri product is computed as hourly accumulated rainfall.The comparison between these products inherently includes the uncertainty related to rainfall variability within an hour.The spatial colocation is obtained by convoluting either the RNC or RGN sri products to the satellite FOVs, taking into account the antenna pattern, assumed as Gaussian, and the ellipsoidal shape at different viewing angles (Bennartz, 2000).For each FOV, the convolution usually takes 200-1000 RNC pixels and 5-180 rain gauges.Thus, high surface rainfall intensity values (e.g., sri > 50 mm h −1 ) that are detected locally at few RNC pixels or RGN sites are typically smoothed by the convolution with the surrounding lower values falling within the same FOVs.Therefore, sri at the scale of satellite FOV seldom exceeds 20 mm h −1 .

Data quality
The single FOV estimate is in general prone to geolocation and colocation errors.In fact, satellite observations suffer from considerable geolocation errors.Errors up to two pixels can be in both the along-and the cross-track direction, leading to considerable geographical misplacement.Moreover, additional geolocation uncertainty stems from the parallax error.This error increases proportionally with increasing observation angle and altitude of the cloud melting layer, and it can contribute to the misplacement of raining areas of up to 10 km (Antonelli et al., 2010).Additional uncertainty is related to the spatial heterogeneity; the so-called beam-filling problem (Kummerow, 1998) refers to the fact that the observed precipitating area does not fill the satellite FOV homogeneously, due to spatial variability in the sub-FOV scale.Finally, sources of inconsistencies between satellite and ground-based observations may be related to erroneous measurements.Therefore, measures for data quality control are applied during the spatial colocation procedure.Geolocation errors are mitigated according to the platform during the production of level 1c data.To minimize the inconsistency related to beam-filling issues, we perform the following data screening with respect to RNC.The procedure discards the FOVs that are only partially covered (less than 3 / 4 of the total area) by RNC pixels.Furthermore, it considers only the FOVs that contain nearly all (95 %) clear or rainy RNC pixels (i.e., leaving 5 % tolerance with respect to considering completely clear or rainy FOVs only).Similarly, with respect to RGN the procedure discards the FOVs with less than 10 rain gauge measurements falling within the FOV area.Furthermore, it considers only the FOVs for which more than 95 % of the associated rain gauges detected either rain or no rain.
Instrument errors may affect all the instruments considered here, i.e., satellite microwave radiometers and groundbased radars and rain gauges.A first quality control (QC) is performed by the data providers, which should ensure proper functioning and calibration.The QC flags provided are used to screen out potentially erroneous data.In addition, we discarded RGN data that resulted as suspicious due to telecommunication problems (wrong coordinates or time/date, redundant data, etc.).Measuring errors related to weather conditions (such as wind, frozen precipitation) are not taken into account.In order to prevent unrealistic sri values entering the statistical analysis, the procedure discards RGN sri values higher than 150 mm h −1 , assumed to be the upper limit for 1 h accumulated precipitation.The same upper limitation is adopted for the instantaneous RNC sri values.In addition, the radar echoes generated by non-meteorological targets are discriminated from weather returns by combining the static clutter map, the Doppler velocity, and the texture of the reflectivity field (Vulpiani et al., 2012).Additionally, a median filter is applied to remove residual clutter echoes.Partial beam-blocking sectors are identified through an empirical visibility map (EVM), which is derived directly from the radar observations.Measures to mitigate the effects of attenuation are not applied to RNC at the current stage.Radar attenuation likely causes a systematic underestimation of sri.The other instrument errors mentioned above are likely to produce random effects depending upon time, location, and weather conditions; therefore we believe their effects are mitigated by the temporal and spatial averaging of the analysis described below.

Statistical scores
The validation of the OPEMW sri product against the ground-based RNC and RGN reference products is performed through the assessment of a number of statistical scores.Here we present both dichotomous and continuous scores, used to assess quantitatively the accuracy of rain detection and estimation, respectively.
The dichotomous scores are used for the assessment of rain detection accuracy.Rain detection is Boolean, as it assumes the value of 0 in the case of no rain and 1 in the case of rain detected (i.e., sri > 0 mm h −1 ).The dichotomous scores are computed from the contingency table, which reports the number of hit, miss, false alarm, and correct null events of OPEMW vs. RNC and RGN detections (see Appendix A).The dichotomous scores include the accuracy, the frequency bias (FB) score, the probability of detection (POD), the false alarm ratio (FAR), the Heidke skill score (HSS), and finally the equitable threat score (ETS).Equations for these scores are given in the Appendix A. The accuracy score indicates the fraction of all the FOVs that has been correctly identified as rainy or non-rainy; however, the high occurrence of nonrainy FOVs strongly influences this accuracy score.Similarly to this, the HSS indicates the fraction of correctly identified FOVs (as rainy or non-rainy), but after eliminating the fraction correctly identified due to random chance.The ETS indicates the fraction of correctly identified FOVs as rainy after eliminating the fraction due to random chance.The FB score indicates whether there is a tendency to over-or underestimate the area subject to rain (bias score > 1 or < 1, respectively).Finally, the POD quantifies the ability to detect the rainy FOVs only, while the FAR provides a measure for the fraction of non-rainy FOVs that have been erroneously detected as rainy.
The continuous scores, used for the assessment of sri estimation accuracy, are applied to the data set after the "binning" following the approach introduced by Ferraro and Marks (1995).In this approach the reference data (RNC and RGN) are binned in 1 mm h −1 sri intervals and the corresponding satellite estimates are averaged and associated with each bin.The binned analysis is extremely useful, as it minimizes match-up errors between ground and satellite observations and it ensures equal emphasis on the entire range of sri.The data set after the binning is used to compute the mean (AVG), standard deviation (STD), and root-meansquared (RMS) difference, the correlation coefficient (COR), and the slope (SLP) and intercept (INT) of a linear fit.Finally, the full data set (i.e., before binning) was processed to compute the monthly mean and standard deviation of the RGN-OPEMW and RNC-OPEMW differences.These numbers provide quantitative information on the annual variation of the accuracy of OPEMW sri estimate with respect to ground references.

Results
The results of the validation of the OPEMW sri product against the ground-based RNC and RGN reference products are presented for the 1 yr data set under consideration (July 2011-June 2012).Figure 4 presents the histograms of OPEMW, RGN, and RNC sri products for the full data set after the colocation procedure.Note that the events with sri larger than 15 mm h −1 have been grouped in the last bin, as their number is quite small (< 0.002 %), especially for OPEMW (< 10).The distributions of the three sri sources look quite similar, except that OPEMW shows more cases at small values (sri ∼ 1 mm h −1 ) and fewer cases at large values (sri ≥ 15 mm h −1 ) relative to the other two sources.Note also that RGN data are only available over land, while RNC data are available both over land and ocean (at ratio ∼ 1 / 2).Since precipitation estimates based on MW radiances are inherently better over ocean than over land, due to less uncertainty related to surface emissivity, we expect data over ocean to positively impact on the overall OPEMW-RNC comparison.
The following sections report the results of the dichotomous statistical assessment, the continuous statistical assessment, and a spatial and temporal analysis.

Dichotomous statistical assessment
The dichotomous statistical assessment was performed over the whole data set, containing more than 650 000 OPEMW-RGN match-ups and more than 1 600 000 OPEMW-RNC match-ups.We assume a detection limit of 0.5 mm h −1 , which means that sri values smaller than this limit are set to zero.The overall results are reported in Table 1, also divided into four seasons: summer (July-September 2011), fall (October-December 2011), winter (January-March 2012), and spring (April-June 2012).Considering all data, the accuracy score shows that OPEMW correctly identifies most of the FOVs as rainy or non-rainy (accuracy of 98 % for both RGN and RNC).However, the accuracy score is heavily influenced by the high occurrence of non-rainy FOVs (96 and 98 % for RGN and RNC, respectively), as previously anticipated.The fraction of correct detection after eliminating the portion due purely to random chance is given by the HSS and ETS scores, respectively considering or not the correct null events (no rain).The perfect value for HSS and ETS is 1.0, while here these get to HSS = 0.42 (0.45) and ETS = 0.27 (0.29) with respect to RGN (to RNC).The perfect value for FB score is 1.0, while here it becomes a bit larger with respect to both RGN and RNC, indicating that OPEMW has a tendency to slightly overestimate the precipitating areas.As a consequence, the FAR is rather high (64 %), while the POD is within 55 % (60 %) for RGN (RNC).Table I also reports the dichotomous scores computed by breaking the data set into the four seasons introduced above, thus providing information on the seasonal behavior of the OPEMW performances.All the scores indicate that performances better than the annual average are found for summer, fall, and spring, while substantially worse than average for the winter season.In particular, the low values for POD and HSS together with the high values for FAR and FB seem to suggest that OPEMW tends to overestimate the precipitating areas during winter.
Note that the quantitative values of the above scores depend upon the assumed sri detection limit.For example, Fig. 5 shows the monthly mean POD as computed using three different detection limits (0.5, 1.0, and 5.0 mm h −1 ) with respect to either RGN or RNC. Figure 5 confirms the seasonal behavior of OPEMW estimates and the performance degradation during the winter season (with respect to both RGN and RNC).The behavior, however, becomes less evident as the detection limit increases.Finally, considering the whole data set, the POD reaches 60 % (66 %) with a sri threshold of 1 mm h −1 , while it reaches 83 % (88 %) with a sri threshold of 5 mm h −1 for RNC (RGN).This demonstrates the increasing OPEMW detection skills as the rainfall becomes more intense.

Continuous statistical assessment
As already mentioned, the continuous statistical assessment was applied to the data set after the "binning" following the approach in Ferraro and Marks (1995).The data sets in Fig. 4 have been processed such that the reference data, either RNC or RGN sri products, are binned in 1 mm h −1 sri intervals, and the corresponding satellite estimates are averaged for each bin and the averaged OPEMW sri value is associated with that bin.We choose 1 mm h −1 bins to be consistent with Di Tomaso et al. ( 2009), and thus produce results that are directly comparable.Larger sri bins or even rainfall classes (e.g., light, moderate, heavy) could have been chosen, though we do not expect the conclusions to be greatly affected.The results of the binning are shown in Fig. 6 in terms of percentage histograms, i.e., the percentage of all data that fell into each bin.Clearly, considering all the matchup data, including non-raining FOVs, a large portion (∼ 60-70 %) falls in the first bin.Conversely, a more balanced histogram is obtained when considering hits only (i.e., raining pixels correctly detected).In any case, the number of satellite estimates falling into each bin decreases drastically with increasing sri values.Accordingly, the binning analysis is less and less reliable as the sri value increases and the number of available satellite estimates decreases.In particular, we limited the analysis to bins with a number of satellite estimates larger than 5, causing the sri range to be bound in the 0-15 mm h −1 range.Figure 6 also shows the standard deviation of OPEMW sri data falling into each bin, either considering all data or hits only.This standard deviation increases with sri for the first few bins and then it becomes nearly constant at ∼ 4 mm h −1 .The results of the binning analysis are shown in Fig. 7, where we compare OPEMW sri estimates against both RGN and RNC for the entire year under consideration.The results are presented considering all data and hits only, whose distribution and deviation are shown in Fig. 6.The binned scatter plots show a reasonably good correlation between OPEMW and RGN/RNC sri products.However, it is noticeable that OPEMW tends to slightly overestimate lower sri values and, conversely, underestimate larger sri, with a hinge point roughly around 6-7 mm h −1 .Up to 7 mm h −1 OPEMW is well correlated with the ground reference, especially with RGN, while the scatter increases substantially for sri >10 mm h −1 , likely due to the low number of cases (as seen in Fig. 6).Overall, the mean difference is within 1.2-3.3mm h −1 and the STD is within 2.7-3.5 mm h −1 , while the correlation is within 0.8-0.9.The agreement in terms of RMS is better with RNC than RGN.The bias for both RGN and RNC is lower when considering hits only.However, the results above are influenced by larger sri values, which again are less reliable due to much lower statistical significance.Nevertheless, there may be reasons for the OPEMW underestimation at high sri related to the precipitation mechanism.Although the synthetic training of the PEMW algorithm also accounted for extreme scenarios (Di Tomaso et al. 2009), Fig. 7 seems to suggest that it works better for stratiform (relatively lower) rather than for convective (relatively higher) rainfall.The analysis of the influence of precipitation type on algorithm performances shall be the object of future research.
The same analysis, but dividing the data set into the four seasons introduced earlier, is repeated in Figs. 8 and 9 against RGN and RNC, respectively.In Fig. 8 we see that OPEMW agrees quite well with RGN in summer, winter, and spring, showing mean difference within 1.1 mm h −1 , STD within 1.9 mm h, RMS within 2.1 mm h −1 , and correlation greater than 0.9.However, the range of valid sri is limited to less than 10 mm h −1 by the low occurrence of higher sri values.The results for the fall season resemble those for the whole year in Fig. 7.Note that fall is characterized by high occurrence of heavy precipitation over Italy.Orographic precipitation and mesoscale convective systems play an important role due to steep slopes in the vicinity of large coastal areas, often causing localized hailstorms with cluster organized cells (Ferretti et al., 2013).Similarly, Fig. 9 shows results with respect to RNC.Here differences between the four seasons are less evident.Note, however, that OPEMW tends to overestimate small sri values with respect to RNC (Fig. 9), but not so much with respect to RGN (Fig. 8).We attribute this to the underestimation of the sri field by RNC related to the complex orography of the Italian territory; in fact com-plex orography causes, apart from substantial ground clutter, a range-dependent underestimation due to beam divergence and altitude (Marzano et al., 2004).Mitigation measures are currently under testing at DPC, but are not applied to the current version of RNC.

Spatial-temporal assessment
The spatial and temporal distribution of the retrieval uncertainties are also important to characterize the OPEMW performances, especially over a territory with complex orography and large seasonal variability such as Italy.To investigate this, we have divided the geographical area in Figs.1-3 into a 14 • × 14 • longitude-latitude grid with 0.1 • step and computed for each pixel the mean absolute difference between OPEMW and ground-reference (either RGN or RNC) sri products for each of the four seasons introduced above.The results are shown in Figs. 10 and 11 with RGN and RNC respectively as reference.Note that, as anticipated, Fig. 11 shows that the agreement is generally better over ocean than over land.Figures 10 and 11 do not seem to show any particular geographical-seasonal effect, except for an increase in mean absolute difference over the Alps and along the northern Apennines during winter.There are likely a number of reasons concurring to this effect.In fact, precipitation over the mountains is often snow during winter, which increases the uncertainty of ground-based measurements.In fact, tipping bucket rain gauges often get clogged by snow, and even those provided with a heating system can only measure the water equivalent of frozen precipitation, which is affected by substantial measuring errors (e.g., evaporation loss).At the same time, radar quantitative precipitation estimation is degraded in mountainous areas, due to (i) the more complex orography, causing enhanced beam blockage and ground clutter, and (ii) the presence of snow/ice hydrometeors, adding uncertainty to the assumptions concerning particle size, distribution, and phase (Germann et al., 2006).The increased difference may also be caused by the presence of snow on the ground, which is a well-known source of uncertainty for passive microwave estimates of rainfall.As described in Di Tomaso et al. ( 2009) the PEMW algorithm applies methods to avoid snow on the ground being detected as rainfall, based on the observations from the channels less sensitive to ground emissivity.The effects of these methods were shown by Di Tomaso et al. (2009), concluding that the number of false alarms is reduced considerably, but not completely set to zero.The above reasons concur with the increased mean absolute difference over the main mountain ridges, as well as to the relative larger FAR and smaller POD reported in Table 1 and Fig. 4 during winter.
In Fig. 10 we also notice larger mean absolute difference values over Sicily than for the rest of Italy.This is more ev-ident during winter, but it seems to be present during the other seasons as well.Since there is no hint of this feature in Fig. 11, we attribute it to larger uncertainties affecting the rain gauge network deployed in that region.
Finally, in order to quantify the accuracy of the single FOV estimate and to detect its seasonal features, we used the whole match-up data set (more than 650 000 for OPEMW-RGN and more than 1 600 000 for OPEMW-RNC) and computed the monthly mean difference between groundreference (either RGN or RNC) and OPEMW sri products.The results are shown in Fig. 12, where the error bars indicate the STD of the monthly mean difference.Figure 12 shows that, with respect to RGN, OPEMW tends to underestimate sri from September to May, while the opposite is the case in June, July, and August, though the monthly mean difference remains within ± 1 mm h −1 .Conversely, with respect to RNC, OPEMW seems to overestimate sri throughout the year, with monthly mean difference within −2 and 0 mm h −1 .These results are likely influenced by the large amount of relatively low sri dominating the statistics (see Fig. 6), for which OPEMW agrees quite well with RGN but it is larger than RNC, as seen already in Figs.7-9.As already anticipated, this feature is mainly related to the complex orography of the Italian territory.The standard deviation of the monthly mean difference does not seem to show an evident seasonal behavior, with values between 2 and 4 mm h −1 (except for March 2012).

Conclusions
One year of surface rain intensity (sri) data produced by the operational procedure OPEMW developed at IMAA-CNR has been validated against ground-based reference sri products from rain gauge (RGN) and weather radar (RNC) networks deployed over the Italian territory.The data set spans from July 2011 until June 2012, exploiting more than 3000 rain gauges and 20 weather radars.Ground-based observations have been temporally and spatially colocated with the satellite observations for a total of more than 650 000 OPEMW-RGN match-ups and more than 1 600 000 OPEMW-RNC match-ups.The distribution of sri shows that OPEMW generates more cases at smaller values (sri ∼ 1 mm h −1 ) and fewer cases at larger values (sri ≥ 15 mm h −1 ) relative to the two ground-based references.
The assessment of OPEMW rain detection is performed over the whole data set, showing 98 % accuracy in correctly identifying rainy and non-rainy FOVs.The FB score is larger than unity, indicating that OPEMW has a tendency to slightly overestimate the precipitating areas.Consistent results are obtained against RGN and RNC.As a consequence, the FAR is rather high (64 %), while the POD is 55 % (60 %) with respect to RGN (RNC).Taking RGN as reference, OPEMW shows an increase (with respect to random chance) in the ability to detect rainy and non-rainy FOVS (HSS = 0.42) as well as in the ability to detect rainy FOVs only (ETS = 0.27).Similar results are obtained when taking RNC as reference (HSS = 0.45 and ETS = 0.29).When breaking the data set into seasons, all the dichotomous scores indicate performances better than average in summer, fall, and spring, while substantially worse than average in the winter season.Low POD, HSS, and ETS values together with high FAR and FB values all seem to suggest that OPEMW tends to overestimate the precipitating areas during winter.These results, including the seasonal trend, are comparable with numbers found in Ebert et al. (2007), though those were obtained for 24 h accumulated rain.It is also noted that the OPEMW detection skills become better for increasing rainfall intensity (POD up to 66 and 88 % for detection limit set to 1 and 5 mm h −1 , respectively).
The assessment of OPEMW estimation accuracy demonstrates reasonable agreement with RGN/RNC sri products.However, OPEMW tends to slightly overestimate lower sri values, and conversely to underestimate larger sri, with a hinge point roughly around 6-7 mm h −1 .Up to 7 mm h −1 OPEMW is well correlated with the ground reference, especially with RGN; the dispersion increases substantially for sri > 10 mm h −1 , likely due to the low number of cases with rainfall higher than 10 mm h −1 .Taking RGN (RNC) as reference, the mean difference is 3.3 (2.2) mm h −1 , the standard deviation is 3.4 (2.7) mm h −1 , and the correlation is 0.8 (0.9).In terms of RMS difference, results improve by 10 % when considering only the hit events.Better agreement is found with RNC rather than RGN; this result is partially due to the smaller differences over ocean, though it is also strongly influenced by the larger and statistically less significant sri values.When breaking the data set into seasons, the estimation accuracy does not show substantial difference from the results above, except that intense rainfall events are pretty much limited to the fall season.For low to moderate sri values (sri < 8 mm h −1 ), OPEMW agrees well with RGN but tends to overestimate RNC.The latter result may be explained by the likely RNC sri underestimation due to the combined effect of attenuation and complex orography.
We also investigated the spatial and temporal behavior of the mean absolute difference between OPEMW and groundbased reference sri products.Two geographical-seasonal features are noticed: (i) mean absolute difference larger than average over the Alps and northern Apennines during winter, and (ii) larger mean absolute differences over Sicily than for the rest of Italy with respect to RGN.The first feature is consistent with the scores in Table 1 and the rain detection results above.We attribute it to the combination of larger uncertainty in both satellite estimates (residual spurious effects   caused by snow on the ground) and ground-based measurements (complex orography, frozen precipitation) in mountain regions during winter.Conversely, we attribute the feature over Sicily to larger errors affecting the rain gauges deployed in Sicily rather than to inaccurate satellite estimates, though we were not able to retrieve information about possible instrumental differences.
Finally, we investigate the monthly mean difference between OPEMW and ground-based reference sri products.With respect to RGN, the monthly mean difference remains within ±1 mm h −1 throughout the year.OPEMW underestimates RGN from September to May, while the opposite is the case in June, July, and August.Conversely, OPEMW seems to overestimate RNC throughout the year, with monthly mean difference ranging from 0 to −2 mm h −1 .These results are likely influenced by the large amount of relatively low sri dominating the statistics, for which OPEMW agrees quite well with RGN but is larger than RNC.The systematic difference between OPEMW and RNC is mostly attributed to the likely systematic underestimation of sri by RNC caused by radar attenuation issues.The STD of the monthly mean difference do not seem to show an evident seasonal behavior, with values between 2 and 4 mm h −1 for both RGN and RNC.
In conclusion, the validation effort presented here extends the results of Di Tomaso et al. ( 2009) -which validated the PEMW algorithm limited to a few case studies -to a full year of operational OPEMW sri products.The rain detection and estimation performances over the Italian territory and four seasons indicate that the OPEMW sri product is suitable for the deployment in an integrated system supporting numerical hydrometeorology and flood-hazard-alert systems.However, discrepancies with respect to ground-based references have been identified and discussed.Besides the uncertainty attributed to the ground-based reference observations, we identified the following features for OPEMW: (a) large false alarm ratio and mean absolute error during winter, and (b) considerable underestimation of intense rainfall at FOV scale (sri > 10 mm h −1 ).These features represent the starting point of our ongoing and future work to improve the overall performances of OPEMW.In fact, solutions to mitigate these features are under study, as for example an adaptive screening designed to remove the residual contamination by snow, and additional training giving more weight to extreme rainfall cases.

Definitions of statistical scores
This appendix summarizes the statistical scores used for evaluating surface rain intensity (sri) estimated from satellite by OPEMW with respect to ground-based measurements from rain gauge (RGN) and weather radar (RNC) networks.These include the accuracy, the frequency bias score, the probability of detection (POD), the false alarm ratio (FAR), the Heidke skill score (HSS), and the equitable threat score (ETS).Equations for the above scores are taken from Ebert et al. (2007) and references therein.Every satellite-RGN (or satellite-RNC) match-up duplet, obtained as described in Sect.3.1, can be classified as a hit (H, observed rain correctly detected), miss (M, observed rain not detected), false alarm (F, rain detected but not observed), or correct null (N, no rain observed nor detected) event.The sum H+M+F+N is equal to the sample size S.The accuracy score is defined as (H + N) / S, and it indicates the fraction of total sample that has been correctly identified as rainy or non-rainy.The FB score is defined as (H + F) / (H + M), and it is the ratio of the detected to observed rain areas, thus indicating whether there is a tendency to over-or underestimate the area subject to rain (bias score > 1 or < 1, respectively).The probability of detection, POD = H / (H + M), gives the fraction of rain occurrences that was correctly detected, while the false alarm ratio, FAR = F / (H + F), measures the fraction of rain detections that was actually false alarms.By considering the number of hits that could be expected due purely to random chance, given by He = (H + M) (H + F) / S, the HSS score is defined as HSS=(H +N − He)/(S-He), indicating the fraction of correctly detected FOVs (as rainy or non-rainy) but after eliminating the fraction correctly identified due to random chance.Similarly to this, the ETS is defined as ETS=(H -He)/(H +M+F -He), indicating the fraction of correctly detected FOVs (as rainy), adjusted for the number of hits that could be expected due purely to random chance.ETS is more severe than HSS since it does not take into consideration the corrected negatives.The ETS is commonly used as an overall skill measure by the numerical weather prediction community, with accuracy, FB, POD, and FAR providing complementary information on bias, misses, and false alarms.

1 2Figure 1 :Fig. 1 .
Figure 1: An example of the graphical output of OPEMW (the MHS FOVs are repres 3 uniform circles along the scan line).The surface rain intensity (sri) product is col 4 according to the vertical bar (in mm/h) and layered over the Meteosat Second Generatio 5 10.8 µm image (in normalized inverted grey scale).Data obtained from MHS on NOA 6 overpass at 01:22 UTC and MSG observations at 01:00 UTC on 5 July 2011.7 Fig. 1.An example of the graphical output of OPEMW (the MHS FOVs are represented as uniform circles along the scan line).The surface rain intensity (sri) product is color-coded according to the vertical bar (in mm h −1 ) and layered over the Meteosat Second Generation (MSG) 10.8 µm image (in normalized inverted greyscale).Data obtained from MHS on NOAA N-18 overpass at 01:22 UTC and MSG observations at 01:00 UTC on 5 July 2011.

Figure 2 :
Figure 2: An example of the graphical output of RNC (courtesy of DPC).The surface rain intensity 2 (sri) product is color-coded according to the vertical bar (in mm/h) and layered over the Meteosat 3 Second Generation (MSG) 10.8 µm image (in normalized inverted grey scale).Data obtained from 4 RNC at 01:15 UTC and MSG observations at 01:00 UTC on 5 July 2011 (i.e.within 7 minutes from 5

Fig. 3 .
Fig. 3.The distribution of more than 3000 rain gauges over Italy.The figure shows 1 h accumulated rain (sri) between 01:00 and 02:00 UTC on 5 July 2011 (i.e., the 1 h period containing both Figs.1 and 2).Rain gauges are indicated with circles and colorcoded as follows: grey →missing data; white →no rain; green

Figure 6 :Fig. 6 .
Figure 6: Top panels: Percentage histograms of binned analysis for OPEMW sri against RGN (left) 3 and RNC (right) sri products.Bottom panels: Standard deviation of OPEMW sri data falling into 4 each 1 mm/h bin of RGN (left) and RNC (right).Blue and red bars indicate all and hits-only data, 5 respectively.6 7 8

Figure 7 : 10 Fig. 7 .
Figure 7: Scatter plot of binned analysis for the 1 yr period under analysis (July 2011-June 2012).y-3 axis report the OPEMW sri product, while x-axis report RGN sri (left) and RNC sri (right) 4 products.Blue markers indicate results using all data, while red crosses indicate results considering 5 hits only.Main statistics are shown, as the number of available bins (N), the mean (AVG), standard 6 deviation (STD), and root-mean squared (RMS) difference, the correlation coefficient (COR), and 7 the slope (SLP) and intercept (INT) for a linear fit.Numbers after the +/-sign indicate the 95% 8 confidence interval.Error bars indicate one std of OPEMW sri values within each bin.9 10

Figure 9 : 6 Fig. 9 .
Figure 9: Scatter plot of seasonal binned analysis as in Fig. 8, but with respect to RNC.Clockwise 3 from top-left panel: Summer, Fall, Winter, and Spring.Markers and statistics are as in Fig. 7. Error 4 bars have been omitted for improving figure readability.5 6

1 2Figure 10 :Fig. 10 .
Figure 10: Maps of seasonal mean absolute difference with respect to RGN; clockwise from top-left 3 panel: Summer (July-August-September 2011), Fall (October-November-December 2011), Winter 4 (January-February-March 2012), and Spring (April-May-June 2012).The vertical color bar is in 5 mm/h.The black arrow in the lower-left panel indicates Sicily.6 Fig. 10.Maps of seasonal mean absolute difference with respect to RGN.Clockwise from top-left panel: summer (July-August-September 2011), fall (October-November-December 2011), winter (January-February-March 2012), and spring (April-May-June 2012).The vertical color bar is in mm h −1 .The black arrow in the lower-left panel indicates Sicily.

Figure 11 : 6 Fig. 11 .
Figure 11: As in Fig. 7 but with respect to RNC.Clockwise from top-left panel: Summer, Fall, 3 Winter, and Spring.The vertical color bar is in mm/h.4 5 6

Figure 12 :
Figure 12: Monthly mean difference (circles) and its standard deviation (error bars) of OPEMW sri product with respect to RGN (blue) and RNC (red) products.Mean difference is computed as RGN (or RNC) minus OPEMW.

Fig. 12 .
Fig. 12. Monthly mean difference (circles) and its standard deviation (error bars) of OPEMW sri product with respect to RGN (blue) and RNC (red) products.Mean difference is computed as RGN (or RNC) minus OPEMW.

Table 1 .
Results of the dichotomous statistical assessment for OPEMW sri product with respect to RGN (blue) and RNC (red) products.