Commercial microwave links as a tool for operational rainfall monitoring in Northern Italy

Abstract. There is a growing interest in emerging opportunistic sensors for
precipitation, motivated by the need to improve its quantitative
estimates at the ground. The scope of this work is to present a
preliminary assessment of the accuracy of commercial microwave link (CML) retrieved rainfall
rates in Northern Italy. The CML product, obtained by the open-source
RAINLINK software package, is evaluated on different scales (single
link, 5 km×5 km grid, river basin) against the
precipitation products operationally used at Arpae-SIMC, the regional
weather service of Emilia-Romagna, in Northern Italy. The results of
the 15 min single-link validation with nearby rain gauges
show high variability, which can be caused by the complex physiography and precipitation patterns. Known sources of errors
(e.g. the attenuation caused by the wetting of the antennas or random
fluctuations in the baseline) are particularly hard to mitigate in
these conditions without a specific calibration, which has not been
implemented. However, hourly cumulated spatially interpolated CML
rainfall maps, validated with respect to the established regional
gauge-based reference, show similar performance (R2 of 0.46 and
coefficient of variation, CV, of 0.78) to adjusted radar-based precipitation gridded products and
better performance than satellite-based ones. Performance improves when basin-scale
total precipitation amounts are considered (R2 of 0.83 and CV of
0.48). Avoiding regional-specific calibration therefore does not
preclude the algorithm from working but has some limitations in probability of detection (POD)
and accuracy. A widespread underestimation is evident at both the grid box scale
(mean error of −0.26) and the basin scale (multiplicative bias of 0.7),
while the number of false alarms is generally low and becomes even lower
as link coverage increases. Also taking into account delays in the
availability of the data (latency of 0.33 h for CML against
1 h for the adjusted radar and 24 h for the
quality-controlled rain gauges), CML appears as a valuable data source
in particular from a local operational framework perspective. Finally,
results show complementary strengths for CMLs and radars, encouraging
joint exploitation.


The whole set of available CMLs is represented in Fig. S1a, according to the pairs of path length (x-axis) and frequency (yaxis).Theoretical sensitivity is calculated for every length-frequency pair through the inversion of the kR relationship at a fixed 1 mmh −1 rain rate.It has to be remembered here that the manipulations within the algorithm (especially the Aa threshold) do not allow a direct translation from the theoretical sensitivities to actual instrumental uncertainties or error bands.The theoretical sensitivity field is shown as contour lines of equal sensitivity with small differences for the two polarizations.Starting from an initial dataset of 357 duplex CMLs, 4.13 % of the data is rejected as it has a theoretical sensitivity lower than 0.1 dB per mmh −1 (red selection in Fig. S1).This involves 15 CMLs (see Fig. S2a).Another 10.9 % is lost during the Preprocessing, mainly due to missing values ("NA" in the R language) alternatively on Pmin or Pmax.Some mitigation technique was attempted, exploiting the remaining observation of the Pmin-Pmax pair, but did not lead to any consistent result.The median of the number of CMLs involved here is 37, with around 9 % variance between different days and an isolated peak on June 3 (see Fig. S2b).Both low sensitivity and NAs are issues intrinsic to the dataset we received, and their mitigation is out of our control, so we had no alternatives but to discard them all.Rejected percentages of the total available data are listed in Tab.S1.Lastly, 3.87 % of the data is classified as "outlier" from the outlier filter of the RAINLINK algorithm.The resulting valid dataset is 81.1 % of the initially available one.Outliers affect a median of 14 CML per time interval (15 min), but in this case, the corresponding variance between different days is a higher 27 % (see Fig. S2c and S2d).Outliers happen mostly on high-frequency links (> 35 GHz) and on the shortest lengths (< 4 km) with a secondary peak on the longest (> 17 km), as shown in Fig. S3.No calibration of the -32.5 dBkm −1 h threshold was attempted.Please note that LC, BC, and the other dataset descriptors of Tab. 1 of the main manuscript are all computed for the valid dataset, not on the total one.In the southern part of the Po Valley, where our target areas are located, precipitation is well distributed along the year, with two peaks in spring and autumn, and relatively dry summer.During spring, a transition season for the Mediterranean area, precipitation is often related to cyclonic development, typical of colder months, with large frontal bands and moderate rain rates, but also small scale convection triggered in many cases by orography.The PDF of rain rates in the region during the two months of our research are reported in Fig. S4 (dashed lines) for the single 15 minutes raingauges (left) and the hourly ERG5 product at 5x5 km 2 resolution (right).In the same plots, are also reported the PDF of rain rates as computed by RAINLINK (solid line), 15 min, single link (left), and hourly interpolated on the same ERG5 5x5 km 2 grid (right).

Characteristics of precipitation in the study areas
The plots show that single-CML estimates are rather good for low to moderate rain rate (around 5 mm), while for higher, more discrepancies are found.Moving to the interpolated version, we can note that interpolation leads to a general underestimation widespread along with the whole rain rate range.A characterisation of the events that occurred in the two areas in the two months is shown in Fig. S5.An event is defined as any precipitation episode lasting for at least one hour (to grab also small scale thunderstorms) with at least two wet adjacent gridpoints.Two consecutive hours are needed to separate subsequent events.
Most of the events occupy a very small area (below 10%, i.e. around 300 km 2 ), and in general, the average coverage of the events during their lifetime is below 60 %.It has to be remarked that very likely the events are not entirely contained in the target areas during the whole lifetime so that we can have only a partial view of the events.The maximum rain rate for each event has two peaks, one for very small rain rate below 5 mmh −1 , probably due to stratiform precipitation, and a smaller one around 13 mm, related to convective systems of moderate intensity.The duration of the events spans from one hour to one day, while most of the events produce small rain amounts, below 5 mm on the areal average.This analysis shows that the typical size of the events is small and, with some significant exception, characterized by low to moderate rain rate.This is certainly a challenging situation to test RAINLINK algorithm, since low-moderate rain rate makes the wetting of the antennas more influential, and small scale precipitation areas can be poorly defined by sparse sensors, such as both CML and the raingauges used for verification.

The total accumulation of interpolated products
The cumulated rain over the entire period on every 5x5 km 2 grid box is analysed with maps and scatter plots in Fig. S6.Most of the points show a cumulated depth of around 100 mm over two months.Most of the CML product shows a 10-50 % underestimation, while two different branches gather the highest positive and negative discrepancies.When looking at the map, the points of the two branches are not randomly placed but are grouped in specific zones.It is reasonable to expect that the presence of the boundaries of the domain is probably affecting both the CML product skill and the reference reliability.Further studies could work on subsampled areas where raingauges are the most uniform and where CMLs are present not only inside but also outside the interpolation region.

Calibration of Aa and alpha
We performed some sensitivity analysis for Aa and alpha, as recommended by Overeem et al. (2016).However, it is our feeling that the reference data we had available (which are used daily in operational offices) are not ideal to be used as a calibrator, in terms of quality and spatial and temporal characteristics.For this reason, we eventually chose not to use the calibrated parameters inside the primary validation process.We checked for the 27 links with a close-by 15 min raingauge in  The CC surface (left) shows a clear maximum at alpha = 0.3, Aa = 0.7, while CV (right) reaches no local minimum in the examined domain, but has a plateau-like area of good performance, in which falls the best match for CC.This analysis suggests an Aa value much smaller than the one chosen by Overeem et al. ( 2016) (and kept for our work), which was 2.3 dB, while the alpha parameter remains almost unchanged from the default value of 0.33.However, looking at the PDF of the estimated rainfall with the two different Aa, (see Fig. S8), emerges that the physical representativeness drops down with the lower Aa.

Figure S1 .
Figure S1.Dataset characteristics over the field of the theoretical sensitivity.Data in red are below the 0.1 dB per mmh −1 .

Figure S2 .
Figure S2.Top left: number of low sensitivity data rows.Top right: number of rows with NAs, Bottom Left: daily numbers of CML with outliers.Bottom right: hourly numbers of CML with outliers.

Figure S3 .
Figure S3.Distributions of the hardware characteristics of the CML reported as outliers compared with the ones of the whole dataset.

Figure S4 .
Figure S4.Comparison of estimated and gauged PDF at 15 min single link (left) and hourly interpolated scales (right).Sampling is performed on 1 mm bins.

Figure S5 .
Figure S5.Distributions of summary indicators for precipitation for May and June 2016 in Emilia-Romagna.

Figure S6 .
Figure S6.Top: total accumulated rain depth over May and June 2016.Bottom: branches of high discrepancy are highlighted on the map. 60

Figure S7 .
Figure S7.Sensitivity analyses to two coupled retrieval parameters of the RAINLINK algorithm.Loss functions CC and CV.

Figure S8 .
Figure S8.PDF of the single-CML estimates against the raingauges for two different values of Aa.

Table S1 .
Percentages of the data rejected after the different steps of the algorithm.