Hydrometeor classification from polarimetric radar measurements: a clustering approach

: A data-driven approach to the classification of hydrometeors from measurements collected with polarimetric weather radars is proposed. In a first step, the optimal number of hydrometeor classes (nopt) that can be reliably identified from a large set of polarimetric data is determined. This is done by means of an unsupervised clustering technique guided by criteria related both to data similarity and to spatial smoothness of the classified images. In a second step, the nopt clusters are assigned to the appropriate hydrometeor class by means of human interpretation and comparisons with the output of other classification techniques. The main innovation in the proposed method is the unsupervised part: the hydrometeor classes are not defined a priori, but they are learned from data. The approach is applied to data collected by an X-band polarimetric weather radar during two field campaigns (from which about 50 precipitation events are used in the present study). Seven hydrometeor classes (noptD7) have been found in the data set, and they have been identified as light rain (LR), rain (RN), heavy rain (HR), melting snow (MS), ice crystals/small aggregates (CR), aggregates (AG), and rimed-ice particles (RI). Abstract. A data-driven approach to the classiﬁcation of hydrometeors from measurements collected with polarimetric weather radars is proposed. In a ﬁrst step, the optimal number of hydrometeor classes ( n opt ) that can be reliably identiﬁed from a large set of polarimetric data is determined. This is done by means of an unsupervised clustering technique guided by criteria related both to data similarity and to spatial smoothness of the classiﬁed images. In a second step, the n opt clusters are assigned to the appropriate hydrometeor class by means of human interpretation and comparisons with the output of other classiﬁcation techniques. The main innovation in the proposed method is the unsupervised part: the hydrometeor classes are not deﬁned a priori, but they are learned from data. The approach is applied to data collected by an X-band polarimetric weather radar during two ﬁeld campaigns (from which about 50 precipitation events are used in the present study).Seven hydrometeor classes ( n opt = 7 ) have been found in the data set, and they have been identiﬁed as light rain


Introduction
Hydrometeor classification (HC) from weather radar data refers to a family of techniques and algorithms that retrieve qualitative information about precipitation: the dominant hydrometeor type within a given sampling volume, where the term "dominant" is used to underline that the actual hydrometeor content is usually a mixture. These methods use as input a set of quantitative measurements provided by the radar itself and some additional information from external sources, such as vertical profiles of temperature or estimates of the 0 • C isotherm height.
The classification is conducted on the spatial scale of the radar resolution volume (radar range gate), and its inputs are usually a set of polarimetric variables, such as the radar reflectivity factor at horizontal polarization Z H , differential reflectivity Z DR , the copolar correlation coefficient ρ hv , and the specific differential phase K dp 1 (definitions in Bringi and Chandrasekar, 2001;Berne and Krajewski, 2013).
The most recent HC techniques require polarimetric capabilities. This allows a single instrument, the radar, to acquire multiple simultaneous measurements that are sensitive to distinct characteristics of precipitation. This facilitates the understanding of many microphysical processes (e.g. Seliga and Bringi, 1976;Jameson, 1983;Vivekanandan et al., 1994;Ryzhkov et al., 2005;Bechini et al., 2013;Schneebeli et al., 2013).
Different HC algorithms are used at different frequencies, as in Straka et al. (2000); Liu and Chandrasekar (2000) for S-band, Marzano et al. (2007); Dolan et al. (2013) for Cband, and Dolan and Rutledge (2009) ;Snyder et al. (2010); Marzano et al. (2010) for X-band. This is necessary because the scattering properties of hydrometeors vary with respect to the incident wavelength. Recently, after many years of improvements, HC has become a common product, provided operationally by national meteorological services (e.g. Gourley et al., 2007;Al-Sakka et al., 2013;Chandrasekar et al., 2013).
Most HC methods are based on similar principles: they start by selecting the number and type of hydrometeor classes undergoing classification. Then, through scattering simulations, the theoretical radar observations associated with these hydrometeor classes are reconstructed. Finally, actual observations are associated (labelled) with the appropriate class according to their degree of similarity with the sets of simulations available. This last step is often conducted by means of a fuzzy-logic input-output association (e.g. Dolan and Rutledge, 2009) or by means of Bayesian (Marzano et al., 2010) or neural network (Liu and Chandrasekar, 2000) techniques.
In some cases these relations rely entirely on the simulation framework available (e.g. Dolan and Rutledge, 2009). In other cases, they are instead adapted and modified in order to adequately reproduce actual observations (e.g. Marzano et al., 2007) or according to empirical constraints (e.g. Al-Sakka et al., 2013).
The typical HC techniques mentioned above have become a state-of-the-art approach, stable and robust enough to be implemented operationally. However, it is important to underline that these approaches have some limitations since they rely on strong assumptions. First, the choice of the hydrometeor classes, meaning their content and their number, is mostly subjective. Secondly, the scattering simulations (e.g. Mishchenko et al., 1996), which are usually very accurate for rainfall, are uncertain for ice-phase hydrometeors because of the complex geometries, dielectric properties, and largely unknown size distributions of ice particles (Tyynela et al., 2011). Finally, it is not easy to take into account the accuracy of actual radar measurements when comparing simulations and observations. In the present paper we propose a different approach to HC, in which the classifier is built on actual measured radar data and is not constrained by the output of numerical simulations.
A clustering technique, i.e. a technique that is used to find patterns (groups) in data sets in an unsupervised way (see Jain et al., 1999;Xu and Wunsch, 2005;Von Luxburg, 2007, for a complete overview), is applied to a database of precipitation measurements collected by an X-band dualpolarization Doppler radar. An optimal partition of these data into n opt groups is found as a trade-off between data similarity (of polarimetric observations within each group) and spatial smoothness of the partition. The content of these groups is then interpreted a posteriori, and a hydrometeor class is assigned to each of them.
The paper is structured as follows. Section 2 provides some background on clustering algorithms, and Sect. 3 presents the polarimetric data employed in the study. Section 4 describes the unsupervised part of the classification method, and Sect. 5 is devoted to the identification of the optimal number of clusters in the data set. Section 6 deals with the labelling of the n opt clusters identified, and Sect. 7 presents the summary, discussion, and conclusions.

Background on clustering techniques
The proposed approach to HC is data-driven. The first two necessary steps are therefore to identify groups (clusters) in the available data set and then to select the optimal number of these groups. In this section we provide some background on the clustering methods that will be employed in the following sections.

Hierarchical data clustering
We define all techniques that aim at organizing a given set of objects (observations) in a certain number of groups (clusters) as unsupervised data clustering techniques. The shape (or functional form) of these groups, as well as their number, is unknown a priori (Jain et al., 2000).
We consider here a particular type of clustering technique: agglomerative hierarchical clustering (Ward, 1963, AHC hereafter). AHC is a stepwise approach that is used to group a set of N D objects into n c clusters (n c ≤ N D ) in such a way that objects belonging to the same cluster are more similar to each other than to those belonging to others. The technique is called agglomerative because at a step i This means that, at the initial step (i = 0), individual objects populate the clusters, while at each step two objects (the most similar) are merged, thus reducing the total number of clusters by one. The method is nested, in the sense that, once two samples are grouped in the same cluster, they remain clustered in all the following levels of the hierarchy.
In order to define which objects are the most similar, two criteria need to be defined (Xu and Wunsch, 2005): (i) a metric, i.e. a measure of distance between objects, and (ii) a merging rule. At each step i the pair of objects that are situated at the closest distance (according to a certain merging rule) are merged together.

Distance metric
Let x and y be two objects, or vectors, defined in a ddimensional space. They therefore have d components: A list of common distance metrics used to measure the distance D(x, y) between x and y is provided in Table 1 and it is a good metric to evaluate the similarity between x and y when all the d components have the same order of magnitude. Conversely, the "correlative distance" (see Table 1) is less affected by unbalanced components but might be ill-defined when d is small.

Merging rule
The second concept to be introduced is the merging rule. A merging rule defines the criteria that an object x, or a cluster of objects C I (a group of objects x ∈ C I ), has to satisfy in order to be merged with another cluster C J . In other words, it generalizes the concept of distance between single objects of Table 1 to distances between two clusters, or between a cluster and a single object. Even though many merging rules exist, in this paper we present the weighted pairwise average (WPA) and weighted centroid (WC) rules (Jain and Dubes, 1988): -WPA defines the distance between C I and C J as the average distance between couples of objects belonging to the two clusters, weighted by the number of objects in each subcluster. In this case the definition of distance between clusters, employed as merging rule, is recursive. As an example, given where n K and n L are the number of objects contained in the clusters C K and C L , respectively.
-WC defines the distance between clusters as the distance between the (weighted) centroids of each cluster. The centroid is the centre of mass of a cluster C I . It is computed as the average position of all the subclusters C K ⊂ C I , weighted by the number of objects in each C K . Thus, where x C I is the weighted centroid of cluster C I , defined as All hierarchical cluster methods start with N objects distributed into N clusters, and they end with N objects in one single cluster. The key point of any clustering method is therefore the selection of the optimal intermediate partition, named n opt , between the starting and the ending point. A universally applicable criterion to guide this choice does not exist. This selection is usually performed by taking into account the compactness of the clusters, their relative separability (Halkidi et al., 2002), and the available prior knowledge about the data undergoing clustering (Wilks, 2011).

Data and processing
The present section provides a description of the data employed in the following analysis, and some details about data processing.

Data source
The polarimetric radar data considered here were collected with an X-band dual-polarization Doppler weather radar (MXPol), whose characteristics are summarized in Table 2.
In the present work we employ radar data collected during two field deployments. The first one took place in Davos (CH), in the Swiss Alps, from September 2009 to July 2011. The radar was deployed at 2133 m a.s.l. on a ski slope dominating the valley of Davos, as shown in Fig. 1a. The altitude of the deployment site made it possible, during cold seasons, to collect many observations of ice-phase precipitation when the radar itself was located above the melting layer and therefore did not suffer from liquid-water signal attenuation. Such radar observations represent the main peculiarity of this field campaign (e.g. Schneebeli et al., 2013;Scipion et al., 2013). However, during warm seasons, the melting layer was often higher than the radar site and relevant observations of liquid phase precipitation, both in stratiform and convective cases, were collected as well. The climate of the Davos region is characterized by approximately 130 days of precipitation per year and total yearly accumulations of about 1100 mm. The most intense snowfall events in winter are associated with north-westerly fluxes (Mott et al., 2014). The scanning sequence of the radar, repeated approximately every 5 min, included plan position indicator (PPI) sector scans over the valley of Davos (at elevation angles of 0,2,5,9,14,18,20, and 27 • ), a range height indicator (RHI), and a vertically pointing PPI used for the zeroing of Z DR .
The second field deployment, shown in Fig 1b, took place in the Ardèche region (FR) from September to November 2012, at an altitude of 605 m a.s.l.. This deployment was part of the HYdrological cycle in the Mediterranean EXperiment (HyMeX) experiment (www.hymex.org; Ducrocq et al., 2014;Bousquet et al., 2014). Stratiform and convec-tive Mediterranean precipitation events were sampled during this campaign, with the radar always located below the melting layer. Convective precipitation included vigorous thunderstorms with intense electric activity. In Ardèche, precipitation (in the fall season) is mainly associated with eastwardmoving troughs from the Atlantic region that are at first slowed by the anticyclonic system over Russia and interact with the complex topography of the coastal region in the south of France (Miniscloux et al., 2001;Boudevillain et al., 2011). The scanning sequence of the radar included, in this case, wider (200 • in azimuth) sector scans at elevation angles of 3.5, 4, 6, 9, and 10 • . Additionally, two or three RHIs towards different directions and a vertically pointing PPI were collected during each cycle of 5 min.

Two-dimensional video disdrometer data
The novel hydrometeor classification method proposed in this work is entirely based on radar data. However, in the following sections we will use, for validation purposes, data collected by a two-dimensional video disdrometer (2DVD; see Kruger and Krajewski, 2002). One 2DVD (secondgeneration, "low"-profile version) was deployed during the Davos field campaign at an altitude of 2543 m and at a horizontal distance of 5.2 km from MXPol, as shown in Fig. 1a. The 2DVD is a disdrometer that measures surface precipitation with a sampling area of about 125 cm 2 . It provides the fall velocity as well as a couple of orthogonal two-dimensional views of any object crossing the sampling area, and its measurements have been recently used to perform ground-based hydrometeor classification .

Polarimetric data
The polarimetric variables calculated from the measurements of MXPol and employed in the following analysis are Z H [dBZ], Z DR [dB], K dp [ • km −1 ], and ρ hv [-].
Z H and Z DR are corrected for attenuation (in rain only) using the relations linking K dp , Z H , specific horizontal attenuation α H [dB km −1 ], and differential attenuation α DR [dB km −1 ] according to the method of Testud et al. (2000). The power laws between these variables are parametrized using disdrometer measurements for the data collected in France (locations shown in Fig. 1b) and using simulated realistic drop size distribution fields (Schleiss et al., 2012) for the data collected in Switzerland. The set of observations corresponding to events during which the radar was located above the melting layer were not corrected for attenuation, assuming the attenuation in dry snow to be negligible (Matrosov, 1992).
K dp is estimated from the total differential phase shift dp [ • ] using a method based on Kalman filtering . The algorithm is designed to ensure the independence between K dp estimates and other polarimetric variables and to capture the fine-scale variations of K dp . All the polarimetric variables are censored with a mask of signal-tonoise ratio SNR > 8 dB, and all the radar range gates potentially contaminated by ground clutter are censored as well, by means of a threshold of 0.7 on ρ hv .

Clustering of polarimetric radar data
Hierarchical clustering is applied to radar observations (objects) x, defined in the multidimensional space of the polarimetric variables. Here we present in detail our clustering approach, and we apply it to the database of Sect. 3.

Data preparation
The data object x is a five-dimensional vector defined for each valid radar resolution volume. The components of x are The last component (x[5] = z) is not a polarimetric variable, and it is defined as where z i [m] is the altitude above sea level of the ith resolution volume and z 0 • is the estimated altitude of the 0 • C isotherm, taken as a reference. A positive z refers to a measurement collected at temperature ranges where ice-phase hydrometeors are expected, while a negative one refers to a measurement likely taken in liquid-phase precipitation. This variable is used as prior information for the clustering algorithm in order to take into account the approximate environmental conditions associated with each measurement. The altitude of the 0 • C isotherm z 0 • is approximated by means of the linear interpolation of ground-based temperature measurements collected at a distance ≤ 40 km from the radar location and by assuming a constant lapse rate with altitude. It could also be estimated directly from other sources, such as soundings, numerical models, or radar data directly, when a melting layer is sampled.
The vector x is not yet suitable to undergo cluster analysis. Two issues need to be tackled.
1. The skewed distribution of K dp values. At X-band, K dp ranges approximately from −1 to 15 • km −1 (e.g. Otto and Russchenberg, 2011;Schneebeli and Berne, 2012), but its probability distribution, calculated over a large set of observations, is positively skewed, with typical modal values below 0.5 • km −1 . This issue is tackled by log-transforming K dp values. Before log-transforming we add 1 • km −1 to K dp in order to consider K dp values in the range [−1,15] 3 .
2. Due to the differences in their units, the different radar variable fields contained in x have a typical range of values that differs by several orders of magnitude. For instance, Z H can vary over tens of decibels relative to Z., while Z DR and K dp are smaller by one order of magnitude and ρ hv even by two orders of magnitude. This issue is tackled by means of data standardization (stretching). Even though a classical approach would be to use a z-score transformation, based on mean and standard deviation of a sample of data (e.g. Wilks, 2011), we selected a method based on minimum and maximum boundaries that allows us to preselect physically relevant bounds. The components x[i] * of the standardized data are obtained as is the minimum (maximum) bound allowed for each polarimetric variable. The boundaries employed in the present study are −10 to 60 dBZ for Z H , −1.5 to 5 dB for Z DR , −3 to 3 for the logarithmically transformed K dp , and 0.7 to 1 for ρ hv ( z is considered in the next paragraph). Variations of the order of ±20 % around the proposed boundaries have a negligible impact on the results presented in the following sections, and the most sensitive boundaries are associated with Z H .
z is stretched within a smaller range of variation in the following way: κ is a scaling factor and f ( z) denotes any monotonically increasing functional form that gives continuity to Eq. (9). Gaussian, sigmoid, and logistic functions have been tested and appeared to be equally adequate. The threshold of ±400 m is the (rounded) standard deviation of z 0 • estimates. The reason for a different standardization of z is to reduce the weight of this nonpolarimetric input in the clustering process: this parameter is intended only to flag positive and negative temperatures in a quasi-binary way and not to substitute the information provided by the polarimetric variables (therefore, κ is kept strictly ≤ 1). κ factors ranging between 0.3 and 0.9 lead to similar outputs, and an intermediate value of 0.5 was used.
With the standardization detailed in Eqs. (8) and (9), the radar observations collected at each radar range gate are summarized by the observation vector x * , whose entries are now expressed with a similar order of magnitude.

Subset undergoing clustering analysis
Agglomerative clustering algorithms are generally computationally expensive, because the distances between all samples (and then groups) to be clustered are computed at each step of the hierarchical aggregation chain. Therefore, we opted to define the clusters using a subset of the data and then assign the whole data set to these clusters using a nearestcluster rule (e.g. Volpi et al., 2012).
About 50 precipitation events belonging to the data set of Sect. 3 were manually selected. These events cover the range of precipitation types observed by MXPol during the field campaigns of Davos (CH) and Ardèche (FR), and they are assumed to be a representative sample of midlatitude temperate precipitation.
A subset of data is taken randomly from these 50 precipitation events from PPI scans conducted at elevation angles between 3.5 • and 10 • (free of ground clutter contamination). This amount, consisting of 20 000 observations x * (defined in Eqs. 6, 8, and 9), is used as input to the subsequent cluster analysis. Different seeds of the initial random selection led to the same results, suggesting that the random sampling does not affect the outcome of the clustering technique presented in the next section.

Clustering algorithm: data similarity and spatial smoothness
An AHC is applied to the polarimetric data set of x * objects in order to obtain an optimal partition of the data into a set of clusters. This technique is a trade-off between purely data-driven clustering, as it was described in Sect. 2 (that only looks for similarity in the five-dimensional feature space of x * ), and spatial smoothness of the partition in the physical space. In other words, hydrometeor classes should contain both objects that are similar to each other (data-wise) and that also exhibit spatial consistency, since we assume spatial smoothness of the geographic distribution of precipitation types. Here, and in the following, we will refer to a Euclidean distance metric and WPA merging rule. Of the other possible combinations of the distance metrics and merging rules presented in Sect. 2, similar results were obtained with the correlative distance and WC rule. The method developed in the present paper is sketched in the flow chart of Fig. 2. Panel (a) of the figure is explained step by step in the following sections.

Step1: Fig. 2a1
Initially the 20 000 selected objects populate n c = 20 000 clusters. A first hierarchical aggregation is conducted on the data, until reaching a number of 1 000 clusters in the data set 4 . This step aims at merging the most similar objects before proceeding with more computationally expensive calculations.

Step2: Fig. 2a2
Given the remaining n c = 1000 clusters, referred to as C L (L = 1, . . .n c ), we proceed to the classification of the entire PPI images from which the original 20 000 objects were extracted. Let x * p ∈ C L (L = 1, . . .n c ) be an object taken from one of the PPI images and not belonging to any cluster C L . This object is now classified into one of the n c clusters available, specifically the one related to the minimal distance to the object (according to the given merging rule). We proceed until all the objects of the PPI images are classified into one of the n c clusters available.
At this point we evaluate the spatial smoothness of the partition into n c clusters. Each object x * p has been assigned to a cluster C M (1 < M ≤ n c ). We start by defining a spatial smoothness index (SSI) associated with x * p . This index evaluates the spatial consistency of the classification of an object with respect to the classification of its neighbouring objects: where where n NN (number of nearest neighbours) is the number of nearest objects considered in the construction of SSI and x * i(p) indicates the ith nearest object of x * p . In the present work n NN = 4, and very similar results are obtained for n NN = 2, 4, 8. The identification of the nearest neighbours is performed in polar coordinates, and the distance between objects is the distance between their respective radar resolution volumes. SSI ranges between 0 and 1. If all the n NN objects belong to the cluster C M , then SSI is equal to 1. SSI indices are calculated for each x * p , and they are summarized in a n c × n c matrix M, hereafter called spatial smoothness matrix. The elements M I,J of M are defined as where N I is the total number of objects x * p satisfying the condition x * p ∈ C I . The matrix M is conceptually similar to a confusion matrix, commonly used to evaluate the goodness of categorical classifications (e.g. Wilks, 2011). Diagonal entries M I,I quantify the spatial smoothness of the cluster C I , while the off-diagonal terms M I,J (I = J ) quantify the probability of objects belonging to a cluster C I to be surrounded by objects of the cluster C J .
Analogously to a confusion matrix, the information contained in M can be further summarized by means of quality indices. As an example, Cohen's kappa can be used to evaluate the global spatial smoothness of a partition of the data set into n c clusters. Cohen's kappa is defined as where and N is the total sum (over rows and columns) of all the elements of M. Kappa ranges from −1 to 1 and increases as the level of spatial smoothness increases. Furthermore, it is a robust estimator in the case of unbalanced clusters. In fact, it takes into account the globally observed spatial smoothness (SSO) as well as the contribution occurring by chance, namely S est . Kappa evaluates the global spatial smoothness of a partition, but the smoothness of each cluster C M can be evaluated individually. For this purpose we define the spatial smoothness per cluster (SS M ) index:

Step 3: Fig. 2a3
At this stage, the set of observations is divided into n c clusters, and the spatial smoothness of this partition has been evaluated. A classical hierarchical approach would now proceed by merging the two most similar clusters data-wise, reducing the total number of clusters to n c −1 at each iteration. In our case, we make additional use of the information provided by Eq. (15). Let the cluster C W with the lowest spatial smoothness score be defined as The cluster C W is forced to disappear, and it is merged with the most similar (data-wise) one according to the linkage method and the distance metric selected.
In this way, at each step of the AHC, spatial smoothness is used to identify the cluster that exhibits the highest spatial discontinuity (lowest spatial smoothness), while data similarity is used to merge it with one of the other n c − 1 available clusters. The reader should be aware that different constraints on spatial smoothness could be implemented at this stage, and the constraint used in this work is a specific example. The aggregative algorithm detailed in steps 1-3 recursively repeats step 2 and step 3 until n c = 2.

Selection of the optimal cluster partition
An important step of hierarchical clustering is the selection of the optimal partition (n opt ) of the data set. In the present section we introduce some indices that evaluate the quality of data partitions and that are used to guide the final selection of n opt .

Cluster quality metrics
The spatial quality of each partition of the data set is quantified by means of two indices: 1. Kappa (Foody, 2004), defined in Eq. (12); Kappa quantifies the global degree of spatial smoothness of a given partition.
2. The accuracy spread index (AS), derived from Eq. (15) as follows: This index evaluates the inhomogeneity of the spatial characteristics of a partition into n c clusters. The lower it is, the more homogeneously the n c clusters perform in terms of spatial smoothness. Lower values are therefore associated with better partitions.
Other indices can be employed to evaluate each partition from the point of view of data similarity only. Most of these indices evaluate the scattering inside each cluster with respect to the distance between clusters, and they assign relatively better scores to partitions with compact and wellseparated clusters. In the present work we employ one index of this kind: the SD index (e.g. Halkidi et al., 2002). SD takes into account the average scattering of the clusters (Scat) and the total separation between clusters (Dist). For a partition of the data set into n c clusters, Scat is defined as where the vector σ (C L ) is the total variance of the Lth cluster C L , the vector σ Data is the total variance of the data set, and the ||•|| 2 operator is the 2-norm, defined in Table 1. Note that, in a d-dimensional space, these quantities are vectors and not scalar. The separation between clusters (Dist) is defined as where a is a normalization factor, equal to Dist(n max ), that forces Scat and Dist to be of the same order of magnitude. SD takes lower values for compact (low Scat) and well-separated (low Dist) partitions; therefore, the optimal number of clusters n opt in a database should exhibit a minimum SD. An optimal solution is selected here when n c = n opt = 7 clusters. In fact, we can observe that n c = 7 corresponds to a local minimum for both the SD index and the AS index. When n c = 7 the spatial behaviour of the seven clusters is the most homogeneous (low AS) and the trade-off between compactness and separability of the clusters is optimal (lowest SD). . Tech., 8, 149-170, 2015 www.atmos-meas-tech.net/8/149/2015/

From unlabelled clusters to hydrometeor classes: Fig. 2c
This section is devoted to the interpretation of the output of the clustering algorithm (Fig. 2c) that is still unknown at this step of the method.

Global characteristics of the clusters
The seven clusters corresponding to the optimal partition of our database contain a set of observations (or objects) that have been grouped according to spatial smoothness and data similarity. These clusters exist in the five-dimensional space given by the dimensions of x * , and it is therefore not trivial to illustrate their content. A way to reproduce a partial visualization of the clusters is to display pairs of two-dimensional projections of the objects x while keeping in mind that their original nature is five-dimensional. Some of these projections are displayed in Fig. 4, in which the seven clusters are colourcoded and labelled with a hydrometeor type. Additionally, Table A1 (numerically) and Figs. 5 and 6 (graphically) provide the one-dimensional distribution of polarimetric variables within each cluster. By looking at Fig. 4c, we observe that three clusters contain data collected only where the relative altitude with respect to the local 0 • C, i.e. z, is positive (negative temperatures), three clusters contain data collected always where z is negative (positive temperatures) and one cluster contains mostly data collected where z ≈ 0.
In the following sections, we will proceed by interpreting the clustering results separately for clusters appearing on average at z ≤ 0 and z > 0, and we will assign a hydrometeor class to each of the seven clusters.

Clusters at positive temperatures
Three clusters (red, green, and dark blue) in Fig. 4 are identified at positive temperatures. It is therefore assumed in a first approximation that they are related to liquid-phase precipitation. In order to properly associate each of them to a more specific category, further analysis is performed. At first, the data classified into one of these three categories are extracted from the observations collected in Ardèche (HyMex campaign, Sect. 3) from PPIs taken at elevation angles ranging between 3.5 and 10 • . Then, the rainfall rate R [mm h −1 ] associated with each measurement is computed by means of the following relations (Otto and Russchenberg, 2012): where ζ H = 10 0.1Z H [mm 6 m −3 ] (i.e. the horizontal reflectivity expressed in linear units). The distribution of R stratified for each class is summarized in Table 3. The green cluster is characterized by extremely low rainfall intensity, and therefore it is associated hereafter with a hydrometeor class named light rain (LR). This cluster contains mainly precipitation characterized by small spherical drops. It is worth noting Table 4. Confusion matrix comparing the classification of liquid phase hydrometeor classes ( z < 0) obtained with the clustering method described in the paper and the output of the fuzzy-logic method DR2009, described in Appendix B. The classes of the novel method are light rain (LR), rain (RN), and heavy rain (HR). The elements M i,j of the matrix contain the percentage of liquid phase observations classified in the ith class of the first method and simultaneously in the j th class of the second method. The data are obtained from 100 runs of the clustering algorithm.
Novel method DR2009 LR RN HR Drizzle 59 % 1 % 0 % Rain 7 % 29 % 6 % ( Fig. 5b and c) that LR contains Z DR values lower than 1 dB, with a mode around 0.25 dB, and K dp values always close to 0 • km −1 . LR therefore contains drizzle and the lightest rainfall intensities. The dark blue cluster is characterized by low to intermediate rainfall intensity, and therefore it is associated with a category named rain (RN). Finally, the red cluster contains by far the highest rainfall intensities, and it is hereafter called heavy rain (HR). We also hypothesize that, when hail occurs, it is classified as HR. We base this assumption on the fact that HR includes observations with a low-correlation coefficient ρ hv (Fig. 5d) as well as near-zero or negative Z DR (Fig. 5b). These signatures have been documented in cases where hail was measured by polarimetric weather radars (Al-Sakka et al., 2013).
As an additional test, the classification output of our method is compared with a fuzzy-logic classification scheme based on the parametrization of Dolan and Rutledge (2009), hereafter DR2009 (see Appendix B for the details). DR2009 does not provide three "liquid-phase" hydrometeor classes but only rain and drizzle. The contingency table of Table 4 shows that the HR class of our method is entirely classified as rain by DR2009. The RN class is mainly classified as rain, and LR is almost entirely associated with drizzle. We conclude that results from the proposed method agree well with DR2009 for liquid phase hydrometeor classes. Figure 7 illustrates a case where LR, RN, and HR are classified on the same PPI radar image. This case was collected on 24 September 2012 during the HyMeX campaign, when a high-intensity convective line was approaching the radar location from the west side of the domain (more details about the storm in Bousquet et al., 2014). This resulted in a layer of high values of Z H , Z DR , and K dp . The transition from LR to HR within few kilometres appears qualitatively to be a satisfactory illustration of the incoming front. Figure 7 also shows a map of classification accuracy. This parameter is defined for each observation (valid range gate) as the difference between the distance of the observation with respect to the two closest clusters, and it is normalized by the smaller distance. The classification accuracy is therefore lower in the ar-eas of transition between different hydrometeor types, where the polarimetric signatures change as the dominant hydrometeor type changes.

Cluster around 0 • C
The yellow cluster of Figs. 4 and 5 appears on average around the 0 • C isotherm, and it is interpreted as melting snow (MS). Figure 8 shows an example of classification output, where a melting layer is clearly visible in the polarimetric observations. The MS category can be seen to delimit the transition between ice-phase and liquid-phase hydrometeors. The signatures of this transition can also be seen in Z H , Z DR , and ρ hv . K dp does not exhibit any obvious signature in the regions classified as MS (in agreement with observations documented by Thompson et al., 2014).

Clusters at negative temperatures
The clusters identified at negative temperatures (dark green, pink, and cyan clusters in Fig. 4) should be attributed to ice-phase hydrometeors. To classify these clusters, we proceed as follows: first we examine the behaviour of the polarimetric variables within these three clusters, then we compare the classification with the output of DR2009. Subsequently we compare the classification with qualitative (hydrometeor classification) and quantitative (snowfall intensity) observations provided by a two-dimensional video disdrometer (2DVD) and with the output of a numerical weather prediction model (Consortium for Small-scale Modeling (COSMO)). Figure 6 presents the distribution of the polarimetric variables Z H , Z DR , K dp , and ρ hv , as well as the relative altitude z for the three "ice-phase" clusters. By looking at panel (a), we can observe a clear Z H signature. Z H is the lowest in the cyan cluster (mode ≈ 12 dBZ), it is slightly higher in the pink cluster (mode ≈ 15 dBZ), and it is the highest in the dark-green cluster (mode > 20 dBZ). Higher Z H indicates higher hydrometeor concentration, size, and/or ice density.

Polarimetric signatures
Z DR , shown in panel (b), exhibits a different pattern. The cyan cluster and the pink cluster show some variability in Z DR . Z DR ranges from −0.3 to 2.5 dB (mode 0.8 dB) in the cyan cluster and from −0.3 to 1.6 dB (mode 0.5 dB) in the pink cluster. We interpret this behaviour as the signature of particle shape and orientation variability in the cyan and pink clusters, with the pink cluster containing on average hydrometeors that are more geometrically isotropic. The dark-green cluster behaves differently: Z DR has a clear mode around 0.3 dB, the distribution of Z DR values is narrow (ranging between −0.6 and 1 dB), and it includes many negative values, i.e. prolate particles. ρ hv has a clear signature for the cyan cluster only, characterized by low values that often depart from 1. We interpret this behaviour as an additional effect of the variability of particle shapes within the radar resolution volume.
K dp , shown in panel (c), is lower than 1 • km −1 for the pink cluster and the cyan cluster. The dark-green cluster exhibits instead relatively large values of 2.5 • km −1 . K dp depends on size, concentration, shape, and density of the particles in the radar resolution volume and therefore the dark green cluster contains, on average, more oblate hydrometeors and/or oblate hydrometeors of a larger size and density.
Finally, by looking at panel (e), we observe that the darkgreen cluster is found over a broad range of altitudes (temperatures) and that the cyan cluster generally appears at lower temperatures than the other two.
From this analysis, we observed that the three clusters exhibit distinct polarimetric signatures, which led us to hy-pothesize the following associations. The cyan cluster corresponds to individual crystals and small aggregates (denoted CR): it appears in the coldest areas of precipitation, it shows significant variability of shapes, and low-intensity Z H returns due to the low concentration and small size of the hydrometeors. The pink cluster corresponds to aggregates (AG). Aggregates generate larger Z H returns due to their larger sizes, and they tumble as they fall, thus lowering Z DR . The dark green cluster corresponds to heavily rimed-ice particles (RI). The larger density of rimed particles lead to significant Z H signatures, and the dielectric properties of dense ice (very different with respect to dry crystals and aggregates; Vivekanandan et al., 1994) lead to a response also in K dp . Z DR is low because riming tends to smooth particle shapes, and it shows negative values when conically shaped rimed particles are formed (Evaristo et al., 2013). These hypotheses are discussed in the next sections.   Table 4 but comparing the classification of ice phase hydrometeor classes ( z > 0). The classes of the novel method are crystal (CR), aggregates (AG), and rimed-ice particles (RI). The elements M i,j of the matrix contain the percentage of ice phase observations classified in the ith class of the first method and simultaneously in the j th class of the second method. The data are obtained from 100 runs of the clustering algorithm.

Comparison with DR2009
In Sect. 6.2 we compared the liquid-phase clusters of our method with the output of DR2009. We now perform a similar evaluation, focussing on the ice-phase clusters. As a re-minder, our method provides three ice-phase classes: crystal and small aggregates (CR), aggregates (AG), and rimedice particles (RI). DR2009 instead provides five ice-phase classes: crystals (CR), aggregates (AG), high-density graupel (HDG), low-density graupel (LDG), and vertically aligned ice (VI, which denotes oblate ice crystals aligned vertically because of an electric field). The contingency table between these categories is shown in Table 5. We observe that the methods are in overall good agreement. The CR class is associated mostly with the DR2009 classes of aggregates, crystals, and vertical ice. AG is associated with aggregates, and RI is associated with the two graupel categories of DR2009. The only notable discrepancy between the methods happens for the high-density graupel category of DR2009: this class is evenly distributed among CR, AG, and RI, indicating that there is not a clear match for this hydrometeor type.

Comparison with 2DVD classification output
An additional comparison is conducted with the output of an HC scheme developed for two-dimensional video disdrometers. This method, hereafter called HC2DVD is described in detail in Grazioli et al. (2014). HC2DVD takes as input a set of two-dimensional particles images, collected by a 2DVD, and it provides as output an estimate of the dominant hydrometeor type within time intervals of 60 s. The method does not classify individual particles but populations of hydrometeors. HC2DVD discriminates between eight hydrometeor classes: small particle-like (SP), dendrite-like (D), column-like (C), graupel-like (G), rimed-particle-like (RIM), aggregate-like (AG), melting-snow-like (MS), and rain (R) 5 . The "-like" is added to underline that this algorithm assigns 5 We use here the abbreviations of Grazioli et al. (2014).
a dominant type of hydrometeor to each time step, but there is usually a mixture of different hydrometeors captured by the 2DVD so they do not necessarily exhibit a single pristine shape.
Here we compare HC2DVD with the output of the clustering algorithm for snowfall events collected during the campaign of Davos 2009. The PPI of the lowest elevation not contaminated by clutter was taken at 9 • elevation with a repetition interval of 5 min. This PPI is used for a comparison with HC2DVD.
Before discussing the comparison, it must be kept in mind that (i) the closest radar resolution volume centre was about , K dp [ • km −1 ], and ρ hv [-]. The altitude of the radar is 605 m. Table 6. Confusion matrix comparing the classification of ice phase hydrometeor classes during the measurement campaign of Davos as estimated from the clustering method and from the 2DVD (HC2DVD; Grazioli et al. (2014)), taken as ground reference. The comparison is conducted on three hydrometeor classes: crystal (CR), aggregates (AG), and rimed-ice particles (RI). In this case, the matrix is 3 × 3, with similar classes in rows and columns, and it can be used to evaluate quantitatively the agreement among methods. The overall accuracy of the comparison is 49 %, and Cohen's kappa is 0.23.

HC2DVD
CR AG RI HC2DVD-CR 27.9 % 13.9 % 0.5 % HC2DVD-AG 4.5 % 18.2 % 1.0 % HC2DVD-RI 15.7 % 16.5 % 1.8 % 400 m above the 2DVD and crystal habits can change over this altitude range, and (ii) the sampling times and volumes of the two instruments are different, even though the sampling times overlap.
The comparison is conducted on a subset of about 30 manually selected snowfall events. We excluded any precipitation event with a visible melting layer or positive temperatures at the radar location as well as any event characterized by evident spatial and temporal variabilities on the small scale. Radar resolution volumes within 150 m in horizontal distance from the 2DVD location are compared with the HC2DVD output. A buffer of +2 min is applied in order to match multiple 2DVD observations with a single radar scan.
In order to simplify the comparison, we aggregate some of the categories from HC2DVD as follows. We merge together small particles (SP), dendrites (D), and columns (C) into a single class called "crystals" (HC2DVD-CR). We keep aggregates (AG) in a single class, and we name it HC2DVD-AG. Finally, we merge graupel (G) and rimed particles (RIM) into a "rimed-ice" class (HC2DVD-RI). Table 6 presents the confusion matrix of the comparison between the novel clustering algorithm and HC2DVD. The agreement between CR and HC2DVD-CR is very good, as is the agreement between AG and HC2DVD-AG. Rimedice particles, in contrast, exhibit a good accuracy of detection (when they are detected, their presence is confirmed by HC2DVD), but they are subject to a large number of missed detections. If we consider HC2DVD as ground truth, the overall accuracy of the comparison is 49 % and Cohen's kappa is 0.23.
Similarly, we now compare HC2DVD with the classes of DR2009. We merge crystals (CR) and vertical ice (VI) in a single class called "crystals" (DR2009-CR), we keep aggregates (AG) in a single class (DR2009-AG), and we merge low-density graupel (LDG) and high-density graupel (HDG) into a "rimed-ice class" (DR2009-RI). Table 7 shows the confusion matrix of the comparison between DR2009 and HC2DVD. In this case, the overall accuracy is 38 % and Cohen's kappa is 0.08. We may conclude that, with HC2DVD as a reference, the newly proposed approach outperforms DR2009.
A complete evaluation of the uncertainties related to such comparisons is beyond the scope of the paper, and we suggest that the reader consider these results as largely qualitative.

Comparison with 2DVD in terms of snowfall intensity
Additionally to the qualitative information provided by HC2DVD, the 2DVD observations can be used in a quantitative way to investigate the relation between the content of the three clusters and the intensity of snowfall. Here we quantify the snowfall intensity by means of an equivalent flux (EF), De 3 where t is the temporal resolution at which EF is calculated (1/60 h here), A is the measurement area of the instrument [mm 2 ], N ( t) is the number of particles recorded in t, and De i the equivalent diameter of each snowflake or particle [mm], as defined in Huang et al. (2010). Given the assumptions in the estimation of De, EF can be erroneous in absolute terms, and therefore it is used here to compare the content of the three clusters in relative terms only. Figure 9 shows the distribution of EF as measured by the 2DVD and related to the occurrence of CR, AG, and RI. It can be seen that the snowfall intensity differs among these three. CR exhibits the lowest intensities, AG intermediate ones, and RI the highest intensities. This is the expected behaviour of rimed particles. As riming progresses, the original shape of ice particles becomes imperceptible, the drag decreases, and the particles become more dense (Pruppacher and Klett, 1997). Hence, their fall behaviour is smoother and therefore faster, leading to larger EF. The results presented in this section are not a rigorous validation, but they are in agreement with our initial assumptions. (up to 28 dBZ). CR is instead characterized by low values of Z H and ρ hv (as low as 0.9) and very low values of K dp , between −0.1 and 0.1 • km −1 . In this example, AG exhibits polarimetric signatures that are somewhat intermediate between CR and RI. For illustrative purpose, the situation corresponding to Fig. 10 was simulated using the numerical weather model COSMO (see http://www.cosmo-model.org), operationally used by MeteoSwiss. The model was run at 2 km resolution with forcing from MeteoSwiss reanalysis. As shown in Fig. 11, COSMO predicts the presence of supercooled liquid water (QC) at altitudes between 2.5 and 3 km. Additionally, at altitudes between 2 and 6 km, we observe large quantities of graupel (QG) mixed with snow (QS). Both the presence of supercooled liquid water in the clouds and the explicit presence of graupel are in agreement with the layer of rimed particles RI identified in Fig. 10.

Summary and conclusions
A novel approach to hydrometeor classification from a polarimetric weather radar was presented in this paper. The method was applied to polarimetric data collected by an Xband radar in the Swiss Alps and in the French Prealps. The novel approach was not based on numerical-scattering simulations. The number of hydrometeor classes was not defined a priori, but it was learned from the data, and the content of each hydrometeor class was manually interpreted.
A subset of 20 000 polarimetric observations was randomly extracted from the available data set. A hierarchical clustering algorithm with spatial constraints was applied to the subset in order to merge observations according to both the similarity of polarimetric data and the spatial smoothness of each partition. This means that we made the assumption of smooth spatial transitions between hydrometeor types. Following this strategy, an optimal number of seven clusters was found. Three clusters were found at positive temperatures, and they were interpreted as light rain (LR), rain (RN), and heavy rain (HR). One cluster appeared systemat- ically around 0 • C, and it was associated with melting snow (MS). Finally, three clusters were found at negative temperatures and their polarimetric signatures were interpreted as being of crystals/small aggregates (CR), aggregates (AG), and rimed-ice particles (RI). The content of the clusters agrees well with the outcome of a fuzzy-logic algorithm, denoted as DR2009 (Dolan and Rutledge, 2009). Additionally, the novel approach obtained scores better than DR2009 when compared to a ground-based (video-disdrometer-based) hydrometeor classification scheme, hence suggesting that the new method was better tailored to the observations of the Xband radar employed in this study.
The proposed approach is the first attempt, using unsupervised classification, to move the starting point of a classification algorithm away from scattering simulations conducted over an arbitrarily defined number of hydrometeor classes to the identification of relevant clusters in the data themselves. The initial identification of the clusters is computationally expensive, but this operation is performed only once and the classification of newly collected radar images can be conducted in real time.
Some of the advantages of this approach are that it is immune to possible radar miscalibration and that the datadriven approach ensures that the identified clusters take into account the accuracy of the instrument. Finally, the method is adaptable to other radar systems and can be tuned to include other constraints regarding the spatial smoothness of the partition or temporal consistency. The main limitations of the method are related to the manual interpretation of the content of the clusters. This may not be trivial, especially in the absence of surface precipitation type reports for comparison. Additionally, the method is as representative as the available database is, and the clusters identified are a priori valid only for the instrument employed to collect the data. We nevertheless expect the number and type of clusters to be very similar for other X-band dual-polarization radars of similar sensitivity.
It is interesting to note that the method exploits a simple hypothesis about the spatial smoothness of the hydrometeor types and that this rule is applied only in the initial steps (when the n opt clusters are identified). Future work will be devoted to also extending the constraints involving spatial smoothness to newly classified images or to including physically justified contiguity rules for specific hydrometeor types. In addition, this clustering approach (or some steps of the approach) could be employed as a support to fuzzy-logic-based classification methods to improve or adapt the membership functions according to the clustering outputs in specific data sets. Table A1 provides the relevant statistics of each of the seven clusters identified in this work from a database of X-band radar data. Table A1. Statistics describing the content of the seven clusters identified in Sects. 5 and 6. For each polarimetric variable and for each cluster, we provide the mean value, standard deviation σ , and a set of quantiles (Q1 %, Q5 %, 10 %, Q25 %, Q50 %, Q75 %, Q90 %, Q95 %, Q99 %).

Appendix A: Polarimetric characteristics of the seven clusters
Var. Class

Appendix B: DR2009 algorithm
The algorithm denoted as DR2009 in this paper is based on the work of Dolan and Rutledge (2009), with some adaptations that we will highlight as they appear in this section. In this Appendix, we provide the exact parametrization of the membership functions for the fuzzy-logic scheme, as well as the weights assigned to each polarimetric variable. The input variables of the algorithm are Z H [dBZ], Z DR [dB], K dp [ • km −1 ], ρ hv [-], and z [m], and their weights in the fuzzy-logic scheme are 0.25, 0.25, 0.25, 0.08, and 0.17, respectively. z is the relative altitude with respect to the 0 • C isotherm, as defined in Sec 4.1, and this input is not used in Dolan and Rutledge (2009). The hydrometeor classes available are aggregates (AG), crystals (CR), drizzle (DZ), high-density graupel (HDG), low-density graupel (LDG), rain (R), vertical ice (VI), and wet snow (WS; not present in Dolan and Rutledge, 2009). The membership function employed for all the polarimetric inputs is a membership beta function β, while for z a trapezoidal one is used. β is defined as where x is the considered polarimetric variable, m is the midpoint, a is the width, and b the slope. Table B1 summarizes the values of the parameters for each polarimetric variable and each hydrometeor class. The trapezoidal membership function T employed for z instead takes the form of if x < l 1 ; x−l 1 l 2 −l 1 if l 1 < x ≤ l 2 ; 1 if l 2 < x ≤ r 1 ; where l 1 , l 2 , r 1 , and r 2 define the four vertices of the trapezoid. The values for these parameters are reported in Table B2.