Development and validation of a supervised machine learning radar Doppler spectra peak-finding algorithm

Kalesse, Heike; Vogl, Teresa; Paduraru, Cosmin; Luke, Edward

doi:https://doi.org/10.5194/amt-12-4591-2019

Articles | Volume 12, issue 8

https://doi.org/10.5194/amt-12-4591-2019

Articles | Volume 12, issue 8

Research article

30 Aug 2019

Research article |

| 30 Aug 2019

Development and validation of a supervised machine learning radar Doppler spectra peak-finding algorithm

Heike Kalesse, Teresa Vogl, Cosmin Paduraru, and Edward Luke

Abstract

In many types of clouds, multiple hydrometeor populations can be present at the same time and height. Studying the evolution of these different hydrometeors in a time–height perspective can give valuable information on cloud particle composition and microphysical growth processes. However, as a prerequisite, the number of different hydrometeor types in a certain cloud volume needs to be quantified. This can be accomplished using cloud radar Doppler velocity spectra from profiling cloud radars if the different hydrometeor types have sufficiently different terminal fall velocities to produce individual Doppler spectrum peaks. Here we present a newly developed supervised machine learning radar Doppler spectra peak-finding algorithm (named PEAKO). In this approach, three adjustable parameters (spectrum smoothing span, prominence threshold, and minimum peak width at half-height) are varied to obtain the set of parameters which yields the best agreement of user-classified and machine-marked peaks. The algorithm was developed for Ka-band ARM zenith-pointing radar (KAZR) observations obtained in thick snowfall systems during the Atmospheric Radiation Measurement Program (ARM) mobile facility AMF2 deployment at Hyytiälä, Finland, during the Biogenic Aerosols – Effects on Clouds and Climate (BAECC) field campaign. The performance of PEAKO is evaluated by comparing its results to existing Doppler peak-finding algorithms. The new algorithm consistently identifies Doppler spectra peaks and outperforms other algorithms by reducing noise and increasing temporal and height consistency in detected features. In the future, the PEAKO algorithm will be adapted to other cloud radars and other types of clouds consisting of multiple hydrometeors in the same cloud volume.

Download & links

How to cite.

Received: 08 Feb 2019 – Discussion started: 28 Mar 2019 – Revised: 04 Jul 2019 – Accepted: 10 Jul 2019 – Published: 30 Aug 2019

1 Introduction

Determining cloud composition in terms of hydrometeor populations is a nontrivial task in thick, cold precipitating clouds below 0 ^∘C. In these clouds, supercooled liquid water droplets and solid ice crystals of a variety of shapes and sizes can coexist at temperatures between −40 and 0 ^∘C. Mixed-phase clouds and thick, cold precipitating cloud systems play an important role in the Earth's climate, due to their strong influence on the radiative budget (Tan et al., 2016). Global climate models (GCMs) still have problems in representing mixed-phase clouds, and especially the supercooled liquid fraction (SLF), accurately (Komurcu et al., 2014).

This motivates the need for highly time- and range-resolved observations of the occurrence of different hydrometeor populations and of cloud phase in the vertical column. The first step towards characterizing hydrometeor types is determining the number of different populations within a certain cloud volume. Profiling cloud Doppler radars are well suited for this task for two reasons.

(i) They are able to penetrate the complete atmospheric column (except for strongly precipitating deep convective clouds), i.e., also beyond the range where lidar is fully attenuated, and (ii) they can be used as a stand-alone means of inferring the number of different hydrometeor populations and in certain circumstances even cloud phase because different ice particle populations (and sometimes liquid cloud droplets) and ice particles, which are present simultaneously within a radar sampling volume, are characterized by different terminal fall velocities due to their different particle size distributions and densities (Shupe et al., 2004; Verlinde et al., 2013; Kalesse et al., 2016; Radenz et al., 2019).

Each of these different particle size distributions thus generates a peak in the radar Doppler velocity spectrum (Kollias et al., 2016). However, sub-volume turbulence broadens the cloud Doppler spectra peaks and thus smears/smoothes the microphysical signature. Using narrow-beam width antennas and optimizing observational strategies with short dwell time and high vertical resolution reduces turbulence-induced spectrum broadening (Kollias et al., 2016). However, the observed Doppler spectrum is always a convolution of microphysical and dynamical effects.

In order to infer microphysical properties from the radar Doppler spectrum, the peaks have to be separated. Because spectra can be noisy and peaks can be merged, this is a nontrivial task, which has already been approached in multiple ways in the past for different cloud types: Shupe et al. (2004) were able to separate observed Doppler velocity spectra into a liquid and an ice spectral mode for a 30 min long altostratus case study. They empirically defined criteria, which were applied by an algorithm to distinguish multiple peaks in the radar Doppler spectra.

The Microscale Active Remote Sensing of Clouds (MicroARSCL) data product (Kollias et al., 2007; Luke et al., 2008) is generated by a post-processing routine applied to Doppler spectra recorded by the U.S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) program millimeter wavelength cloud radars. It uses the morphology of the Doppler spectrum to determine shape parameters like skewness and kurtosis for both the primary peak (highest reflectivity) and, if applicable, an additional noise-separated secondary peak (of lower reflectivity). The peak power densities and modal velocities of up to two local maxima (sub-peaks) located within the primary peak are also included. The MicroARSCL product has, for example, been used by Riihimaki et al. (2016) and Oue et al. (2018). The former used it to infer hydrometeor phase in a tropical deep convective system, the latter to study hydrometeor populations in deep precipitating systems in the Arctic. Oue et al. (2018) found multimodal Doppler spectra in a dendritic/planar growth layer as well as in mixed-phase layers. They also highlighted the added value of joint analysis of Doppler spectra and polarimetric variables from scanning cloud radar observations for snow microphysical studies.

Other studies have utilized Doppler spectra analyses to identify cloud microphysical composition and cloud processes operating in Arctic clouds. For instance, four Arctic cloud hydrometeor populations (background ice, cloud, drizzle, and new ice) were successfully classified using continuity of spectral modes in time and height combined with high-spectral-resolution lidar (HSRL) and in situ observations (Verlinde et al., 2013). Analyses of the Biogenic Aerosols – Effects on Clouds and Climate (BAECC) field campaign have also distinguished up to three noise-floor-separated peaks in the recorded Doppler spectra for frontal snow falling through a supercooled water layer (SWL) that produced rimed snowflakes (Kalesse et al., 2016). These respective peaks were then used to track microphysical processes along slanted fall streaks, although this documented case was special due to the separation of peaks by the noise floor (merged peaks are usually observed, motivating the need to develop robust cloud radar Doppler spectrum peak separation techniques). Finally, KAZR observations of liquid-only and mixed-phase clouds at Oliktok Point, Alaska, have been used to identify multiple Doppler peaks using the depth of the local minimum between the main peak and sub-peak as the main separation criteria (Williams et al., 2018).

All these efforts, using somewhat differing approaches, show that there is a need to correctly separate multiple merged peaks in Doppler spectra to aid microphysical understanding of mixed-phase cloud processes as well as to improve hydrometeor classification techniques. In the past, algorithms mimicking the feature detection skill of human experts in analyzing Doppler spectra have been shown to achieve robust results (Cornman et al., 1998), while recent studies highlight the role of machine learning as a tool for hydrometeor classification based on remote-sensing data (e.g., Besic et al., 2016; Praz et al., 2017). This study describes a new algorithm that adopts machine learning tools to classify Doppler spectra peaks in complex mixed-phase cloud scenarios.

2 Data set description

The Biogenic Aerosols – Effects on Clouds and Climate (BAECC; Petäjä et al., 2016) campaign took place at the Station for Measuring Ecosystem-Atmosphere Relations II (SMEAR II) in Hyytiälä, Finland (61^∘51^′ N, 24^∘17^′ E, 150 m above sea level). The ARM program deployed their second ARM mobile facility (AMF2) to Hyytiälä from February to September 2014. Within this time frame, a snowfall experiment (BAECC SNEX) took place as a collaborative effort between DOE ARM, the University of Helsinki, the Finnish Meteorological Institute (FMI), the National Aeronautics and Space Administration (NASA), and Colorado State University. An intensive operation period (IOP) from 1 February to 30 April 2014 was aimed at measuring snowfall microphysics using a comprehensive suite of remote-sensing instruments, complemented by surface-based precipitation observations.

The AMF is constituted of several ground-based remote-sensing instruments, including among other things a 35 GHz Ka-band ARM zenith-pointing radar (KAZR), a W-, Ka-, and X-band scanning ARM cloud radar (Kollias et al., 2014), a high-spectral-resolution lidar (HSRL), and a micropulse lidar (MPL). Supplementing these measurements, radiosondes were launched four times daily. This study will focus on the Doppler spectra recorded by the KAZR, and will utilize other observations (e.g., ground-based in situ, HSRL – if applicable) for comparison and validation purposes. The KAZR was operated with a temporal resolution of 2 s, a vertical range gate spacing of 30 m, and a Doppler velocity spectrum resolution (bin width) of 2.37 cm s⁻¹.

3 Methodology

In the following section, the supervised Doppler spectra peak detection algorithm developed in this work is introduced. This description is followed by an introduction of the other Doppler spectra peak-finding algorithms which are compared to the new algorithm.

https://www.atmos-meas-tech.net/12/4591/2019/amt-12-4591-2019-f01

Figure 1Example of graphical user interface for peak-marking by hand. For the Doppler spectrum in the center panel, two peaks (red stars) were marked by the user (HK). The surrounding panels display the spatially and temporally neighboring spectra. Data are as follows: KAZR spectra observed at TMP on 16 February 2014, 0.03–0.05 UTC, between 1.0 and 1.2 km height. The red line marks the maximum noise floor determined according to Hildebrand and Sekhon (1974), and the black line the mean of the noise.

Development and validation of a supervised machine learning radar Doppler spectra peak-finding algorithm

3.1 PEAKO algorithm description

3.2 Description of other radar Doppler spectra peak-finding algorithms

4.1 Training phase of the PEAKO algorithm

4.2 Testing phase of the algorithm

5.1 Summary of findings and outlook