Identifying Insects, Clouds, and Precipitation using Vertically Pointing Polarimetric Radar Doppler Velocity Spectra

,


Introduction
The vertical structure of non-precipitating clouds plays an important role in the Earth's radiation balance. These clouds absorb longwave radiation emitted from the surface and reflect shortwave solar radiation back into space (Cess et al., 1990). The proportion of these two processes determines whether these clouds act as a net radiation sink or source in the Earth's radiation budget (Ramanathan et al., 1989). Vertically pointing cloud radars have been used for decades to quantify the extent to which non-5 precipitating clouds can be used as inputs to Earth radiation budget studies to understand cloud dynamics and cloud life cycles (Moran et al., 1998;Ackerman and Stokes, 2003).
In addition to measuring cloud properties, cloud radars are sensitive enough to detect individual insects within the radar volume (for overviews, see Riley, 1989, Westbrook et al., 2014Nansen and Elliot, 2016). The field of radar entomology exploits this sensitivity by pointing polarimetric radar beams a few degrees off vertical and rotating the beam 360 degrees in azimuth to 10 estimate insect population and migration direction (Drake et al., 2020). The field of radar meteorology has used polarimetric scanning radar observations to track insect flying direction and altitude outside of clouds (Mueller and Larkin, 1985) and to estimate gust-front motions ahead of convective cells because insects and small pieces of vegetation act as radar reflectors trapped within the strong boundary layer outflow (Klingle el al., 1987). Insects are considered clutter and unwanted signals in vertically pointing cloud radar observations. Two approaches have been used to identify insects in cloud radar observations: polarimeteric signatures 15 and Doppler velocity power spectra signatures. Compared to more spherical hydrometeors (i.e., cloud droplets, raindrops, and ice particles), insects have asymmetrical shapes that produce large cross-polarization power return signals that enable insects to be identified with polarimeteric radar estimates including differential reflectivity and linear depolarization ratio. (Lohmeier et al., 1997;Khandwalla et al., 2001 andMartner and Moran, 2001). Also, large insects will have different radar cross-sections at different radar operating wavelengths due to the resonance or Mie scattering effects enabling insects to be detected in dual-20 wavelength radar observations (Khandwall et al., 2001 andKollias et al., 2002).
Insects produce unique signatures in the Doppler velocity power spectra. An individual insect scatters as a single point target with a returned power confined to a narrow Doppler velocity range and to a single range gate (Bauer-Pfundstein and Görsdorf, 2007). In contrast, clouds and precipitation are composed of hydrometeor distributions containing different size particles with different velocities that are spread over several range gates leading to broader measured Doppler velocity power spectra 25 extending over several range gates (Luke et al., 2008). The difference between insect and hydrometeor Doppler velocity power spectra signatures has been used to distinguish insect and hydrometeor peaks in Doppler velocity spectra (Bauer-Pfundstein and Görsdorf, 2007;Luke et al., 2008). In these studies, multiple peaks are first found in the spectra and then intelligent algorithms (Bauer-Pfundstein and Görsdorf, 2007) or neural network algorithms (Luke et al., 2008) were developed to classify peaks as the result of either insect or hydrometeor scattering. The method presented herein reverses the processing steps by first identifying and 30 then removing insect signatures in the Doppler velocity spectra before estimating spectrum peaks.
Identifying and removing radar scattering from insects and other sources of "atmospheric plankton" (Lhermitte, 1966) has been a known problem in developing operational cloud products (Kollias et al., 2016). The U. S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) program merges observations from multiple sensors (including radars, lidars, and ceilometers) to produce an estimate of hydrometeors (i.e., cloud particles, raindrops, ice particles) in the vertical column, called 35 the Active Remote Sensing of CLouds (ARSCL) product (Clothiaux et al., 2000). ARSCL is a high temporal (~4 s) and vertical (~30 m) resolution operational product that primarily uses ceilometer cloud base and radar moments to classify all returns into one of three scattering regimes: hydrometeor-only scattering, clutter-only scattering (due to insects or another non-atmospheric artifact), and scattering due to a mixture of hydrometeors and clutter within the radar pulse volume. An approximate estimate of maximum clutter height is provided to an automated heuristic algorithm developed over two decades of experience producing the 40 ARSCL product at multiple radar sites. The results of the classification are reviewed. On rare occasions, the maximum clutter height is revised and the classification procedure is repeated. reflectivities. The black symbols represent ceilometer-derived cloud base, which is also an input to the ARSCL operational algorithm. Figure 1c (bottom panel) shows the hydrometeor mask produced using the algorithms discussed herein. The apparent fall streaks in the ARSCL hydrometeor-only product below 1.5 km are misclassifications of insect clutter. The misclassification of insect clutter as hydrometeors and the inefficient ARSCL processing steps were some of the reasons why DOE ARM sponsored this work to identify insect clutter with the aim of improving future ARSCL products. 10 15 produced using the method described herein. Black symbols in all panels are ceilometer derived cloud base. Note the hydrometeor misclassification below ceilometer cloud base in (b) motivates the need for improved insect clutter detection.
The method to identify insects and hydrometeors presented herein builds on prior work using polarimetric diversity and Doppler velocity power spectra variability (e.g., Martner and Moran, 2001;Bauer-Pfundstein and Görsdorf, 2007;Luke et al., 2008;Görsdorf et a., 2015). One unique feature of the proposed algorithms is that insect and hydrometeor scattering are identified 5 before identifying significant peaks in the Doppler velocity spectra. This approach complements the methods that first identify multiple peaks and then classify each peak (Bauer-Pfundstein and Görsdorf, 2007;Luke et al., 2008) The observations used in this study and the signatures of insect and hydrometeor scattering are discussed in Sections 2 and 3. Section 4 presents the main concept behind the algorithms developed in this study. Section 5 compares the hydrometeor masks with the Clouds Optically Gridded by Stereo (COGS) product (Romps and Öktem, 2018) derived from stereo cameras. Section 5 also compares the hydrometeor mask 10 cloud bottom with ceilometer derived cloud base. Conclusions and next steps are discussed in Section 6. The online Supplemental Material Section contains images of insect and hydrometeor classifications for forty-seven (47) summer-time days in northern Oklahoma, U.S.A., identified as LASSO cloud simulation events (LASSO, 2020).

Observations
The observations used in this study were collected by the U.S. Department of Energy (DOE) Atmospheric Radiation Measurement 15 (ARM) program at their Southern Great Plains (SGP) central facility located in northern Oklahoma. Vertically pointing Ka-band radar co-polarized (CoPol) and cross-polarized (XPol) Doppler velocity power spectra are processed to identify insects, clouds, and precipitation in the vertical column. Verification of those classifications are based on observations from co-located lidar, ceilometer, Total Sky Imager (TSI), and cloud boundaries contained in the Clouds Optically Gridded by Stereo (COGS) product (Romps and Öktem, 2018). 20

Ka-band ARM Zenith Pointing Radar (KAZR)
The DOE ARM program deploys atmospheric observing systems to characterize the radiative properties of clouds in the atmosphere (Mather and Voyles, 2013 where to define the velocity range of valid , ℎ and , ℎ observations. At SGP, KAZR operates in the general 'GE' and medium 'MD' sensitivity modes to sense clouds at different altitudes with operating parameters during 2018 and 2019 shown in Table 1 ( ARM user facility, 2011aARM user facility, , 2011bWidener et al. 2012). Even though insects are detected in both KAZR operating modes, to simplicity the figures and algorithm descriptions, the results from just the MD mode are presented herein. Since the MD mode transmits a long frequency-modulated pulse, the first resolved range gate is 570 m above the radar. The KAZR 3.05 m diameter Cassegrain parabolic reflector manufactured by Millitech produces a 5 0.2° antenna beamwidth with 57.5 dBi gain, has -27 dB cross polarization isolation, and a membrane radome slopes across the antenna with a dry 2-way loss less than 2 dB (Widener et al., 2012). The MD mode uses a non-linear frequency modulated chirp over a 3967 ns pulse length to produce a 45 m range resolution sampled at 30 m range spacing. At 1 km range, the radar pulse volume is a 3.6 m diameter horizontal disk over a 45 m range to yield a pulse volume of approximately 450 m 3 . To save computer hard disk space, the KAZR CoPol and XPol Doppler velocity power spectra are retained only at range gates with significant power 10 above a noise threshold.

Validation Observations
Two observational datasets are used to validate the derived KAZR insect and hydrometeor classifications: ceilometer cloud base estimates from a Vaisala model CL31 ceilometer (ARM user facility, 2010b; Morris, 2016) and cloud bottom and top estimates from the COGS product (ARM user facility, 2017). The Vaisala ceilometer uses a pulsed InGaAs diode laser at 910 nm wavelength 15 and the vendor supplied algorithm estimates cloud base at 10-m and 16-s resolution when the vertical visibility is less than 100 m (Morris, 2016). The COGS cloud boundaries are derived from three pairs of stereo cameras positioned around the SGP Central Facility and represent cloud boundaries over a cubic domain 6 km to a side (Romps and Öktem, 2018). Due to camera visual occlusion during precipitation, COGS cloud boundaries are only estimated for cases of shallow cumulus clouds, which allow the three cameras to view the vertical extent of each cloud. Likewise, estimates from COGS are only available during daylight hours. 20

Insect, Cloud Droplet, and Precipitation Spectral Characteristics 25
This section discusses the scattering characteristics of insects, atmospheric plankton, clouds, and precipitation as observed in KAZR CoPol and XPol Doppler velocity power spectra. The first subsection discusses characteristics when it is not raining and the radar is observing individual insects or other atmospheric plankton particles scattering as point targets with narrow velocity ranges and shallow cumulus clouds scattering as distributed targets with broader velocity ranges. The variability of return power across the Doppler velocity spectrum, or the spectrum 'texture', is used to distinguish point target insects from distributed target 30 clouds. The second subsection describes the characteristics when individual insects and raindrop or ice particle distributions occur simultaneously in the radar volume. The LDR at each Doppler velocity bin is used to distinguish high LDR insects from low LDR raindrops or ice particles.  (Fig. 2c) have a coherent pattern, but vertical motion ( Fig. 2b) appears more variable. If drizzle or rain were below cloud base, then all three quantities would be coherent with downward motions increasing as reflectivity and spectrum width increase (Williams and Gage, 2009). Thus, it is not raining below cloud base. Above the ceilometer derived cloud base height, there are CoPol reflectivity observations (Fig. 2a), but not as many LDR estimates (Fig. 2d). For example, near minute 20, there is an enhancement of CoPol reflectivity above the ceilometer cloud 5 base and extending above 2 km, yet, there are very few LDR observations in this time-height region. Since LDR requires both CoPol and XPol reflectivity observations, the lack of LDR above cloud base indicates that the XPol channel is not detecting cloud particles. This CoPol vs. XPol sensitivity is illustrated in the bottom panel which shows CoPol reflectivity for all pixels that do not also have a LDR observation. The continuous time-height CoPol reflectivity observations above 1.5 km are cloud features that are easily discernible by eye. Return signals from individual insects appear as speckles up to 4 km in all panels.  Figure 3c shows CoPol Doppler velocity power spectra at 1 and 2 km heights (black and red lines, respectively). The power spectrum at 1 km has more variability between velocity bins compared to the spectrum at 2 km. This variability is because the radar is detecting individual insects within the 300 m 3 field-of-view with each insect moving at its own radial velocity. If an 20 insect is the only insect moving at a particular velocity, the spectrum will have an isolated peak (e.g., near -1.7 m s -1 radial velocity in Fig. 3c). If multiple insects are moving at similar speeds, the spectrum will be broader, yet, will still have variability. For example, between -1 and +3 m s -1 radial velocities, the 1 km height spectrum (black line) is both elevated in magnitude and has more bin-to-bin variability than the spectrum from 2 km (red line). Also, the backscattered power from insects is primarily confined to one range gate with some power leaking into neighboring range gates due to radar signal processing limitations, which produce 25 point enhancements in the spectra profile. Shown in sequential spectra profiles in the Supplemental Material, point enhancements often appear in only one spectra profile and not in neighboring profiles separated 4 seconds apart. The surface wind speed was about 3 m s-1 for this profile and there is not enough information to determine whether the insects are passive tracers advecting with the wind or self-propelling themselves through the 3.6 m diameter by 45 m field-of-view in less than 4 seconds.

Insects and Shallow Cumulus Clouds
In contrast to individual insects, clouds and precipitation are distributed targets filling the radar volume with hundreds or 30 thousands of hydrometeors of different sizes with different radial velocities. Since the number of particles in the hydrometeor size distribution varies gradually over neighboring particle sizes and the hydrometeor spectrum is extended in the velocity dimension due to antenna broadening effects, the return power spectrum has a gradual change over neighboring velocity bins. Thus, the power spectrum from a distribution of many hydrometeors is smoother than the return from a few individual insects. The smoother power spectra at 2 km height shown in Fig. 3c are consistent with a distribution of small cloud droplets moving at different velocities 35 within the radar volume. In addition to smooth power spectra across the velocity dimension, power spectra from cloud droplets are also more continuous in range due to the vertical extent of clouds as can be seen with a continuity of clouds with height in Fig. 3a.   Figure 4 shows time-height cross-sections of KAZR CoPol reflectivity (Fig. 4a) and LDR observations (Fig. 4b) when insects, clouds, and precipitation are observed in the same hour. Observations were collected during 0400 UTC (2300 Local) on 04-April-2019. From minutes 0-to-20, the approximate 1.5 km ceilometer cloud base height (black symbols) is above the insect layer that has LDR values between approximately -5 and -10 dB (see Fig. 4b), while the CoPol reflectivity is continuous in time and height above the ceilometer cloud base height (see Fig. 4a and 4c). At the beginning of the hour, the CoPol reflectivity (Fig. 4a) time-10 height structure indicates a precipitating cloud system between 3 and 5 km that evolves in time with precipitation reaching the lowest resolved height of 0.57 km after minute 20. The LDR shows a similar time-height structure (with reduced vertical depth)

Insects and Precipitation 5
with LDR values ranging between -25 to -20 dB. The LDR enhancement near 2.4 km and after minute 20 is due to a mixture of liquid and frozen particles near the melting layer (Baldini and Gorgucci, 2006). Below the melting layer, the LDR has values near -25 dB that is due to scattering from rain drops. Above the melting layer, scattering from asymmetrical ice particles leads to LDR 15 values near -20 dB (Baldini and Gorgucci, 2006). In contrast to the shallow cumulus cloud droplet observations in Figs. 2 and 3, KAZR has enough sensitivity to detect XPol signal returns from large spherical raindrops and ice particles.  Figure 4c shows the CoPol reflectivity at time-height pixels that do not have a LDR measurement. As with the shallow cloud observations (see Fig. 2e), there are more CoPol observations than LDR observations. The bottom panel (Fig. 4d) shows the QC1 hydrometeor mask produced by the insect-hydrometeor detection algorithm described in Section 4. The events shown in Figs.

2-4 highlight three important attributes of CoPol and LDR measurements:
 LDR measurements detect some, but not all, insect, cloud, and precipitation observations. 5  KAZR LDR measurements do not have the sensitivity to detect shallow non-precipitating liquid clouds.
 Doppler velocity power spectra contain signatures unique to insect scattering and hydrometeor scattering.
The limitation of LDR measurements not detecting all insects detected by CoPol measurements and the benefit of Doppler velocity power spectra having signatures of insects and hydrometeor scattering suggests that Doppler velocity power spectra can be analyzed along with LDR measurements to discriminate between insect and hydrometeor scattering. 10

Anatomy of Insect-Hydrometeor Detection Algorithms
As described previously, the radar returned signal results from scattering from insects (including "atmospheric plankton") and/or hydrometeors (aka, cloud droplets or precipitation sized particles). The insect-hydrometeor detection algorithms described in this section aim to classify each region of the CoPol and LDR Doppler velocity spectra as either insect or hydrometeor scattering. Next, the two CoPol and LDR regional spectral classifications are combined and then filtered to produce masks indicating the occurrence 15 of insect or hydrometeor scattering at every range gate.
( 4 b ) 25 The goal of the CoPol and LDR insect-hydrometeor detection algorithms is to classify insect and hydrometeor scattering contributions at each , ℎ location. Insects and hydrometeors do occur in the same range gate and sometimes at the same velocity (as will be seen in Figs. 6, 8, 9, and 11). These simultaneous insect and hydrometeor classifications will be mitigated by temporal quality control filtering.
The observed KAZR CoPol and XPol spectra profiles (Fig. 3) are the inputs to the insect-hydrometeor algorithms, with 30 the processing steps for both algorithms outlined in Fig. 5. The methodology consists of two parallel algorithms. The CoPol Texture Algorithm classifies insects and hydrometeors based on the CoPol spectra texture, with the understanding that scattering from insects produces more spectrum variability than cloud droplet or raindrop distributions. The LDR algorithm classifies insects and hydrometeors based on the understanding that asymmetric insects produce larger LDR magnitudes than spherical raindrops (when viewed from the bottom), and that the non-precipitating liquid cloud droplets should not produce any signal in the KAZR XPol 35 channel for single scattering processes. Both algorithms produce insect-hydrometeor membership classes for every region of the spectra profile. The membership classes are combined and then reduced to binary insect and hydrometeor masks that have affirmative values for insect or hydrometeor scattering at each range gate. After processing individual spectra profiles, two timeheight continuity quality control (QC) filters are applied to the binary hydrometeor masks to remove outliers. This is based on the understanding that clouds and precipitation are persistent over 10's of seconds and 10's of meters. Details of each algorithm module are described in the following sections.

CoPol Texture Algorithm Branch
This section describes the CoPol Texture Algorithm by applying the processing steps (Boxes 1-4 of Fig. 5) to the observed spectra profile shown in Fig. 3a. An objective noise threshold ℎ is estimated from the CoPol spectra at each height (Hildebrand and 10 Sekhon 1974;Carter et al. 1995). The CoPol spectra with magnitudes greater than ℎ are defined as signal power (see equation 3). The CoPol signal power for the boundary layer spectra in Fig. 3a is shown in Fig. 6a. As discussed before, insect scattering produces delta functions in the power spectra that are broadened in the velocity domain because of hardware limitations (e.g., antenna beamwidth) and signal processing techniques (e.g., FFT processing). A texture parameter where , selects the larger magnitude value between estimates or . To capture both positive and negative changes equally, , ℎ uses the absolute magnitude, then selects the largest difference between the neighbors (i.e., or ). Figure 6b shows the texture , ℎ for the CoPol power spectra shown in Fig. 6a. Note that the small magnitude texture values in the upper heights are due to cloud droplet scattering and larger magnitude texture values in the lower heights are caused by insect scattering. Several features make texture , ℎ well suited for identifying insect produced delta function variability. 5 First, the texture , ℎ is calculated using signal powers expressed in decibel units. Thus, the power difference between neighbors in decibel units is the same as a power ratio, or a percent change, when the power is expressed in linear units (e.g., 10 10 10 ). This implies that fluctuations expressed in decibel units are independent of magnitude, which allows for comparisons of low magnitude cloud observations with larger magnitude insect observations as shown in Fig. 3. Second, a narrow KAZR antenna beamwidth allows the difference between nearest neighbors (i.e., and ) to quantify delta functions. 10 Note that depending on radar hardware and operating parameters, the insect peak may be broader than these observations, and power differences using further neighbors may be necessary in order to identify delta functions (e.g., between and ). With a goal of identifying regions of insect and hydrometeor scattering, a small velocity-height window is moved throughout the , ℎ domain and texture statistics are calculated for each small region. For this KAZR dataset, a velocityheight window of five (5) velocity bins (total width of 0.186 m s -1 ) and three (3) range gates (total depth of 90 m) was large enough to capture regional texture variability. For each small region, maximum texture max , ℎ and standard 20 deviation , ℎ are estimated and assigned to the location , ℎ . Figures 6c and 6d show the regional maximum and standard deviation for the texture shown in Fig. 6b. Note that both quantities are larger at lower altitudes where insect scattering dominates compared to higher altitudes that are dominated by cloud droplet scattering. Interestingly, enhancements in both max texture and STD texture are visible near 1.8 and 2 km indicating that insect scattering is occurring with close proximity to cloud scattering regions. 25 With an objective of separating insect and cloud scattering regions based on CoPol texture statistics, Fig. 7a, 7b, and 7c shows 1-dimensional (1D) and 2-dimensional (2D) probability distribution functions (PDFs) of and max for all profiles in hour 19 of 31-July-2018 and all spectral regions that do not have a LDR measurement. The spectral regions with a LDR measurement are shown in Fig. 7d, 7e, and 7f. The color coding in the 2D plot represents the percent occurrence relative to the cell with maximum number of observations. The 1D PDFs produced from the observations are shown 5 with black curves in Figs. 7a and 7c using 953,136 samples, each representing a small spectral region, distributed into two populations. The peak with smaller and smaller max is due to cloud particle scattering. The peak with larger texture attributes is caused by insect scattering. The contour lines in Fig. 7b represent 90%, 75%, 63% and 50% occurrence of 2D Gaussian functions estimated for both populations. The red lines in Figs. 7a and 7c are 1D Gaussian function fits to the observations. Better fits were obtained using Generalized Gaussian functions that accounted for skewness in the observed 10 distributions. However, these better fits did not yield better classifications, as better classifications depend on the samples between the two peaks and not on the outer tails of the distributions that determined the distribution skewness.

20
The observations with LDR in Fig. 7d, e, and f show only one distribution corresponding to insect scattering. The functional fits are Generalized Gaussian distributions and capture skewness in the distributions. Note the similarities between the fitted parameters for the insect populations with and without LDR measurements. Both distributions have similar means and standard deviations (i.e., near 10 dB mean and 2.7 dB standard deviations). Also note that the insect distribution in Fig. 7e extends toward the origin and overlaps with the cloud population shown in Fig. 7b. This overlap causes difficulty in using a simple threshold 25 to classify hydrometeor from insect observations. This difficulty was noticed in Luke et al. (2008). One way to improve the classification is to use a threshold that is orthogonal to the observed distributions. The insect and hydrometeor 2D Gaussian functional fits shown in Fig. 7b and 7e have correlation coefficients greater than 0.9 and indicate the distributions are close to a 1to-1 slope. After creating a line between the hydrometeor and insect distributions, an orthogonal threshold can be constructed. and 2019 (see Appendix A for details). The analysis presented in Appendix A suggests that the orthogonal threshold has a true positive rate of about 90% for both hydrometeor and insect scattering observations. Due to the distribution overlap, a single threshold methodology will not reach 100% true positive rate and additional classification or filtering will be necessary. One way to improve the classifications due to distribution overlap or inaccurate thresholds is to apply continuity filters to remove random or ephemeral samples due to misclassifications as discussed in Section 4.4. Applying the orthogonal CoPol texture threshold to the 5 example profile from 19:19:02 UTC, Fig. 8a shows the insect (blue shading) and hydrometeor (red shading) texture membership classes. Also in Fig. 8 are the LDR insect-hydrometeor classes; the combined classes; and the profile mask; all of which are discussed in the next section.

LDR Algorithm Branch
This section describes the processing steps of the LDR Algorithm (Boxes 5-8 of Fig. 5). In Box 5, an objective noise threshold ℎ is estimated from the XPol spectra at each height (Hildebrand and Sekhon 1974;Carter et al. 1995). The XPol spectra with magnitudes greater than ℎ are defined as signal power (see equation 3). Box 6 calculates the linear depolarization ratio spectra using equation (1). The CoPol and XPol spectra profiles at 04:48:17 UTC from the precipitation event on 4-April-2019 5 shown in Fig. 4 are shown in Fig. 9. The top row of Fig. 9 (Fig. 9a-d) shows CoPol observations and CoPol texture statistics used in the CoPol texture algorithm. Figures 9e and 9f show XPol and LDR spectra profiles. To estimate regional scattering properties, the same 5x3 velocity-height window used in the texture algorithm is used to calculate regional LDR statistics throughout the , ℎ spectra profile (Box 7 of Fig. 5). Figures 9g and 9h show the , ℎ and , ℎ estimates and suggest that insects are present below 1 km with near zero vertical velocity and falling hydrometeors are present above 3 km. With an objective of separating insect and hydrometeor scattering regions based on LDR statistics, Fig. 10 shows 2D and 1D PDFs of the LDR statistics estimated for all observations below 1.5 km (to avoid too many hydrometeor observations that would prevent any insects from appearing in Fig. 10) for hour 04 on 04-April-2019. Figure 10 contains over 1 million LDR statistic samples each calculated over a separate 5x3 spectral region. The distribution near , ℎ 8 dB is due to insect 20 scattering and the distribution near , ℎ 20 dB is due to hydrometeor scattering. A threshold of 15 dB clearly separates the two distributions and is indicated with a dashed line in Fig. 10b, which is consistent with estimates from Matrosov (1991) and Reinking et al. (1997).

Combining Co-Pol Texture and LDR Algorithm Classifications
After performing the CoPol texture and LDR algorithms, the binary insect and hydrometeor spectral classifications from both algorithms are combined and then filtered (e.g., see Figs. 8a and 8b, and Figs. 11a and 11b). Initially, the combined spectral 5 classification is the texture classification because the LDR classification will always have fewer valid observations than the CoPol observations. To incorporate the LDR classification, the combined classification is changed only if the LDR algorithm produced a hydrometeor class when the texture classification was set to insect class. This logic places more emphasis on identifying hydrometeors than insects.
One of the physical attributes of hydrometeor scattering is that the Doppler velocity spectra span multiple continuous 10 velocity bins and over several range gates. Accordingly, isolated hydrometeor pixels in the combined spectral classification are removed by requiring at least 7 continuous hydrometeor pixels in the velocity dimension. All hydrometeor pixels not satisfying this constraint are set to the insect scattering class. The filtered memberships for the two example profiles are shown in Figs. 8c for the warm liquid cloud event on 31-July-2018 and Fig. 11c for the precipitation event on 4-April-2019. The red and blue shading corresponds to hydrometeor and insect scattering classes, respectively.
The final processing step is to reduce the filtered membership classes into binary masks indicating the presence of insect or hydrometeor scattering at each range gate (Box 10 of Fig. 5). The insect and hydrometeor masks are set to unity if that filtered membership class exists for that range gate ℎ . In the case when both insect and hydrometeor scattering are detected at the same 5 range gate, the hydrometeor mask is set to unity and the insect mask is set to zero. This logic places more emphasis on identifying robust hydrometeor masks and defining masks resulting from either insect or hydrometeor scattering at each range gate. Figures   8d and 11d show the binary insect and hydrometeor masks for the two example profiles. Both masks are saved in output data files and have the variable names insect_mask_raw and hydro_mask_raw (Boxes 11 and 12 of Fig. 5). The suffix raw designates that these masks were estimated from individual profiles and without any temporal information from neighboring profiles, which is 10 discussed in Section 4.4.
In addition to the binary insect mask, an insect activity index is generated by counting the number of insect scattering velocity bins at each height. This insect index ℎ is defined as where , ℎ is the insect spectral classification and has a value of either 0 or 1. This insect index is not an estimate of 15 the insect number concentration because the magnitude of the insect scattering is not being taken into account. The authors hypothesize that the insect index should be related to insect number density, as the breadth of insect velocities should increase as the number of insects increases. The insect index is available in the output data files with the variable name insect_index_raw. Figure 12 shows the time-height cross-section of observed CoPol KAZR reflectivity (Fig. 12a), the raw hydrometeor mask (Fig.  20 12b), a time-height filtered hydrometeor mask (Fig. 12c), and the insect index (Fig. 12d) for hour 19 on 31-July-2018. This is the same event shown in Figs. 1 and 2, except with the vertical axis limited to 3 km height. The ceilometer cloud base height is shown in each panel with black dots. The blue and red plus symbols are cloud top and base determined from the COGS stereo camera system, which is discussed in more detail in Section 5. The hydrometeor mask in Fig. 12b is the raw mask produced from each spectra profile. These raw hydrometeor masks contain random misclassified pixels of hydrometeors below the ceilometer cloud 25 base height. Most of these false positive hydrometeor mask pixels are removed by sequentially applying two time-height quality control filters.

5
The first quality control filter, named QC1 (shown in Fig. 12c), removes temporal outliers by applying a 3-member temporal continuity filter, which retains all three values if three consecutive values are present. The QC1 filter also inserts up to three consecutive hydrometeor mask pixels in vertical profiles to fill small gaps in the raw hydrometeor mask. The second quality control filter, named QC2 (not shown), applies a low-pass filter to the QC1 filtered mask by moving a 3x3 time-height (approximately 12 s by 90 m) continuity constraint throughout the domain to robustly identify hydrometeors that are persistent in time and height. Both the QC1 and QC2 filtered hydrometeor masks are available in the output data files with variable names hydro_mask_qc1 and hydro_mask_qc2. Figure 12d shows the insect index and estimates the number of velocity bins in the spectra that contained insect scattering. The color scale is logarithmic with maximum value 256 representing the number of velocity bins in the spectra. The QC1 hydrometeor mask is plotted for the 4-April-2019 precipitation event in Fig. 4d. The mask identifies the 5 shallow clouds near 1.5 km from about 2-to-13 minutes and the precipitating anvil at the beginning of the hour between 3-to-4 km that descends to the lowest range gate just after minute 20. The hydrometeor mask below 1.5 km starting at about minute 21 and continues until the end of the hour except for a shallow gap between minutes 50-to-55 is due to precipitation at these lower heights as indicated in Fig. 11c. There is strong agreement between the ceilometer cloud base height estimates and the hydrometeor mask before minute 20. After this time, the hydrometeor mask identifies raindrops, while the ceilometer is identifying cloud base. COGS 10 measurements are unavailable for comparison purposes during this event because COGS is an optical system requiring daylight.

Comparing Cloud Mask with Independent Measurements
Figures 4 and 12 show significant agreement between ceilometer cloud base estimates and retrieved QC1 hydrometeor masks. Figure 12 also shows agreement between COGS cloud base and top estimates with the QC1 hydrometeor mask. In comparing the three products, the KAZR hydrometeor masks and ceilometer cloud base estimates appear as discrete cloud events. Conversely, 15 the COGS estimates appear continuous in time, as if COGS is detecting a persistent cloud layer. This difference is because KAZR and ceilometer are 'soda-straw' observations and COGS is a 6-km x 6-km domain-averaged product produced from three pairs of stereo cameras positioned around the radar and ceilometer (Romps and Öktem, 2018). Figure 12c shows that when the radar and ceilometer both detect clouds, COGS also had a similar cloud base height estimate. The ceilometer and radar cloud bases also showed consistency even at the cloud edges (see near minute 35). Regarding cloud top estimates, COGS estimates are higher than 20 the radar because COGS is a domain average. The online Supplemental Material section contains images of QC1 hydrometeor mask, ceilometer, and COGS retrievals for forty-seven (47) days corresponding to 2018 and 2019 LASSO shallow cloud events (LASSO, 2020). The COGS product is available only for shallow cumuliform clouds and only during daylight hours. 10, Fig. 13b shows the 2D distribution of height differences with the line graphs showing 1D PDFs. Over 70% of the 12,141 simultaneous profiles had height differences within +/-100 m, which represents +/-3 thirty-meter radar range gates. There is a small skewness to the height difference PDF (Fig. 13a) that is consistent with the ceilometer detecting clouds before the radar detects hydrometeors. Also, during the few precipitation events, the hydrometeor mask bottom was significantly lower than the ceilometer cloud base as the hydrometeor mask detects falling raindrops far below the ceilometer detected cloud base. 30 occurrences. Artifact at negative height differences and low ceilometer cloud base is due to radar first range gate at 570 m.

Conclusions
In addition to detecting cloud particles, vertically pointing cloud radars are sensitive enough to detect individual insects. If insect 10 contamination is not identified and removed, then radar derived cloud properties will be incorrect and will not help with validating cloud resolving models or climate simulations. This study used polarimetric radar observations to develop two insect-hydrometeor detection algorithms. The two algorithms use different radar scattering principles to identify small velocity-height regions in the Doppler velocity power spectra profile as resulting from either insect or hydrometeor scattering. The results of both algorithms are combined and then filtered to produce single value insect and hydrometeor masks at each range gate. The backscattered power 15 from hydrometeors and insects is larger in the CoPol channel than the XPol channel leading to negative LDR values. This difference in sensitivity leads to this study finding that KAZR XPol spectra observations observed fewer insects than KAZR CoPol observations. This implies that using just a polarimetric signal processing method to identify insects will not identify all insect clutter affecting CoPol observations and that insect clutter mitigation methods must use CoPol observations to identify all insect clutter in the CoPol channel. 20 One algorithm uses the texture of CoPol Doppler velocity power spectra to identify small velocity-height regions of spectra attributed to insect or hydrometeor scattering. Since insects are individual point targets, their radar power return is confined to narrow intervals of Doppler velocity and range gates, on the order of 1-to-3 velocity bins (0.04 to 0.12 m s -1 ) and 1-to-3 range gates (30 to 90 m). In contrast, cloud particles and raindrops occur in distributions that extend over many velocity bins and several range gates, on the order of 5-to-150 velocity bins (0.2 to 7 m s -1 ) and 3-to-150 range gates (90 to 4500 m). The CoPol and XPol 25 Doppler velocity power spectra from insect scattering have large variability, or texture, while scattering from cloud particles and raindrops produce smoother, less variable, spectra. The CoPol texture algorithm uses the texture information to identify small regions of insect and hydrometeor scattering. The CoPol texture algorithm can be applied to any cloud radar system collecting Doppler velocity power spectra and does not require a cross-polarization channel.
The other algorithm uses the linear depolarization ratio (LDR) at each point in the Doppler velocity power spectra to 5 identify regions of scattering due to spherical raindrops, asymmetric ice particles, or asymmetric insects. Unlike previous studies, this work uses the LDR at each spectra bin. After identifying small velocity-height regions of insect and hydrometeor scattering in both algorithms, the spectra classifications are combined and then filtered to account for continuity in the Doppler velocity and vertical range dimensions. The filtered spectra classifications are reduced to binary affirmative insect and hydrometeor masks with a single value at each range gate. An insect activity index is estimated at each range gate by counting the number Doppler velocity 10 spectra bins with insect scattering. Future studies will use insect activity and vertical air motion estimates to explore whether insects are passive tracers or actively propelling themselves through the atmosphere. Often, insects occur at the same height as clouds and during the onset of precipitation. While these are interesting phenomena, the focus of this work is producing robust hydrometeor masks to help identify cloud boundaries, which can be used, for example, to study the evolution of shallow cumulus clouds in the planetary boundary layer (Gustafson et al., 2017). Using over 12,000 simultaneous ceilometer and radar profiles, it 15 was found that over 70% of the hydrometeor mask column bottoms were within +/-100 m of the ceilometer cloud base (i.e., +/-3 thirty-meter radar range gates). The hydrometeor mask column bottom was slightly higher than the ceilometer cloud base. This is to be expected, as the ceilometer detects cloud particles at lower heights than the radar detecting hydrometeors within the cloud.
The online Supplemental Material includes sample images of observed KAZR reflectivity, retrieved hydrometeor masks, and verification observations from ceilometer and COGS. The processing described herein was applied to KAZR observations for 20 April-October in 2018 and 2019 summer seasons at the Southern Great Plains (SGP) facility. The insect and hydrometeor masks for these months are available online on the DOE ARM Archive.

Appendix A
This appendix uses a large, hand edited, dataset to first explore the representativeness of spectral region texture distributions and then to develop an orthogonal threshold to classify hydrometeors from insects in the 2D texture distributions. Over 75 hours of KAZR observations from 2018 and 2019 at SGP were manually inspected to contain only hydrometeors or insect scattering observations. Figure A1 illustrates examples of manual classified hydrometeor and insect boundaries shown as gray shaded areas 5 superimposed onto KAZR CoPol Reflectivity. The top panel (Fig. A1a) shows hydrometeor boundaries from 4 to 15 km for hour 12 UTC on 3-August-2019. There are no insects detected above 4 km and all observations are due to hydrometeor scattering. Figure A1b shows insect boundaries from the lowest resolved range gate to 3 km for hour 09 UTC on 06-June-2019. The ceilometer derived cloud base indicates clouds are detected above 3.5 km between minutes 14 and 37. In this 'truth' dataset, there were over 15 hours of hydrometeor profiles and over 60 hours of insect profiles. The manually classified dataset contained over 47 million 10 CoPol and over 20 million LDR spectral regions with 5 velocity bins and 3 range gates (i.e., 0.186 m s -1 by 90 m). Note that the correlations of the General Gaussian functional fits are estimated separately for both datasets and both estimates are greater than 0.9 indicating that the individual distributions have nearly 1-to-1 slopes.
To construct an orthogonal classification threshold to divide observations into either hydrometeor or insect scattering classes, a line is first constructed between the two distributions and then an orthogonal slope is estimated from that original line.
The slope between the two distributions is estimated from the Gaussian distribution mean values using 5 ( A 1 ) where the numerator is the change in and the denominator is the change in max . The equation of the line between the two distributions is written in Fig. A2b and A2e and is shown with the solid black line. The asterisks indicate the distribution mean locations, specifically, , and , . Orthogonal thresholds will have a slope given by 20 .
( A 2 ) Using the threshold slope in (A2), many orthogonal threshold lines were constructed and the two truth datasets were classified using each threshold. The goodness of classification was determined using a receiver operating characteristic ( Figure A3 shows the TPRs for both datasets as the parallel thresholds moved graphically from left-to-right in Fig. A2b and A2e with increasing max . Note that starts at a low value and increases as the threshold moves to larger max . The has an opposite behavior. The intersection of indicates the same true positive rate for both datasets. The intersection has a true positive rate over 0.9 and occurs when max is equal to 4.8.  Data availability. Original raw KAZR spectra are available on the DOE ARM archive. After release of the manuscript, fourteen months (April-October 2018 and 2019) of insect and hydrometeor mask data files at Southern Great Plains (SGP) Central Facility will be available at the DOE ARM archive as a Principle Investigator (PI) Product (Williams, 2021).
Source code availability. The code used to generate the insect, cloud, precipitation, and hydrometeor masks stored on the DOE 5 ARM archive is available upon request from the lead author. With this source code, users can repeat the analysis presented in this study and develop improved insect-cloud and insect-precipitation detection algorithms for their vertically pointing radar observations. Supplemental Material. Selected images of observed KAZR reflectivity, retrieved hydrometeor masks, and verification 10 observations from ceilometer and COGS are available in the online Supplemental Material.
Author contribution. CRW: insect and hydrometeor mask code development; KLJ: testing and evaluation of products and verification; SEG and JCH: Evaluation of products against collocated observations; RO and DMR: processing of COGS product.