Calibration of a 35 GHz airborne cloud radar: lessons learned and intercomparisons with 94 GHz cloud radars

. This study gives a summary of lessons learned during the absolute calibration of the airborne, high-power Ka-band cloud radar HAMP MIRA on board the German research aircraft HALO . The ﬁrst part covers the internal calibration of the instrument where individual instrument components are characterized in the laboratory. In the second part, the internal calibration is validated with external reference sources like the ocean surface backscatter and different air- and spaceborne cloud radar instruments. A key component of this work was the characterization of the spectral response and the transfer function of the receiver. In a wide dynamic range of 70 dB, the receiver response turned out to be very linear (residual 0.05 dB). Using different attenuator settings, it covers a wide input range from − 105 to − 5 dBm. This characterization gave valuable new insights into the receiver sensitivity and additional attenuations which led to a major improvement of the absolute calibration. The comparison of the measured and the previously estimated total receiver noise power contributions. FE and MBP conceived the concept for internal calibration. FE, MBP, MH and LH performed calibration and the measurements. developed the pre-sented methods and carried out the MBP and MH con-tributed to the interpretation of the results. JD provided the single scattering results and RASTA measurements. FE took the lead in writing the paper. All authors provided feedback on the paper.

Abstract. This study gives a summary of lessons learned during the absolute calibration of the airborne, high-power Ka-band cloud radar HAMP MIRA on board the German research aircraft HALO. The first part covers the internal calibration of the instrument where individual instrument components are characterized in the laboratory. In the second part, the internal calibration is validated with external reference sources like the ocean surface backscatter and different air-and spaceborne cloud radar instruments.
A key component of this work was the characterization of the spectral response and the transfer function of the receiver. In a wide dynamic range of 70 dB, the receiver response turned out to be very linear (residual 0.05 dB). Using different attenuator settings, it covers a wide input range from −105 to −5 dBm. This characterization gave valuable new insights into the receiver sensitivity and additional attenuations which led to a major improvement of the absolute calibration. The comparison of the measured and the previously estimated total receiver noise power (−95.3 vs. −98.2 dBm) revealed an underestimation of 2.9 dB. This underestimation could be traced back to a larger receiver noise bandwidth of 7.5 MHz (instead of 5 MHz) and a slightly higher noise figure (1.1 dB). Measurements confirmed the previously assumed antenna gain (50.0 dBi) with no obvious asymmetries or increased side lobes. The calibration used for previous campaigns, however, did not account for a 1.5 dB two-way attenuation by additional waveguides in the airplane installation. Laboratory measurements also revealed a 2 dB higher two-way attenuation by the belly pod caused by small deviations during manufacturing. In total, effective reflectivities measured during previous campaigns had to be corrected by +7.6 dB.
To validate this internal calibration, the well-defined ocean surface backscatter was used as a calibration reference. With the new absolute calibration, the ocean surface backscatter measured by HAMP MIRA agrees very well (< 1 dB) with modeled values and values measured by the GPM satellite. As a further cross-check, flight experiments over Europe and the tropical North Atlantic were conducted. To that end, a joint flight of HALO and the French Falcon 20 aircraft, which was equipped with the RASTA cloud radar at 94 GHz and an underflight of the spaceborne CloudSat at 94 GHz were performed. The intercomparison revealed lower reflectivities (−1.4 dB) for RASTA but slightly higher reflectivities (+1.0 dB) for CloudSat. With effective reflectivities between RASTA and CloudSat and the good agreement with GPM, the accuracy of the absolute calibration is estimated to be around 1 dB.

Introduction
In recent years, the deployment of cloud profiling microwave radars on the ground, on aircraft as well as on satellites, like CloudSat (Stephens et al., 2002) or the upcoming Earth-CARE satellite mission (Illingworth et al., 2014), have greatly advanced our scientific knowledge of cloud microphysics. Nevertheless, large discrepancies in retrieved cloud microphysics (Zhao et al., 2012;Stubenrauch et al., 2013) contribute to uncertainties in the understanding of the role of Published by Copernicus Publications on behalf of the European Geosciences Union.
clouds for the climate system (Boucher et al., 2013). An important aspect for enabling accurate microphysical retrievals based on cloud radar data is the proper calibration of the systems.
However, the absolute calibration of an airborne millimeter-wave cloud radar can be a challenging task. Its initial calibration demands detailed knowledge of cloud radar technology and the availability of suitable measurement devices. During cloud radar operation, system parameters of transmitter and receiver system can drift due to changing ambient temperature, pressure and aging system components. The validation of the absolute calibration with external sources is furthermore complicated for downwardlooking installations on an aircraft. The missing ability of most airborne and many ground-based radars to point their line of sight to an external reference source makes it difficult or even completely impossible to calibrate the overall system with an external reference in a laboratory.
Typically, an budget approach is used for the absolute calibration of airborne cloud radar instruments. First, the instrument components like transmitter, receiver, waveguides, antenna and radome are characterized individually in the laboratory. During in-flight measurements, variable component parameters are then monitored and corrected for drifts using the laboratory characterization. Subsequently, all gains and losses are combined into an overall instrument calibration.
In order to meet the required absolute accuracy and to follow good scientific practice, an external in-flight calibration becomes indispensable to check the internal calibration for systematic errors. For weather radars, the well-defined reflectivity of calibration spheres on tethered balloons or erected trihedral corner reflectors has been a reliable external reference for years (Atlas, 2002). In more recent years, this technique is being extended to scanning, ground-based millimeter-wave radars (Vega et al., 2012;Chandrasekar et al., 2015). For the airborne perspective on the other hand, the direct fly-over and the subsequent removal of additional background clutter is difficult to reproduce (Z. . Driven by this challenge, many studies have been conducted to characterize the characteristic reflectivity of the ocean surface using microwave scatterometer-radiometer systems in the X and Ka bands (Valenzuela, 1978;Masuko et al., 1986). As one of the first, Caylor (1994) introduced the ocean surface backscatter technique to cross-check the internal calibration of the NASA ER-2 Doppler radar (EDOP; Heymsfield et al., 1996). In an important next step, L.  combined this technique with analytical models of the ocean surface backscatter. In their work, they used circle and roll maneuvers to sample the ocean surface backscatter for different incidence angles with the Cloud Radar System (CRS; Li et al., 2004), a 94 GHz (W band) cloud radar on board the NASA ER-2 high-altitude aircraft. In this context, they proposed to point the instrument 10 • off-nadir, an angle for which multiple studies found a very constant ocean surface backscatter (Durden et al., 1994;Z. Li et al., 2005;Tanelli et al., 2006). For this incidence angle, these studies confirmed the ocean surface to be relatively insensitive to changes in wind speed and wind direction.
Subsequent studies followed suit, applying the same technique to other airborne cloud radar instruments: the Japanese W-band Super Polarimetric Ice Crystal Detection and Explication Radar (SPIDER; Horie et al., 2000) on board the NICT Gulfstream II by Horie et al. (2004), the Ku/Ka-band Airborne Second Generation Precipitation Radar (APR-2; Sadowy et al., 2003) on board the NASA P-3 aircraft by Tanelli et al. (2006) and the W-band cloud radar (RASTA; Protat et al., 2004) on board the SAFIRE Falcon 20 by Bouniol et al. (2008).
Encouraged by these airborne studies, this in-flight calibration technique has also been proposed and successfully applied to the spaceborne CloudSat instrument (Stephens et al., 2002;Tanelli et al., 2008). Based on this success, Horie and Takahashi (2010) proposed the same technique with a whole 10 • across-track sweep for the next spaceborne cloud radar, the 94 GHz Doppler Cloud Profiling Radar (CPR) on board EarthCARE (Illingworth et al., 2014).
With CloudSat as a long-term cloud radar in space, direct comparisons of radar reflectivity from ground-and airborne instruments became possible (Bouniol et al., 2008;Protat et al., 2009). While the first studies still assessed the stability of the spaceborne instrument, subsequent studies turned this around by using CloudSat as a Global Radar Calibrator for ground-based or airborne radars Protat et al. (2010).
This work will focus on the internal and external calibration of the MIRA cloud radar (Mech et al., 2014) on board the German High Altitude and Long Range Research Aircraft (HALO), adopting the ocean surface backscattering technique described by Z. . In the first part, the preflight laboratory characterization of each system component will be described. This includes antenna gain, component attenuation and receiver sensitivity. In a budget approach, these system parameters are then used in combination with in-flight monitored transmission and receiver noise power levels to form the internal calibration. The second part will then compare the internal calibration with external reference sources in-flight. As external reference sources, measurements of the ocean surface as well as intercomparisons with other air-and spaceborne cloud radar instruments will be used. This paper is organized as follows: after some considerations about required radar accuracies shown in Sect. 1, Sect. 2 introduces the cloud radar instrument and its specifications on board the HALO research aircraft. Section 3 recalls the radar equation and introduces the concept of using the ocean surface backscatter for radar calibration. The characterization and calibration of the single system components, including waveguides, antenna and belly pod, is described in Sect. 3.1. Subsequently, the overall calibration of the radar receiver is explained in Sect. 3.2. Here, a central innovation Atmos. Meas. Tech., 12, 1815-1839, 2019 www.atmos-meas-tech.net/12/1815/2019/ of this work is the determination of the receiver sensitivity (Sect. 3.3 and 3.4). In the second part of the paper, the budget calibration is validated by using predicted and measured ocean surface backscatter (Sect. 4.3). In addition, the calibration and system performance for joint flight legs is compared to the W-band cloud radars like the airborne cloud radar RASTA (Sect. 5.2) and the spaceborne cloud radar CloudSat (Sect. 5.3).

Accuracy considerations
In order to provide scientifically sound interpretations of cloud radar measurements, a well-calibrated instrument with known sensitivity is indispensable. Many spaceborne (Delanoë and Hogan, 2008;Deng et al., 2010) or ground-based (Donovan et al., 2000) techniques to retrieve cloud microphysics using millimeter-wave radar measurements require a well-calibrated instrument. In the case of the CloudSat instrument, the calibration uncertainty was specified to be ±2 dB or better (Stephens et al., 2002). This requirement for absolute calibration imposed by retrievals of cloud microphysics is further explained in Fig. 1. Under the simplest assumption of small, mono-disperse cloud water droplets, the iso-lines in Fig. 1 represent all combinations of cloud droplet effective radius and liquid water content with a radar reflectivity of −20 dBz. An increasing retrieval ambiguity, caused by an assumed instrument calibration uncertainty, is illustrated by the shaded areas with ±1 dB (green), ±3 dB (yellow) and ±8 dB (red). To constrain the retrieval space considerably within synergistic radar-lidar retrievals like Cloudnet (Illingworth et al., 2007) or Varcloud (Delanoë and Hogan, 2008), the absolute calibration uncertainty has to be significantly smaller than the natural variability of clouds. Since a reflectivity bias of 8 dB would bias the droplet size by a factor of 2 and the water content by even an order of magnitude, the absolute calibration uncertainty should be at least 3 dB or lower. For a systematic 1 dB calibration offset, Protat et al. (2016) still found ice water content biases of +19 % and −16 % in their radar-only retrieval. Since HAMP MIRA data are used in retrievals of cloud microphysics, the target accuracy will be set to 1 dB. An accurate absolute calibration is further motivated by recent studies (Protat et al., 2009;Hennemuth et al., 2008;Maahn and Kollias, 2012;Ewald et al., 2015;Lonitz et al., 2015;Myagkov et al., 2016;Acquistapace et al., 2017), which used the radar reflectivity provided by almost identical ground-based versions of the same instrument. The installation of the MIRA instrument on many ground-based cloud profiling sites within ACTRIS (Aerosols, Clouds and Trace gases Research InfraStructure Network; http://www. actris.eu, last access: 10 March 2019) and in the framework of Cloudnet is a further incentive for an external calibration study.
The need for an external calibration is furthermore encouraged by several studies which already found evidence of an offset in radar reflectivity when comparing different cloud radar instruments. In a direct comparison with the W-Band (94 GHz) ARM Cloud Radar (WACR), Handwerker and Miller (2008) found reflectivities around 3 dB smaller for the Karlsruhe Institute of Technology MIRA, contradicting the reflectivity-reducing effect of a higher gaseous attenuation and stronger Mie scattering at 94 GHz. Protat et al. (2009) could reproduce this discrepancy in a comparison with CloudSat, where they found a clear systematic shift of the mean vertical profile by 2 dB between Cloudsat and the Lindenberg MIRA (CloudSat showing higher values than the Lindenberg radar).

The 35 GHz cloud radar on HALO
The cloud radar on HALO is a pulsed Ka-band, polarimetric Doppler millimeter-wavelength radar which is based on prototypes developed and described by Bormotov et al. (2000) and Vavriv et al. (2004). The current system was manufactured and provided by Metek (Meteorologische Messtechnik GmbH, Elmshorn, Germany). The system design and its data processing, including an updated moment estimation and a target classification by Bauer-Pfundstein and Görsdorf (2007), was described in detail by Görsdorf et al. (2015). The millimeter radar is part of the HALO Microwave Package (HAMP) which will be subsequently abbreviated as HAMP MIRA. Its standard installation in the belly pod section of HALO with its fixed nadir-pointing 1 m diameter Cassegrain antenna is described in detail by Mech et al. (2014). Its transmitter is a high-power magnetron operating at 35.5 GHz with a peak power P t of 27 kW, with a pulse repetition frequency f p between 5 and 10 kHz and a pulse width τ p between 100 and 400 ns. The large antenna and the high peak power can yield an exceptionally good sensitivity of −47 dBZ for the ground-based operation (5 km distance, 1 s averaging and a range resolution of 30 m). In the current airborne configu- ration, the sensitivity is reduced to −39.8 dBZ by various circumstances which will be addressed in this paper. The broadening of the Doppler spectrum due to the beam width can reduce this sensitivity further by 9 dB, as discussed in Mech et al. (2014). Table 1 lists the technical specifications as characterized in this work. Boldface indicates the operational configuration.
Most of the parameters in Table 1 play a role in the absolute calibration of the cloud radar instrument. For this reason, this section will briefly recapitulate the conversion from receiver signal power to the commonly used equivalent radar reflectivity factor Z e . When the radar reflectivity η of a target is known, e.g., in modeling studies, its equivalent radar reflectivity factor is given by where |K| 2 = 0.93 is the dielectric factor for water and λ the radar wavelength. For brevity, the equivalent radar reflectivity factor Z e is referred to as "effective reflectivity" in this paper. Following the derivation of the meteorological form of the radar equation by Doviak and Zrnić (2006), the effective reflectivity Z e (mm 6 m −3 ) can be calculated from the received signal power P r (W) by where r is the range between antenna and target, L atm is the one-way path integrated attenuation, and R c is a constant which describes all relevant system parameters. Assuming a circularly symmetric Gaussian antenna pattern, this radar constant R c contains the pulse wavelength λ (m), pulse width τ p (s) and peak transmit power P t in milliwatts, the peak antenna gain G a , and the antenna half-power beamwidth φ. Additionally, it accounts for all attenuations L sys occurring in system components, e.g., in transmitter (L tx ) and receiver (L rx ) waveguides, due to the belly pod radome L 2 bp and due to the finite receiver bandwidth (L fb ): 1024 ln 2λ 2 10 18 L sys P t G 2 a cτ p π 3 φ 2 |K| 2 . (3) Usually, antenna parameters (G a , φ) and system losses (L sys = L tx L 2 bp L rx ) have to be determined only once for each system modification. In contrast, transmitter and receiver parameters have to be monitored continuously. In addition, a thorough characterization of the receiver sensitivity is essential for the absolute accuracy of the instrument.

Internal calibration
This section will discuss the internal calibration of the radar instrument and its characterization in the laboratory. The following section will then compare this budget approach inflight with an external reference source.
The monitoring of the system-specific parameters and the subsequent estimation of effective reflectivity are described in detail by Görsdorf et al. (2015). The internal calibration (budget calibration) strategy for HAMP MIRA is therefore only briefly summarized here. In case of a deviation, previously assumed and used parameters will be given and referred to as initial calibration for traceability of past radar measurements.
3.1 Antenna, radome and waveguides -Antenna. The gain G a = 50.0 dBi and the beam pattern (−3 dB beamwidth φ = 0.56 • ) was determined by the manufacturer following the procedure described by Myagkov et al. (2015). Hereby the 1 m diameter Cassegrain antenna was installed on a pedestal to scan its pattern on a tower 400 m away. The antenna pattern showed no obvious asymmetries or increased side lobes (side lobe level: −22 dB). Its characterization revealed no significant differences in comparison with the initially estimated parameters (G a = 49.75 dBi, φ = 0.6 • ).
-Radome. The thickness of the epoxy quartz radome in the belly pod was designed with a thickness of 4.53 mm to limit the one-way attenuation to around 0.5 dB. Deviations during manufacturing increased the thickness to 4.84 mm, with a one-way attenuation of around 1.5 dB. Laboratory measurements confirmed this 2.0 dB (2 × 1.0 dB) higher two-way attenuation compared to the initially used value for the radome attenuation. A detailed analysis of this deviation can be found in Appendix B.
Atmos. Meas. Tech., 12, 1815-1839, 2019 www.atmos-meas-tech.net/12/1815/2019/ -Waveguides. The initially used calibration did not account for the losses caused by the longer waveguides in the airplane installation. Actually, transmitter and receiver waveguides each have a length of 1.15 m. With a specified attenuation of 0.65 dB m −1 , the two-way attenuation by waveguides is thus 1.5 dB.
3.2 Transmitted and received signal power -Transmitter peak power P t . Due to strong variations in ambient temperatures in the cabin, in-flight thermistor measurements proved to be unreliable. For this reason, thermally controlled measurements of P t were conducted on the ground, which were correlated with measured magnetron currents I m . The relationship between both parameters then allowed P t to be derived from inflight measurements of I m . A detailed analysis of this relationship can be found in Appendix A.
-Finite receiver bandwidth loss L fb . The loss caused by a finite receiver bandwidth was discussed in detail by Doviak and Zrnić (1979). For a Gaussian receiver response, the finite receiver bandwidth loss L fb can be estimated using Here, B 6 is the 6 dB filter bandwidth of the receiver and τ p is the duration of the pulse. During the initial calibration, no correction of the finite receiver bandwidth loss was applied.
-Signal-to-noise ratio (SNR). For each sampled range, MIRA's digital receiver converts phase shifts of consecutive pulse trains (e.g., N P = 256 pulses) into power spectra of Doppler velocities v i by a real-time fast Fourier transform (FFT). First, spectral densities s j (v i ) of multiple power spectra are averaged (e.g., N S = 20 spectra) to enhance the signal-to-noise ratio. Subsequently, the averaged spectral densities s j (v i ) in individual velocity bins are summed to yield a total received signal S in each gate: The receiver chain omits a separate absolute power meter circuit. At the end of each pulse cycle, the receiver is switched to internal reference gates by a pin diode in front of the first amplifier. These two last gates are called the receiver noise gate and the calibration gate.
To obtain the received backscattered signal S r in atmospheric gates, one has to subtract the signal received in the noise gate S ng from the total received signal S since it contains both signal and noise: In that way, a signal-to-noise ratio is then calculated by dividing the received backscattered signal S r in each atmospheric gate by the signal S ng measured in the noise gate: The relative power of the calibration gate to the receiver noise gate is furthermore used to monitor the receiver sensitivity (for details see Sect. 3.3). The main advantage of this method is the simultaneous monitoring of the relative receiver sensitivity using the same circuitry that is used for atmospheric measurements. Furthermore, the determination of the receiver noise in a separate noise gate can prevent biases in SNR, when the noise floor in atmospheric gates is obscured by aircraft motion or strong signals, both leading to a broadened Doppler spectrum.
Following Riddle et al. (2012), the minimum SNR min can be calculated in terms of N P and N S , if the backscattered signal power is contained in a single Doppler velocity bin: Here, Q = 7 is a threshold factor between the received signal and the standard deviation of the noise signal.
In the absence of any turbulence-or motion-induced Doppler shift, the operational configuration yields a SNR min of −22.1 dB. As discussed in Mech et al. (2014), this minimum SNR can be larger by 9 dB due to a motion-induced broadening of the Doppler spectrum in the airborne configuration.
-Received signal power P r . The SNR response of the receiver to an input power P r is described by a receiver transfer function SNR = T (P r ). When T is known, an unknown received signal power P r can be derived from a measured SNR by the inversion T −1 : -Receiver sensitivity P n . For a linear receiver, T −1 can be approximated by a signal-independent receiver sensitivity P n , which translates a measured SNR to an absolute signal power P r in dBm: www.atmos-meas-tech.net/12/1815/2019/ Atmos. Meas. Tech., 12, 1815-1839, 2019 Figure 2. Total received signal S in digital numbers as a function of gate number with external noise source switched on (S * on , red) and switched off (S * off , green). The two last gates monitor the signals S ng and S cal , which correspond to the receiver noise and the internal calibration source. The factors c * 1 , c 2 and c * 3 correct the estimated noise power P † n to reflect the actual receiver sensitivity P n . Signal levels obtained only during the calibration with the external noise source are marked with an asterisk.
More specifically, P n can be interpreted as an overall receiver noise power and is thus equal to the power of the smallest measurable white signal. It includes the inherent thermal noise within the receiver response, the overall noise figure of the receiver and mixer circuitry and all losses occurring between ADC and receiver input.

Estimated receiver sensitivity
Prior to this work, no rigorous determination of the receiver transfer function T was performed. During the initial calibration, the receiver sensitivity P n was instead estimated using the inherent thermal noise and its own noise characteristic.
Generated by thermal electrons, the inherent thermal noise P kTB received by a matched receiver can be derived using Boltzmann's constant k B , temperature T 0 and the noise bandwidth B n of the receiver. Additional noise power is introduced by the electronic circuitry itself, which is considered by the receiver noise factor F n . The noise factor F n expressed in decibels (dB) is called noise figure NF. Combined, P kTB and F n yield the total inherent noise power P † n : Using a calibrated external noise source with known excess noise ratio (ENR), F n was determined in the laboratory. Inflight, F n is monitored using the calibration and the noise gate.
In the following, measurements obtained with the calibrated external noise source in the laboratory are marked with an asterisk. Signal levels measured in-flight as well as during the calibration are marked without an asterisk. Figure 2 shows the external noise source measurements, where the received signal S is plotted as a function of the gate number. While connected to the receiver input, the external noise source is switched on and off with signals S * on (r) (red) and S * off (r) (green) measured in atmospheric gates. Corresponding to this, S * cal and S * ng are the signals measured in the two last gates, namely the calibration and the receiver noise gates. The in-flight signals in these two reference gates are denoted by S cal and S ng .
Using the so called Y-factor method (Agilent Technologies, 2004), the averaged noise floor ratio Y in atmospheric gates between the external noise source being switched on and off and the ENR of the external noise source is used to determine the noise factor F n created by the receiver components: Here, the averaging S * on (r) of atmospheric gates is done for gate numbers larger than 10 to exclude the attenuation caused by the transmit-receive switch immediately after the magnetron pulse. During the initial calibration with the external noise source, a noise figure NF = 8.8 dB was determined.
Summarizing the above considerations, an overall receiver sensitivity P † n is estimated using an assumed receiver noise bandwidth of 5 MHz, a receiver temperature of 290 K and a noise figure of NF = 8.8 dB. According to Eq. (12), the estimated receiver sensitivity P † n used in the initial calibration is The measurements with the external noise source are furthermore exploited to correct for various effects, which cause deviations between the inherent noise power P † n in the noise gate and the actual receiver sensitivity P n : Here, the following applies: c * 1 accounts for the attenuation in atmospheric gates, which is caused by the transmit-receive switch immediately after the magnetron pulse: As evident in Fig. 2, c * 1 is only significant in the first eight range gates (= 240 m) and rapidly converges to 0 dB in the remaining atmospheric and reference gates.
-Secondly, the correction factor c 2 is used to monitor and correct in-flight drifts of the receiver sensitivity. To this end, the ratio S * cal /S * ng measured during calibration between calibration and noise gate is compared to the ratio S cal /S ng during flight: Atmos. Meas. Tech., 12, 1815-1839, 2019 www.atmos-meas-tech.net/12/1815/2019/ In the course of one flight of several hours, c 2 varies only slightly by ±0.5 dB. Continuous observation of c 2 should be performed to keep track of the receiver sensitivity.
-A further factor accounts for the fact that the noise level measured in the noise gate is lower than the total system noise with matched load because the low-noise amplifier is not matched during the noise gate measurement. The SNR † determined with the noise gate level therefore overestimates the actual SNR in atmospheric gates: In Fig. 2, this offset is called c * 3 . Its value is determined by comparing the signal S * ng in the noise gate with the signal S * off in atmospheric gates, while the external noise source is switched off: This offset between noise gate and total system noise remains very stable with c * 3 = −0.83 dB. Since c * 3 exists only in earlier MIRA-35 systems (without MicroBlaze processor, e.g. KIT, UFS, HALO and Lindenberg), most MIRA-35 operators do not have to address this issue.

Measured receiver sensitivity
A key component of this work was to replace the estimated receiver sensitivity P † n with an actual measured value P n . While P † n was calculated using an assumed receiver noise bandwidth and the receiver noise factor, P n is now measured directly using a calibrated signal generator with adjustable power and frequency output. By varying the power at the receiver input, P n is found as the noise-equivalent signal when SNR = 0 dB. In addition, the receiver response and its bandwidth is determined by varying the frequency of the signal generator. Both measurements are then used to evaluate and check F n according to Eq. (12). This is done for two different matched filter lengths (τ r = 100 ns, τ r = 200 ns) to characterize the dependence of B n and F n on τ r .
To this end, an analog continuous wave signal generator E8257D from Agilent Technologies was used to determine the receiver's spectral response and its power transfer function T . The signal generator was connected to the antenna port of the radar receiver and tuned to 35.5 GHz, the central frequency of the local oscillator. For the characterization, the radar receiver was set into standard airborne operation mode. In this mode, 256 samples are averaged coherently into power spectra by FFT. Subsequently, 20 power spectra are then averaged to obtain a smoothed power spectrum for each second.

Receiver bandwidth
To determine the spectral response, the frequency sweep mode of the signal generator with a fixed signal amplitude was used within a region of 35 500±20 MHz. For the shorter match filter length (τ r = 100 ns) on the left and for the longer matched filter length (τ r = 200 ns) on the right, Fig. 3 shows measured signal-to-noise ratios as a function of the frequency offset from the center frequency at 35.5 GHz. The spectral response of the receiver for both matched filters (black lines) approaches a Gaussian fit (crosses). To estimate the finite receiver bandwidth loss L fb using Eq. (4), the 6 dB filter bandwidth (two-sided arrow) is determined directly from the receiver response with B 6 = 9.8 MHz for τ r = 200 ns and B 6 = 17.2 MHz for τ r = 100 ns. In the following, the equivalent noise bandwidth (ENBW) concept is used to determine the receiver noise bandwidth B n which is needed to calculate P n . In short, the ENBW is the bandwidth of a rectangular filter with the same received power as the actual receiver. Illustrated by the green and blue hatched rectangles in Fig. 3, the measured ENBW is B n,200 = 7.5 MHz for the longer matched filter length and B n,100 = 13.5 MHz for the shorter matched filter length. In contrast, the red-hatched rectangles show the estimated 5 MHz (and 10 MHz) receiver noise bandwidth using 1/τ r . The discrepancy between the measured and the estimated noise bandwidth could be traced back to an additional window function which was applied unintentionally to IQ data within the digital signal processor. This issue led to a bit more thermal noise power P kTB . For the operationally used matched filter (τ r = 200 ns), the offset between estimated and actual thermal noise power (−106.9 vs. −105.2 dBm) led to an 1.8 dB underestimation of Z e . Future measurements will not include this bias since this issue was found and fixed.

Receiver transfer function
Next, the amplitude ramp mode of the signal generator was used to determine the transfer function P r = T (SNR) of the receiver. The receiver transfer function references absolute signal powers at the antenna port with corresponding SNR values measured by the receiver. Moreover, the linearity and cut-offs of the receiver can be assessed on the basis of the transfer function. For this measurement, the frequency of the signal generator was set to 35.5 GHz, while the output power of the generator was increased steadily from −110 to 10 dBm. This was done in steps of 1 dBm while averaging over 10 power spectra. In order to test the linearity and the saturation behavior of the receiver for strong signals, these measurements were repeated with an internal attenuator set to 15 and 30 dB. For τ r = 200 ns, Fig. 4a shows the measured receiver transfer functions for the three attenuator settings of 0 dB (black), 15 dB (green) and 30 dB (red). For measurements with an activated attenuator, SNR values have been corrected by +15 dB (respectively +30 dB) to compare the  transfer functions to the one with 0 dB attenuation. The overlap of the different transfer functions between input powers of −70 and −30 dBm in Fig. 4a confirms the specified attenuator values of 15 and 30 dB. Furthermore, no further saturation by additional receiver components (e.g., mixers or filters) can be detected up to an input power of −5 dBm. This allows the dynamic range to be shifted by using the attenuator to measure higher input powers (which would otherwise be saturated) without losing the absolute calibration. This feature is essential for the evaluation of very strong signals like the ground return.
Subsequently, a linear regression to the results without an attenuator was performed between input powers of −70 and −40 dBm, which is shown in Fig. 4b.
With a slope m of 1.0009 (±0.0006) and a residual of 0.054 dB, the receiver behaved very linearly for this input power region. Similar values were obtained for an attenuation of +15 dB with a slope of 0.9980 (±0.0005) and a residual of 0.024 dB and a slope of 0.9884 (±0.0013) and a residual of 0.1 dB for an attenuation of +30 dB.
As discussed before, the setting with the shorter matched filter length collects more thermal noise due to the larger receiver bandwidth. In a final step, this top-down approach to obtain P n for different τ r can be used to determine F n and check for its dependence on τ r . By solving Eq. (12) for F n and inserting the measured bandwidths B 100 and B 200 we obtain the following: − 92.7 dBm + 102.6 dBm = 9.9 dB (τ r = 100 ns) − 95.3 dBm + 105.2 dBm = 9.9 dB (τ r = 200 ns).
Remarkably, F n shows no dependence on τ r but turns out to be larger than previously estimated by 1.1 dB. This previous underestimation of F n led to an 1.1 dB underestimation of Z e . Now, all system parameters are known to estimate the radar sensitivity at a particular range. Following Doviak and Zrnić (2006), the minimum detectable effective reflectivity Z min (r) at a particular range can be calculated using Eq. (2) in decibels: Z min (r) = MDS + 20log 10 r + log 10 R c .
Here, P r is the minimum detectable signal (MDS) in dBm which is given by P n SNR min using Eq. (11): In the operational configuration (Q = 7, N P = 256, N S = 20), the MDS is −117.4 dBm, since SNR min = −22.1 dB and P n = −95.3 dBm. The parameters listed in Table 1 yield a range-independent radar constant of R c = 3.9 dB. Using the MDS and R c in Eq. (26), the minimum detectable effective reflectivity in 5 km is Z min (5 km) = −39.8 dBZ.

Overall calibration budget
Comparing the measured P n = −95.3 dBm to the estimated P † n = −98.2 dBm for τ r = 200 ns, the combination of bandwidth bias (1.8 dB) and larger noise figure (1.1 dB) caused a 2.9 dB underestimation of Z e . Combined with the disregard of the 2.0 dB higher two-way attenuation by the radome and the 1.5 dB higher two-way attenuation by the waveguides as well as the disregard of the finite receiver bandwidth loss L fb of 1.2 dB, effective reflectivities derived with the initial calibration has to be corrected by +7.6 dB. Table 2 summarizes and breaks down all offsets found in this work. The following section will now test the absolute calibration using an external reference target. As already mentioned in the introduction, the ocean surface has been used as a calibration standard for air-and spaceborne radar instruments. In their studies, Barrick et al. (1974) and Valenzuela (1978) reviewed and harmonized theories to describe the interaction of electro-magnetic waves with the ocean surface. They showed that the normalized radar cross section σ 0 of the ocean surface at small incidence angles ( < 15 • ) can be described by quasi-specular scattering theory. At larger incidence angles ( > 15 • ), Bragg scattering at capillary waves becomes dominant, which complicates and enhances the backscattering of microwaves by ocean waves.

Modeling the normalized radar cross section of the ocean surface
At the scales of millimeter waves and for small incidence angles θ , the ocean surface slope distribution is assumed to be Gaussian and isotropic, where the surface mean square slope s(v) is a sole function of the wind speed v and independent from wind direction. Backscattered by ocean surface facets, which are aligned normal to the incidence waves (Plant, 2002), the normalized radar cross section σ 0 can be described as a function of ocean surface wind speed v and beam incidence angle θ (Valenzuela, 1978;Brown, 1990;Z. Li et al., 2005): For the ocean surface facets at normal incidence, the reflection of microwaves is described by an effective Fresnel reflection coefficient e (0, λ) = C e [n(λ)−1] [n(λ)+1] . In this study, the complex refractive index n(λ = 8.8 mm) = 5.565 + 2.870i for seawater at 25 • C is used following the model by Klein and Swift (1977). Like with other models (Ray, 1972;Meissner and Wentz, 2004), the impact of salinity on σ 0 is negligible, while the influence of the ocean surface temperature on σ 0 stays below σ 0 = 0.5 dB between 5 and 30 • C. Since specular reflection is only valid in the absence of surface roughness, various studies (Wu, 1990;Jackson et al., 1992;Freilich and Vanhoff, 2003; included a correction factor C e to describe the reflection of microwaves on wind-roughened water facets. While C e has been well characterized for the Ku band (Apel, 1994;Freilich and Vanhoff, 2003) and W band (Horie et al., 2004;, experimental results valid for the Ka band are scarce (Nouguier et al., 2016). Tanelli et al. (2006) used simultaneous measurements of σ 0 in the Ku and Ka bands, to determine | e (0, λ = 8.8 mm)| 2 = 0.455 for the Ka band, which corresponds to an correction factor C e of 0.90. However, there is an ongoing discussion about the influence of radar wavelength or wind speed on C e (Jackson et al., 1992;Tanelli et al., 2008). Chen et al. (2000) explains this disagreement with the different surface mean square slope statistics used in these studies, which do not include ocean surface roughness at the millimeter scale. To include this uncertainty in this study, the correction factor C e has been varied between 0.85 and 0.95, while the simple model (CM) for non-slick ocean surfaces by Cox and Munk (1954) was used for s(v). In their model, the surface mean square slope s(v) scales linearly with wind speed v, describing a smooth ocean surface including gravity and capillary waves:

Measuring the normalized radar cross section of the ocean surface
The ocean surface backscatter is also measured by the Global Precipitation Measurement (GPM; Hou et al., 2013) platform which carries a Ku-and Ka-band dual-frequency precipitation radar (DPR). For this study, σ * 0 from GPM is used as an independent source to support the calculated σ 0 from the model. Operating at 35.5 GHz, the KaPR scans the surface backscatter with its 0.7 • beamwidth phased array antenna, resulting in a 120 km swath of 5 km × 5 km footprints. The measured ocean surface backscatter by GPM is operationally used to retrieve surface wind conditions and path-integrated attenuation of the radar beam. In the following, the σ * 0 corrected for gaseous attenuation from GPM was used, which corresponds to the co-localized matched swath of the KaPR.
During the second Next-Generation Remote Sensing for Validation Studies (NARVAL2) in June-August 2016, HAMP MIRA was deployed on HALO. The campaign was focused on the remote sensing of organized convection over the tropical North Atlantic Ocean in the vicinity of Barbados. Another campaign objective was the integration and validation of the new remote sensing instruments on board the HALO aircraft. For the HAMP MIRA cloud radar, multiple roll and circle maneuvers at different incidence angles were included in research flights to implement the well-established calibration technique to measure the normalized radar cross section of the ocean surface at different incidence angles.
During NARVAL2, HAMP MIRA was installed in the belly pod section of HALO and aligned in a fixed nadirpointing configuration with respect to the airframe. The incidence angle is therefore controlled by pitch-and-roll maneuvers of the aircraft. The aircraft position and attitude are provided at a 10 Hz rate by the BAsic HALO Measurement And Sensor System (BAHAMAS; Krautstrunk and Giez, 2012). Pitch, roll and yaw angles are provided with an accuracy of 0.05 • , while the absolute uncertainty can be up to 0.1 • . Additional incidence angle uncertainty is caused by uncertainties in the alignment of the radar antenna. Following the approach of Haimov and Rodi (2013), the apparent Doppler velocity of the ground was used to determine the antenna beam-pointing vector. With this technique, the offsets from nadir with respect to the airframe was determined with 0.5 • to the left in the roll direction and 0.05 • forward in the pitch direction.
During calibration patterns, HALO flew at 9.7 km altitude with a ground speed of 180 to 200 ms −1 . The pulse repetition frequency was kept at 6 kHz with a pulse length of τ p = 200 ns. For the purpose of calibration, the data processing and averaging was set to 1 Hz, being the standard campaign setting with Doppler spectra averaged from 20 FFTs, which each contain 256 pulses. As a consequence of this configuration, the ocean surface backscatter at nadir was sampled in gates measuring approx. 100 m in the horizontal and 30 m in the vertical. With this gate geometry, a uniform beam-filling of the ocean surface is ensured for incidence angles below 20 • .
In the current configuration, the point target spread function of the matched receiver is under-sampled since the sampling is matched to the gate length. Thus, the maximum of the ocean backscatter can become underestimated when the surface is located between two gates. At nadir incidence, negative bias values of σ 0 of up to 3-4 dB were observed in earlier measurement campaigns, when the gate spacing equals or is larger than the pulse length (Caylor et al., 1997). For this reason, the received power from the range gates below and above were added to the received power of the strongest surface echo. By adding the power from only three gates, Caylor et al. (1997) could reduce the uncertainty in σ 0 to 1 dB and exclude the contribution by antenna side-lobes from larger ranges.
Furthermore, the backscattered signal was corrected for gaseous attenuation by oxygen and water vapor considered in the loss factor L atm . While the two-way attenuation by oxygen and water vapor is normally almost negligible in the Ka band, it has to be considered in subtropical regions with high humidity and temperature near the surface. To this end, the gaseous absorption model for millimeter waves by Rosenkranz (1998)  dropsondes, which were launched from HALO during the calibration maneuvers.
Following L. , the measured normalized cross section σ * 0 of the ocean surface can be calculated from measured signal-to-noise ratios: cπ 5 τ p R c r 2 L 2 atm 2λ 4 10 18 P n SNR.
Here, the receiver power P r was replaced with P n SNR (Eq. 10) to include the overall receiver sensitivity P n in the formulation of σ * 0 . Like in Eq.
(2), R c is the radar constant (with |K| 2 = 1) which includes the transmitter power P t , transmitting and receiving waveguide loss L tx and L rx , attenuation by the belly pod L bp and the antenna gain G a . Together with P n , the combination of these system parameters are being checked in the following section, when the measured σ * 0 is compared to the modeled σ 0 . For the following analysis, σ * 0 was obtained with Eq. (30) using 1 Hz averaged SNR from HAMP MIRA.

Comparison of measurements and model
The HAMP MIRA calibration maneuver during NARVAL2 was included in research flight RF03 on 12 August 2016. The flight took place 700 km east of Barbados in a region of a relatively pronounced dry intrusion with light winds and very little cloudiness. Figure 5 shows the flight track in orange with a true-color image taken during that time by the geostationary SEVIRI instrument. The superimposed color map shows σ * 0 from GPM in the vicinity of the operating area for that day. Here, the satellite nadir is located in the cen- ter of each track, with inclination angles θ > 0 left and right towards the edges of the swath. Apparently, σ * 0 seems spatially quite homogeneous, where the ocean surface is only covered by small marine cumulus clouds. The first waypoint was chosen to be collocated with a meteorological buoy (14.559 • N, 53.073 • W, NDBC 41040) to obtain the accurate wind-speed and direction at the level of the ocean surface as well as wave heights measured by the buoy. At 12:50 UTC, the buoy measured a wind speed of 5.7 m s −1 from 98 • with a mean wave height of 1 m and mean wave direction of 69 • . A detailed overview of the flight path during the calibration maneuver is shown in Fig. 6, where the beam incidence angle θ is shown by the color map. At 12:40 UTC, the aircraft executed a set of ±20 • roll maneuvers to sample σ * 0 in the cross-wind direction. At 12:44 UTC, the aircraft entered a right-hand turn with a constant roll angle of 10 • , the incidence angle for which σ * 0 becomes insensitive to surface wind conditions and models. After a full turn at 12:58 UTC, another set of ±20 • roll maneuvers were executed to sample σ * 0 in the along-wind direction. The dropsonde was launched around 13:06 UTC at 12.98 • N and 52.78 • W. A two-way attenuation by water vapor and oxygen absorption L 2 atm of 0.78 dB was calculated using the dropsonde sounding. With an approximate distance of 700 km, the GPM measurement closest to the calibration area was made at 10:46:29 UTC at 13.67 • N and 59.53 • W. To obtain a representative σ * 0 measurement from GPM, the swath data were averaged alongtrack for 10 s.
The measurement of σ * 0 during the across-wind roll maneuver is shown in Fig. 7a. The blue circles mark the corresponding GPM measurements. For the HAMP MIRA data, σ * 0 was calculated using the old, estimated calibration (red dots) and the new, measured calibration (green dots). In order to assess the agreement of σ * 0 with σ 0 , the CM model for σ 0 (Eq. 28) was first calculated using the measured wind speed from the buoy. These values are shown by the black line in Fig. 7a  the uncertainty in σ 0 due to the uncertainty in C e (0.85 . . . 0.95). Both modeled and measured σ 0 show the exponential falloff with θ corresponding to the smaller mean square slope of the ocean surface with increasing θ . In a second step, the CM model was fitted to σ * 0 from the old (red line) and new (green line) calibration to obtain the wind speed v. Here, a potential calibration offset σ 0 was considered as a second fitting parameter: The following analysis is valid for the turn maneuver; differences between across-wind roll, turn and along-wind roll maneuver are discussed in the Fig. 7b and the following paragraph. For old and new calibration, the fitted wind speed of 5.71 m s −1 agrees very well with the actual measured wind speed of 5.7 m s −1 . While σ * 0 for the old calibration shows a strong underestimation of σ 0 by σ 0 = −7.8 dB, the fit for the new calibration only marginally underestimates σ 0 with σ 0 = −0.2 dB, well within the uncertainty of σ 0 . Thus, the initial calibration yields 7.6 dB smaller values for σ * 0 when compared to the new calibration that is in good agreement with the modeled values. This observed difference also matches precisely with the 7.6 dB difference determined during the absolute calibration in Sect. 3. Furthermore, the accuracy of the new absolute calibration is supported by the GPM measurements in the vicinity. With an increasing offset σ 0 from −0.1 dB to −1 dB towards smaller incidence angles, GPM measured only slightly larger values within its 9 • co-localized matched swath compared to the new absolute calibration. Here, the small, increasing offset σ 0 with decreasing θ suggests a slightly lower wind speed at the GPM footprint, with more ocean surface facets pointing into the backscatter direction. The much better agreement of the new absolute calibration with GPM is a further demonstration of its validity.
Extending this discussion, the dependence of σ * 0 on wind direction is tested in the following study. To this end, Fig. 7b shows σ * 0 measured during the across-wind (green) and along-wind (red) roll maneuver as well as during the turn. Like in Fig. 7a, the CM model for the actual measured wind speed of 5.7 m s −1 is fitted to the 1 Hz averaged σ * 0 measured during the three flight patterns. While the across-wind results are slightly below the values of σ 0 predicted by the wind speed of the buoy by σ 0 = −0.5 dB, the along-wind results underestimate σ 0 by σ 0 = −0.8 dB. In comparison, the fit to the measurements in the turn showed the smallest offset σ 0 = −0.2 dB. The inset in Fig. 7c gives a closer look on the scatter of σ 0 during the turn maneuver, with a standard deviation of 0.8 dB. Here, the slightly higher values were measured in the downwind section of the turn; an observation that is in line with measurements by Tanelli et al. (2006). In addition, this scatter is further caused by the under-sampled point target spread function of the ocean surface with a remaining uncertainty of 1 dB. Due to these two effects, the measured σ * 0 will be associated with an uncertainty of 1 dB. To put a possible directional dependence in perspective to the effect of different wind speeds, modeled σ 0 are plotted with their uncertainty for wind speeds of 2 m s −1 (dashed line), 8 m s −1 (dashed-dotted line) and the actual 5.7 m s −1 (solid line). In summary, measured σ * 0 for the new calibration agree with modeled as well as independently measured values within their uncertainty estimates.

Intercomparison with RASTA and CloudSat
The following section will validate the preceding external calibration. To that end, we conducted common flight legs with W-band cloud radars, like the airborne RASTA and the spaceborne CloudSat. First, possible differences between effective reflectivities at 35 and 94 GHz are explored on the basis of a numerical study.

Model study of Z e at 35 and 94 GHz
In contrast to water cloud droplets, ice crystals have various shapes and sizes. With increasing maximum diameter D max , ice crystals become more complex and their effective density decreases (Heymsfield et al., 2010). For this study, we use the "composite" mass-size relationship from Heymsfield et al. (2010) (Eq. 10, in their paper) to describe the connection between the maximum ice crystal diameter D max and its equivalent melted diameter D eq . This relationship combines data sets of six in situ measurement campaigns in a variety of ice cloud types. We assumed horizontally aligned oblate spheroids with an aspect ratio of 0.6, composed of a mixture of air and ice which follows the given mass-size relationship. As a function of the equivalent melted diameter D eq , the ice fraction of this mixture is shown as green line in Fig. 8. Furthermore, a realistic and well-tested particle size distribution (PSD) is used. Since PSDs are known to be highly variable (Intrieri et al., 1993), we choose the normalized PSD approach by Delanoë et al. (2005), which is based on an extensive database of airborne in situ measurements. Figure 8 shows this PSD as a function of D eq for different effective ice crystal radii r eff . Following Delanoë et al. (2014), the effective radius is derived using the method of Foot (1988), which increases proportional to the ratio of mass to projected area. The mass-size relationship and PSD are also a key component of the synergistic radar-lidar retrieval DARDAR , which is designated for the EarthCARE mission. In the following, "Rayleigh scattering only" will be compared to Mie scattering and T-Matrix scattering theory. Mie theory is applied assuming homogeneous ice-air spheres, while the T-Matrix calculations are done for the spheroids of the same mass and area as the ice-air spheres.
The model results for a single ice crystal are shown in Fig. 9a. Here, the effective reflectivities at 35 GHz (green) and 94 GHz (red) are shown as a function of equivalent melted diameter D eq according to Rayleigh (blue), Mie (solid) and T-Matrix (dashed) theory. While the effective reflectivity derived with Rayleigh theory increases with the square of the particle mass, Z e starts to deviate for Mie and T-Matrix theory for D eq > 400 µm at 94 GHz and for D eq > 800 µm at 35 GHz. For D eq larger than 600 µm (1200 µm), effective reflectivity for single ice spheroids decreases again for 94 GHz (resp. 35 GHz) due to Mie resonances. The reader is advised that the results in Fig. 9a probably underestimate Figure 8. The ice microphysical model used during the effective reflectivity study. The particle size distribution (Delanoë et al., 2005) and the mass-size relationship (green curve) (Heymsfield et al., 2010) are based on an extensive database of airborne in situ measurements.
the backscatter for snowflakes at larger D eq (Tyynelä et al., 2011).
In a next step, this result is integrated using the normalized PSDs for different effective radii. The results for a fixed ice water content of 1 g m −1 and variable effective ice crystal radius is shown in Fig. 9b.
Lower effective reflectivity values are almost identical, while larger effective reflectivities at 94 GHz are below the values at 35 GHz. For these realistic PSDs, effective reflectivities deviate from Rayleigh theory for effective radii larger than 80 µm at 94 GHz and 120 µm at 35 GHz. In Fig. 9b, the non-Rayleigh scattering effects become apparent at much smaller values of r eff compared to values of D eq in Fig. 9a. This is only an apparent contradiction, since r eff increases proportional to the ratio of mass to projected area and thereby much slower than D eq . At last, the PSD-integrated results from Fig. 9b are used to constrain co-located Z e measurements at 35 and 94 GHz to physically plausible values. In Fig. 10, modeled effective reflectivities at 94 GHz are plotted against reflectivities at 35 GHz. The blue lines show the Rayleigh result, the solid lines show results according to Mie theory and the dashed lines show results for spheroids which where obtained from T-Matrix theory. Figure 10a shows results for mono-disperse ice particles with increasing D eq , while Fig. 10b shows results for whole ice crystal distributions with increasing r eff and a fixed ice water content of 1 g m −1 . Obviously, Z e values measured at 94 GHz should always be equal to or smaller than Z e values measured at 35 GHz. While Z e can also be much smaller due to the combination of non-Rayleigh scattering and higher attenuation at 94 GHz, ice clouds with smaller ice crystals and thus effective reflectivity should exhibit quite similar Z e values at 94 and 35 GHz. Figure 9. Modeled effective reflectivities at 35 GHz (green) and 94 GHz (red) as a function of equivalent melted diameter D eq according to Rayleigh (blue), Mie (solid) and T-Matrix (dashed) theory. (a) Results for mono-disperse, horizontally aligned oblate spheroids with an aspect ratio of 0.6, composed of a mixture of air and ice according to Fig. 8. (b) Results for particle size distributions (shown in Fig. 8) of these spheroids with a fixed ice water content of 1 g m −1 . Figure 10. Comparison between modeled effective reflectivities at 94 GHz against effective reflectivities at 35 GHz according to Rayleigh (blue), Mie (solid) and T-Matrix (dashed) theory. (a) Z e for mono-disperse ice crystals (soft spheroids), (b) Z e for the whole distribution shown in Fig. 8. Overall, the results for both wavelengths are almost identical for small particle sizes and thus small Z e , while Z e at 94 GHz is smaller than at 35 GHz for larger particles and thus larger Z e values.

RASTA
The calibration of the 94 GHz Doppler cloud radar named RASTA on board the French Falcon 20 was performed by Protat et al. (2009) with an absolute accuracy of 1 dB by using the ocean surface backscatter. In intercomparisons, they found that CloudSat measured about 1 dB higher reflectivities compared to RASTA. A coordinated flight with the French Falcon equipped with the 94 GHz radar system RASTA and the HALO equipped with the 35 GHz radar system was performed over southern France and northern Spain on 19 December 2013 between 11:00 and 11:15 UTC. Both aircraft flew in close separation of less than 5 min. During that leg, HALO was flying at an altitude of 13 km and passed the slower-flying French Falcon at an altitude of 10 km. The SEVIRI satellite image indicated a stratiform cloud in the measurement area (Fig. 11).
The radar measurements showed a two-layer cloud structure (Fig. 12) with a lower cloud in the first half of the measurement reaching from ground to about 4 km height and an overlying cloud layer, present during the whole co-located flight, with a cloud base between about 4.5 and 6 km height and an homogeneous cloud top at about 10.5 km in altitude. Thus, this coordinated flight provides an optimal measurement situation for a radar intercomparison. Due to the close separation of the aircraft, many cloud features can be found in both measurements at the same place. On the first sight of the measurements one can suggest that the HAMP MIRA instrument shows more variability within the cloud layer. Also, small-scale cloud structures are visible in the measurements made between 11:08 and 11:12 UTC. These cloud structures are not visible in the cross section of the RASTA measurements. At first glance, the HAMP MIRA at 35 GHz is more sensitive, especially to low-lying water clouds. While the effective reflectivities of the high cirrus cloud layer are quite similar, differences become visible in precipitating clouds, but also in non-precipitating water clouds after 11:07 UTC.
Like in Sect. 4.3, Z e was corrected for gaseous attenuation by oxygen and water vapor using the model of Rosenkranz (1998). Profiles of pressure, temperature and humidity were taken from the ECMWF Integrated Forecasting System (IFS) model. At first glance, Z e from both instruments looks quite similar in the cirrus cloud layer when using the new calibration of HAMP MIRA. In precipitating clouds at lower altitudes, however, differences in Z e become visible. As discussed in the previous model study, this can be explained by the difference in wavelength. With increasing ice crystal size, the transition from the Rayleigh scattering regime (Z ≈ D 6 ) towards the Mie scattering regime (Z ≈ D 2 ) first occurs at 94 GHz. The difference Z between 94 and 35 GHz increases with increasing Z e due to larger ice crystals and higher attenuation at lower altitudes. In the following in-tercomparison, cloud parts below 4 km were thus discarded (hatched line in Fig. 12) to exclude effects caused by different attenuation or scattering regimes. Common coordinates were used as reference points to obtain reflectivity pairs from both instruments. Figure 13a gives a closer look: the airborne RASTA and MIRA HAMP measurements of Z e are compared against each other like in the model study shown in Fig. 10. On average, the linear regression reveals lower reflectivities (−1.4 dB) for RASTA. While slightly outside their calibration uncertainties, the agreement is still quite good with the slight time shift and wavelength difference in mind.

CloudSat
In recent years, CloudSat has been established as a reference source to compare the calibration of different groundand airborne cloud radars (Protat et al., 2010). Due to the stability of its absolute calibration and its global coverage, CloudSat has tied the different cloud radar systems more closely together. For this reason, the spaceborne CloudSat is used in this last comparison. For intercomparison, a Cloud-Sat underflight performed over the subtropical North Atlantic ocean east of Barbados on 17 August 2016 between 16:54 and 17:22 UTC (Fig. 14) is used. Due to the different conventions for the dielectric factor (|K| 2 = 0.75 for Cloudsat, |K| 2 = 0.93 for HAMP MIRA) the effective reflectivity from CloudSat first had to be converted for |K| 2 = 0.93 using Eq. (1). Using a nearby dropsonde sounding, the two-way attenuation by water vapor and oxygen absorption was calculated at 35 and 94 GHz and used to correct Z e . For this underflight, Fig. 14a shows a corresponding image of the scene at a wavelength of 645 nm which was acquired by the wide-field camera (Pitts, 2007) on CALIPSO. HALO flew aligned with the CloudSat footprint for over 450 km. During this flight, all instrument settings were identical to the calibration flight (f p = 6 kHz, τ = 200 ns, 1 Hz), with footprints measuring approximately 100 m in the horizontal and 30 m in the vertical.
In the beginning of the underpass flight, HALO was still climbing through the cirrus layer. Coinciding with the Cloud-Sat overpass at 17:04 UTC, the aircraft then reached the top of the cirrus layer. The overall measurement scene is characterized by inhomogeneous cirrus cloud structures with a contribution of a few low clouds. The first part is dominated by an extended cirrus layer. As this cirrus layer becomes thinner, the second part is composed of broken and thinner cirrus clouds and shallow, convective marine boundary layer clouds. The cirrus layer and the lower precipitating clouds are clearly visible from both platforms. Strong effective reflectivity gradients are more blurred in the CloudSat measurement due to the coarser horizontal (1700 vs. 200 m) and vertical (500 vs. 30 m) resolution. For this reason, cloud edges as well as internal cloud structures are better resolved in the HAMP MIRA measurements. At cloud edges, this  resolution-induced blurring leads to larger reflectivities while it reduces the maximum reflectivities found inside clouds. For the direct comparison in Fig. 13b, cloud parts below 4 km are again discarded (hatched line in Fig. 14) to exclude effects caused by different attenuation or scattering regimes.
Again, common coordinates were used as reference points to obtain reflectivity pairs from both instruments. Since the scene is dominated by cirrus, the values for Z e are generally lower than in the RASTA-MIRA comparison. In contrast to the RASTA comparison, the linear regression reveals slightly larger reflectivities (+1.0 dB) for CloudSat. In comparison to Fig. 13a, the scatter between air-and spaceborne platforms is significantly larger due to the different spatial resolutions and instrument footprints. The small bias between CloudSat and HAMP MIRA is, however, within the calibration uncertainties of both instruments. The fact that the effective reflectivity measured by HAMP MIRA is in-between RASTA and CloudSat serves as further validation of the new absolute calibration.

Conclusions
In this study, we have characterized the absolute calibration of the microwave cloud radar HAMP MIRA, which is in-stalled in the belly pod section of the German research aircraft HALO in a fixed nadir-pointing configuration. In the first step, the respective instrument components were characterized in the laboratory to obtain an internal calibration of the instrument. Our study confirmed the previously assumed antenna gain and the linearity of the receiver: -The antenna gain G a = 50.0 dBi and beam pattern (−3 dB beamwidth φ = 0.56 • ) showed no obvious asymmetries or increased side lobes.
-With three attenuator settings (0, 15 and 30 dB), the radar receiver behaved very linearly (m = 1.0009 and residual 0.054 dB) in a wide dynamic range of 70 dB from −105 to −5 dBm.
-No further saturation by additional receiver components (e.g., mixers or filters) could be detected up to an input power of −5 dBm. This allows the dynamic range to be shifted by using the attenuator to measure higher input powers (which would otherwise be saturated) without losing the absolute calibration.
A key component of this work was the characterization of the spectral response of the radar receiver and its power transfer function T using an analog continuous wave signal generator. This characterization gave valuable new insights into the receiver noise power and thus the receiver sensitivity. In the course of this study, the following major improvements to the instrument calibration were made: -The comparison of the measured and the previously estimated total receiver noise power (−95.3 vs. −98.2 dBm) revealed an underestimation of 2.9 dB.
-This underestimation of P n could be traced back to two different origins within the radar receiver -Spectral response measurements of the receiver unveiled a larger receiver noise bandwidth of 7.5 MHz, compared to the 5.0 MHz expected by the matched filter used (τ r = 200 ns). This issue could be traced back to an additional window function which was applied unintentionally to IQ data within the digital signal processor. The larger receiver response led to a somewhat higher thermal noise power P kTB (−106.9 vs. −105.2 dBm) than initially assumed.
-The noise figure NF, describing the additional noise created by the receiver itself, turned out to be 1.1 dB larger than previously estimated, but showed no dependence on τ r .
-The combination of a larger spectral response (1.8 dB) and higher noise figure (1.1 dB) caused the 2.9 dB underestimation of the inherent noise power P n . This, in turn, lead to an 2.9 dB underestimation of Z e .
-Furthermore, no correction of the finite receiver bandwidth loss was applied to previous data sets of HAMP MIRA. Using the spectral response measurements, this study can now give an estimate for the finite receiver bandwidth loss L fb of 1.2 dB.
In addition, our study re-evaluated the previously assumed attenuation by the belly pod and additional waveguides with the following measurements: -The thickness of the epoxy quartz radome in the belly pod was designed with a thickness of 4.53 mm to limit the one-way attenuation to around 0.5 dB. Deviations during manufacturing increased the planned belly pod thickness from 4.53 to 4.84 mm. This increased the oneway attenuation from the initially assumed value of 0.5 to 1.5 dB. The higher radome attenuation is now also confirmed by laboratory measurements.
-The initially used calibration did not account for the losses caused by the longer waveguides in the airplane installation. With an additional length of 1.15 m and a specified attenuation of 0.65 dB m −1 , the two-way attenuation by additional waveguides is 1.5 dB.
Subsequently, this component calibration was validated by using the ocean surface backscatter as a reference with known reflectivity. To this end, controlled roll maneuvers were flown during the NARVAL2 campaign in the vicinity of Barbados to sample the angular dependence of the ocean surface backscatter. The comparison with modeled backscatter values using the Cox-Munk model for non-slick ocean surfaces and measured values from the GPM satellite confirmed the internal calibration to within ±0.5 dB. In a second intercomparison study, the absolute accuracy of the internal calibration was further scrutinized during common flight legs with the airborne 94 GHz cloud radar RASTA and the spaceborne 94 GHz cloud radar CloudSat. To assess the influence of different radar wavelengths on this comparison, we first conducted a model study of effective reflectivities at 35 and 94 GHz. Using realistic ice particle size distributions, T-Matrix calculations for spheroids show almost identical effective reflectivities at 35 and 94 GHz for effective radii smaller than 50 µm. Larger ice crystals and higher attenuation generally lead to a smaller reflectivity at 94 GHz. In this context, the intercomparison showed good agreement between the HAMP MIRA at 35 and the RASTA at 94 GHz, with slightly lower reflectivities (−1.3 dB) for RASTA. The intercomparison with CloudSat showed slightly higher (+1.2 dB) reflectivities for CloudSat. These higher reflectivities were mostly found at cloud edges, where the coarser spatial resolution of CloudSat can blur out higher reflectivities into regions with thinner reflectivity below the sensitivity of CloudSat. The intercomparison studies showed that the absolute calibration uncertainty is now well below the initially required accuracy of 3 dB and even came close to the target accuracy of 1 dB.
In conclusion, the following procedures and techniques turned out to be essential for the absolute calibration of HAMP MIRA and should become state of the art: 1. The simultaneous characterization of the spectral response of the radar receiver and its power transfer function T turned out to be very valuable to cross-check the receiver sensitivity P n .
2. While P n was previously estimated using an assumed receiver noise bandwidth B and a measured receiver noise factor F n , it is now measured directly using a calibrated signal generator with adjustable power output.
sure the receiver noise bandwidth B. A characterized spectral response is essential to calculate the finite receiver bandwidth loss L fb . It can also be used to calculate the receiver sensitivity P n .
4. The direct measurement of P n and the calculated value can then be used to evaluate and check the receiver noise factor F n . This should be done for two different matched filter lengths to characterize the dependence of B n and F n on τ r .
5. Discrepancies between the component-wise calculation of P n and the direct measurement can help to find additional noise sources or attenuation within the radar receiver.
6. Validate the budget approach of the internal calibration in intercomparison with external sources like sea surface data or different instruments.
The lessons learned in the course of this study helped us to better understand our instrument and increased the confidence in its absolute calibration. Subsequent studies for similar cloud radar instruments should consider following prerequisites and guidelines: -Knowledge about existing calibration offsets grows gradually. It is advisable to refrain from incremental updates of prior data sets. To mitigate confusion with different calibration offsets, a new calibration should only be applied to prior or current measurements when the internal calibration is in agreement with external reference sources.
-Initially, the main focus should be on the antenna gain (including the radome) and the receiver sensitivity. The measured antenna gain and the radome attenuation should furthermore be cross-checked with calculated values.
-For the characterization of the spectral response and receiver noise power, access to unprocessed Doppler spectra is advantageous to check the calculation of SNR independently.
-The sole intercomparison of two cloud radars is a necessary but not sufficient step towards an absolute calibration. An apparent agreement can lead to a false sense of accuracy since common misconceptions and assumptions remain hidden and can thus propagate from instrument to instrument. Discrepancies in intercomparisons should always trigger a re-evaluation of the internal calibration.
-While a sole internal calibration can help to get a better understanding of the instrument performance, it has to be validated with external reference sources. It is the combination of internal calibration and external validation which establishes trust in the absolute calibration.  Konow et al. (2019), https://doi.org/10.5194/essd-2018-116. All data sets created during the internal calibration are provided upon request.
Appendix A: Temperature dependence of I mg and P t In-flight thermistor measurements of the average transmit power P t proved to be unreliable due to strong variations in ambient temperatures in the cabin. For this reason, thermally controlled measurements of P t were conducted at ground level which were correlated with measured magnetron currents I m . The relationship between both parameters then allowed P t to be derived from in-flight measurements of I m . For this analysis, we operated the HAMP MIRA instrument within a trailer during a 27-day ground-based test campaign in 2016. We observed the magnetron temperature T mg and the magnetron anode current I mg for two different anode voltages (15.4 and 15.5 kV). The average transmit power P t was measured every 30 min using a calibrated thermistor. Fig. A1a shows the relationship between magnetron temperature T mg and magnetron anode current I mg . The relationship between magnetron anode current I mg and average transmit power P t is given in Fig. A1b. Depending on the anode voltage, P t increased linearly with I mg and varied by 0.2 dB within the operational range of T mg between 25 and 35 • C. Figure A1. Analysis of the relationship between magnetron temperature T mg , magnetron anode current I mg and average transmit power P t during a 27-day ground-based test campaign in 2016 for two different anode voltages (15.4 and 15.5 kV). (a) Relationship between magnetron temperature T mg and magnetron anode current I mg . (b) Relationship between magnetron anode current I mg and average transmit power P t .

Appendix B: Characterization of the belly pod transmission
The thickness of the epoxy quartz radome in the belly pod was designed with a thickness d of 4.53 mm to limit the one-way attenuation to around 0.5 dB. The thickness was designed to cancel out reflections on the front and back side of the radome. However, laboratory measurements of the finished radome found a one-way attenuation of around 1.5 dB for the transmit frequency 35.5 GHz of HAMP MIRA. The spectral transmission of the radome between 26 and 40 GHz is shown in Fig. B1a. The red line shows the initially assumed spectral transmission using a relative permittivity r = 3.44, a dielectric loss tangent tan δ = 0.0015 and a radome thickness d = 4.53 mm. Our measurements (black crosses), however, can be better explained by the green line in Fig. B1a, which shows the spectral transmission for a relative permittivity r = 3.80, a dielectric loss tangent tan δ = 0.0017 and a radome thickness d = 4.84 mm. In Fig. B1b, the transmission for both material properties are compared as a function of the radome thickness d. The oscillating transmission can be explained by the cancellation of reflections at every half-wavelength. Since the wavelength is shorter within the radome material ( r = 3.80), this happens every d opt = λ/ 2 √ r = 2.2 mm at 35.5 GHz. The combination of a higher relative permittivity and the slight increase in radome thickness can explain the 1 dB increase in radome attenuation. Figure B1. Analysis of the belly pod radome transmission of HALO. (a) Radome transmission between 26 and 40 GHz for the initially assumed material properties (red line, r = 3.44, tan δ = 0.0015, d = 4.53 mm) and the measured ones (red line, r = 3.80, tan δ = 0.0017, d = 4.84 mm). (b) Radome transmission for the two material properties (red: assumed, green: measured) as a function of the radome thickness d. A increase in relative permittivity and thickness ( = 0.31 mm) increased the radome attenuation by 1 dB.