Testing the efficacy of atmospheric boundary layer height detection algorithms using uncrewed aircraft system data from MOSAiC

During the Multidisciplinary drifting Observatory for the Study of Arctic Climate (MOSAiC) expedition, meteorological conditions over the lowest 1 km of the atmosphere were sampled with the DataHawk2 (DH2) fixed 15 wing uncrewed aircraft system (UAS). Of particular interest is the atmospheric boundary layer (ABL) height, as ABL structure can be closely coupled to cloud properties, surface fluxes, and the atmospheric radiation budget. The high temporal resolution of the UAS observations allows us to subjectively identify ABL height for 65 out of the total 89 flights conducted over the central Arctic Ocean between 23 March and 26 July 2020 by visually analyzing profiles of virtual potential temperature, humidity, and bulk Richardson number. Comparing this subjective ABL 20 height with the ABL heights identified by various previously published objective methods allows us to determine which objective methods are most successful at accurately identifying ABL height in the central Arctic environment. The objective methods we use are the Liu-Liang, Heffter, virtual potential temperature gradient maximum, and bulk Richardson number methods. In the process of testing these objective methods on the DH2 data, numerical thresholds were adapted to work best for the UAS-based sampling. To determine if conclusions are robust across 25 different measurement platforms, the subjective and objective ABL height determination processes were repeated using the radiosonde profile closest in time to each DH2 flight. For both the DH2 and radiosonde data, it is determined that the bulk Richardson number method is the most successful at identifying ABL height, while the Liu-Liang method is least successful.


Introduction
The transfer of energy between the Earth's surface and the overlying atmosphere, particularly at high latitudes, remains an area of substantial uncertainty in our understanding of the global climate system (de Boer et al., 2012;Tjernström et al., 2012;Karlsson and Svensson, 2013). The consequences of this uncertainty are significant, with global climate model projections of present-day sea ice demonstrated to fall short of simulating the observed rate of change (Stroeve 35 et al., 2007;Stroeve et al., 2012). The thermodynamic structure of the lower atmosphere plays a central role in regulating cloud lifecycle and radiative transfer, and their influence on atmospheric energy transport (Tjernström et al., 2004;Karlsson and Svensson, 2013;Brooks et al., 2017). Significant insight can be gained by measurements https://doi.org/10.5194/amt-2021-383 Preprint. Discussion started: 3 January 2022 c Author(s) 2022. CC BY 4.0 License. collected over the central Arctic Ocean ice pack, focused on the structure of the lower atmosphere, its spatial and temporal variability, the intensity of turbulent energy fluxes, and its connection to surface features. To provide such 40 measurements, uncrewed aircraft were deployed in the lower atmosphere during legs 3 (March through May 2020) and 4 (June through August 2020) of MOSAiC (Multidisciplinary drifting Observatory for the Study of Arctic Climate; Shupe et al. 2020), a year-long expedition that took place from October 2019 to September 2020 in which the icebreaker RV Polarstern (Alfred-Wegener-Institut Helmholtz-Zentrum für Polar-und Meeresforschung, 2017) was frozen into the central Arctic ocean sea ice pack and allowed to passively drift across the central Arctic for an 45 entire year (Fig. 1). Additional information on measurements taken of the atmosphere and sea ice during MOSAiC can be found at Shupe et al. (submitted) and Nicolaus et al. (submitted) respectively.
One component of the structure of the lower Arctic atmosphere is the atmospheric boundary layer (ABL) height. The ABL is the turbulent lowest part of the atmosphere that is directly influenced by the earth's surface (Stull, 1988; 50 Marsik et al., 1995). In the central Arctic, the ABL is mostly impacted by interactions between the atmosphere and sea ice surface features, including the generation of turbulence through surface energy fluxes emitted from open water regions such as leads (Lüpkes et al., 2008), the horizontal advection of airmasses from lower latitudes (Brooks et al., 2017), radiative mixing forced by cloud cover (Tjernström et al., 2004), or mechanical generation of turbulence by sea ice features (Andreas et al., 2010) or oceanic waves (Jenkins et al., 2012). Solar heating of the earth's surface and 55 the subsequent formation of buoyant thermals, which is a dominant forcing of the ABL in most parts of the planet (Marsik et al., 1995), does not often play a role in the central Arctic due to the relatively reflective surfaces found there. To understand the influence of the surface on other atmospheric features such as clouds and their influence on radiative transfer in the lower atmosphere, low level jets, and temperature inversions, it is important to identify the ABL height. 60 The depth of the ABL has been previously defined using a variety of approaches that involve visualizing the profiles of different variables. Table 1 lists the variables that have previously been used to identify ABL height, as well as the associated literature that references use of that variable. Each of these variable profiles typically exhibits a distinct change in structure at the top of the ABL, which is why they are used to identify ABL height.

Table 1 goes here
Of these methods, some of the most widely used ones, and the ones applied in the current analysis of a central Arctic dataset to determine ABL height are the ones that involve analysis of virtual potential temperature (θv), vertical 70 gradient of virtual potential temperature (dθv/dz), humidity, bulk Richardson number (Rib), and wind speed profiles.
The current focus is on these variables because the physical basis for each one as an indication of ABL height is relevant for the Arctic atmosphere. Specifically, θv helps identify the entrainment zone above the ABL, humidity begins to either decrease or increase more above the ABL (Dai et al., 2014), Rib helps identify where turbulence (caused by buoyancy in a convective ABL (Stull, 1988) and by strong wind shear or surface roughness in a stable or 75 https://doi.org/10.5194/amt-2021-383 Preprint. Discussion started: 3 January 2022 c Author(s) 2022. CC BY 4.0 License. neutral ABL (Grachev et al., 2005)) ceases above the ABL, and wind speed helps identify the top of the ABL when it is capped by a low-level jet (Stull, 1988).
High resolution data collected by the DataHawk2 uncrewed aircraft system (UAS) allows for determination of ABL heights with high accuracy through visual analysis of profiles of θv, humidity, and Rib. However, visually determining 80 ABL height case-by-case is time consuming for processing a large dataset. Therefore, the UAS-derived dataset is leveraged to compare subjectively determined ABL heights with those identified through previously published objective and automated methods. This evaluation is completed to identify objective methods that can accurately diagnose ABL height across a larger dataset of central Arctic atmospheric conditions.

85
To subjectively identify the ABL height of each atmospheric profile from DH2 data, the stability regime of the ABL (stable, neutral, or convective) is categorized and ABL heights are visually identified through combined evaluation of θv, humidity (both relative humidity (RH) and mixing ratio), and Rib profiles. Objective analyses of ABL heights are derived through the application of previously published methods, including those in Liang (2010), Heffter (1980), Dai et al. (2014), and Sivaraman et al. (2013), adapted to best suit the DH2 profiles examined. Then, statistical 90 comparisons of the objective ABL heights and the subjective values are conducted to evaluate how well each method identifies the correct ABL height. Next, the objective methods are applied in their adapted form to radiosonde profiles nearest in time to each DH2 flight to determine if these methods are robust across different measurement platforms for central Arctic conditions. Finally, discussion is included on the features that do or do not lend themselves to accurate identification of the ABL height by the objective methods, and findings are summarized to support future 95 studies seeking to identify ABL height quickly, objectively, and accurately across large atmospheric datasets collected in the central Arctic.

The DataHawk2
100 Data presented in this paper were obtained between 23 March and 26 July 2020 using the University of Colorado  (Fig. 1). Throughout this time period, the MOSAiC floe evolved from snow-covered rigid ice situated in the high Arctic to being covered with melt ponds and leads close to the sea ice edge. The surface atmospheric temperatures 105 also transitioned from nearly -35° C at the beginning of leg 3 to hovering around 0° C throughout the entirety of leg 4. frequency (800 Hz) information on temperature and air speed, multiple sensors for temperature and relative humidity (Vaisala RSS-421 measuring at 5 Hz and SHT-85 measuring at 100 Hz), and up-and downward looking thermopile sensors to provide infrared brightness temperatures of the sky and surface. Air pressure is measured at 5 Hz by the 115 Vaisala RSS-421 sensor. Measurements of attitude, airspeed, ground speed, and altitude support the derivation of high-frequency (10 Hz) 3D wind estimates. Combined, these sensors provide a comprehensive picture of atmospheric thermodynamic and kinematic state along with some context on the surface and sky condition under which these measurements were obtained. Table 2 lists the resolution, repeatability (standard deviation of difference between two successive repeated calibrations), and response time for the Vaisala RSS-421 sensor. Uncertainty in the wind speed 120 estimation is not provided, as determining this is still in progress.

Table 2 goes here
Measurements collected by the DH2 are logged at different frequencies, requiring the implementation of a time 125 alignment process to assure that the time index for each datapoint of each variable is consistent with all other measurements. Additionally, the wind measurements have been filtered to remove the impact of the angle and ground speed of the aircraft to provide estimates of the true wind speed (de Boer et al., in progress). Data collected by the DH2 during MOSAiC are available for public download through the NSF Arctic Data Center at *insert DOI when available* (Jozef et al., 2021).

130
During MOSAiC, DH2 flights were conducted whenever flight weather criteria were met and when the team was able to access the ice alongside the Polarstern. The weather criteria include wind speeds with a sustained average below 10 m s -1 , and gusts below 14 m s -1 , as well as sufficient visibility to maintain visual contact with the aircraft at all times during flight. In addition, the DH2 flights required coordination with other MOSAiC activities, especially those 135 impacting air space over the MOSAiC floe, including manned helicopter flights and other UAS and tethersonde operations. Since the DH2 flights were limited to days in which these weather and airspace criteria were met, the profiles sampled during the campaign only represent those that occurred under a subset of all conditions observed during legs 3 and 4 of MOSAiC.

140
The most common flight pattern conducted with the DH2, and the flight pattern from which data for this analysis were acquired, was a profiling flight in which the plane flew a spiral ascent and descent pattern, with a radius of 75-100 m between the surface and 1 km altitude (or cloud base, if lower than 1 km), with the aircraft ascending and descending at a rate of 2 m s -1 and flying at an airspeed of 14-18 m s -1 . Each profiling flight lasted an average of 30-minutes, with some shorter flights when the air temperature was at its coldest (~-35 °C) near the beginning of leg 3, and some longer 145 flights when the air temperature was much warmer (~0 °C) during leg 4. Throughout our measurement period, 89 flights were conducted with the DH2. In the present study, 65 of these flights are found to have clearly identifiable ABL heights within the altitude range sampled. The remaining flights sampled only the lowest portion of the https://doi.org/10.5194/amt-2021-383 Preprint. Discussion started: 3 January 2022 c Author(s) 2022. CC BY 4.0 License. atmosphere due to cloud cover or other environmental conditions and therefore did not observe the full depth of the ABL.

150
For each of these 65 DH2 flights, the ABL height is visually determined from the θv, humidity, and Rib profiles; the former two of which have been created by averaging each variable in 1 m bins throughout the flight, excluding the first 5 seconds of flight, as the initial measurements after takeoff may be faulty due to hysteresis associated with the sensor sitting still at the surface before launch. Additionally, the ABL heights are objectively identified using the four 155 published methods (Liu-Liang, Heffter, virtual potential temperature gradient maximum method, and the bulk Richardson number method). For the remainder of the manuscript, ABL heights determined from visual identification are referred to as the "subjective" ABL heights and the ABL heights determined by the published methods are referred to as the "objective" ABL heights. Some of the methods for both subjectively and objectively identifying ABL height differ depending on the stability regime, so the sampled regime is first identified for each DH2 flight. Once the regime 160 is defined, we apply the appropriate criteria below to subjectively identify the ABL height for each case and compare this to the ABL height identified by each of the published objective methods.
Lastly, profiles of potential temperature, humidity, and wind speed from the balloon-borne radiosondes that were launched at least 4 times per day from the deck of the Polarstern (Maturilli et al., 2021) are leveraged to determine 165 if the objective methods used to identify ABL height from the UAS data are robust across platforms. To do this, radiosonde profiles with launch times closest to DH2 flight times are used, repeating the same processes for subjective and objective ABL height identification and comparison.

170
The three possible stability regimes considered include a convective boundary layer (CBL), stable boundary layer (SBL), and neutral boundary layer (NBL; Liu and Liang, 2010). A CBL forms when convective thermals create positive buoyancy (Liu and Liang, 2010) and an air parcel at the surface rises adiabatically until becoming neutrally buoyant. In a CBL, θv near the surface is greater than that of the overlying ABL (Stull, 1988). A SBL forms when there is a deficit of radiation at the surface, or when warmer air is advected over a cooler surface. In a SBL, θv increases 175 with altitude, and can range from being nearly well-mixed with moderate turbulence to nearly laminar (Stull, 1988), though some amount of turbulence is always present. A NBL occurs when there are near-neutral conditions in the ABL (Sivaraman et al., 2013), meaning requirements are lacking for either a CBL or SBL, air at the surface is neutrally buoyant, and θv at the surface is approximately the same value as that of the overlying remainder of the ABL (Stull, 1988).

180
Stability regimes are identified by comparing θv (calculated using RSS-421 temperature, pressure, and RH) between the lowest altitude sampled by the DH2 ( ; typically ~5m) and 40 m above using equations 1-3 below adapted from Liu and Liang (2010 In these equations, δs is a stability threshold that represents the minimum θv increase/decrease with altitude near the 190 surface necessary for the ABL to qualify as a SBL/CBL. If this minimum is not reached in either direction, the ABL is identified as a NBL (Liu and Liang, 2010). In an idealized case, δs would be zero. However, in practice it must be specified as a small positive number, and this number depends on the surface characteristics. For profiles over ocean/ice, this number has been defined to be 0.2 K (Liu and Liang, 2010).

195
While Liu and Liang (2010) compare θv between pressure levels that equate to approximately 40 and 160 m in the conditions we sampled, this range was found to be inadequate for differentiating between a SBL, NBL or CBL. In the current study, ABL tops were observed to range in height between 38-287 m, with an average of 104 m. This means that the 40 -160 m altitude range used by Liu and Liang (2010) would include all of the boundary layer and a portion of the overlaying free atmosphere for most cases. This would result in the incorrect identification of most ABLs as 200 SBLs, since the overlaying free atmosphere will always have a greater θv than that of the ABL. In general, considering the θv change below ~45 m more accurately reflects the stability regime of the Arctic ABL, since the ABL is often much shallower than that over land or at midlatitudes (Esau and Sorokina, 2010).
Once the stability regime (CBL, NBL, or SBL) is identified, criteria based on the θv, humidity, and Rib profiles are 205 applied to subjectively determine the boundary layer height. For the current dataset, 31 SBL cases, 32 NBL cases, and 2 CBL cases were identified. This is in agreement with Brooks et al. (2017), which states that the ABL over sea ice in the summer is typically near-neutral or weakly stable, and our data suggests this is also true in the late winter to spring season.

Subjective identification of atmospheric boundary layer height
To subjectively identify ABL height, the θv profile is first analyzed, as the θv profile changes structure above the ABL (Stull ,1988). For a CBL and NBL, above the ABL, θv changes from decreasing or constant with height, to increasing with height, marking the entrainment zone (Stull, 1988). The structure of a SBL, however, can vary a lot more (Mayer et al., 2012;Steeneveld et al., 2007;Zilitinkevich and Baklanov, 2002). In an ideal SBL case, the θv inversion is at its 215 strongest near the surface and transitions to the free atmosphere (nearly constant or gradually increasing θv with altitude) above the SBL, with no entrainment zone (Stull, 1988). The ABL height is then identified as the altitude of the shift from the surface based θv inversion to the free atmosphere (Stull, 1988). In reality, the structure of a SBL is often not that simple, and the height of a SBL can be difficult to identify based on θv alone (Stull, 1988;Zhang et al., 2014). SBLs in our dataset often included a weaker surface based θv inversion capped by a layer of enhanced stability 220 (stronger θv inversion), reminiscent of an entrainment zone. ABLs with this structure form as the near-surface atmosphere fluctuates between weakly stable and near-neutral (Brooks et al., 2017). In more difficult cases such as https://doi.org/10.5194/amt-2021-383 Preprint. Discussion started: 3 January 2022 c Author(s) 2022. CC BY 4.0 License.
these, the top of the SBL can be better determined by supplementing the θv profile with the RH and mixing ratio profiles, which usually have an obvious transition at the top of the ABL (Dai et al., 2014), as well as the Rib profile (Zhang et al., 2014).

225
Bulk Richardson number is the ratio between buoyantly (from thermals) and mechanically (from wind shear) produced turbulence (Sivaraman et al., 2013). The buoyancy term can also indicate that stability is suppressing turbulence.
Therefore, Rib can help to identify the top of the ABL under the assumption that turbulence ceases above of the ABL (Stull, 1988), and thus, Rib will exceed a critical value (typically 0.25 ;Stull, 1988) at the top of the ABL (Seibert et 230 al., 2000). Rib is calculated at altitude, z, using the following equation from Stull (1988): where g is acceleration due to gravity, θ ̅̅̅ is mean virtual potential temperature over the altitude bin being considered, 235 z is altitude, u is zonal wind, v is meridional wind, and ∆ represents the difference over the altitude bin used to calculate Rib throughout the profile. The only way that Rib can be negative is if the value for ∆θ is negative, indicating a convective atmosphere with buoyancy-driven generation of the turbulence.
Rib values below the critical threshold (often taken as ~0.25) indicate a turbulent atmosphere, while values above this 240 threshold indicate that an already laminar layer will not become turbulent, as static stability is strong enough to suppress mechanically generated turbulence. While the critical value of 0.25 is not always the exact number that corresponds to the transition between turbulent and laminar conditions (different studies have used critical Richardson numbers ranging from as low as 0.15 to as high as 7.2 in coarse resolution models; Dai et al., 2014), low Rib is generally expected in the ABL, and high Rib is expected above the ABL. By examining Rib profiles for our DH2 245 flights, this transition from values near or below 0.25 to well above 0.25 can be identified to mark the top of the ABL.
Rib profiles are created by calculating Rib over 30 m bins, at 5 m resolution.

Subjective atmospheric boundary layer height for a CBL
To determine the ABL height of a CBL, we identify the bottom of the entrainment zone. From the surface, the θv of a 250 CBL decreases with altitude (Stull, 1988). Therefore, to identify the height of a CBL, we identify the altitude at which the potential temperature has returned to its surface value and then increases with altitude. This will not be the first altitude at which the θv increases with altitude. Rather, this will be the bottom of a layer of enhanced stability (stronger θv inversion) above. For one of the CBL cases we sampled, this altitude is also accompanied by a large increase in Rib above this altitude, and a shift from increasing to decreasing RH with height ( Fig. 2a). 255

Subjective atmospheric boundary layer height for a NBL
To determine the height of a NBL, the θv profile is first referenced, looking for the altitude at which the θv shifts from being approximately constant with altitude to increasing with altitude. This marks the shift from the mixed layer, or https://doi.org/10.5194/amt-2021-383 Preprint. Discussion started: 3 January 2022 c Author(s) 2022. CC BY 4.0 License.
ABL, to the entrainment zone above (Stull, 1988). Figure 2b shows an example of a profile with a single shift in θv 260 slope around 110 m altitude. If there is more than one shift in slope of θv within the same θv inversion, a kink (abrupt slope shift) in the RH and/or the mixing ratio profiles can be used to identify which θv slope shift indicates the top of the ABL. Typically, this kink indicates a shift from RH that is increasing or constant with altitude, to decreasing with altitude, however a few cases feature a humidity inversion (ex: Fig. 2c), meaning the kink indicates a shift from RH increasing with altitude to RH increasing even more with altitude. This spot on the profile is usually accompanied by 265 a large increase in Rib above this altitude. Figure 2c shows an example of a profile with multiple shifts in θv slope, where the shift associated with a kink in the RH and mixing ratio profiles around 100 m is selected.
The third possibility is a rarer NBL case in which there is no clear elevated θv inversion layer above the layer of constant θv with altitude. Rather, θv slowly shifts to a steady increase with altitude throughout the profile. In this case,

270
there is no clear indication from the θv profile as to where the top of the ABL is, so the determination is made entirely on the humidity and Rib profiles. Once a shift is identified where Rib goes from consistently being very small to very large that coincides with a kink in the humidity profiles, the θv profile is reanalyzed for a visually detectable (as slight as it may be) shift in slope near the altitude indicated by the Rib and humidity profiles, and we identify this as the ABL top ( Fig. 2d).

Subjective atmospheric boundary layer height for a SBL
Determining the height of a SBL is more difficult, since the low-level θv inversion extends to the surface. However, the DH2 data from MOSAiC include very few cases in which the surface based θv inversions are at their strongest right from the surface. Instead, the static stability strengthens some distance above the surface in almost every stable 280 case, likely because of surface-drag induced turbulence close to the surface. Additionally, many SBL cases are close to meeting the criteria to be a NBL. Therefore, similar criteria to those used to determine NBL height are applied for evaluating SBL height. This includes identification of the altitude at which stability is enhanced and θv increases more with altitude than it did in the layer below. This altitude usually corresponds to a kink in the humidity profiles and a shift in the Rib profile from near 0.25, to well above 0.25. Figure 2e shows a representative example where this 285 approach is applied.
For cases that do not include an obvious stability transition in the θv profile, more attention is given to the humidity and/or Rib profiles. In these cases, a slope shift in the humidity profiles is identified first, after which the θv profile is evaluated for any slight variability to identify the top of the ABL (Fig. 2f). The humidity profiles are more helpful 290 than the Rib profile in many SBL cases, as the Rib profile usually does not include any clear shift at the top of a SBL.
However, there are some limited cases in which Rib shifts from below to above critical at the top of the SBL, and in those cases this information is helpful in determining the correct height for the top of ABL.

Objective identification of atmospheric boundary layer height
After subjectively identifying the ABL height for each profile using visual techniques, several published methods for objectively determining ABL height are also applied and evaluated. Through this analysis, the usefulness of each objective method for evaluating ABL height in the central Arctic is assessed. The objective methods that are 300 implemented include the Liu and Liang (2010), the Heffter (1980), the virtual potential temperature gradient maximum (TGRDM; Dai et al., 2014), and the bulk Richardson number (Rib) methods (Sivaraman et al., 2013).
Each of these methods relies on profiles of either dθv/dz or Rib, some in combination with the θv and/or wind speed profiles. As for the Rib profiles, dθv/dz profiles are created by calculating dθv/dz over 30 m bins, at 5 m resolution. 305

Liu-Liang method
The Liu-Liang method is applied in slightly different ways depending on whether the profile includes a CBL, SBL, or NBL. Each off these different implementations are described in the following subsections. 310

Liu-Liang method for a CBL
For a CBL, the ABL height is determined as the height at which an air parcel rising adiabatically from the surface first becomes neutrally buoyant (Stull, 1988). To implement the Liu-Liang method for this type of profile, the lowest altitude, j, in which the following criterion is met is identified: where is the lowest altitude sampled by the DH2 and δu is the θv difference that represents the minimum strength of the θv inversion necessary to indicate the transition from the ABL to the entrainment zone above. δu is defined to be 0.1 K for atmosphere above an ocean/ice surface (Liu and Liang, 2010). In essence, this equation seeks to find the 320 lowest altitude at which θv exceeds its surface value by 0.1K.
Next, the lowest altitude, k, above j, where the virtual potential temperature gradient (θ̇k) is greater than θ̇r according to equation 6 below from Liu and Liang (2010) is identified: where θ̇r is the minimum strength for the overlying θv inversion necessary to identify the top of the ABL, and is specified to be 0.05 K/100 m over ocean/ice by Liu and Liang (2010). The altitude determined by this method is then considered the ABL height for a CBL. The Liu-Liang method applied to a CBL case is shown in Fig. 3a. The Liu-Liang method for determining the height of a NBL states that the height of the NBL is the altitude at which dθv/dz first exceeds 0.05 K/100 m. However, this threshold was found to be inappropriate for the current dataset, so instead a threshold of 2.5 K/100 m is applied. The first altitude at which dθv/dz exceeds 2.5 K/100 m is identified 335 and defined to be the ABL height. Figure 3b shows an example of this adapted Liu-Liang method applied to a NBL case using the threshold of 2.5 K/100m, as well as what the Liu-Liang ABL height would have been using the original threshold of 0.05 K/100m, with a dotted line.

340
There is no recognized equation to determine SBL top height accurately without observations supporting the derivation of turbulent kinetic energy profiles (Stull, 1988;Siebert et al., 2000). However, as a SBL can either be minimally turbulent due to lack of buoyancy below the ABL or more turbulent due to the presence of wind shear (Stull, 1988), Liu and Liang (2010) search for potential ABL heights associated with each scenario. Thus, for a SBL, ABL height is defined as either the top of the bulk stable (θv inversion) layer starting from the ground, or the height of the low-level 345 jet (LLJ) maximum if present, whichever is lower (Liu and Liang, 2010).
To find the height of a SBL defined by a lack of buoyancy, a local minimum in dθv/dz is identified (Liu and Liang, 2010). To be identified as the ABL height, the altitude, k, of this local dθv/dz minimum must meet one of the following two conditions outlined in Liu and Liang (2010): or [ θ̇k +40m < θ̇r and θ̇k +80m < θ̇r ]

355
where θ̇k = dθv/dz at altitude k, δ̇ = 4 K/100 m, and θ̇r is 0.05 K/100 m over ocean/ice (Liu and Liang, 2010). Under the first condition, the ABL top is positioned at the altitude where there is sufficient diminishment of the surface θv inversion. If no point in the profile meets this first criterion, the second criterion is tested, which attempts to identify the first local minimum at altitude, k, in which dθv/dz both 40 and 80 m above k are less than θ̇r. This indicates that the θv inversion has consistently diminished above altitude, k.

360
Lastly, LLJ presence is identified by searching for wind speeds reaching a maximum that is at least 2 m s -1 stronger than the layers above and below (Stull, 1988;Liu and Liang, 2010). If the altitude of the LLJ is below the ABL height determined by one of the above two conditions, the ABL height is adjusted to the LLJ height (example shown in Fig.   3d), If the LLJ is above the previously determined ABL height, then the altitude given by the above conditions is 365 preserved as the ABL height (example shown in Fig. 3c; Liu and Liang, 2010).

Heffter method
The Heffter method uses θv difference across a θv inversion as an indication of ABL height (Sivaraman et al., 2013).
For a CBL or NBL, this method is meant to determine the altitude of the elevated θv inversion marking the entrainment zone between the mixed layer and free atmosphere (Pesenson, 2003). For a SBL, this method determines where the change in strength of the surface θv inversion marks the transition from the ABL to residual layer or free atmosphere 375 above (Stull, 1988).

385
where θ is θv at the top of the θv inversion and θ is θv at the bottom of the θv inversion (Heffter, 1980;Pesenson, 2003).
To apply this method, the bottom and top of a θv inversion are first identified by determining where dθv/dz first goes above 0.5 K/100 m (for the bottom), and then again goes below 0.5 K/100 m at an altitude above the θv inversion base 390 (for the top) (Sivaraman et al., 2013). For some SBL cases, the θv inversion extends to the surface, so dθv/dz at the lowest altitude measured by the DH2 is already greater than 0.5 K/100 m. In these cases, the bottom of the dθv/dz profile is identified as the θv inversion bottom. For other cases, the θv inversion starts near the top of the profile, but dθv/dz does not go below 0.5 K/100 m again within the altitude range sampled. In these cases, θv at the inversion base and at the top of the profile are compared to see if the second criterion is met.

395
Once all θv inversions are identified, the lowest layer in which θv at the top of the inversion is at least 2 K greater than at the bottom of the inversion is chosen (Sivaraman et al., 2013). Within this θv inversion, the altitude at which θv first becomes more than 2 K greater than θv at the bottom of the θv inversion is labelled as the ABL height (Marsik et al., 1995;Delle Monache et al., 2004;Snyder and Strawbridge, 2004;Sivaraman et al., 2013).

400
This method can be applied regardless of stability regime to find ABL height (Sivaraman et al., 2013). The final dθv/dz-based method used to find the ABL height is the virtual potential temperature gradient maximum (TGRDM) method (Dai et al., 2014). Since the ABL is typically capped by a well-defined θv inversion layer (Stull, 1988), even in a weakly stable case, we expect to see a peak in the dθv/dz profile at this point. By finding the maximum in the dθv/dz profile, the altitude at which the θv inversion is at its strongest and weakens above is identified.

415
To apply this method, peaks in the dθv/dz profile where dθv/dz is at least 1.75 K/100 m greater than a minimum dθv/dz above are identified. This helps to identify robust peaks, eliminating false positives resulting from small variability in the profile. It is not required that dθv/dz at the peak be 1.75 K/100 m greater than a minimum dθv/dz below, otherwise an ABL height near the surface when there is a strong surface-based θv inversion will be missed. The ABL height is set to the altitude of this lowest peak. Figure 5 below shows an example of the TGRDM method applied to a case for 420 each stability regime.  (Stull, 1988) being the most widely accepted value, though a value of 0.5 is also often used (Sivaraman et al., 2013;Zhang et al., 2014).
To determine a viable critical value for the DH2 data, a comparison between ABL heights determined from a range of critical values (we used Ribc = 0.25, 0.5, 0.75, 1.0, 1.25, and 1.5) and the subjective ABL heights was conducted.

430
In identifying the ABL height from these different Ribc values, the level above which Rib was consistently greater than the critical value was used to indicate where turbulence was no longer able to form in a laminar atmosphere. For this dataset, four consecutive datapoints (20 m) were required to be above the critical value. The altitude of the lowest of these four consecutive points is identified as the ABL height.

435
To determine which critical value consistently gives the best estimate of ABL height, the mean and maximum absolute difference between the ABL height given by each critical value and the subjective ABL height were calculated. Based on this evaluation, the critical value of 0.5 gives the lowest mean absolute difference, followed by 0.75. These two critical values also give the lowest maximum in the absolute difference. Therefore, further ABL heights presented using the Rib method are calculated with critical values of 0.5 and 0.75. Figure 6 below shows an example of the Rib 440 method applied to a case for each stability regime. Given that only DH2 cases in which the ABL top was clearly identifiable subjectively were used in the current 445 study, then if no altitude is found to satisfy the criteria for identifying ABL height using one the above objective methods, then that objective method does not work to find the ABL height for that case.

Applying the objective methods to radiosonde profiles
As discussed above, some of the objective methods used in this study had to be modified from their original 450 descriptions to work with the Arctic UAS data. This is in part because previous implementations involved analysis of radiosonde profiles and in mid-latitude locations. To determine if these modifications are robust across sensing platforms, they are additionally applied to identify ABL height using radiosonde data from the MOSAiC expedition (Maturilli et al., 2021). To do this, data from radiosondes launched within 3 hours of each DH2 flight were used to repeat the processes outlined in section 2.3 for subjectively identifying ABL height using the θv, humidity, and Rib 455 profiles.
For each radiosonde profile, the θv (calculated from temperature, pressure, and humidity measurements) and wind speed data were used to create the same profiles that were calculated using the DH2 data, and the same routines for objectively identifying the ABL height from each method were applied (Liu-Liang, Heffter, TGRDM, and Rib). Before 460 applying the objective methods, data below 23 m altitude were removed, as the lowest part of the radiosonde profiles were found to show inaccurately warm temperatures for several cases (Maturilli et al., 2021). Additionally, in some cases, the radiosonde data showed anomalously warm measurements some distance above 23 m, which is assumed to be the result of the balloon passing through the Polarstern's exhaust plume. These measurements were adjusted by interpolating the temperature between the closest good measurements above and below where the radiosonde was 465 presumably in the ship's plume. Since applying these adjustment means that radiosonde data near the surface are not available, the stability regime for each profile is calculated by comparing θv between the lowest radiosonde measurement and 30 m above, or the ABL height if lower, to the appropriate threshold value, δs, that is equal to (0.  differ by 1-101 m. Generally, the closer in time that the DH2 and radiosonde were launched, the closer the subjective ABL height and objective ABL heights from the DH2 and radiosonde are to each other. Figure 8 shows the percent difference between DH2 and radiosonde subjective ABLs (top panel), as well as the percent difference between the DH2 and radiosonde objective ABLs for each method (bottom panel) as a function of time difference in minutes 485 between the DH2 and radiosonde launch. The best fit linear regression for each method shows that as time between the DH2 and radiosonde launch increases, the differences in ABL height increase as well. However, the increase in percent difference between subjective ABL height from the DH2 and radiosonde as time between the launches increases is not significant at the 5% significance level (probability value of 0.34). Therefore, we are confident that the ABL height does not significantly change for DH2 and radiosonde launches up to 3.16 hours apart.

495
To determine how well the different objective methods worked, ABL heights identified by each objective method were compared to the subjective ABL heights. Figure 9 shows scatter plots comparing the objective to the subjective ABL height in each case, along with the associated best fit linear regression, coefficient of determination (R 2 ), slope value, and probability value (p-value) resulting from a paired two sample T-test. In some instances, there are two DH2 flights in closest time proximity to the same radiosonde launch, so in this case, the results from that radiosonde profile 500 are plotted only once.
The R 2 value demonstrates a relationship between the objective and subjective ABL heights by explaining how much of the variation in the objective ABL height can be explained by the difference in subjective ABL height, with higher R 2 values indicating a stronger relationship. Slope values are also included to help evaluate the level of correspondence 505 between the subjective and objective ABL heights. Additionally, looking at the intercept combined with the slope value tells us whether the objective method tends to over-or underestimate ABL height. The line of best fit is included in dark blue (red) on Fig. 9a-e, which reflects the slope of the DH2 (radiosonde) datapoints written on each plot, and a line with slope of 1 and y-intercept of 0 is included in dashed black for reference. Lastly, the p-value tells us whether the relationship between subjective and objective ABL height is significant at the 5% significance level (a p-value less 510 than 0.05 indicates significance and vice versa).
Based on the DH2 data in these scatter plots, the method that gives the greatest R 2 value is the Rib method with critical value of 0. 5 (R 2 of 0.667, Fig. 9d), followed by the Rib method with critical value of 0.75 (R 2 of 0.537, Fig. 9e). These are followed closely by the Heffter method (R 2 of 0.483, Fig. 9b). The TGRDM method has the fourth highest R 2 515 value (R 2 of 0.316, Fig. 9c). The only objective method with a very low R 2 value is the Liu-Liang method (R 2 of 0.0898, Fig. 9a The radiosonde data gives a slightly different conclusion. Here, the method that gives the greatest R 2 value is the Heffter method (R 2 of 0.548, Fig. 9b), followed by the Rib method with critical value of 0.5 (R 2 of 0.392, Fig. 9d).

530
The Rib method with critical value of 0.75 and the TGRDM method have lower R 2 values (0.182 and 0.209 and Fig.   9e and 9c, respectively). As was the case for the DH2 data, the only objective method with a very low R 2 number is the Liu-Liang method (R 2 of 0.0128, Fig. 9a), which is also echoed by a slope value far from 1, of 0.25. The slope values for the rest of the methods are not as close to 1 as they are for the DH2 data, but they are all within 0.5 of 1.
The method with slope value closest to 1, and only method with slope greater than 1, is the Heffter method at 1.07, 535 indicating that this method tends to overestimate ABL height when applied to the radiosonde data used in the current study. The rest of the methods have a slope of less than 1 and positive intercept, indicating that they tend to overestimate ABL height for small ABL height, but underestimate it for large ABL height when applied to the radiosonde data used in the current study. Lastly, the p-values follow the same order as the R 2 values, with the lowest p-value found for the Heffter method (indicating the highest significance) and the highest p-value for the Liu-Liang 540 method (indicating the lowest significance). Unlike the DH2 results, for the radiosonde, the p-values for all relationships compared to the 5% significance level show that the relationship between subjective and objective ABL height is significant for every method except the Liu-Liang method, in which the p-value is greater than 0.05.
Lastly, Fig. 9f compares the subjective ABL height from the radiosondes to the subjective ABL height from the DH2.

545
The line of best fit and associated R 2 , slope value, and p-value are shown in purple, alongside a dashed black line of slope 1 and zero intercept for reference. The high R 2 value (0.779) indicates a strong correlation between the two subjective ABL heights, which demonstrates that the ABL height usually did not change much between the DH2 and radiosonde launches in each case. Interestingly, there is enhanced deviation from the line of best fit for lower ABL heights, and better agreement for higher ABL heights, indicating that the ABL height usually varied more between 550 the DH2 and radiosonde launches for shallow ABLs. The very low p-value of 2.68e-22 demonstrates the high significance in the relationship between ABL heights from the DH2 and radiosondes.

Figure 9 goes here
In addition to evaluating general agreement between subjective and objective ABL heights for each method, additional analysis was completed to assess the frequency of objective ABL height determination within a certain percent difference from the subjective ABL height. To do this, an absolute percent difference between the objective and subjective ABL in each case and for each method was determined. These results are included in Fig. 10a for the DH2 profiles, and in Fig. 10b for the radiosonde profiles. For example, just over 30% of ABL heights predicted by the Liu-

560
Liang method were within 10% of the subjective ABL height for the DH2 data. Figure 10a shows that, for the DH2 profiles, the Rib method with critical value of 0.75 predicts the highest percent of cases to be within 10% of the subjective ABL height, followed by the Rib method with critical value of 0.5.
Interestingly, the Liu-Liang method predicts the third highest percent of cases to be within 10% of the subjective ABL 565 height. However, the Liu-Liang method falls behind other methods as the percent difference range is increased above 20%. Additionally, the Liu-Liang method has the highest percent of cases in which no ABL height is found at all for the DH2 profiles. This trend indicates that, while the Liu-Liang method sometimes works to find an ABL height close to the subjective ABL height, it also fails to find an ABL height close to the subjective ABL height, or to find any ABL height, in many cases. Another important finding is that the Rib method using either critical value never fails to 570 find an ABL height and the percent of cases within each percent difference range is greater for the Rib method than that for all other methods.
The information presented in the bar graph for the radiosonde profiles (Fig. 10b) leads to a similar conclusion. As for the DH2 profiles, the Rib method predicts the highest percent of cases to be within 10% of the subjective ABL height 575 (but for this platform, the critical value of 0.5 does best) and the Liu-Liang method predicts the third highest percent of cases to be within 10% of the subjective ABL height. Additionally, as for the DH2 profiles, the Liu-Liang method performs more poorly as the percent difference range is increased. The Liu-Liang method also has the highest percent of cases in which no ABL height is found at all, followed by the Heffter and TGRDM methods, which was also true for the DH2 data. As for the DH2, there are no radiosonde cases in which the Rib method with either critical value 580 finds no ABL height. The main difference between Fig. 10b of the radiosonde data and Fig. 10a of the DH2 data is that, while the Rib method with critical value of 0.75 applied to the DH2 data was always more successful than that with critical value of 0.5 for percent difference ranges below 70%, for the radiosonde data, the Rib method with critical value of 0.5 proves to always be more successful than that with critical value of 0.75.

Figure 10 goes here
After comparing ABL height from the different objective methods to the subjective ABL height for both the DH2 and the radiosondes (Fig. 9 -10), it is found that, with the exception of the Liu-Liang method, all other methods generally provide a reasonable estimate of ABL height for both datasets, with the Rib method being most favorable. Additionally, 590 the efficacy of each method is similar for the DH2 and the radiosonde data, as is indicated by similar patterns in the scatter plots (Fig. 9) and bar plots (Fig. 10).

When the objective methods fail
While the Liu-Liang method sometimes works well, it is not consistent enough to be reliable across a wide range of 595 different profile structures. Specifically, this method often struggles to accurately predict ABL height of SBLs because the dθv/dz criteria are not met anywhere in the profile, usually because a weak θv inversion persists throughout the whole dθv/dz profile, meaning that the method reverts to using the LLJ height as the ABL height. However, the LLJ is usually above the ABL, so this predicts the ABL height to be too high (example: Supplementary Fig. S2 on 24 March at 12:09 UTC). If there is no LLJ in the profile and a weak θv inversion persists throughout the whole dθv/dz 600 profile, then the Liu-Liang method will not identify any ABL height (example: Supplementary Fig. S29 on 30 April at 14:07 UTC). In most NBL cases, the Liu-Liang method is very successful, but it fails when there is a gradual transition between the ABL and the entrainment zone, as opposed to a sharp change in θv slope at the top of the ABL (example: Supplementary Fig. S50 on 17 July at 13:30 UTC).

605
Any of the other objective methods would be a good choice for objectively determining ABL height for a dataset similar to the DH2 and radiosonde datasets (high resolution profiles in the central Arctic environment). However, each method still struggles in some situations, which leads them to either over-or underestimate ABL height. The Heffter method sometimes estimates an ABL height above what is identified as the subjective ABL height when there is a neutral ABL with constant θv with altitude, capped by a weak θv inversion (minimal θv change with altitude). Because 610 the Heffter method identifies the ABL height as the point where θv is 2 K warmer than θv at the bottom of the θv inversion then, in such a scenario this method would identify the ABL as part way through the entrainment zone, while the subjective criteria identify the top of the ABL to be the bottom of the entrainment zone (example: Supplementary   Fig. S52 on 18 July 13:10 UTC). The Heffter method also struggles to identify the ABL height of a SBL where the ABL may extend beyond the altitude at which θv is 2 K warmer than θv at the surface (example: Supplementary Fig.   615 S1 on 23 March at 13:52 UTC) or may not extend all the way to the altitude at which θv is 2 K warmer than θv at the surface (example: Supplementary Fig. S38 on 21 June at 13:13 UTC).
The TGRDM method sometimes struggles to identify the correct ABL height, since it identifies the strongest point of the θv inversion as the ABL height. While this often occurs at the altitude identified as the subjective ABL height, it 620 also sometimes occurs part way through the entrainment zone, resulting in the ABL height estimation by the TGRDM method to be too high (example: Supplementary Fig. S60 on 22 July at 7:37 UTC). This approach also fails in cases of strong surface-based inversions, where the largest θv gradient would be at the surface but the ABL would extend some distance above the surface, as a result of mechanically generated turbulence, despite the strong static stability near the surface. While the DH2 dataset did not have any cases in which the surface-based inversion was strongest 625 right at the surface, this problem did arise in some radiosonde profiles (example: Supplementary Fig. S6 on 7 April) due to the elimination of the near-surface datapoints. It is possible that similar situations may arise when analyzing other datasets, especially in the middle of wintertime in the Arctic.

640
Overall, the objective methods are more likely to agree with each other as well as with the subjective ABL for cases with more simplistic structures, such as those with strong potential temperature inversions with a base at or just below the top of the ABL, those with LLJ core altitude at or just above the top of the ABL, and those with consistently and somewhat gradually increasing θv with altitude above the entrainment zone.

Summary and conclusions
By comparing subjective ABL heights identified visually in θv, humidity (both RH and mixing ratio), and Rib profiles to objectively determined ABL heights, the performance of several different published methods, including the Liu-Liang, Heffter, TGRDM, and Rib methods, are evaluated across 65 DH2 profiles. When comparing objective to 650 subjective ABL height for each DH2 case, the method that appears to be most successful (combination of high R 2 value, low p-value, and slope close to 1) is the Rib method with either critical value (Fig. 9). When calculating the percent of DH2 cases in which the objective ABL height is within certain percent difference ranges from the subjective ABL height, the Rib method with a critical value of either 0.5 or 0.75 is also most successful (Fig. 10). The Heffter and TGRDM methods also produce reasonable results according to Fig. 9 and 10. The only objective method that 655 consistently fails at accurately identifying ABL height is the Liu-Liang method.
In the process of applying these different objective methods to the DH2 data, some threshold and qualifying values were modified to be better applicable to the UAS dataset. While these adjustments were made to best suit the 65 DH2 profiles analyzed in this study which occurred between March and July of 2020, these adjustments should yield better 660 results for identifying ABL height during any season and location in the central Arctic. We hypothesize this because the ABL structures sampled by the DH2 in the current study were diverse and encompass the variety of ABL structures commonly observed in the central Arctic (which are typically shallow and either stable or neutral) throughout the entire year. Additionally, since the locations of the DH2 flights in this study range from deep in the sea ice pack to near the sea ice edge, we are confident that the adjustments made will be applicable for identifying ABL height in 665 either environment. Testing these adjustments outside of the 65 DH2 flights, the modified techniques were also applied to the radiosonde profiles closest in time to each DH2 flight, to determine if the methods work similarly on data from another sensing platform. To test this, the processes of subjective and objective ABL height identification described for the DH2 670 profiles were repeated on the radiosonde profile closest in time to each DH2 flight. Radiosonde profiles closest in time proximity to the DH2 flights were used under the assumption that the ABL structure would be minimally changed between the launch of the two platforms (supported by Fig. 8), and thus applying the methods of subjective and objective ABL height detection would lead to a similar conclusion. For the radiosonde data, the Heffter and Rib methods prove most successful for the radiosondes in terms of having a high R 2 value, low p-value, and slope closest 675 to 1 when compared to the other objective methods (Fig. 9). Additionally, the Rib method also proves most successful when looking at the percent of cases in which the objective ABL height was within different percent difference ranges for the radiosondes, as it did for the DH2 (Fig. 10). Once again, the only method that consistently provided unfavorable results is the Liu-Liang method. These similar conclusions demonstrates that the adapted objective methods are indeed robust across platforms.

680
These findings show that no one method works well 100% of the time. Given this, the best way to accurately identify ABL height across a variety of conditions in the Arctic atmosphere is to visually analyze the θv, humidity, and Rib profiles for each case individually. However, in the case of large datasets that require automated processing techniques, the current study reveals that the Rib, Heffter, or TGRDM methods are most suitable for such a task, with the preferred 685 method being the Rib method with critical value of 0.5. The Liu-Liang method does not provide consistent results in accurately identifying Arctic ABL heights in many cases, especially SBLs ( Fig. 9 -10). The most common occurrence of failure of the objective methods exists for NBLs capped by a weak θv inversion, so that a clear θv slope change between the ABL and entrainment zone is difficult to find. In such cases, the Rib method was found to be most reliable for identifying the ABL height.

690
The methods and results of this study for stability regime and ABL height identification are currently being applied to the entire year of radiosonde data collected during the MOSAiC expedition (October 2019 -September 2020) to create a data product containing year-long statistics on ABL characteristics in the central Arctic. Additional metrics, such as low-level jet height and speed and temperature inversion depth and strength will be included in this product for 695 eventual publication. Additional value from the DH2 data and methods used in the current study comes from the uniqueness of the location and timing of the profiles collected, so these data provide a unique opportunity to evaluate any additional ABL height detection schemes that were not addressed in this study, or that have yet to be developed.  Table 1: List of variables previously used to identify ABL height, as well as the associated literature in which each variable was referenced.

875
The right panel in each case shows the dθv/dz profile, and the ABL height identified by the Liu-Liang method is marked with a green line and the altitude written below the legend. This green line is also included on the θv profile. For the two SBL cases (bottom), a middle panel showing the wind speed profile is included, indicating the location and altitude of the LLJ, if one is present. (a) For the CBL case, a thin black line at dθv/dz = 0.05 K/100 m is plotted on the right panel. (b) For the NBL case, a thin black line at dθv/dz = 2.5 K/100 m is plotted on the right panel.

880
Additionally, a dashed black line using the original Liu-Liang threshold value of dθv/dz = 0.05 K/100 m is plotted on the right panel, as well as a dashed green line at the ABL height associated with using this threshold on both panels. (c) For a SBL case, the ABL found by the Liu-Liang methods is determined using the first buoyancy forcing criteria.
(d) For another SBL case, the ABL found by the Liu-Liang methods is determined as the altitude of the LLJ. For both SBL cases, a thin black line at dθv/dz = 0.05 K/100 m is plotted on the right panel.