Assessment of Mixed-Layer Height Estimation from Single-wavelength Ceilometer Profiles

An assessment of differing ::::::: Differing : boundary/mixed-layer height measurement methods was performed :::: were ::::::: assessed : in moderately-polluted and clean environments, with a focus on the Vaisala CL51 ceilometer. This intercomparison was performed as part of ongoing measurements at the Chemistry And Physics of the Atmospheric Boundary Layer Experiment (CAPABLE) site in Hampton, VA ::::::: Virginia and during the 2014 Deriving Information on Surface Conditions from Column and Vertically Resolved Observations Relevant to Air Quality (DISCOVER-AQ) field campaign that took place in 5 the Denver, CO area ::: and :::::: around ::::::: Denver, :::::::: Colorado. We analyzed CL51 data that was :::: were collected via two different methods (i.e. via the BLView software, which applied correction factors, and simple terminal emulation logging) to determine the impact of data collection methodology. Further, we evaluated the STRAT ::::::::: STRucture :: of ::: the ::::::::::: ATmosphere :::::::: (STRAT) : algorithm as an open-source alternative to BLView (NOTE: :::: note ::: that : the current work presents an evaluation of the BLView and STRAT algorithms and does not intend to act as a validation of either). A common filtering criteria was ::::::: Filtering :::::: criteria :::: were : de10 fined according to the ∆MLH :::::: change :: in :::::::::: mixed-layer :::::: height :::::: (MLH) : distributions for each instrument and algorithm ::: and ::::: were :::::: applied ::::::::: throughout ::: the ::::::: analysis to remove high-frequency fluctuations from the MLH retrievals, and was applied throughout the analysis. Of primary interest was determining how the different data-collection methodologies and algorithms compare to each other and to radiosonde-derived boundary-layer heights when deployed as part of a larger instrument network. We determine that data collection ::::::::: determined :::: that ::::::::::::: data-collection methodology is not as important as the processing algorithm , and that 15 much of the algorithm differences may :::: might : be driven by ::::::: impacts :: of local meteorology and precipitation events that pose algorithm difficulties. The results of this study show that for LIDAR-based : a ::::::: common ::::::::: processing :::::::: algorithm :: is :::::::: necessary ::: for ::::: LIght


Introduction
The atmospheric boundary layer (ABL) is the lowermost portion of the troposphere that is directly influenced by the Earth's surface and responds to surface forcing of heat, moisture, pollutant emissions, and momentum on a timescale of 1 h or less (Stull, 1988).The ABL can be defined by a number of criteria depending on the particular interest (e.g., thermodynamic boundary layer, chemical boundary layer (CBL), aerosol mixed layer).The ABL is typically defined by thermodynamic data (i.e., potential temperature and/or skew-T plot) obtained from meteorological sondes.While meteorological sondes have excellent vertical resolution, the temporal resolution is generally poor, ongoing regular sonde launches are labor intensive, and coverage is limited.Conversely, mixed-layer heights (MLHs), as calculated from backscatter light detection and ranging (lidar) instruments, provide both excellent vertical and temporal resolution.Typical analysis of lidar data involves identification of gradients within the aerosol profile (Brooks, 2003), which is generally considered to be a marker for the MLHs.With respect to air quality, the top of the ABL often acts like a lid on the lowest layer of the atmosphere and temporarily traps the majority of near-surface anthropogenic and biogenic emissions.As a result, the vertical distribution of ambient air pollutants and associated precursors within the ABL and lower-troposphere are strongly influenced by the height of and vertical mixing within the ABL.
ABL variability complicates quantitative determination of surface trace-gas levels from a remote-sensing platform (Coen et al., 2014;Herman et al., 2009;Knepp et al., 2015;Lamsal et al., 2008Lamsal et al., , 2014;;Petritoli et al., 2004;Piters et al., 2012).Therefore, properly accounting for ABL variability from a continuous measurement system such as lidar will provide invaluable information to policy, health, modeling, and remote-sensing communities for applications sensitive to the vertical profiles of tracers (Compton et al., 2013;Martin, 2008;Scarino et al., 2014).In 2009, the United States National Research Council highlighted ABL height as a high priority observation needed to improve mesoscale predictions of air quality, short-range severe-weather forecasting, and regional climate modeling (NRC, 2009).More recently, the National Plan for Civil Earth Observation called for improved observation density and sampling of the boundary layer (NSTC, 2014).In 2015, as part of the revisions to the ozone (O 3 ) National Ambient Air Quality Standards, the US Environmental Protection Agency (EPA) finalized a new requirement under the Photochemical Assessment Monitoring Stations (PAMS) program for the collection of continuous MLH observations.By 2019, the PAMS program will have involved the implementation of approximately 50 air-quality sites in the United States that provide continuous MLH.Kotthaus et al. (2016) showed that intercomparison of ceilometer data is not a straightforward endeavor.An intercomparison of ceilometer instrumentation was carried out in support of upcoming PAMS monitoring requirements.Results from an intercomparison of three backscatter lidar instruments from the 2014 DISCOVER-AQ field campaign in Colorado (low aerosol load) and the Chemistry and Physics of the Atmospheric Boundary Layer Experiment (CAPABLE) site at NASA's Langley Research Center (LaRC; moderate aerosol load) in Hampton, Virginia are presented herein.

CL51
The Vaisala (Vantaa, Finland) CL51 ceilometer is a singlewavelength (eye safe Class 1M InGaAs diode laser emitting at 910 ± 10 nm, pulsed at 6.5 kHz with a 110 ns pulse width with average pulse power of 19.5 mW, and an avalanche photodiode detector centered at 915 nm), single-lens, lidar system originally designed to report cloud-base heights and visibility.More recently, ceilometers have been used to estimate MLHs (Emeis and Schäfer, 2006;Emeis et al., 2008a, b;Haeffelin et al., 2012;Morille et al., 2007;Schäfer et al., 2012Schäfer et al., , 2013;;Schween et al., 2014;Sokol et al., 2014;Wiegner et al., 2014).These ceilometers have a 10 m vertical resolution (with 10 m overlap) up to a maximum altitude of 15.4 km (± greater of 1 % or 5 m precision, all altitudes are with respect to ground level) and up to 2 s temporal resolution (depending on the control software), though profiles are generally averaged over 16-36 s to improve the signalto-noise ratio (see Sect. 3.1 for more details).An example backscatter plot that includes increased signal at 3 km due to transport of smoke from a Canadian forest fire is presented in Fig. 1.
The CL51 was designed to operate continuously, regardless of meteorological conditions, in an autonomous manner with minimal user support.Due to the emission wavelength's proximity to the near-infrared water vapor bands, ceilometers operating at the stated wavelengths experience water vapor interference, thereby lessening their utility in retrieval of aerosol optical properties.However, the interference on aerosol profile and MLH estimation is negligible (Wiegner et al., 2014).
Two CL51s were deployed as part of the 2014 DISCOVER-AQ mission in Colorado (Golden, and Erie, Colorado).Before and after deployment, these ceilometers were set up to continually collect data at the CAPABLE site and the EPA Ambient Air Innovative Research Site (AIRS) in Durham, North Carolina.The ceilometers were collocated with meteorological sonde (met-sonde) launch sites during the DISCOVER-AQ campaign and at the CAPA-BLE site, allowing a direct intercomparison of the sonde and lidar ABL/MLH methodologies.Furthermore, during the DISCOVER-AQ campaign the ceilometers were collocated Atmos.Meas. Tech., 10, 3963-3983, 2017 www.atmos-meas-tech.net/10/3963/2017/with other lidar instruments.Intercomparisons are presented in Sect. 5.

Full-profile collection
The Vaisala standard MLH retrieval is based on a proprietary wavelet/gradient technique built into the logging/analysis software BLView.The BLView software provides not only logging and data analysis (e.g., MLH and cloud-height estimates) but also archiving capability.While the CL51 reports backscatter up to 15.4 km, BLView truncates the datacollection at 4.5 km, precluding the ability to monitor uppertroposphere/lower-stratosphere transport of aerosol, smoke, or ash from major events.Therefore, a full-profile collection method that can run side by side with the standard datacollection software was developed and implemented.Data transmission from the ceilometer to the logging computer was achieved by splitting an RS-232 connection into two ports on the logging computer: one port logging to BLView and the other logging to a custom script (e.g., as written in Python).The primary drawback of using a secondary script to log the full profile (as opposed to logging in BLView) is the inability to apply proprietary calibration coefficients that are built into the BLView software to the logged data.However, as shown in subsequent sections, this impacts neither the MLH estimates nor the general profile shape substantially.

Micropulse lidar
Elastic lidar observations were performed using a Sigma Space (Lanham, Maryland) Micropulse lidar (MPL), previously described by Spinhirne (1993) and Welton et al. (2000).Briefly, the MPL transmitter consists of an eye-safe Nd:YLF laser emitting at 527 nm and pulsed at 2.5 kHz with a pulse power of 6-10 µJ.It has a software programmable vertical resolution, with possible values of 15, 30, and 75 m (up to 25 km), and temporal resolutions ranging from 1 s to 15 min.The receiver consists of a 178 mm telescope that collects the backscattered light, which is then focused onto a photoncounting silicon avalanche photodiode (APD).The APD output is recorded by a field programmable gate array data system that enables display and storage of range-dependent average count rates on a laptop computer.The raw data are converted to aerosol attenuated backscatter, correcting for instrumental factors such as detector dead time, geometrical overlap, background subtraction, and range-squared normalization.Recorded lidar profiles have temporal and vertical resolutions of 1 min and 30 m, as set by the University of Maryland Baltimore County (UMBC) team for the DISCOVER-AQ campaign.MPL is used for continuous recording of aerosol profiles and optical properties, and calculating MLH values.

Meteorological sondes and ozonesondes
A meteorological sonde (herein referred to as sonde/radiosonde) is the conventional method for measuring temperature, pressure, and humidity throughout the atmosphere, and for characterizing the ABL.Radiosondes were used to identify steep gradients within the potential temperature (theta) profile (Fig. 2a) as identified by the Heffter criteria shown in Eqs. ( 1) and ( 2) where is potential temperature in Kelvin, Z is altitude in meters, and top and base refer to the potential temperature at the top and bottom of the proposed inversion layer as described in (Heffter, 1980;Marsik et al., 1995).This thermodynamic ABL is a product of atmospheric turbulent kinetic energy and lapse rate.Similar gradients can be seen in chemical and aerosol profiles as well (Fig. 2b and c).For the current study, radiosondes from International Met Systems were used (iMet; Grand Rapids, Michigan) and ozonesondes from Droplet Measurement Technologies (DMT, now En-Sci; Boulder, Colorado).iMet sondes require no preparation and were used as received from the manufacturer, while ozonesondes were conditioned according to the procedure defined by the World Meteorological Organization recommendations (Smit, 2013).
Results of numerous analyses have been published to illustrate differences between the various chemical and meteorological sensors and to show how differing meteorological sensors influence secondary chemical measurements such as ozone (Deshler et al., 2008;Dirksen et al., 2014;Johnson et al., 2002;Miloshevich et al., 2004;Nash et al., 2006Nash et al., , 2011;;Smit, 2013;Stauffer et al., 2014).While these influences can impact the derived CBL, the ABL and MLH remain unperturbed.Therefore, the remainder of the current work focuses on the MLH and ABL, with CBL variability regarded as outside the current scope.

BLView
BLView makes use of variable time and altitude averaging when calculating the MLH.Typical averaging time ranges from 14 min at night to 52 min during clear-sky, daytime conditions and is automatically adjusted within the software according to signal-to-noise ratio.Altitude averaging varies with altitude and ranges from 80 m near the surface to 360 m above 1.5 km.Further, BLView selectively removes falsepositive MLH identifications by requiring a minimum number of similar MLH values (±140 m) within the last several www.atmos-meas-tech.net/10/3963/2017/Atmos.Meas.Tech., 10, 3963-3983, 2017 minutes and has the ability to discriminate between MLH inversions and changes in backscatter intensity induced by clouds, precipitation, and fog.
Advantages of the BLView software are the standardization of retrieval parameters and a user interface that provides flexibility in setting user-specified sensitivities.These come at the cost of a database system that makes access to raw data difficult and makes it impossible to batch process archived data, posing a severe limitation on reprocessing data sets with a long record history.

STRAT
The STRucture of the ATmosphere (STRAT v1.04) algorithm was developed under a GNU General Public License to analyze aerosol vertical profiles as measured by lidar and to estimate cloud heights and aerosol MLHs from a variety of lidar instruments.It is currently in use by the European Aerosol Research Lidar NETwork (EARLINET) (Haeffelin et al., 2012;Hirsikko et al., 2014;Morille et al., 2007;Pappalardo et al., 2014).STRAT uses a covariance wavelet technique (CWT), of which the full details can be found in Morille et al. (2007) and Haeffelin et al. (2012).STRAT can be run exclusively in MATLAB or a combination of MATLAB and Python.Due to its widespread use throughout the European network it is considered here to be a viable open-source alternative to BLView.
While BLView provides limited user control of the retrieval process, which is beneficial with regard to standardizing the retrieval process across a network, STRAT provides a significantly greater amount of user control.Such control is desirable since retrieval parameters in a heavily polluted region will likely be different from those in a clean environment.Further, STRAT is provided as raw scripts as opposed to BLView's compiled executable, making the STRAT platform independent and highly user configurable.STRAT can also run batch jobs, which is useful when reprocessing data from instruments that have a long record history.
The STRAT algorithm implements a user-defined normally distributed weighting function in both the temporal and vertical domains to smooth the data, similarly to BLView.In the current study, the STRAT averaging time and vertical resolution were set to match the BLView settings as much as possible for intercomparison.An analysis of how well the two MLH algorithms agree is presented below.

UMBC algorithm
The UMBC algorithm was developed independently for estimating MLHs from lidar backscatter profiles using a CWT similar to STRAT.The STRAT software was designed specifically for single-channel lidars (primarily ceilometers) and is not readily customizable to other lidar systems, such as the MPL.The UMBC algorithm was designed to be more flexible than STRAT in that regard and uses a CWT to identify the sharp gradient changes indicative of the MLH (Davis et al., 2000;Brooks, 2003).A detailed description of the UMBC algorithm has been published in Compton et al. (2013).

CAPABLE site
The CAPABLE site was established at LaRC, in the greater Hampton Roads region (a group of cities in coastal Virginia, also known as Tidewater Virginia: Virginia Beach, Norfolk, Chesapeake, Newport News, Hampton, Portsmouth, Suffolk, Poquoson, Williamsburg), to continuously monitor air-quality and meteorological parameters to bridge the gap between satellite observations and ground conditions (i.e., where pollutants directly impact living organisms), improve applicability of satellite data to the air-quality user community, and act as a long-term satellite validation site.CAPA-BLE has a suite of in situ and remote-sensing instruments, including a CL51 ceilometer and sounding station.These instruments allow thorough sampling of the atmosphere to provide valuable in situ and profile information within the lower troposphere in a highly complex (due to bay-breeze events; see Martins et al., 2012) and moderately polluted (NO x , SO 2 , aerosols) environment, yielding valuable satellite ground-truthing and model a priori estimates.CAPABLE (37.103 • N, 76.387 • W, 5 m a.s.l.) is located on a peninsula between the James River to the southwest, the Chesapeake Bay to the north, and the Atlantic Ocean to the east.The Hampton Roads region can be described as moderately polluted.Aerosol statistics (PM 2.5 and aerosol optical thickness (AOT) recorded by a sun photometer within the AERosol Robotic NETwork (AERONET) as described by Holben et al., 1998) are presented in Table 1.The data show that AOT loads at CAPABLE are significantly higher than at the corresponding Colorado sites, particularly in the lower size distributions (i.e., lower wavelengths in Table 1).As part of FRAPPE, the University of Wisconsin's (UW) Space Science and Engineering Center trailer, which housed a high spectral resolution lidar and from which regular sonde launches were performed, was stationed at the site.The UW trailer temporarily housed a CL51 during the mission.Due to the proximity of the UW trailer, both ceilometers experienced the same chemical, aerosol, and meteorological conditions.

Golden, Colorado
CL51 data were collected at the Golden, Colorado site (39.750• N, 105.183 • W, 1850 m a.s.l.) (considered to be a clean environment compared to CAPABLE; see Table 1) from 14 July to 12 August 2014 as part of the DISCOVER-AQ field mission.The Golden site was located next to the National Renewable Energy Laboratory (NREL) on Table Mountain mesa (a flat-topped geographic structure).Due to the site's elevation on the mesa and its limited emissions sources, conditions at the Golden site were generally clean from an aerosol perspective and did not typically experience a well-developed ABL/ML.This is demonstrated in Fig. 3 by the lack of structure in the diurnal MLH profile.While both the BAO and CAPABLE sites demonstrate the expected nocturnal low and daytime high MLHs, the Golden diurnal variability is not as well defined, consistent with ABL development in mountainous terrain (Banta, 1984;Tripoli and Cotton, 1989;Bossert et al., 1989;Bossert and Cotton, 1994).
The Golden site housed the US EPA trailer, the LaRC ozone lidar, MPL and LEOSPHERE ALS-450 lidar operated by UMBC, a SOnic Detection and Ranging (SODAR) instrument operated by Millersville University (MU), and regular met-sonde launches from the MU group.

Analysis
Lidar data collected during the DISCOVER-AQ campaign had sampling times that ranged from 36 to 60 s, while sondeprofile data had average measurement times of 1 s.Due to the nature of sounding data sets, sonde-based ABL's were not averaged to 5 min resolution.To harmonize lidar data sets to a common time frame, the data were averaged to 5 min resolution unless otherwise specified.Further, it is well known that the atmosphere changes throughout the day due to surface heating, etc. (hence, driving ABL variability).Therefore, some of the analyses were broken into 4 h segments to remove biases caused by time-of-day influences.Since the primary objective of this assessment was to understand how the CL51 MLH compared with other instruments and methods, all analytical results are presented in relation to the CL51.
The analysis was performed using several ceilometer MLH products for a thorough comparison of instruments (CL51, MPL, and met-sondes), collection methods (allowing BLView to collect profile data with application of calibration factors vs. logging raw data with a custom Python script), and data-processing algorithms (BLView vs. STRAT and custom MLH scripts from UMBC).Assessment of data-acquisition methodology is presented first, followed by a comparison of MLH retrieval algorithms applied to data collected by a single instrument, and then a comparison of the various instrumentation.As MLH variability follows a distinct diurnal cycle as shown in Fig. 3, all dates and times are presented in local standard time.

Data acquisition
Data-acquisition methods were analyzed to determine whether the CL51 data-logging methodology influenced the MLH estimate.As described above, CL51 profile data were logged using two methodologies: BLView and a custom Python routine.The BLView software has the advantage of applying the ceilometer's calibration factors and precondi-Atmos.Meas. Tech., 10, 3963-3983, 2017 www.atmos-meas-tech.net/10/3963/2017/tioning the profiles (here referred to as BLView; note, however, that this refers to the backscatter-profile that is logged by BLView and not the BLView-calculated MLH), while the Python script logged the raw incoming data stream up to the full profile (FP) height (i.e., 15.4 km).The question was, does application of the lidar calibration factor influence the MLH estimate?This question is addressed in Sect.5.1.2,but first, viable filtering criteria to remove spurious MLH fluctuations from the data set were developed prior to analysis, as discussed in Sect.5.1.1.

Filtering criteria
Regardless of the data-acquisition method (i.e., BLView or Python), pragmatic data-selection criteria were needed for quality control.Since ABL and MLH variations occur in a generally smooth manner, it is expected that the variance within a short time interval will be minimal, and that any larger variance is indicative of other events (e.g., precipitation, frontal systems, window contamination).Therefore, cutoff criteria for implementing data filtering were identified.This portion of the analysis was conducted first, because application of these cutoff criteria will influence the data acquisition comparison (i.e., BLView-corrected data vs. raw data collected via the Python script).
Despite the atmosphere's smooth variation in ABL and MLH, these parameters do change substantially over long periods of time (e.g., an hour or day), with SDs significantly increasing over the longer time periods and during rapid transition events.Therefore, the current analysis was performed on short-time-series data (i.e., MLH resampled to 5 min resolution) to eliminate bias caused by natural low-frequency changes.Figure 4 shows a series of percentile plots for data collected at LaRC (N > 30E5), where the SD of MLH was calculated over 5 min intervals and subsequently averaged to provide mean SD every 4 h.This figure elucidates the variability of the MLH SD for both collection methods and algorithms.Except for the afternoon period (12:00-19:00, local time) when the variability is slightly increased, 85 % of the data fall within one SD (≈ 0.20 km) regardless of time of day.Therefore, data with a 5 min SD greater than 0.20 km were removed from subsequent analysis (labeled "filtered").Data with a relative SD greater than or equal to 20 % were also removed.Implementation of these filter criteria removed up to 10 % of the data at each site.This filtering method is further supported by observing the variability in the BLView and Python-collected data sets (both processed in STRAT) in relation to backscatter curtains (Fig. 5), where it is observed that much of the difference between the BLView and Python-collected data occurs during times of high variability or precipitation (e.g., 19:00-24:00 in Fig. 5).During such events, neither collection method is expected to provide valid MLH estimates; rather, to overcome such discrepancies, if possible, the MLH algorithms must be adjusted accordingly.

Collection method dependence
To determine whether the data-collection method influenced MLH estimates, both BLView and Python-collected backscatter profiles were processed on a common algorithm (STRAT) using identical input configuration files.Both the BLView and FP profiles were processed using the STRAT algorithm as described in Sect.3.2, followed by a 5 min block average.
The data were replotted as correlation plots with the z axis being representative of the immediate data density (a dimensionless value that has been scaled to 1).The data density was calculated by implementing a Gaussian-based kerneldensity estimation (Scott, 1992;Silverman, 1986) as supplied in Python's scipy.stats.kdemodule, represented mathematically in Eqs.(3)-( 5), where X is the 2 × n vector of the x and y vectors (i.e., flattened and stacked atop one another), n represents the number of points within each data set (assuming data sets are of equal length), f is the Scott's factor (n −1 d+4 ), d is the number of independent data sets analyzed, and Eq. ( 5) is evaluated over the range 1 to n.As these density values are used as weights in subsequent calculations, the output vector is labeled w here.It is observed that the major-00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 00:00 ity of MLH estimates fall along the 1 : 1 line (center column in Fig. 6), though there is significant scatter along both axes.
Figure 6 was divided into 4 h blocks to identify any timeof-day dependence.The figure shows that most of the data continued to fall along the 1 : 1 line regardless of time of day, as indicated in the CAPABLE and BAO Tower density plots.The Golden site displays some disruption in the 16:00-19:59 panel, but the source of this discrepancy is currently unknown.It has become clear, however, that the meteorology at the Golden site is different from that observed at CAPA-BLE and BAO Tower.It is suggested that this difference is primarily driven by orographic perturbations as well as the Golden site's location atop a mesa, both of which can inhibit formation of stable ABL and ML (Bossert et al., 1989;Bossert and Cotton, 1994;Tripoli and Cotton, 1989).
For regulatory and modeling applications, 1 h averages are standard, requiring the data to be averaged down to 1 h resolution.The impact of the filtering criteria and resampling to 1 h resolution throughout the day can be seen in Fig. 7.Note that the density of data around the 1 : 1 line is readily apparent in Fig. 7; therefore the z axis has been converted to relative SD to show the relative variability within each 1 h time block, after application of filtering criteria.The intention is to provide some understanding of how much the Table 2. Summary of aggregate statistics for the Python-collected (FP)/STRAT-processed and the BLView-collected (BLV)/STRATprocessed MLH estimates (y and x, respectively).Data were resampled to 5 min resolution followed by application of filtering criteria to both data sets (lines labeled 1 h present statistics after data were filtered and subsequently resampled by a 1 h block average).Values in parentheses indicate percent of the difference value with respect to the BLView-derived MLH.MLH will change within the model and regulatory applications' time frame.Table 2 presents statistics on the aggregate analysis.While the aggregate coefficients of correlation and line-of-best-fit (LOBF) equations do not change substantially after resampling to 1 h blocks, the scatter is dramatically reduced.This is likely due to the scatter being evenly distributed around the 1 : 1 line and the majority of data points falling along the 1 : 1 line, as observed in the data-density panels of Fig. 7.
It can be concluded from the current analysis that the majority of variability was driven by local atmospheric fluctuations and events that cannot be readily accounted for within the algorithms.In addition, no significant difference is observed between the BLView-and Python-collected data sets on the timescales relevant to model inputs and atmospheric variations when processed on a common algorithm.Findings presented in Sect.5.1.3further support this conclusion.Logged in BLView (km a.g.l.) Figure 7. Same data set as in Fig. 6, but with the data resampled to 1 h means after application of filtering criteria.Due to the sparseness of the data compared to Fig. 6, there is no need to present data density on the z axis.Here, the z axis represents relative SD to show the relative variability within each 1 h block after filtering.

MLH algorithm dependence
In the previous section, the data collection method (i.e., Python vs. BLView) was shown to have little impact on the derived MLH values when the two data sets were processed using a common algorithm (STRAT).The question remains of how the two data sets compare when processed in different algorithms.To answer this question, data collected with the Python script were processed using the STRAT algorithm and were compared with data collected and processed with BLView.
Figure 8 presents scatter plots similar to those in Fig. 6, but with data collected and processed using the two different methods.Most data continued to fall along the 1 : 1 line, as shown in the density plots, and much of the scatter is caused by short-term variability.However, in contrast to Fig. 6, the scatter is neither as evenly distributed nor as tightly grouped around the 1 : 1 line.The STRAT-derived MLHs were generally lower than those calculated in BLView (given by the slopes) at all sites, while the aggregate mean difference shows the opposite for the Colorado sites (Table 3), which is likely driven by outliers.
The agreement between the two data sets is less than when a common algorithm was employed (Table 3).Despite the increased scatter, a significant subset of data remains along the 1 : 1 line.As a test for how well the data fit the 1 : 1 line, the R and LOBF values were recalculated using Eq. ( 5) with weights applied according to data density.Therefore, points that had a greater number of surrounding data points received more weight, while more isolated points received less weight.Weighted coefficients of correlation were calculated using Eq. ( 6), where variables with a w subscript indicate weighted means.Weighted regressions were performed by simultaneously solving the modified normal equations of regression shown in Eqs. ( 7) and ( 8) with weighting factors applied.
These weighted statistics are not included to suggest that the agreement has actually improved (R), nor do they suggest improved predictability (LOBF).Rather, the improved R values and slopes reflect the degree to which the data are predominantly distributed around the 1 : 1 line to the exclusion of other regions.As an example, the improvement in the Golden regressions, despite weighting, is notably less than at the other two sites.This is likely due to more spread in the data, which mitigates the influence of the points along the 1 : 1 line in the regression analyses.Therefore, the preponderance of the data collected at the CAPABLE and BAO Tower sites falls nearer the 1 : 1 line when processed using the different algorithms compared to the data collected at the Golden site.Further, despite most data falling nearer the 1 : 1 line for these two sites, influences remain that neither the STRAT configuration nor the current filter methodology can account for, which likely drives the poor correlation in contrast to Table 2.This is possibly a product of how the differing algorithms handle atmospheric interferential events (e.g., precipitation, fog).The application of a filtering methodology to account for and remove these events will be the subject of future study.Finally, the analysis was repeated by using STRAT to process backscatter data collected by BLView for comparison with the BLView-collected/processed product.As concluded in Sect.5.1.2,the data collection method had little influence on the MLH estimation when both data sets were processed using a common algorithm (STRAT).Based on that conclusion, it would be expected that the current comparison would be similar to the previous comparison as summarized in Table 3.This is, in fact, what was observed.The aggregate statistics for the BLView-collected, STRATprocessed vs. BLView-collected/processed intercomparison are presented in Table 4, wherein we see similarity with Table 3.These findings further support the conclusion that data collection methods (including application of calibration factors) play much less of a role in identifying a qualitative gradient within the profile than the choice of MLH algorithm.Indeed, it can be concluded that choice and configuration of the algorithm are critical and that, for network intercomparisons, all networked lidar systems should have their data processed by a common algorithm.

Sonde intercomparison
Meteorological soundings have been a staple for profiling the atmosphere and deriving ABL heights for decades.These ABL heights are typically derived using potential temperature (e.g., using the Heffter criteria) or through analyzing skew-T, log-P plots that implement potential temperature, both of which are different from the gradient-based MLH algorithms implemented here.As ABL data are typically used in chemical transport models, it is necessary to determine how these MLH data compare to the sonde-derived ABL data collected at the three measurement locations.
Intercomparison of sonde-based ABL and ceilometerbased MLH can be complicated due to the fundamentally different nature of the two observations.Sondes provide a direct measurement of the atmosphere, while ceilometers provide Table 3. Summary of statistics for the Python-collected/STRAT-processed and the BLView-collected/BLView-processed MLH estimates.Values in parentheses indicate percent of the difference value with respect to the BLView-derived MLH, and the w subscript indicates a weighting function was applied.Data were resampled to 5 min resolution followed by application of filtering criteria to both data sets (lines labeled 1 h present statistics after data were filtered and subsequently resampled by a 1 h block average).

R
Line an indirect (i.e., remotely sensed) measurement.Therefore, care must be taken when comparing the two sets of observations.Further, the aerosol profile can be impacted by aerosol layers transported aloft, thereby offsetting the MLH estimate.
Since the sondes capture an ephemeral snapshot of the atmosphere's current conditions and traverse several kilometers in the horizontal direction due to winds, the ceilometer data were averaged over 30 min for comparison.Additionally, each measurement can be impacted by atmospheric phenomena that can affect the measurements in different ways and can in turn affect the comparison of the measurements.Metsondes can be impacted by local updrafts and downdrafts, and result in ABL estimates that are higher or lower than the time-or space-averaged MLHs.The response time of the sensors is less than 1 s, thereby minimizing offset in vertical structure.The CL51 MLH is calculated based on identification of a sufficiently steep, vertically averaged, backscatter gradient, so if there are additional aerosol layers just above the MLH, the contrast between the aerosol layers might not be strong enough for the CL51 to identify each layer or the correct altitude of the MLH.
Correlation plots for the CL51 MLH calculated via BLView compared to sonde ABL are shown in Fig. 9a-c with statistics summarized in Table 5.For all coincidence times, the CAPABLE site showed the best correlations between the CL51 and sondes.The correlation for the CL51 vs. all the Table 4. Summary of statistics for the BLView-collected/STRAT-processed and the BLView-collected/BLView-processed MLH estimates.Values in parentheses indicate percent of the difference value with respect to the BLView-derived MLH, and the w subscript indicates a weighting function was applied.Data were resampled to 5 min resolution followed by application of filtering criteria to both data sets (lines labeled 1 h present statistics after data were filtered and subsequently resampled by a 1 h block average).Herein, the comparison is limited strictly to the MLH algorithms.

R
Line sondes (N = 25) at the CAPABLE site was R = 0.79, with a similar correlation R = 0.82 (N = 22) when the filtering criteria were implemented.For daytime data, the CAPABLE site contained two early morning sondes (before 10:00 local time), with all other sondes launched between 10:00 and 16:00 local time.By late morning, ≈ 10:00 local time, the vertical dispersion of aerosols due to turbulent mixing likely resulted in a well-mixed boundary layer, so the ABL and MLH coincide in elevation, which is evident in Fig. 9a, where many of the data points fall close to the 1 : 1 line.Met-sonde data collected at the BAO Tower site showed lower correlations than the CAPABLE site (unfiltered R = 0.63, N = 16; filtered R = 0.58, N = 14), while the Golden site correlations (unfiltered R = −0.28,N = 12) appear to be strongly impacted by two morning sonde launches, which occurred during a transition period when the boundary layer was experiencing rapid growth.Upon applying the filtering criteria, the two early morning data points were removed, resulting in a much improved correlation (filtered R = 0.74, N = 10) for the Golden site.These results indicate that the CL51 might have difficultly capturing an accurate MLH dur-ing rapidly changing conditions, such as during early morning and late evening transition periods in a clean atmosphere.
It is somewhat surprising that the filtered correlation for the Golden site is better than the filtered result for the BAO Tower site, given that the BAO Tower site is situated farther to the east of the Rocky Mountains, at the start of the High Plains, which are less influenced by very local geographic perturbations, and that a similar relationship is not observed in the CL51 intercomparisons (Tables 2, 3, and 5).As a check of the met-sonde potential temperate profiles, the potential temperature data from the NASA P-3B aircraft spirals conducted over the Golden and Erie sites are shown in Figs. 12 and 13.These spirals are coincident with the launch of the met-sondes from the sites.The coincident CL51 backscatter profiles are also plotted in Figs. 12 and 13.The agreement between the radiosonde and P-3B aircraft profiles is good, indicating that the potential temperature within the aircraft spiral radius is consistent with that of the radiosonde.These figures show agreement between the potential temperature ABL and CL51 MLH by identifying the same first major gradient in the MLH data on certain days.
The STRAT-derived intercomparison with sonde ABL is presented in Fig. 9d-f, where it is observed that the agreement is significantly less than when BLView was used to calculate MLH.This disparity is caused by spurious MLH values from STRAT that are observed under two conditions: (1) during heavy cloud cover and precipitation events STRAT sometimes falsely identified the cloud deck as the MLH and completely ignored the MLH gradient 1-2 km below the cloud; (2) STRAT failed to identify a valid MLH when the atmosphere was exceptionally clean, and instead identified a stronger, spurious, gradient 2-4 km up.An example of the first type is presented in Fig. 10 where STRAT switches from properly identifying the MLH at ≈ 0.5 km to identifying the cloud deck (≈ 2.4 km) as the MLH starting around 12:00 local time and an example of the second type is shown in Fig. 11.A corresponding shift was not observed in the BLView-derived MLH for the same day, indicating that BLView has been trained to recognize these spurious events and ignore them.After removing these "false" MLH values, the coefficient of correlation between STRAT-derived MLH and sonde ABL (pre-filtering) improved for all sites to 0.82, 0.79, and 0.70 for LaRC, BAO Tower, and Golden.The results of these correlations are encouraging and are indicative of the importance of properly training the STRAT algorithm to identify and exclude these false-positive events.The downside is that, despite having a better correlation (after removing spurious events), the variance of STRAT MLH values are larger than that of BLView, indicating that defining an MLH filter criteria is dependent on the algorithm in use.However, the positive aspect of this is that the STRAT algorithm, being open source with the source code available, can, in theory, be modified by end users to identify and account for these spurious events.
Overall, all three sites show good correlation between the CL51 and met-sonde data, with MLH and ABL estimates from the sondes being, on average, higher than the CL51 MLHs (200 m (13 %), 390 m (15 %), −240 m (9 %) for CA-PABLE, BAO Tower, and Golden) as indicated in the linear regression lines plotted in Fig. 9, with the exception being the unfiltered results for Golden.

MPL intercomparison
The MPL instrument was collocated with the CL51 stationed at the NREL site in Golden, Colorado.Being a lidar instrument, it profiles the atmosphere similarly to the CL51 with the major difference being their hardware.The two instruments emit different wavelengths (CL51:910 nm, 00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 00:00 MPL:532 nm), causing the instruments to differ in sensitivity with respect to particle size and geometry.Therefore, it is feasible that the two instruments observed "different" atmospheres in a quantitative manner (e.g., AOT).However, if the ML is well mixed, then the general particle distribution and gradient will be the same, making the two intercomparable.Figures 14 and 15 shows that the agreement between the two instruments and algorithms (BLView, STRAT for CL51 profiles and UMBC algorithm-processing MPL profiles) is poor, even though a significant subset of data fall along the 1 : 1 line, as indicated by data density (z axis).The low correlation is partly driven by the invariability in one instrument compared to the other at lower MLH values (≤ 500 m).Removal of MLH below 500 m improved the coefficients of correlation for the 5 min averaged data to 0.467, 0.489, and 0.469 for BLView-derived MLH values (Fig. 14a-c) and 0.433, 0.471, and 0.368 for STRAT-derived MLH (Fig. 15ac) values.Similarly to the algorithm comparison, much of the variability between the two instruments and algorithms occurs during events that inhibit a reliable estimation (e.g., fog, precipitation) of MLH (as seen in Fig. 16).
The most commonly used statistical techniques used for comparing two data sets depended on two key assumptions: data were normally distributed and homoscedastic.The CL51 and MPL MLH 5 min averaged data sets were confirmed to be nonnormal via the Kolmogorov-Smirnov test and passed Levene's test for homoscedasticity (p value 0.39).Therefore, similarity between the two corresponding probability distributions was determined using the two-sample Kolmogorov-Smirnov test.It was determined that the 5 min averaged MPL and CL51 data sets were statistically different (p 0.01), regardless of filtering and averaging.How-ever, when considering 1 h averaged data that were filtered to remove data with large relative SDs (≥ 0.20) and MLH ≤ 0.5 km, the two data sets were statistically indistinguishable (p > 0.8).While we cannot account for the bias induced by these low-altitude MLH values it is quite clear that they significantly influence the intercomparison.Given that this is the first intercomparison of these two instruments and algorithms, it is not surprising that a significant difference was identified in this regime.

Conclusions
A CL51-focused intercomparison of different ABL/MLH methodologies was performed at three different sites that experience different meteorological, aerosol, and emission conditions.The CL51 MLH results were compared with ABL from radiosondes at all three locations as well as an MPL at the Golden, Colorado site.
Two collection methods and processing algorithms were tested for the CL51 MLH calculation.We demonstrated that the data-collection method played an insignificant role in MLH estimation when the data sets were processed using a common algorithm.Furthermore, the choice of processing algorithm played a significant role in MLH estimation.Therefore, we recommend that, for ceilometer and lidar networks, a common MLH processing algorithm be employed.Agreement between the different algorithm products might be dictated, to a large degree, by local atmospheric fluctuations and interferential events (e.g., fog), which should be a topic for future investigation.

Figure 1 .Figure 2 .
Figure 1.Backscatter curtain plot collected on 10 June 2015 when smoke from a Canadian forest fire was transported over the CAPABLE site.The smoke is observed by increased backscatter in the 2500-4000 m range.

Figure 3 .
Figure 3.Diurnal variability of the MLH at the three sites.Data were resampled to 5 min averages and filtered.

Figure 4 .
Figure 4. Percentiles for MLH SD throughout the day from the CAPABLE site.Data in panel (a) were collected and processed in BLView, data in panel (b) were collected with the Python script and processed in STRAT, data in panel (c) were collected in BLView and processed in STRAT.It is observed that variability was maximum during the afternoon regardless of collection method or processing algorithm.

Figure 6 .
Figure 6.Correlation plots for data collected at the three sites under study.Data density is presented to better understand the distribution within the scatter plots.Data were averaged to 5 min resolution, without application of filtering criteria.

Figure 8 .
Figure 8. Correlation plots for data collected at the three sites under study.At all sites the data were collected by or processed in Python/STRAT and BLView/BLView.Plots show the data density to better understand the distribution within the scatter plots.Data were averaged to 5 min resolution, without application of filtering criteria.

Figure 9 .Figure 10 .
Figure 9. Correlation plots for CL51 MLH and sonde-derived ABL estimates.Black error bars represent the spread in unfiltered data, while the red error bars represent the filtered data set.MLH values (30 min average, centered on sonde-launch time) were calculated in BLView (panels A-C) and STRAT (panels D-F) and resampled to 30 min resolution.Error bars indicate SD of the CL51-derived MLH within the 30 min period.

Figure 11 .
Figure 11.Example plot in which STRAT fails to identify a reasonable MLH due to unusually clean conditions.

Figure 12 .
Figure 12.Potential temperature and Cl51 backscatter profiles collected at the BAO Tower site.Horizontal lines indicate MLH as determined by BLView.

Figure 13 .
Figure 13.Potential temperature and Cl51 backscatter profiles collected at the Golden NREL site.Horizontal lines indicate MLH as determined by BLView.

Table 1 .
Aerosol optical thickness statistics at the three sites under study.Here, Q 1 , Q 2 , and Q 3 represent the 25th, 50th, and 75th percentiles.Data have been filtered to show only data collected during the DISCOVER-AQ 2014 field campaign period (July-August 2014).Knepp et al.:Assessment of ceilometer MLH tem Research Laboratory's (ESRL) Boulder Atmospheric Observatory (BAO) and served as a combined DISCOVER-AQ/FRAPPE ground site.The site is often referred to as BAO Tower because of the site's primary feature: a 300 m tower.BAO Tower provided a unique profiling ability for in situ samplers by mounting them on the tower for static sampling or on the carriage to collect "active" profiles.
4.2 DISCOVER-AQ and FRAPPE sitesFrom 2011 to 2014 the National Aeronautics and Space Administration (NASA) conducted the Deriving Information on Surface Conditions from Column and Vertically Resolved Observations Relevant to Air Quality (DISCOVER-AQ) Earth Venture Suborbital Mission with four field deployments.A primary objective of DISCOVER-AQ was to investigate the ability of satellite remote sensing to inform surface air quality.Since the ABL limits vertical exchange of primary pollutants and directly influences near-surface pollutant concentrations, the ABL height directly influences air quality and chemistry.Therefore, measurements during these missions focused on the vertical distribution of trace gases and aerosols within the ABL and lower troposphere as well as the diurnal variability of these distributions in conjunction with the ABL.The final DISCOVER-AQ field mission was conducted over Denver and the Front Range region of Colorado in July and August 2014, and was conducted jointly with the Front Range Air Pollution and Photochemistry Experiment (FRAPPE).4.2.1 Erie, Colorado and BAO TowerData were collected at the Erie, Colorado site (40.045•N,105.005 • W, 1500 m a.s.l.), which is considered to be a clean environment compared to CAPABLE (see Table1), from 14 July to 12 August 2014 as part of the DISCOVER-AQ field mission.The Erie site (rural community surrounded by agricultural activity) was located at NOAA's Earth Sys-T.N.

Table 5 .
Summary of statistics for the CL51/sonde MLH/ABL intercomparison, corresponding to Fig.9.Numbers in parentheses indicate sample size.Composite statistics were generated by looking at all sites as a single data set.In this table only, the filtering method for the STRAT-based MLH is based on visual identification of false MLH values due to clouds/precipitation events and unusually clean atmospheres as described in text.