Interactive comment on “ Looking through the haze : evaluating the CALIPSO level 2 aerosol optical depth using airborne high spectral resolution lidar data ”

Abstract. The Cloud–Aerosol Lidar with Orthogonal Polarization (CALIOP) instrument onboard the Cloud–Aerosol Lidar and Pathfinder Satellite Observations (CALIPSO) spacecraft has provided over 8 yr of nearly continuous vertical profiling of Earth's atmosphere. In this paper we investigate the V3.01 and V3.02 CALIOP 532 nm aerosol layer optical depth (AOD) product (i.e the AOD of individual layers) and the column AOD product (i.e., the sum AOD of the complete column) using an extensive database of coincident measurements. The CALIOP AOD measurements and AOD uncertainty estimates are compared with collocated AOD measurements collected with the NASA High Spectral Resolution Lidar (HSRL) in the North American and Caribbean regions. In addition, the CALIOP aerosol lidar ratios are investigated using the HSRL measurements. In general, compared with the HSRL values, the CALIOP layer AOD are biased high by less than 50% for AOD We estimated the CALIOP column AOD error can be expressed as ±0.05 ± 0.07 · (HSRL column AOD) at night and ±0.08 ± 0.1 · (HSRL column AOD) during the daytime. Multiple sources of error contribute to both positive and negative errors in the CALIOP column AOD, including multiple layers in the column of different aerosol types, lidar ratio errors, cloud misclassification, and undetected aerosol layers. The undetected layers were further investigated and we found that the layer detection algorithm works well at night, although undetected aerosols in the free troposphere introduce a mean underestimate of 0.02 in the column AOD in the data set examined. The decreased signal-to-noise ratio (SNR) during the daytime led to poorer performance of the layer detection. This caused the daytime CALIOP column AOD to be less accurate than during the nighttime, because CALIOP frequently does not detect optically thin aerosol layers with AOD This extensive validation of level 2 CALIOP AOD products extends previous validation studies to nighttime lighting conditions and provides independent measurements of the lidar ratio; thus, allowing the assessment of the effect on the CALIOP AOD of using inappropriate lidar ratio values in the extinction retrieval.

±0.05 ± 0.05 • (HSRL layer AOD) during the daytime.Furthermore, the CALIOP layer AOD error is found to correlate with aerosol loading as well as aerosol subtype, with the AODs in marine and dust layers agreeing most closely with the HSRL values.The lidar ratios used by CALIOP for polluted dust, polluted continental, and biomass burning layers are larger than the values measured by the HSRL in the CALIOP layers, and therefore the AODs for these types retrieved by CALIOP were generally too large.
We estimated the CALIOP column AOD error can be expressed as ±0.05 ± 0.07 • (HSRL column AOD) at night and ±0.08 ± 0.1 • (HSRL column AOD) during the daytime.Multiple sources of error contribute to both positive and negative errors in the CALIOP column AOD, including multiple layers in the column of different aerosol types, lidar ratio errors, cloud misclassification, and undetected aerosol layers.The undetected layers were further investigated and we found that the layer detection algorithm works well at night, although undetected aerosols in the free troposphere introduce a mean underestimate of 0.02 in the column AOD in the data set examined.The decreased signal-to-noise ratio (SNR) during the daytime led to poorer performance of the layer detection.This caused the daytime CALIOP column AOD to be less accurate than during the nighttime, because CALIOP frequently does not detect optically thin aerosol layers with AOD < 0.1.Given that the median vertical extent of aerosol detected within any column was 1.6 km during the nighttime and 1.5 km during the daytime, we can estimate the minimum extinction detection threshold to be 0.012 km −1 at night and 0.067 km −1 during the daytime in a layer median sense.
This extensive validation of level 2 CALIOP AOD products extends previous validation studies to nighttime lighting conditions and provides independent measurements of the lidar ratio; thus, allowing the assessment of the effect on the CALIOP AOD of using inappropriate lidar ratio values in the extinction retrieval.

Introduction
The role tropospheric aerosols play in Earth's climate forcing is complex.The direct effect of scattering of incoming solar radiation by aerosols is well understood; however, the indirect effect of aerosols is less so (Quaas et al., 2009;Lohmann and Feichter, 2005).Aerosols and their optical properties vary greatly over space and time, and satellite remote-sensing observations are the only practical way to map out global distributions of aerosol optical properties pertinent to assessing the aerosol radiative forcing effect (Kaufman et al., 2002).Typically, passive spaceborne sensors retrieve the total column aerosol optical depth (AOD), a measure of light attenuation as it is transmitted through the atmosphere.AOD is directly related to the direct and indirect effects (Yu et al., 2006); therefore, providing an accurate measurement from remote sensing is vital in assessing the radiative forcing budget.
The spatial and temporal coverage from the passive sensors do not completely characterize a scene because they typically provide little, if any, knowledge of the vertical distribution of aerosols in the atmosphere.Kaufman et al. (2002) suggested that the application of lidars is a vital component to the study of the vertical distributions of aerosols and clouds.In the recent years, space-based lidars have been used to efficiently measure aerosol vertical profiles with global coverage.The Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) instrument (Winker et al., 2007) was launched in 2006 on the Cloud-Aerosol Lidar and Pathfinder Satellite Observations (CALIPSO) spacecraft, and has now provided over 8 yr of nearly continuous global measurements of aerosols and clouds with high vertical and spatial resolution.The vertical distribution of aerosols, provided by lidar, is not only important for radiative forcing (e.g., Satheesh, 2002), but also for other applications including air quality studies (e.g., Al-Saadi et al., 2005;Engel-Cox et al., 2006), and model validation (Dirksen et al., 2009;Koffi et al., 2012).
As with any satellite sensor, validation of the CALIOP data products is critical to appropriate use of the data: random errors and known systematic errors must be taken into account when interpreting the products.Several studies have investigated the CALIOP level 1 products (McGill et al., 2007;Kim et al., 2008;Mamouri et al., 2009;Mona et al., 2009;Pappalardo et al., 2010;Rogers et al., 2011).Pappalardo et al. (2010) and Kacenelenbogen et al. (2011) also provided the validation efforts of the CALIOP level 2 aerosol backscatter and extinction profiles, showing promising results.Both Kittaka et al. (2011) and Redemann et al. (2012) demonstrated strategies for comparing the CALIOP AOD product to passive spaceborne measurements and conclude that CALIOP can quantitatively retrieve extinction on a climate scale and likely many local-scale events.Omar et al. (2013) assessed CALIOP AOD accuracies using the well-established AERONET (AErosol RObotic NETwork) measurements and, after applying a more strenuous cloud screening to the AERONET data set, found the mean difference to be ∼ 25 % (AERONET higher) for AOD less than unity.Schuster et al. (2012) found CALIOP to agree within 13 % of AERONET with better agreement, within 3 %, if dust is excluded from the analysis.Kim et al. (2014) used MODIS (Moderate Resolution Imaging Spectroradiometer) to evaluate CALIOP, finding CALIOP to be 63 % lower than MODIS.However, one limitation common to all previous CALIOP AOD investigations is that the comparisons used only total column AOD measured during daytime, when the CALIOP signal-to-noise ratio (SNR) is the lowest.Spatial mismatch between the CALIOP footprints and the AERONET sites also contributes to these differences.Lastly, Kacenelenbogen et al. (2014) used the NASA High Spectral Resolution Lidar (HSRL) data to study CALIOP AOD only over clouds.This study, on the other hand, assesses both the CALIOP layer AOD and the total column AOD, and also includes nighttime measurements.In addition, this study assesses one of the key sources of error in these AOD measurements, the lidar ratio, using the rich data set of direct measurements of lidar ratio acquired with the NASA Langley Research Center (LaRC) airborne HSRL.
An extensive validation of the CALIOP level 1 532 nm total attenuated backscatter calibration (Rogers et al., 2011) demonstrates that LaRC HSRL is an ideal validation instrument and highlights the strength in validating a spaceborne satellite with an extensive, systematic series of aircraft flights.Between 2006 and 2012, the HSRL flew more than 1000 h on over 18 field experiments on NASA LaRC King Air aircraft over a wide seasonal, temporal, and geographic range.Several of these field experiments were either focused on CALIOP validation or included CALIOP validation flights, resulting in a total of 106 CALIOP underflights as of the end of 2011.These 106 flights offer a large data set to validate the CALIOP data products over varying aerosol types and scenes.This data set is unique because, unlike ground-based lidars, the airborne lidar can be flown along the same ground track as CALIPSO, resulting in virtually no spatial offset between the HSRL and CALIOP measurements and many independent validation comparisons along each flight track.This study utilizes the extensive data set of HSRL lidar observations to investigate the 532 nm CALIOP layer and column AOD, where the layer AOD is the AOD of each individual layer detected by CALIOP, and the column AOD represents the AOD of the entire atmospheric column.
The close spatial and temporal coincidence and similar downlooking viewing geometry shared by HSRL and CALIOP are strengths of this validation effort.This study focuses specifically on areas that the passive validation studies could not examine directly, addressing the question of when and why the CALIOP layer AOD is representative of the true layer AOD or not, as well as both day and night validation.Once the limitations of the CALIOP 532 nm AOD are better understood, the next step will be to apply this validation strategy to the aerosol profile product (further discussed in Sect.2.1) and the vertical distribution of extinction as well as the 1064 nm channels, all of which are the subjects of future publications using this data set.
The instruments, data collocation, and an example case study are presented in Sect. 2. We compare the CALIOP layer and column AODs with those of HSRL in Sect.3. The CALIOP uncertainty for each layer is also investigated.We discuss the impact of CALIOP layer detection and 532 nm lidar ratio selection on the CALIOP AOD in Sect. 4 and summarize the results of the study in Sect. 5.

CALIOP
The CALIOP instrument is a two-wavelength, polarizationsensitive elastic backscatter lidar that has provided over 8 yr of global aerosol and cloud profile measurements (Winker et al., 2010).The CALIOP instrument and its initial performance assessment are described in Winker et al. (2007) and Hunt et al. (2009).The level 1 total attenuated backscatter profiles, β (z), are calibrated and geolocated on a uniform altitude grid (Powell et al., 2009) and are used to derive level 2 aerosol and cloud products through a comprehensive collection of fully automated data processing algorithms (Winker et al., 2009).The level 2 products are reported both as layer products and as profile products.In this study the version 3.01 level 2 aerosol layer product is examined (5 km minimum resolution).Specifically, the layer AOD (Feature_Optical_Depth_532) and the column AOD (Column_Optical_Depth_Aerosols_532), their corresponding uncertainties (Feature_Optical_Depth_Uncertainty_532, Column_Optical_Depth_Aerosols_Uncertainty_532), and lidar ratio parameter (Final_532_Lidar_Ratio) are examined in detail.
The CALIOP algorithms relevant to this study start with layer detection, accomplished with the selective, iterated boundary location (SIBYL) scheme, which identifies the cloud and aerosol layer heights (Vaughan et al., 2009).The cloud-aerosol discrimination (CAD) routine separates clouds and aerosols (Liu et al., 2008), which are then further separated into types based on their observed integrated attenuated backscatter (IAB), attenuated depolarization, layer height, and surface type (Omar et al., 2009;Hu et al., 2009).Each aerosol subtype (i.e., marine, clean continental, dust, polluted dust, polluted continental, or biomass burning; see Omar et al., 2009) is characterized by an extinction-tobackscatter ratio (also referred to as the lidar ratio or S a ) that represents the ratio of the aerosol extinction to the aerosol backscatter.The lidar ratio associated with each type was determined based on extensive analyses of AERONET observations (e.g., Omar et al., 2005), measurements of size distributions and index of refraction, and modeled results.The layer boundary, typing, and subtyping information are reported in the CALIOP vertical feature mask (VFM).The CALIOP extinction retrieval scheme, hybrid extinction retrieval algorithm (HERA), applies one of two different techniques to the SIBYL-defined layers in order to retrieve aerosol extinction profiles and AOD (Young and Vaughan, 2009).Choosing between the two techniques is based on the spatial distribution of clouds and aerosols in a given region.Constrained solutions for the elastic backscatter lidar equation are possible for those lofted layers where a direct estimate of the layer twoway transmittance can be obtained from the ratio of the attenuated backscatter in the clear air regions above and below the layer and then related to AOD (e.g., Young, 1995).In this case the estimated AOD is used as a constraint that enables the retrieval of the layer mean lidar ratio.When constrained retrievals are not feasible, the CALIOP scheme reverts to an unconstrained retrieval that derives extinction and AOD using a prescribed value for the lidar ratio that is defined by the aerosol type as described above.The overwhelming majority of measurements examined in this study were obtained in the lower troposphere with layers in contact with the surface and therefore used unconstrained solutions.
Estimated uncertainties in the modeled lidar ratio, lying between 30 and 50 % depending on aerosol type, are reported in the metadata of the CALIOP aerosol data products and are used to estimate extinction uncertainties.Also note that an error in the lidar ratio for the topmost layer of a multi-layer column will propagate errors to lower layers in the column.A comprehensive error analysis for the HERA algorithm and the method for computing the extinction uncertainty estimates included in the level 2 product are given in Young et al. (2013).

HSRL
The NASA Langley airborne HSRL separately measures the aerosol and molecular lidar returns via the HSRL technique (e.g., Shipley et al., 1983;Piironen and Eloranta, 1994) at 532 nm, thus providing accurate and independent measurements of the vertical profiles of aerosol backscatter and extinction.Aerosol backscatter and extinction are retrieved at 1064 nm using standard techniques (Fernald, 1984;Fernald et al., 1972).The HSRL instrument is polarization sensitive at both wavelengths.Critical to this study, the HSRL provides a direct measurement of 532 nm aerosol extinction and optical depth from the attenuation in the molecular channel from which the aerosol backscattering signal is filtered, with 60 s temporal resolution (i.e., ∼ 6 km along track), after first removing any cloudy profiles from the 60 s averaging window.Also, the independent measurement of aerosol extinction and backscatter allow direct retrievals of the 532 nm lidar ratio profiles, via the ratio of the two measured profiles.The random uncertainty in the lidar ratio for typical aerosol loading (AOD ∼ 0.2) is 9 % (Hair et al., 2008), and the 532 nm AOD values compared with established extinction and AOD measurements to within 6 % (Rogers et al., 2009).Unlike CALIOP, the HSRL instrument does not rely on any layer detection to calculate science data products.The HSRL instrument, data products, and uncertainty are further described by Hair et al. (2008).
To date the HSRL instrument has flown 106 successful validation underflights of the CALIPSO satellite.Subsets of the data set used here are described by Rogers et al. (2011) and Burton et al. (2010Burton et al. ( , 2013)).The HSRL measurements have several aspects that make them ideal for validation of the CALIOP level 2 products.First, these flights cover a large geographic and seasonal range and sample a wide variety of aerosol types (although it must be noted that that the flights have thus far been confined largely to North America and the Caribbean, and thus do not represent a global validation).Second, the HSRL technique provides a direct, calibrated, and validated measurement of AOD.For example, Ansmann (2006) found that the Klett solutions (Klett, 1981) of extinction and backscatter from ground-based and space-based elastic backscatter lidar could differ as much as 20 % in cases where the lidar ratio increases with height, suggesting that a Raman or HSRL is critical to an accurate validation of CALIOP.One advantage of the HSRL technique is that it is largely unaffected by the solar background at 532 nm, while a Raman lidar at 532 nm has a significantly lower SNR during the daytime.Third, the HSRL is on a mobile platform that can follow the CALIPSO tracks, eliminating sampling mismatches inherent to validation with groundbased instruments, e.g., when comparing a spatial average from CALIOP with a temporal average from the groundbased sensor.In substituting temporal for spatial averaging, discrepancies can be induced by terrain and meteorology as noted by Pappalardo et al. (2010).However, the exact temporal coincidence between HSRL and CALIPSO is an instantaneous moment, and thus differences in platform speeds require that HSRL cover the same track as CALIPSO in a matter of hours instead of minutes.However, careful experimental design and flight planning can ensure that the effect of this temporal mismatch on AOD is minimal.This is further discussed and demonstrated in Sect.2.5.Finally, the airborne HSRL has the advantage that there is typically very little aerosol loading within the region of incomplete overlap between the lidar receiver and transmitter (∼ 7.5-9 km above mean sea level for a typical flight), in contrast with ground-based lidars that must normally deal with the largest aerosol amounts located within the incomplete overlap region, where the uncertainties can be large (Wandinger and Ansmann, 2002).
The large data set of 106 flights used in this study is plotted in Fig. 1 and tabulated in Table 1, updated from Rogers et al. (2011).HSRL has acquired CALIOP validation data in conjunction with numerous field studies, including all of the following:  In addition, flights not associated with a specific mission were occasionally conducted during transit flights to or from NASA LaRC and other destinations (denoted by Other in Table 1.

AOD analysis
AOD is defined for a layer between altitudes z top and z base to be the vertical integration of the extinction coefficient profile, α(z): The CALIOP level 2 aerosol layer product reports AOD only over the vertical extent of detected layers (Young and Vaughan, 2009).In this paper, we will refer to this as the layer AOD, the AOD of a layer between z top and z base in Eq. ( 1), with z top and z base reported by the CALIOP layer product.The HSRL layer AOD is calculated between the same CALIOP layer top and bottom altitude boundaries after collocation and averaging of the HSRL data to the 5 km CALIOP layer product resolution.
Passive satellite measurements can usually only infer the column AOD; moreover, most do not have the vertical information described in Eq. ( 1).The column AOD is determined by setting the z top altitude in Eq. (1) to the spacecraft altitude and the z base altitude to the ground.In reality the HSRL does not sample the whole atmosphere, because the aircraft typically flies at 9 km; therefore, the HSRL column AOD is only measured from the ground up to ∼ 7.5 km (Rogers et al., 2009).Recognizing the possibility of aerosol above the aircraft altitude, we have screened for any CALIOP-detected layers above the HSRL; in the cases used here the HSRL measurement is therefore representative of the entire column detected by CALIOP.In addition, the background aerosol loading in the stratosphere is typically small (i.e., approximately in the 0.003 to 0.01 range) for the Northern Hemisphere during this study (Vernier et al., 2011;Bourassa et al., 2012), which is negligible for the purposes of this study.
For CALIOP, the column AOD involves simply integrating the aerosol extinction coefficients from all aerosol layers detected in a given location.Note that this can be different from the HSRL column AOD, which is not dependent on any layer detection.In order to investigate potential errors in CALIOP AOD due to undetected layers, a similar quantity must be defined for HSRL.To compute the HSRL layersummed AOD, HSRL extinction coefficients are integrated only over the vertical extent of all of the layers identified by the SIBYL algorithm after the collocation.
Finally, we define a quantity more native to CALIOP, the layer IAB, defined for a layer at altitudes z top to z base to be the vertical integration of the level 1 total attenuated backscatter coefficient profile, β (z), after correcting for molecular transmission, T 2 mol (z), and subtracting the attenuated molecular backscatter, T 2 mol β mol (z): • dz. (2) Note that this is slightly different from the definition in the CALIOP algorithm theoretical basis document (ATBD; Vaughan et al., 2005) due to the difficulty in estimating the aerosol transmittance for the HSRL product in the same manner as described in the CALIOP ATBD.The IAB is therefore calculated directly for both CALIOP and HSRL in the same manner, Eq. ( 2).In this process, the HSRL attenuated backscatter profiles are first scaled to the CALIOP calibration altitude by correcting for molecular and ozone attenuation between this altitude and the lower HSRL calibration altitude as performed in Rogers et al. (2011).
This paper evaluates the CALIOP AOD by considering the HSRL AOD to represent the true value.As such, we define the AOD bias to be bias which can also be expressed as a relative bias (fractional percent): relative bias = AOD CALIOP − AOD HSRL AOD HSRL (4) and then multiplied by 100 % to obtain a percentage.

Data collocation and data screening
For this comparison, the HSRL AOD values were averaged to the 5 km latitude and longitude grid defined in the CALIOP layer products.The HSRL temporal averaging applied to AOD is typically 60 s; therefore, depending slightly on the aircraft speed, each 5-6 km HSRL data point is unique.Typically the differences in the flight track flown by HSRL and the actual track of CALIOP were small, less than a few kilometers in longitude, and are not thought to induce systematic differences in AOD (Anderson et al., 2003;Shinozuka and Redemann, 2011).The HSRL flight plans were normally aimed specifically at having the CALIPSO overpass occur during the HSRL underflight if possible (Rogers et al., 2011).
The impact of the temporal separation on the AOD comparison is discussed further in Sect.2.5.
For CALIOP data products the quality flags are extremely important and should be used accordingly.In this study only the CALIOP aerosol layers with the highest quality data were examined.The requirements for these layers are as follows: only layers with a CAD score less than −20 were considered (CAD_Score < −20) following Winker et al. (2013).Second, any CALIOP 5 km profile containing a nonzero cloud optical depth or an HSRL-detected cloud was excluded from the comparison.Clouds (observed by either HSRL or CALIOP) are highly variable in time and space, so this criterion ensured that only aerosol fields were examined.Furthermore, errors in the retrieval of the extinction profiles in high clouds (above the HSRL altitude and observed only by CALIOP) could lead to an incorrect attenuation estimate and incorrect scaling (calibration) of the attenuated backscatter profile that would propagate down to the underlying aerosol layers (Young et al., 2013).Similarly, data from scenes with CALIOP-detected aerosol layers above the HSRL altitude were not considered in this study.Lastly, the study was limited to layers with an AOD < 0.5 due to the relatively small number of samples collected in high AOD cases.The scarcity of higher AOD points was primarily due to the geographical and seasonal sampling, but this criterion also has the beneficial effect of removing contamination from clouds that were misclassified as aerosols.
As previously noted, CALIOP retrieves aerosol level 2 data products only where layers were detected and subsequently identified as aerosols.To ensure the identification of both dense and very tenuous layers, the SIBYL algorithm incorporates an iterated, multi-resolution averaging scheme that detects aerosol layers at horizontal resolutions of 5, 20, and 80 km (Vaughan et al., 2009).The horizontal averaging required to detect the aerosol layers used in this study was a fairly representative distribution of the CALIOP data set.Some 989 layers were found along the HSRL-CALIPSO track during the nighttime, the majority of which were detected at a 5 km horizontal averaging resolution.Some 641 daytime layers were found, and these were predominately detected at the 20 km resolution.The presence of strong solar background signals during daytime substantially reduces the CALIOP daytime SNR relative to nighttime measurements.As a consequence, more horizontal averaging is typically required during daytime than at night for the detection of aerosol layers of equal backscatter intensity.Table 2 lists the number of unique layers for each averaging scale that were considered in this study.It is important to note that even though the CALIOP layer product reports all layers at the 5 km resolution, we treated coarser layers as one unique layer (i.e., a layer detected at an 80 km horizontal averaging scale was not considered as sixteen 5 km layers).Table 2 also highlights the statistics of the SIBYL layer detection at night relative to the daytime.There were more aerosol lay- ers and they were detected with less horizontal averaging at night relative to day, despite that fact that HSRL spent ∼ 45 h on track during the nighttime and ∼ 100 h on track during the daytime; however, counting layers is perhaps not the best measure of the SIBYL's efficacy.
The layers studied in this data set were also typically low in the atmosphere, with the mean layer top altitude less than 3 km for over 95 % of the layers.This also limited the thickness of layers in this study, with approximately 95 % of the layer thicknesses being smaller than 2.5 km.Anderson et al. (2003) conducted an excellent study on the mesoscale variation of the column AOD using an autocorrelation method and concluded that aerosol layers can be considered homogeneous and coherent for time and space scales less than 10 h and 200 km.The HSRL data set along the CALIPSO track provides a unique opportunity to investigate spatiotemporal variations in AOD.On 43 of the 106 CALIPSO flights described here, the HSRL made multiple passes of the CALIPSO track where the HSRL flew along the CALIPSO track and then doubled back on the same track on the return to base.This multiple pass information allows a direct determination of the AOD temporal variability over locations sampled twice instead of the time lagged autocorrelation method used by Anderson et al. (2003).

Temporal evolution of aerosol features
The HSRL column AOD values were matched in latitude and longitude for each out and back track and the temporal difference for each AOD pair was recorded.Figure 2 plots each AOD pair in a scatterplot where the color bar indicates the temporal delay between any two measurements.Although the data are sparser at larger AOD values (i.e., 0.3), the relative error of HSRL AOD comparing out and back legs for any loading is within 16 % for these observations.Bin-ning the AOD data in Fig. 2 by temporal separation (into 15 min bins), we found that for any temporal separation up to 1.5 h, the AOD in each time bin remained well correlated (r 2 > 0.9).The flight duration of the King Air aircraft is about 4 to 5 h, so no time difference larger than ∼ 1.5 h can be examined.
This result agrees with the Anderson et al. (2003) results as well as a similar study by Shinozuka and Redemann (2011), who found that in the absence of plumes aerosols remained well correlated (r > 0.9) for spatial extents of approximately 35 km (a typical boundary layer advection velocity of 20 km h −1 translates to 1.5 h).These studies all indicate that the temporal mismatch between the HSRL measurements and CALIOP overpass should have a negligible effect on the AOD comparison presented in this study since the HSRL flights were typically matched well with the CALIPSO overpass within this time frame.

Sample nighttime case: 7 February 2009
On 7 February 2009 the HSRL acquired data along a nighttime CALIPSO track over North Carolina, Virginia, and Maryland (Fig. 3d).This example of a typical HSRL CALIOP validation comparison is highlighted because of the lack of cirrus above HSRL and the excellent calibration of the CALIOP 532 nm total attenuated backscatter product.The mean attenuated backscatter profiles from CALIOP and HSRL are plotted in Fig. 3e, showing a 3-7 km clean air bias of only 0.7 % ± 3 % (CALIOP higher), calculated from Eq. ( 11) of Rogers et al. (2011).Figure 3a and c, respectively, show the complete scene of 532 nm attenuated backscatter profiles acquired by HSRL and CALIOP.Both HSRL and CALIOP observed a residual aerosol layer extending up to 1.5 km with generally higher attenuated backscattering toward the southern end of the track.In Fig. 3a, b, c the CALIOP point of closest approach (CPA) to HSRL is indicated by the vertical white line near the latitude 36.4 • N. The CALIOP level 2 VFM shows the majority of detected layers were classified as aerosol, and the aerosol subtype scene (Fig. 3b) shows a mixture of mostly polluted continental and polluted dust.
The HSRL AODs were collocated to the CALIOP layer product as described in Sect.2.4 and plotted in Fig. 4 for this scene.The layer mean lidar ratios measured by HSRL and selected by CALIOP are respectively shown in Fig. 4a and  b, with the corresponding layer AOD scenes in Fig. 4d and  e.The HSRL layer lidar ratio time series (Fig. 4b) indicates that the lidar ratio measured in the PBL is in the 40-50 sr range (the median HSRL lidar ratio for the time series was 43 sr), while the standard, modeled values for the CALIOP lidar ratios were (Fig. 4a) generally 55 to 70 sr, an error of 10-45 %.A high bias was also noted in the CALIOP layer AOD values compared to the HSRL values over the entire time series.A comparison of the HSRL and CALIOP column AODs is plotted in Fig. 4f, showing the cumulative effect of  the layers, which result in column AODs that were larger by 45 % on the northern end and up to 65 % on the southern end.
Figure 3e shows that the CALIOP attenuated backscatter was well calibrated.This is further corroborated by the agreement of the HSRL and CALIOP column IAB in Fig. 4c, showing no bias in the CALIOP IAB relative to that of HSRL.In view of the good agreement in IAB and similar spatiotemporal pattern of increased aerosol loading towards the southern end of the track, any small spatial or temporal mismatch is unlikely to explain the bias.In this case the only factor contributing to CALIOP's retrieved AODs being 65 % larger than those measured by the HSRL is the 45 % larger lidar ratio used by the CALIOP.When the CALIOP level 1 data were reanalyzed using the standard CALIOP algorithms but with the HSRL median lidar ratio for the scene, 43 sr, the AOD disparity drops to nearly zero over the entire track, with the exception of above 38 • N where the HSRL lidar ratio is larger than 43 sr (Fig. 4f).The implication is that the HERA algorithm performs extremely well if it is given the correct lidar ratio in a low noise situation with a good calibration.This case study is useful for understanding such discrepancies.This type of analysis is now applied to the entire collocated HSRL data set to generate a statistically representative analysis of the CALIOP AOD.  3.

Results
The entire database of collocated HSRL and CALIOP layer measurements was analyzed as described in Sect. 2 and is summarized in Fig. 5.Because the CALIOP SNR for tropospheric aerosols is significantly lower during daytime, the daytime and nighttime comparisons are presented separately, allowing an assessment of the CALIOP algorithms in both conditions.
In this figure, the dashed lines represent a confidence envelope that encompasses two-thirds of the data points around the one-to-one line (solid) and are tabulated in Table 3 for clarity.This is determined by finding the fraction of points that satisfy the following: AOD =AOD CALIOP − AOD HSRL = ± error absolute ± error relative • AOD HSRL . (5) These two error parameters were determined as follows: the relative error was set to 5 % and the absolute error was minimized such that slightly fewer than 68 % of the points fell in the envelope and the relative term was then increased until 68 % of the data were enveloped.These envelopes are somewhat subjective in that they are manually determined and any number of absolute and relative combinations can be chosen to satisfy the criterion.However, they are useful in that they provide a rough error estimate as discussed in Kahn et al. (2011) as well as Remer et al. (2008), which describes an envelope that encompasses two-thirds of the data points.Table 3 also tabulates the percentage of HSRL data points that fall within the estimated CALIOP AOD uncertainty reported in the CALIOP data products in order to evaluate the correctness of the estimated CALIOP uncertainties.
Figure 5a compares the HSRL total column AOD and the CALIOP column AOD for nighttime conditions.CALIOP's nighttime column AOD is lower than the HSRL's for AODs less than 0.1, and biased as much as 50 % high for AOD greater than 0.2.The daytime column AODs (Fig. 5d) show more scatter than the nighttime values as well as fewer low values (AOD < 0.05).These attributes are due to the higher solar background during the day, increasing both the noise and the detection limit of aerosols.CALIOP's daytime column AODs are greater than HSRL's for AODs less than 0.1.For each column AOD estimate the CALIOP data products also report uncertainty estimates at the 1 SD level.Given a Gaussian uncertainty distribution, the CALIOP AOD uncertainty range (i.e., AOD ± uncertainty) should thus encompass ∼ 67 % of the HSRL measured AOD values.Instead, the current CALIOP uncertainty estimates are somewhat optimistic (i.e., too low), with the CALIOP uncertainty ranges encompassing only ∼ 50 % of the nighttime HSRL column measurements and only ∼40 % of the daytime measurements.We also estimate the error using Eq. ( 5) to capture at least 68 % of the data to be ±0.05 ± 0.07 • AOD for the nighttime and ±0.08 ± 0.1 • AOD for the daytime (see Table 3).Figure 5b and e show the HSRL layer AOD and the CALIOP layer AOD comparison respectively for nighttime and daytime conditions.The number of unique column AOD points differs from the number of unique layer AOD points.Vertically, there can be multiple layers represented in a single column and horizontally each layer usually spans multiple columns (i.e., a layer identified at 80 km resolution spans sixteen 5 km columns).The result is more unique column AOD points than layer AOD points.In the nighttime, a slightly high bias in CALIOP's layer AODs is observed for lower AODs (< 0.1), in contrast to the column AODs in this range.This can be explained by a combination of two factors: a lidar ratio selection that is too large causing an overestimate of layer AOD, combined with some layers that are not detected by CALIOP leading to an underestimate in the column AOD.Both of these factors are discussed in the next section.
The daytime layer AODs also show CALIOP slightly overestimates the layer AOD by ∼ 0.025.The 1 SD layer AOD uncertainty estimate specified in the CALIOP data products encompassed 57 % of the nighttime HSRL layer AODs and 51 % in the daytime.Our estimate of the error was lower for the layer AOD than column AOD with 1 SD encompassed by ±0.035 ± 0.05 • AOD for the nighttime and ±0.05 ± 0.05 • AOD for the daytime.Figure 5c and f are similar to Fig. 5b and e except they represent only the topmostdetected layers, thereby removing data points that can be contaminated by poor solutions in overlying layers.Many of the points with a high CALIOP bias are due to errors in overlying layers, although in both day and nighttime the topmost layer itself has a slightly high AOD bias.For the topmost layers, the uncertainty quoted in CALIOP's data products encompassed more HSRL data points than the column and all-layer cases.For the topmost layers, 59 % of nighttime and 52 % of daytime HSRL AOD values differed from the corresponding CALIOP values by less than the 1 SD uncertainty quoted in CALIOP's AOD data products.This agreement was better than that found for the column and all-layer cases.
Standard regressions on the data sets of Fig. 5 (not shown) are largely driven by the outliers and nonlinearities in the process of deriving AOD and could be misleading for the overall trend (Wilks, 1995).Similarly, given the large range of AOD measurements, overall bias values are not equally representative of all AOD regimes.Instead, Fig. 6 shows the median layer AOD bias (CALIOP layer AOD -HSRL layer AOD) as a function of the CALIOP layer AOD.The AOD data are accumulated in bins of 0.05 in width, centered at each point.The boxes are the 25th and 75th percentiles and the whiskers are the minimum and maximum bias for both day and nighttime.Figure 6 shows a positive bias at all AOD for both day and night (CALIOP layer AOD is higher).The dash-dot line on the top panel plots the median bias of each bin as a fraction of the bin AOD and is less than 50 % for AODs less than 0.3.The fractional bias is larger for higher AODs, although we also note that 90 % of the AOD values are less 0.25.

Discussion and contributions
The CALIOP algorithms are complex and nonlinear so many factors can potentially cause biases in the CALIOP AOD.Some parameters, such as the 532 nm attenuated backscatter calibration, have been extensively validated (Rogers et al., 2011).Calibration errors generally introduce relatively small errors in the lidar-derived aerosol backscattering and extinction profiles, especially for the relatively low optical depths Atmos.Meas.Tech., 7, 4317-4340, 2014 www.atmos-meas-tech.net/7/4317/2014/that are typically measured within aerosol layers (i.e., as opposed to within clouds), although larger errors can occur due to calibration at higher optical depths.Also, as already noted above, errors in the retrieval of the AODs of upper layers are propagated downward as calibration errors in the lower layers, and these can be appreciable (Young et al., 2013).Another possible influence on CALIOP optical depth retrievals is the presence of multiple scattering due to the large receiver footprint.However, multiple scattering effects in aerosol layers are generally thought to be small for CALIOP, especially when compared to the magnitude of the lidar ratio uncertainties (Winker, 2003;Winker et al., 2009).Lastly, errors could be introduced by cloud contamination in the CALIOP aerosol products, which are not investigated in this study due to the highly variable nature of clouds.In this study we investigated several layer quantities such as the layer IAB, and the amount of IAB above the layer, the layer altitude, and the layer thickness that could potentially explain systematic errors in the CALIOP AOD; however, no correlations were found.Almost all of the systematic error observed in this study can be explained by lidar ratio selection errors and undetected layers.This section explores and quantifies the CALIOP error due to the lidar ratio for each aerosol type as well as quantifying the impact of undetected layers on the CALIOP column AOD.

Layer detection
Because the SIBYL detection algorithm is a single routine that is applied to the full dynamic range of both aerosol and cloud backscatter intensities, it can sometimes fail to identify tenuous aerosol layers that may be detected by a specifically targeted algorithm.Vaughan et al. (2009) describe the detection of cloud and aerosol layers in detail.Layer detection is performed by analysis of the attenuated backscatter profiles using a threshold which is set for each profile, depending on signal SNR.For a given layer optical depth success in layer detection depends on layer depth and lidar ratio, as well as the level of background illumination.McGill et al. (2007) found that CALIOP could successfully identify high, thin cirrus layers with optical depths as low as 0.01 in the daytime.The aerosol layers targeted by HSRL, however, are more weakly scattering for the same optical depth (i.e., they have higher lidar ratios), typically have larger vertical depth (i.e., they are more spatially diffuse), and have much lower contrast with the molecular background simply because they lie lower in the atmosphere than cirrus clouds.Also, because the CALIOP layer detection algorithm necessarily uses backscatter contrast (rather than extinction), aerosol layers, with their generally higher lidar ratios, and hence lower backscatter for the same optical depth will be less easily detected than are optically thin cirrus layers.CALIOP can fail to detect aerosol layers when the aerosol backscatter is too small relative to the profile SNR.Aerosol detection failures by CALIOP were often noted during the spring portion of ARCTAS field campaign flights when the HSRL was based out of Barrow, Alaska.During the 12 daytime CALIOP-targeted HSRL flights during the spring ARC-TAS field campaign, almost no aerosol layers were detected by CALIOP except on the occasion when strong smoke advected over the track.An example of HSRL data from ARC-TAS is highlighted in Burton et al. (2012).During ARCTAS, the HSRL typically encountered diffuse, weakly scattered aerosol layers extending up from the ground to the aircraft altitude, without the distinct boundary layer typically found in other locations.While these layers sometimes had significant AOD, the combination of the small aerosol backscatter coefficients and high daytime noise (due to high surface albedo) resulted in CALIOP backscatter profiles in which the aerosol signal remained below CALIOP's detection threshold.This is the case of the daytime ARCTAS mission flights.
Although they represent about 20 % of the flight data in this study they contribute to less than 1.4 % of the aerosol layers compared.
To assess only the effects of CALIOP's layer detection, the HSRL column AOD and HSRL layer-summed AOD along the CALIOP track are compared in Fig. 7 for both day and night lighting conditions.As the layer-summed AODs for the HSRL were only calculated within the boundaries of layers detected by CALIOP, a comparison of the HSRL layersummed values with the HSRL column values unambiguously identifies those cases where the CALIOP layer detection scheme failed to correctly identify the full extent of the aerosol layer(s).
In the nighttime (Fig. 7a) the CALIOP layer detection algorithm has more skill at detecting layers due to higher SNR than during the daytime (Fig. 7b).Several interesting observations can be made from the nighttime panel.First, the comparison of the layer summed to true column AOD is very linear across a wide range of AOD, with a constant offset of about 0.02 which is attributed to tropospheric clean air that often contains tenuous aerosol layers.This is observed in Vaughan et al. (2010), who found in 1 yr of HSRL data from 2006 to 2007 that in the cleanest regions of the free troposphere, about 5 % of the clean air, scattering is actually due to background aerosol.Assuming the clean air aerosol scattering to be 5 % as noted above and a lidar ratio of 40 sr, we estimate the clean air AOD from 2 to 7 km to be 0.0096.Similarly, a 10 % background aerosol scattering will cause an AOD of 0.018.The attenuation due to these optically thin aerosol layers in the free troposphere is expected to cause a slight underestimate of the HSRL layer-summed AOD and therefore the CALIOP column AOD.Looking further at Fig. 7a, the two regions where CALIOP layer detection fails are at the extremes of low and high AOD.Errors at the low column AOD extreme can be expected, because as layers become optically thinner they usually have lower backscatter and are difficult to detect.
The spread at the highest AOD layers in Fig. 7a (nighttime) come primarily from one flight on 23 June 2006.This day had excellent agreement between the HSRL and CALIOP attenuated backscatter, although it was a complex case in terms of vertical structure.The CALIOP feature detection reports as many as five aerosol layers in several of the 5 km segments along this track.In addition to heavy aerosol loading, the HSRL AOD was between 0.4 and 0.75 along this track (Fig. 9 in Rogers et al. (2011) shows a line plot of this case).Due to the complex aerosol structure in this scene and attenuation due to the higher aerosol loading, the SIBYL algorithm failed to detect one of the higher AOD layers.This missing layer case represents less than 1 % of the data set; however, this case is certainly an example of where the CALIOP column AOD does not represent the entire column.
The daytime portion of Fig. 7b shows nearly the same low bias (∼ 0.02) of the layer-summed AOD relative to the column AOD for AOD less than 0.1 that was evident in the nighttime.In addition, Fig. 7b shows that the HSRL layersummed AOD frequently underestimates the true HSRL column AOD during the daytime.The poorer performance of the daytime compared to the nighttime is due to the lower SNR and implies that detection errors alone can cause the CALIOP column AOD to be under-reported in the daytime.
It is worth noting that previously published works comparing CALIOP AOD estimates to collocated measurements by passive sensors, such as AERONET (Schuster et al., 2012;Omar et al., 2013) and MODIS (Kittaka et al., 2011;Redemann et al., 2012), consistently find a similar low bias in the CALIOP results.Because these passive sensors derive AOD from measurements of direct or reflected sunlight, the comparisons are always done during daylight hours when the CALIOP layer detection results are most likely to be degraded by solar background noise.As an initial estimate of the magnitude of the biases introduced by CALIOP detection failures, we examine the detected AOD fraction -i.e., the ratio of the HSRL layer-summed AOD to the HSRL column AOD -for daytime measurements only.As seen in Fig. 8a, for approximately one-quarter of the cases, CALIPSO failed to detect the aerosol that was responsible for at least 50 % of the column AOD.Of these cases of 50 % or more detection failure, almost two-thirds had column optical depths of 0.2 or less (equivalent to an extinction coefficient of ∼ 0.0286 km −1 , which for a lidar ratio of 50 sr equates to a backscatter coefficient of 5.714 × 10 −4 km −1 sr −1 ).The above analysis does not consider cases where HSRL measured a column AOD but zero layers were detected by CALIOP.To address this, Fig. 9 summarizes layer AOD statistics from all 106 HSRL flights coincident with CALIOP.The magenta line (HSRL all ) summarizes all HSRL column AOD measurements along the collocated CALIOP 5 km defined aerosol layer grid, whether or not CALIOP reported a valid column AOD.The blue and red lines represent the HSRL (HSRL match ) and CALIOP column AOD only in regions where CALIOP reports a valid column AOD.In the nighttime, shown in Fig. 9a, the HSRL all and HSRL match distributions are similar; when CALIOP detects a layer at night, it usually detects most of the AOD in the column.Even though there are few points above an AOD of 0.3 at night, CALIOP reports a total of 144 columns with AOD between 0.3 and 0.5 while HSRL reports 61 such columns, showing a general overestimation of high column AOD values by CALIOP at night.In the daytime, the HSRL all and HSRL match distributions do not match, indicating that CALIOP fails to detect any aerosol when the HSRL measured an AOD in the column.While this can occur for AOD  as large as 0.2, the majority of these missed columns are below an AOD of 0.1, where roughly half of the layers are missed by CALIOP.A similar phenomenon is seen at night, where the rate of detection failures increases for layers with AOD less than about 0.02.
Lastly, it makes more sense to describe the CALIOP detection limits in terms of backscatter or extinction than AOD, because the vertical distribution of AOD is a better metric to assess CALIOP's detection scheme than an AOD value.This is the subject of a future paper with the CALIOP aerosol profile products; however, given that we have found that the median column thickness was 1.6 km during the nighttime and 1.5 km during the daytime, we can estimate the minimum extinction detection threshold to be 0.012 km −1 at night and 0.067 km −1 during the daytime in a layer median sense using the minimum AOD values of 0.02 and 0.1 established above, which are also consistent with Fig. 1 of Winker et al. (2013).

Lidar ratio effects
In the absence of independent estimate of AOD, extinction retrievals for elastic backscatter lidar measurements such as those made by CALIOP rely on the a priori specification of a type-dependent lidar ratio (Young and Vaughan, 2009).Because solutions to the lidar equation can be very sensitive to the value of the lidar ratio used, numerous researchers have investigated the role played by lidar ratio selection in evaluating the discrepancies between CALIOP AOD estimates and estimates derived from other sensors (e.g., Kacenelenbogen et al., 2011;Kittaka et al., 2011;Oo and Holz, 2011;Redemann et al., 2012;Schuster et al., 2013).Furthermore, because lidar ratios can have a large range of values, even within a single aerosol subtype, they can easily become the largest source of error in the CALIOP retrievals in this study.In addition to the variability in the lidar ratio for each subtype, the misclassification of an aerosol subtype can also result in a large lidar ratio error, potentially introducing a significant bias in the AOD.Discrepancies between the layer AOD measured by HSRL and the layer AOD retrieved by CALIOP tend to track the differences between the lidar ratios measured by HSRL and specified by CALIOP, shown in Fig. 10, as the layer AOD avoids error from undetected aerosol.The zero bias in CALIOP AOD corresponds to nearly zero bias in lidar ratio.Similarly, positive (negative) lidar ratio biases most frequently correspond to positive (negative) AOD biases.
The CALIOP aerosol types, their characteristic lidar ratios, and the estimated 1 SD uncertainties used in the version 3 CALIOP retrievals are tabulated in Table 4 (Young et al., 2013).Table 4 also reports the means and SDs of the lidar ratios measured by the HSRL in the layers identified by CALIOP.The mean HSRL values were almost identical to the median HSRL values except in the case of marine, which had a median value of 23 sr versus a mean of 26 sr.Note that the HSRL lidar ratios in the table are not intended to represent any given type, but rather are the lidar ratios corresponding to the aerosol masses classified by CALIOP and measured by HSRL within the geographic domain of this study.The spread of HSRL values shown in Table 4 highlights the fact that attempting to characterize any aerosol subtype with a single lidar ratio presents a difficulty for any lidar instrument making global aerosol measurements such as CALIOP.This study does, however, provide statistics on the variation of the lidar ratio that may help make CALIOP AOD uncertainty estimates give a better indication of the likely error in the AOD product.The CALIOP types of biomass burning and polluted continental, which share the same 532 nm lidar ratio and nearly the same estimated lidar ratio uncertainty, are grouped together in the HSRL analysis for reasons that are discussed below.The CALIOP and HSRL lidar ratios agree at some point in their uncertainty envelope for all types, but CALIOP's values are closer to the HSRL average for certain aerosol types, such as marine and dust, than for other types.The modeled CALIOP lidar ratio tends to be higher than the HSRL average in polluted dust and polluted continental-biomass burning layers and lower in clean continental layers.
Recent studies suggest multiple possibilities to explain the discrepancies in Table 4. Either CALIOP incorrectly classifies the subtype (e.g., Burton et al., 2013) or the lidar ratio is not adequately represented by the CALIOP aerosol model (e.g., Schuster et al., 2012).More specifically, there may be a mixture of aerosol species present that either is not modeled in the CALIOP algorithm (e.g., pollution mixed with marine or dust mixed with marine), does not correspond with the mixture type chosen (e.g., pollution cases misidentified as polluted dust), or produces a distribution of lidar ratios too variable for the single modeled lidar ratio to capture adequately.All of these factors will introduce systematic errors into the CALIOP AOD estimates.Burton et al. (2013) provide a comprehensive assessment of the CALIOP aerosol subtypes in the context of the HSRL measurements.However, because HSRL identifies a different set of aerosol classes (Burton et al., 2012), there is no one-to-one correspondence between the CALIOP and HSRL typing schemes.In the next sections we discuss the impact of the subtypedefined lidar ratio on the CALIOP AOD estimates for each aerosol type.Figures 11 and 12, respectively, show the daytime and nighttime lidar ratio distributions, layer AOD bias (as in Fig. 10), and AOD scatterplots (as in Fig. 5b, e) for each aerosol subtype (columns).These figures are the basis for the discussion of the next subsections.Since a CALIOP column may be comprised of several layers with different types, but a CALIOP layer must be of a single type, the next section necessarily uses only the layer AOD.Lastly we stress that, because the HSRL measurements have limited geographical coverage and are not global, the interpretation of these results is not intended to prescribe new lidar ratios to the CALIOP types.

Clean marine
Many of the layers identified by CALIOP as marine aerosol came from the two Caribbean deployments when the HSRL made measurements far offshore, with a few cases off the mid-Atlantic coast region affected by urban outflow.The HSRL analysis in Figs.11 and 12 show that CALIOP marine  aerosol type lidar ratio was similar to the peak of the HSRL lidar ratio distribution (more so for daytime data than at night) and that there was generally little bias in the CALIOP AOD.Some larger AOD values (AOD > 0.1) were noted by HSRL at night and underestimated by CALIOP.This underestimation arises from those cases where CALIOP misiden-tifies the aerosol type as marine, while HSRL reports a lidar ratio larger than 40 sr, from which we can infer the influence of other aerosol types.In these cases, the lidar ratio used by CALIOP is much lower than the value measured by HSRL, resulting in a low bias in the CALIOP AOD.
The reason for the misidentification is that the CALIOP surface type (land/ocean) influences the aerosol typing decision.Any surface-attached layer over the ocean with low depolarization (< 0.05), or any surface-attached layer over the ocean with depolarization < 0.075 and integrated IAB greater than 0.01, is identified as marine by CALIOP.While many other aerosol types, such as pollution or biomass burning, can have similar signatures, Oo and Holz (2011) recently found that use of aerosol size information could be used to improve the classification of marine aerosol and suggested using the CALIOP IAB color ratio could improve the classification of marine aerosol.Simple outflow of these aerosol types over the ocean from a continental region can often result in a misclassification as marine by CALIOP (Schuster et al., 2012).Outside of these regions, in the majority of the marine layers detected in this study, we found good agreement between HSRL and CALIOP lidar ratio and AOD.

Clean continental
The clean continental aerosol type was intended to indicate cases of low aerosol loading over land, where typing using the depolarization ratio or color ratio would not be reliable due to weak aerosol scattering.Omar et al. (2009) noted the clean continental aerosol was not identified very frequently by the CALIOP subtype algorithm.This is largely because the IAB threshold is low and so detection is difficult, especially in daytime lighting conditions.The clean continental aerosol type was almost never identified during the daytime in this data set, only 59 unique layers, and during the night the lidar ratios measured by the HSRL are distributed over values that are significantly higher than the lidar ratio utilized by CALIOP, leading to an underestimate of the AOD by CALIOP.However, in both day and night we found that because the IAB threshold is low, the typical AOD found for this type was also small (AOD < 0.04), so the use of an incorrect lidar ratio was not significant in terms of absolute AOD.

Dust
The CALIOP subtyping procedure identifies pure dust based solely on the layer-integrated depolarization ratio (Omar et al., 2009).The lidar ratio for dust depends on many factors, including, but not limited to, mineral composition, age, humidity, and size distribution.The 532 nm dust lidar ratio is often discussed in the literature with a wide range of lidar ratio values.Attempting to characterize dust with a single lidar ratio presents a difficulty for global measurements such as CALIOP.The CALIOP lidar ratio for dust (40 sr) is on the low end of the typical range of lidar ratios (40-60 sr) measured in Europe or Africa (Mattis et al., 2002;Tesche et al., 2009Tesche et al., , 2011;;Papayannis et al., 2008;Müller et al. 2012;Esselborn et al., 2009), but is consistent with recent measurements of lidar ratio for dust from the Arabian Peninsula (Mamouri et al., 2013), and with earlier estimates based on AERONET observations (e.g., Cattrall et al., 2005).Recent studies have demonstrated that considering the source region of the dust, and any changes in its properties during transport, along with any other aerosol type it mixes with would likely provide a better estimate of lidar ratio (Schuster et al., 2012;Kim et al., 2014).However, accounting for these additional factors would require additional measurements and/or sources of information (e.g., back trajectories) that are not currently incorporated into the CALIOP data analysis scheme.The HSRL data set contains a significant trans-Atlantic transported Saharan dust component from the Caribbean 2010 campaign, which is relevant to the CALIOP lidar ratio global selection process since the Sahara is the largest dust source on the globe and a significant fraction of Saharan dust transports over the Atlantic.A subsequent paper is planned specifically for these measurements, while here we focus only on the layers that CALIOP identified as dust.
In both day and night lighting conditions, CALIOP's value of 40 sr for the lidar ratio of dust identified layers agrees well with the mean lidar ratio from HSRL for these layers of 38 ± 11 sr.Not surprisingly, the AOD shows little bias and scatterplots show that, on average, the CALIOP AOD values for dust layers are in agreement with the HSRL values for a large range of AODs although there is a slight overestimation of larger AOD values by CALIOP at night.It is also important to note that in this data set we saw no indication of the multiple scattering impact on depolarization described by Liu et al. (2010).Indeed, these were primarily non-opaque dust layers with aerosol extinction less than 1 km −1 so the multiple scattering impact is expected to be small (Liu et al., 2010).
Another important conclusion from the analysis of the dust type layers is that the CALIOP AOD is most similar to HSRL's when the mean lidar ratio from the HSRL distribution is the most similar to the value used by CALIOP, reinforcing the importance of CALIOP selecting the correct lidar ratio for the aerosol type that is identified.

Polluted dust
The distribution of lidar ratios measured by the HSRL in layers identified by CALIOP during the day as polluted dust shows no peaks, although the HSRL mean value is some 20 % less than the CALIOP's value.The AODs measured by both instruments show considerable scatter, as would be expected from the broad distribution of lidar ratios.In general, however, the larger lidar ratio disparities track with the larger AOD biases.For nighttime layers, HSRL's lidar ratio distribution shows a strong peak and a mean lidar ratio value some 38 % less than the value used by CALIOP.Consequently, the plot of the CALIOP AOD versus those measured by HSRL shows a strong pattern of correspondence with error increasing in the CALIOP AOD nonlinearly, but correlated with the lidar ratio bias.
Similar to the caveat regarding the dust aerosol type, we stress that this is not a global analysis and this study is biased by the fact that the CALIOP polluted dust type measured by HSRL is primarily a mixture of dust + marine in these cases while the CALIOP assumption is that polluted dust is actually dust + smoke (Omar et al., 2009;Burton et al., 2013).Both such mixtures yield the elevated depolarization ratios (0.075 ≤ depolarization < 0.20) that trigger the identification of polluted dust in the CALIOP aerosol subtyping scheme.There are insufficient coincident HSRL data on dust and smoke mixtures to evaluate CALIOP's lidar ratio in terms of a mixture of these types, but the lidar ratio used by CALIOP for this polluted dust type is considerably larger than the value that HSRL measures for layers it identifies as such.Consequently, the AODs retrieved by CALIOP in these layers are also larger than the HSRL measurement.Where the mixture of types is such that the HSRL lidar ratios exhibit a strong peak, as in the nighttime data shown here for polluted dust, there will be a strong correspondence between the AODs retrieved by CALIOP and those measured by the HSRL and will produce absolute CALIOP -HSRL differences in AOD, which increase with retrieved AOD as we see in the polluted dust columns of Figs. 11 and 12.Where the mixture of types is such that there is no peak in the measured (or actual) lidar ratios, the relationship between the measured (or actual) AODs and those retrieved by CALIOP will show considerable scatter, as in the daytime cases shown here.Lastly, this aerosol type highlights the impact of lidar ratio selection errors on the AOD retrieval that are especially evident towards higher AOD values.

Polluted continental-biomass burning
The polluted continental and biomass burning aerosol types are combined here because they share the same CALIOP lidar ratio.As seen in Figs.11 and 12, the lidar ratios measured by HSRL during both day and night for layers classified as these types are quite broadly distributed and are generally less than the fixed value used by CALIOP.Overall, the HSRL lidar ratios measured in aerosol layers identified as polluted continental or biomass burning by CALIOP show a much broader distribution than would be expected based on the variability ascribed to the CALIOP aerosol models (see Table 4).Two factors potentially contribute to this increased variability.The first is true variability in the lidar ratio for smoke and urban aerosols; for example, smoke properties are known to vary with age (Alados-Arborledas, 2011;Nicolae, 2013).The second factor is the possibility that the CALIOP subtyping routine is more prone to errors in identifying these aerosol types, supported, for smoke at least, by the study of Burton et al. (2013) and studies of biomass burning lidar ratio measurements (e.g., Cattrell et al., 2005;Müller et al., 2007Müller et al., , 2005;;Burton et al., 2012).If intra-class variability is also partly responsible for the disparities, regional variabilities in aerosol composition could play a role.Lopes et al. (2013) used AERONET optical depth measurements and CALIOP IAB measurements to estimate lidar ratio distributions for the CALIOP aerosol subtypes occurring over Brazil.Their calculations show mean percentage differences with respect to the CALIOP modeled values of −1.7 % ± 9 % for the polluted continental type and 4.3 % ± 27 % for biomass burning, suggesting that the aerosols sampled over Brazil more closely resemble the CALIOP models than the aerosols samples over North America.
As a consequence of the broad range of measured lidar ratios within these types, there is considerable scatter in the plots that compare the AODs.The AOD bias in almost all cases (day and night) was larger than zero, reflecting the fact that CALIOP's lidar ratio tended overall to be larger than the HSRL mean value.The high value of CALIOP's ratio for this type, combined with its large difference from the measured values, will cause errors in the retrieved AOD to be a strong function of AOD.

Summary of types
A lidar ratio or some constraint (i.e., AOD or direct transmittance) must be used to retrieve extinction profiles and AOD from an elastic backscatter lidar (e.g., Young, 1995).This comparison of CALIOP's lidar ratios with those measured in this study by the HSRL shows the difficulty inherent in correctly determining an aerosol subtype using a classification algorithm and the consequences of AOD errors that can result from using a single lidar ratio value for each aerosol type.Errors arising from either source will result in incorrect lidar ratios being passed to the HERA (i.e., extinction retrieval) algorithm that can add systematic errors, which in many cases will far exceed other sources of error such as measurement error and calibration error.Studies, such as this one identifying systematic regional biases in the lidar ratio values currently used by CALIOP, can form a basis to improve the performance of the CALIOP algorithms in the future by accommodating these regional variations in the selection of lidar ratio values.
For all of the aerosol types, we found that the AOD retrieved by CALIOP tended to be correlated with that measured by HSRL when the HSRL lidar ratio distribution for a given CALIOP aerosol type was strongly peaked.In the cases of marine and dust, the lidar ratio used by CALIOP was similar to the mean value of the peaked HSRL distribution, resulting in little bias in the CALIOP AOD when compared to the HSRL measured AOD.In the cases of the polluted dust, polluted continental, and biomass burning aerosol types, the means of HSRL lidar ratio distributions were less than the lidar ratios used by CALIOP; thus, CALIOP AODs were generally biased high.Lastly, in the limited case of the clean continental aerosol type, any mismatch in lidar ratio was not found to cause a significant bias in AOD because the IAB (and hence AOD) was quite low (< 0.04).The biases in the retrieved AOD due to errors in the lidar ratio propagate nonlinearly and can be a strong function of AOD.As discussed in Sect.3, reporting a single bias value for each type would not represent all AOD values.As such Tables 5 and 6, respectively list the daytime and nighttime AOD biases for each aerosol type in the same method described for Fig. 6.

Assessment of HERA with corresponding lidar ratios
As demonstrated in Sect.2.6 and discussed above, the lidar ratio selection is critical to obtaining accurate layer AOD from CALIOP.In this section, we re-evaluate the layer AOD discussed in Figs.11 and 12 only in those layers where the CALIOP modeled lidar ratio is within 30 % of the HSRL measured lidar ratio.The layers meeting this criterion accounted for 71 % of the marine layers, 32 % of the clean continental layers, 82 % of the dust layers, 33 % of the polluted dust layers, and 38 % of the polluted continental-biomass burning layers listed in Table 4.The scatterplots are shown in Fig. 13 (daytime and nighttime combined).For each aerosol type, the CALIOP layer AOD is in good agreement with the HSRL layer AOD, especially when compared with the data in Figs.11 and 12, falling around the one-to-one line.This implies that the HERA algorithm performs remarkably well over a large range of HSRL layer AOD when given an accurate lidar ratio.In these cases the CALIOP layer AOD error can be expressed as ± 0.025 ± 0.05 • AOD (from Eq. 5), which is smaller than the cases described in Table 3.Furthermore, the 1 SD layer AOD uncertainty estimate reported in the CALIOP data products is found to encompass 79 % of the nighttime HSRL layer AODs and 70 % in the daytime.This is a significant improvement from the layer AOD uncertainty presented in Sect.3, which only encompassed 57 % of the HSRL layer AOD at night and 51 % during the daytime.This also demonstrates that the HERA uncertainty propagation is working well in the absence of either misclassification or incorrect lidar ratios provided by the classification.

Comparison with previous validation of CALIOP AOD with MODIS and AERONET
Lastly, we note that the results presented here do not lead to the same conclusions drawn in several previous validation studies of the CALIOP column AOD.2012) are only a subset of the temporal range covered by this study.One conclusion frequently drawn from these studies is that lidar ratios assigned by the CALIOP aerosol models are too low.In this study we show that this conclusion can be contradicted by more in-depth evidence and analysis.Like the earlier studies, we also find that the CALIOP column AODs are biased low compared to HSRL, especially for AOD below 0.1.However, the critically important point is the CALIOP layer AOD is almost always found to be biased high in comparisons with HSRL.It thus bears repeating that passive sensors measure or estimate optical depths over a full atmospheric column, from the top of the atmosphere to the Earth's surface, whereas the standard CALIOP retrieval algorithm only retrieves AOD estimates where layers are detected; i.e., in those regions of a column where the magnitude of the aerosol backscatter is sufficient to be readily distinguished from (always noisy) clear-air measurements.Paradoxically, one consequence of this retrieval strategy is that CALIOP can use an overestimate of the layer lidar ratio yet, due to layer detection limitations, simultaneously reports an underestimate of the total column optical depth.Furthermore, this mismatch is much more likely to occur for daytime measurements: the solar illumination that is essential for passive sensor retrievals of AOD generates significant amounts of background noise in the CALIOP measurements, and can significantly degrade CALIOP's ability to detect weakly scattering layers.It is thus clear that an accurate assessment of when and where (or even if) this layer versus column AOD mismatch occurs cannot be made using passive sensor data alone, if for no other reason than that the different sensors are deriving AOD estimates using different fractions of the available atmospheric column.On the other hand, the airborne HSRL data used in this study are uniquely capable of partitioning the total column AOD retrieved by When assessed in light of previous studies, the analyses presented here lead to the conclusion that a large, high quality database of airborne HSRL measurements is critical to understanding the CALIOP layer and column AOD, precisely because it allows for the separate examination of errors incurred by detection and lidar ratio.The vertically resolved nature of the HSRL data set has allowed us to both reveal the true mechanisms for bias in CALIOP AOD and to further demonstrate that the CALIOP extinction retrieval performs well when an accurate lidar ratio estimate is used for input.The power of this kind of comparison is a good argument for future field campaigns using HSRL over more distant parts of the globe or an HSRL instrument on a spacebased mission, since the airborne instrument to date does not include global coverage.If these events occur, future studies to understand regional differences would be enabled.

Summary
The NASA Langley HSRL flew 106 flights along the CALIPSO orbit track between June 2006 and October 2011 and produced a rich, unique data set for validation of CALIOP data products.This data set has been used to provide an extensive, qualitative, and quantitative validation of the CALIOP level 2 aerosol layer and column AOD products for typical air masses observed in the North American and Caribbean regions.
In this paper the temporal variability of the HSRL column AOD in this database was assessed, and good correlation (r 2 > 0.9) was found for collocated HSRL AOD measurements with temporal separation up to 1.5 h, agreeing with previous results (Anderson et al., 2003;Redemann et al., 2005;Shinozuka and Redemann, 2011) for AOD validation studies.A typical case study from this data set was examined (7 February 2009) in which it was demonstrated that the lidar ratio selection can play a dominant role in the CALIOP AOD error, since retrieval with the correct lidar ratio produced excellent agreement in AOD across the entire scene.
The results from this study show that the CALIOP layer AOD error is dependent on both subtype classification and aerosol loading.In general, for the North American/Caribbean air masses in this study, the CALIOP level 2 (version 3.01) layer AOD product is biased high by less than 50 % for AOD values smaller than 0.3, with a somewhat higher bias for larger AOD values.The 1 SD layer AOD uncertainties in the CALIOP data products did not fully encompass the differences between the CALIOP and coincident HSRL values and captured 57 % of the range of layer AODs at night and 51 % in the daytime.However, restricting these layers to only those with similar (within 30 %) lidar ratios, the CALIOP uncertainties encompassed 79 % of the nighttime HSRL layer AODs and 70 % in the daytime.We defined an AOD error estimate in terms of absolute and relative error such that 68 % of the AOD fell into the encompassing envelope.Using the results shown in this study, we express the CALIOP layer AOD error as ±0.035 ± 0.05 • AOD at night and ±0.05 ± 0.05 • AOD during the daytime.
It is difficult to draw overarching conclusions regarding the CALIOP column AOD, because the CALIOP layer detection scheme often identifies a single aerosol mass as containing multiple layers, and these layers may be classified as different aerosol types and thus be assigned different lidar ratios.The reverse is also true; CALIOP may identify single layers containing multiple aerosol masses.Furthermore, the CALIOP column AOD may underestimate the true col-umn AOD because of residual aerosol that goes undetected by the CALIOP layer identification scheme.The CALIOP column AOD uncertainty range (i.e., the AOD ± the quoted 1 SD) given in the CALIOP data products encompassed the AOD measured by the HSRL for 50 % of the nighttime columns and 40 % of the daytime columns.We found that, for the air masses and aerosol types in the region studied, the CALIOP column AOD error could be expressed, in terms of the CALIOP AOD, as ±0.05 ± 0.07 • AOD at night and ±0.08 ± 0.1 • AOD during the daytime, although the error varies considerably with aerosol type.
The performance of the CALIOP layer detection algorithm results were also assessed to provide additional insight into the sources of errors in the CALIOP column AOD.Consistent with Winker et al. (2013), we found that CALIOP generally does not detect the weakly backscattering aerosol layers in the free troposphere, and this leads to an underestimate in the CALIOP column AOD of ∼ 0.02 at night.In night lighting conditions the AOD from these missing layers is insignificant compared with errors in the CALIOP AOD that result from errors in the lidar ratio selection (either through incorrectly identified aerosol types or lidar ratios that are different from the values measured for the type by the HSRL).At night, CALIOP generally detected a sufficient fraction of the existing aerosol layers to represent the column AOD, except for the 0.02 underestimate in the free troposphere.In the daytime, the CALIOP column AOD tends to underestimate the true column AOD due to layer detection difficulties caused by the solar background illumination.We found that CALIOP fails to detect roughly half of weak (AOD < 0.1) aerosol columns during the day.Given that the median column thickness was 1.6 km during the nighttime and 1.5 km during the daytime, we can estimate the minimum extinction detection threshold to be 0.012 km −1 at night and 0.067 km −1 during the daytime in a layer median sense.These minimum extinction thresholds are consistent with previously reported layer detection sensitivities (Fig. 1, Winker et al., 2013).
The selection of a single lidar ratio for each aerosol type has limitations when applied to a global measurement and analysis of lidar data and can lead to systematic regional biases in AOD.As suggested by Schuster et al. (2012), multiple or regional models may improve the CALIOP AOD product.The CALIOP aerosol layer lidar ratios were compared with the lidar ratio distributions measured by HSRL, and errors in CALIOP AODs were correlated with the differences between CALIPSO's lidar ratios and those measured by the HSRL.We found that, for the geographical regions explored in this study, the CALIOP modeled lidar ratios and retrieved AODs are most comparable to the HSRL measurements for the marine and dust aerosol subtypes.CALIOP's lidar ratio for the clean continental aerosol subtype was considerably lower than the values measured by the HSRL, but because the AOD values were extremely small (AOD < 0.04) a corresponding bias in the CALIOP AOD was not observed.For both polluted dust and polluted continental-biomass burning, CALIOP's modeled lidar ratio was found to be larger than the mean measured value for the HSRL distributions and, as a result, the AODs retrieved by CALIOP were larger than those measured by the HSRL.CALIOP's polluted dust aerosol type is modeled as a mixture of dust + smoke while the dust mixtures observed by the HSRL for those layers identified by CALIOP as polluted dust in this study were dominated by a mixture of dust and marine, suggesting that other mixtures should be considered by the CALIOP aerosol typing to improve the AOD products.Lastly, considering only cases where the CALIOP lidar ratio was within 30 % of the HSRL, lidar ratio produced the best comparison of CALIOP AOD, demonstrating that the extinction algorithm is performing properly when provided the proper lidar ratio.In this case the difference between the AODs could be expressed as ±0.025 ± 0.05 • AOD using combined day and night data and the CALIOP layer uncertainty range given in the CALIOP data products encompassed the AOD measured by the HSRL for 79 % of the nighttime layers and 70 % of the daytime layers.

Figure 1 .
Figure 1.Flight track map of all HSRL coincident underflights of CALIPSO in this study.Black lines represent daytime measurements and blue lines represent nighttime (updated from Rogers et al., 2011).

Figure 2 .
Figure 2. Spatially matched HSRL AOD measurements from out and back tracks with the color bar indicating the temporal separation of the measurements.The dashed line is the one-to-one line.

Figure 3 .
Figure 3. CALIOP (a) and HSRL (c) 532 nm total attenuated backscatter time series for 7 February 2009 with the white vertical line denoting the point of closest approach.The flight tracks for HSRL (blue) and CALIOP (red) are also shown (d).The mean attenuated backscatter profiles from HSRL (blue) and CALIOP (red) (e) show good agreement between the two.Finally, the CALIOP aerosol subtype product is shown (b).

Figure 4 .
Figure 4.The CALIOP layer lidar ratio time series (a) and corresponding HSRL layer lidar ratio time series (b) from 7 February 2009.The corresponding CALIOP and HSRL layer AOD time series (d and e, respectively), the layer-summed IAB from CALIOP and HSRL (c), and the column (layer summed) AOD from CALIOP and HSRL (f).In (f), the magenta line represents the CALIOP level 2 column AOD analyzed with a lidar ratio of 43 sr (see text).

Figure 5 .
Figure 5. Scatterplots of column AOD (a, d), individual layer AOD (b, e), and top layer only AOD (c, f).The top row shows night lighting conditions and the bottom row shows daytime lighting.The color bar indicates the number of points in each grid cell.The one-to-one line is in solid black on all figures and the dashed lines represent the error estimates reported in Table3.

Table 3 .
Summary of CALIOP error estimates from HSRL.The (a) column represents the percentage of HSRL AOD that fell within 1 SD of the CALIOP uncertainty estimate from the CALIOP data files.The (b) column represents the confidence envelope (Eq.5) for CALIOP AOD based on the comparison with HSRL AOD. ± 0.07 • AOD ±0.08 ± 0.1 • AOD

Figure 6 .
Figure 6.The median layer AOD bias (CALIOP -HSRL) as a function of CALIOP layer AOD (bottom panel) for both day (red) and night (blue) measurements.The boxes are the quartiles and the whiskers the minimum/maximum of each bin.The dash-dot lines represent the median bias as a fraction of the AOD bin (top panel).

Figure 7 .
Figure 7. Night (a) and day (b) regressions of HSRL column and layer-summed AOD.The solid black line is the one-to-one line and the regression is the dotted line.
Figure 8b shows the detected AOD fraction as a function of column optical depth.The error bars represent the SD of a single sample in each column AOD bin.While the magnitudes of the error bars are all roughly similar, the slight upward slope of the mean values (0.138 ± 0.052) indicates that the CALIOP detection algorithm performs somewhat better for larger aerosol loading and higher optical depths.For this data set, CALIOP detected ∼ 65 % of the low AOD layers (AOD < 0.3) and ∼ 80 % of the highest AOD layers.For reference, the global analysis by Redemann et al. (2012) determined that CALIOP AOD estimates were, in the mean, ∼ 74 % of MODIS values.

Figure 8 .
Figure 8.(a) Cumulative distribution of detected AOD fraction (layer/column); and (b) detected AOD fraction as a function of column AOD (blue) with number of samples (green).Error bars in (b) represent the single sample SD.

Figure 10 .
Figure 10.Layer AOD bias dependence (CALIOP -HSRL) as a function of layer lidar ratio bias (CALIOP -HSRL).Both day and night data are included, and the numbers represent the number of points in each quadrant.

Figure 11 .
Figure 11.Daytime histograms of lidar ratio (top row), AOD bias (CALIOP -HSRL) vs. Sa Bias (CALIOP -HSRL) (middle row), and AOD (middle row) with AOD scatterplots (bottom row) for each CALIOP aerosol type (columns).The numbers in the top row are the means and SDs of the lidar ratios measured by the HSRL.

Figure 12 .
Figure 12.Nighttime histograms of lidar ratio (top row) AOD bias (CALIOP -HSRL) vs. Sa bias (CALIOP -HSRL) (middle row), and AOD scatterplots (bottom row) for each CALIOP aerosol type (columns).The numbers in the top row are the means and SDs of the lidar ratios measured by the HSRL For example,Omar et al. (2013) andSchuster et al. (2012) both found CALIOP column AOD estimates to be lower than collocated AERONET AOD measurements.Similarly,Kim et al. (2014) found the CALIOP column AOD to be lower than collocated MODIS retrievals.Redemann et al. (2012) also investigated the CALIOP AOD using the collocated MODIS AOD.Relative to MODIS, CALIOP AOD was found to be biased low over the oceans, to have a longitude-dependent bias over land, and zero-to-low bias at the latitudes studied here with a caveat that the 8 months studied byRedemann  et al. (

Table 1 .
Rogers et al., 2011)hts and hours along the CALIPSO track for the field missions up to 2011 containing CALIOP validation components (updated fromRogers et al., 2011).

Table 2 .
Number of unique 5, 20, and 80 km layers used in this study.

Table 4 .
CALIOP aerosol classification and corresponding combined (day + night) lidar ratios from both CALIOP and HSRL in the CALIOP classified layer.The first column lists the standard values (and uncertainties) used by CALIOP for the various aerosol classes, the following column lists the average values (and uncertainties) retrieved by the HSRL in the layers identified by CALIOP, and the last column shows how many unique layers were counted for each type.