Evaluation of a method for converting Stratospheric Aerosol and Gas Experiment (SAGE) extinction coefficients to backscatter coefficients for intercomparison with lidar observations

Aerosol backscatter coefficients were calculated using multiwavelength aerosol extinction products from the SAGE II and III/ISS instruments (SAGE: Stratospheric Aerosol and Gas Experiment). The conversion methodology is presented, followed by an evaluation of the conversion algorithm’s robustness. The SAGE-based backscatter products were compared to backscatter coefficients derived from ground-based lidar at three sites (Table Mountain Facility, Mauna Loa, and Observatoire de Haute-Provence). Further, the SAGE-derived lidar ratios were compared to values from previous balloon and theoretical studies. This evaluation includes the major eruption of Mt. Pinatubo in 1991, followed by the atmospherically quiescent period beginning in the late 1990s. Recommendations are made regarding the use of this method for evaluation of aerosol extinction profiles collected using the occultation method.


Introduction
Stratospheric aerosol consists of submicron particles (Chagnon and Junge, 1961) that are composed primarily of sulfuric acid and water (Murphy et al., 1998) and play a crucial role in atmospheric chemistry and radiation transfer (Pitts and Thomason, 1993;Kremser et al., 2016;Wilka et al., 2018). Background stratospheric sulfuric acid is supplied by the chronic emission of natural gases such as CS 2 (carbon disulfide), OCS (carbonyl sulfide), DMS (dimethyl sulfide), and SO 2 (sulfur dioxide) from both land and ocean sources (Kremser et al., 2016). The amount of sulfur in the stratosphere can be acutely, yet significantly, impacted by volcanic eruptions. This influence is not limited to relatively rare injections from large volcanic events such as the Mt. Pinatubo eruption of 1991 (McCormick et al., 1995), but episodic injections from smaller eruptions have been shown to have a significant impact as well (Vernier et al., 2011). Therefore, ongoing long-term observations of stratospheric aerosol are important from both a climate and chemistry perspective.
The Stratospheric Aerosol and Gas Experiment (SAGE) is a series of satellite-borne instruments that use the occultation method (both solar and lunar light sourced) and have a lineage that spans 4 decades, originating with the Stratospheric Aerosol Measurement II in 1978 (SAM-II, Chu and McCormick, 1979). Using the occultation technique, the SAM II and SAGE instruments made direct measurements of vertical profiles of the aerosol extinction coefficient (k, herein referred to simply as aerosol extinction) by recording light transmitted through the atmosphere from the sun or moon as it rises or sets. This attenuated light was then compared to exo-atmospheric values that were recorded when the light source was sufficiently high above the atmosphere. This technique allows for high-precision measurements on SAGE aerosol extinction in the main aerosol layer. In general, stratospheric aerosol extinction measurements are challenging due to the paucity of aerosols under background conditions and the ephemeral nature of ash and particulates injected directly from volcanic eruptions. However, occultation observations have the benefit of long path lengths (on the order of 100-1000 km, dependent on altitude). Further, due to the self-calibrating nature of this method, SAGE measurements are inherently stable (i.e., minimal impact from instrument drift) and ideal for long-term trend studies.
Due to the SAGE instrument's level of precision and the limited aerosol number density in the stratosphere, validating the aerosol extinction products has proven challenging. Successful validation is further limited by the measured parameter itself since coincident stratospheric extinction measurements are scarce. Conversely, high-quality backscatter measurements from ground-based lidar instruments are more common and, despite operating at a fixed location, may provide sufficient coincident observations for an evaluation of the SAGE aerosol product. However, the backscatter and extinction coefficient products are not directly comparable.
Previous researchers have accomplished this comparison through the application of conversion coefficients determined from balloon-borne optical particle counters (OPCs; see Jäger and Hofmann, 1991;Jäger et al., 1995;Jäger and Deshler, 2002) or the selection of a wavelength-dependent lidar ratio (S, typically ≈ 40-46 sr; see Kar et al., 2019) to invert the lidar backscatter (532 nm) to extinction, followed by wavelength correction to account for the differing SAGE and lidar wavelengths (conversion is carried out using the Ångström coefficient; Kar et al., 2019). A major limitation of balloon-based conversions is the uncertainty in the conversion factors (on the order of ±30 %-40 %; Deshler et al., 2003;Kovilakam and Deshler, 2015) and the requirement for ongoing OPC launches to accurately observe both zonal and seasonal variability. The primary limitation of lidar conversions is the challenge of appropriately selecting S. Indeed, Kar et al. (2019) showed that S is both altitude and latitude dependent and varies from 20 to 30 sr, while other reports (Wandinger et al., 1995;Kent et al., 1998) have shown S to go as high as 70 during background conditions. While a lidar ratio of 40-50 sr has been regarded as a satisfactory assumption, S is ultimately uninformed about the atmosphere in which the measurement was recorded, making appropriate selection of S difficult.
On the other hand, Thomason and Osborn (1992) invoked an eigenvector analysis based on SAGE II extinction ratios to convert extinction coefficients to total aerosol mass and backscatter coefficients to enable comparison with lidar observations. This method provided coefficients with uncertainties on the order of ±20 %-30 % and has been used in subsequent studies to convert lidar backscatter to extinction for comparison with SAGE observations (Osborn et al., 1998;Lu et al., 2000;Antuña et al., 2002Antuña et al., , 2003. As this method relied on SAGE-observed extinction coefficients it was more similar to our method than backscatter-to-extinction methods (vide supra) and may be considered a precursor to the present work.
Contrary to previous efforts to compare extinction and backscatter coefficients, the extinction-to-backscatter (EBC) method proposed in this study required relatively basic assumptions about the character of the underlying aerosol. These assumptions include composition, particle shape, and the shape of the size distribution (common assumptions in Mie theory, as further discussed below). While combining Mie theory and extinction measurements to gain insight into the nature of stratospheric aerosol is a common methodology (e.g., Hansen and Matsushima, 1966;Heintzenberg, Jost, 1982;Hitzenberger and Rizzi, 1986;Thomason, 1992;Bingen et al., 2004), the difference in our method is that we make no attempt to infer aerosol properties such as number density or particle size distribution. Instead, we apply Mie theory to infer the relationship between extinction and backscatter and use the range of the solution space of aerosol properties as a bounding box for uncertainty. Fortunately, within the regime of the available observations, this methodology is less sensitive to specific aerosol properties such that we can reasonably convert SAGE extinction to a derived backscatter for comparison with lidar. To this end, we present a method of converting SAGE-observed extinction coefficients to backscatter coefficients for direct comparison with stratospheric lidar observations. This method is presented as an alternative evaluation technique for the SAGE products with the intent of expanding our long-term trend intercomparison opportunities (i.e., to include ground-based lidar as well as the possibility of satellite-borne lidar).

SAGE
The SAGE instruments used in the current study are SAGE II (October 1985-August 2005 and SAGE III on the International Space Station (SAGE III/ISS, June 2017-present, hereafter referred to as SAGE III). The SAGE II instrument and algorithm (v7.0) have been described previously by Mauldin et al. (1985) and Damadeo et al. (2013), respectively. The SAGE III instrument was described by Cisewski et al. (2014), and the algorithm (v5.1) will be the topic of upcoming publications. A brief description will be offered here, but the reader is directed to these publications for details. SAGE II was a seven-channel solar occultation instrument (386,448,452,525,600,935, 1020 nm) that flew on the Earth Radiation Budget Satellite (ERBS) from October 1984 through August 2005. Due to the orbital inclination and the method of observation, SAGE II observations were limited to ≈30 occultations per day, with 3-4 times more observations at midlatitudes than at tropical and high latitudes as seen in Fig. 1a. The standard products included the number density of gas-phase species (O 3 , NO 2 , and H 2 O) and aerosol extinction (385, 453, 525, and 1020 nm) with a vertical resolution of ≈ 1 km (reported every 0.5 km). The SAGE II v7.0 products were used in the current analysis.
SAGE III is a solar-lunar occultation instrument that is docked on the ISS and has a data record beginning in June 2017. The onboard spectrometer is a charge-coupled device with a resolution of 1-2 nm. The spectrometer's spectral range extends from 280 to 1040 nm in addition to a lone In-GaAs photodiode at 1550 nm. Similar to SAGE II, SAGE III has a higher frequency of observations at midlatitudes compared to the tropics and high latitudes (Fig. 1b). The standard products include the number density of gas-phase species for both solar (O 3 , NO 2 , and H 2 O) and lunar (O 3 and NO 3 ) observations, as well as aerosol extinction coefficients (384, 448, 520, 601, 676, 755, 869, 1020, 1543 nm). The vertical resolution is 0.75 km, reported every 0.5 km). The SAGE III v5.1 products (July 2017-September 2019) were used in the current analysis.

Ground lidar
Ground lidar data from three stations were used within this study. To allow intercomparison with both SAGE II and SAGE III, candidate ground stations with a long-duration data record were preferred. Further, data quality is likewise important. The Network for Detection of Atmospheric Composition Change (NDACC, https://www.ndacc.org, last access: 7 August 2020) was founded to observe long-term stratospheric trends by making long-term, high-quality atmospheric measurements. Therefore, stations within this network were selected for comparison. We identified three stations that satisfied the requirements of this analysis: Table  Mountain Facility, Mauna Loa Observatory, and Observatoire de Haute-Provence. A brief description of the instruments and their algorithms is provided below.

Table Mountain Facility
The NASA Jet Propulsion Laboratory (JPL)  (McDermid et al., 1990b, a). The lidar used the third harmonic of a Nd:YAG laser to record elastic backscatter at 355 nm, which was corrected for ozone and NO 2 absorption, and Rayleigh extinction. The corrected backscatter was then used to calculate the aerosol backscatter coefficient from backscatter ratio (BSR) (Northam et al., 1974;Gross et al., 1995). Prior to June 2001 the BSR was calculated using pressure and temperature data from a National Centers for Environmental Protection (NCEP) meteorological model. Since June 2001 the BSR has been computed using the 387 nm channel from a newly installed Raman channel as the purely molecular component in the BSR. For both cases, the BSR was normalized to 1 between 30 and 35 km where it was assumed that the aerosol backscatter contribution was negligible.

Mauna Loa Observatory
The NOAA Mauna Loa Observatory (MLO; 19.5 • N, 155.6 • W; alt. 3.4 km) is located on the Big Island of Hawai'i. The dataset used here comes from the elastic and inelastic (Raman) backscatter channels of the JPL ozone DIAL that began measurements in 1993 Mc-Dermid, 1995). Just like the JPL-TMF system, the lidar used the third harmonic of an Nd:YAG laser to record the elastic backscatter at 355 nm, followed by correction for ozone and NO 2 absorption, as well as Rayleigh extinction. The corrected backscatter was then used to calculate the aerosol backscatter coefficient from the backscatter ratio using the 387 nm channel as the purely molecular component in the BSR as described in Chouza et al. (2020). The BSR was normalized to 1 at a constant altitude of 35 km where it was assumed that the aerosol backscatter contribution was negligible.

Observatoire de Haute-Provence
The Observatoire de Haute-Provence (OHP; 43.9 • N, 5.7 • E; 670 m a.s.l.) is located in southern France and has an elastic backscatter lidar record that began in 1994. The lidar design is based on DIAL ozone measurements that began in 1985. In 1993, the lidar system was updated for improved measurements in the lower stratosphere (Godin-Beekmann et al., 2003;Khaykin et al., 2017). The lidar used the third harmonic of an Nd:YAG laser (355 nm) to record elastic backscatter, followed by inversion using the Fernald-Klett method (Fernald, 1984;Klett, 1985) to provide backscatter and extinction coefficients, assuming an aerosol-free region 4264 T. N. Knepp et al.: Conversion of SAGE extinction to backscatter between 30 and 33 km and a constant lidar ratio of 50 sr. The error estimate for this method is < 10 % (Khaykin et al., 2017).

Methodology
Extinction and backscatter observations cannot be directly compared. In order to evaluate the agreement between backscatter measurements and extinction coefficient measurements, the data types must be converted to a common parameter, thereby requiring a conversion algorithm. As previously mentioned, this is usually done by converting backscatter to extinction coefficients using conversion factors from sources independent of either instrument (e.g., constant lidar ratio). Herein, we derive a process to infer this relationship based on the spectral dependence of SAGE II/III aerosol extinction coefficient measurements and only make basic assumptions on the character of the underlying aerosol. Indeed, this EBC method is proposed to act as a bridge between aerosol extinction and backscatter observations. This bridge is founded upon Mie theory (Kerker, 1969;Hansen and Travis, 1974;Bohren and Huffman, 1983) and invokes the typical assumptions required in Mie theory models: particle shape, composition, and distribution shape and width. Herein we assumed that all particles are spherical, are composed primarily of sulfate (75 % H 2 SO 4 , 25 % H 2 O by mass; Murphy et al., 1998), and that the particle size distribution (PSD) is single-mode lognormal. Refractive index values from Palmer and Williams (1975) were used in the calculations.
Particulate backscatter and extinction efficiency factors (Q sca (λ, r) and Q ext (λ, r), respectively; for derivation of Q(λ, r) see Kerker, 1969, andBohren andHuffman, 1983) were calculated for a series of particle radii (r = [1, 2, . . ., 1500] nm) and incident light wavelengths (λ = [350, 351, . . ., 2000] nm). Subsequently, a series of lognormal distributions (P(r m , σ g ), described by Eq. (1) where σ g is the geometric standard deviation and r m is the mode radius; the median radius of a lognormal distribution is commonly referred to as mode radius in aerosol literature -we adopt this convention here) were calculated for the same family of particles with five distribution widths (σ g = [1.2, 1.4, 1.5, 1.6, 1.8]) that were chosen to cover the range of likely distribution widths (Jäger and Hofmann, 1991;Pueschel et al., 1994;Fussen et al., 2001;Deshler et al., 2003). This was performed for all 1500 radii to calculate a new lognormal distribution as r m took on each value within r. Values for Q sca (λ, r), Q ext (λ, r), and P(r m , σ g ) were then fed into Eqs. (2) and (3) to produce three-dimensional lookup tables (r m ×λ×σ g ) of extinction and backscatter (k(λ, r m , σ g ) and β(λ, r m , σ g ), respectively; hereafter referred to as k and β) coefficients as a function of mode radius, incident light wavelength, and distribution width.
At this point a technical note regarding construction of the lognormal distribution must be made. Construction of a lognormal distribution fails when the mode radius is near the limits of r m (i.e., 1 or 1500 nm), yielding a truncated lognormal distribution. However, the mode radii required for this analysis (i.e., to generate the corresponding SAGE extinction ratios) ranged from ≈ 50 to ≈ 500 nm, well away from these bounds. With the backscatter and extinction lookup tables thus created, we now focus on their utilization in converting from k to β.
Wavelengths were selected based on SAGE extinction channels and available lidar wavelength, and the lookup tables were used to create the plots in Fig. 2. Though this figure only shows data for one combination of extinction and backscatter wavelengths, similar figures were generated for each combination (not shown), with the 520/1020 combination providing the best combination of linearity, atmospheric penetration depth, and wavelength overlap between SAGE II and SAGE III. This figure elucidates the relationship between the inverted lidar ratio (β/k, hereafter referred to as S −1 ), extinction ratio, and distribution width. Indeed, this figure provided the nexus between extinction and backscatter observations and between theory and observation since SAGEobserved extinctions were imported into this model to derive β 355 . To do this, SAGE extinction ratios (k 520 /k 1020 ) were used to define the abscissa value, followed by identifying the ordinate value (S −1 ) according to the line drawn in Fig. 2, followed by multiplication by the SAGE-observed k 1020 . For example, if the observed SAGE extinction ratio was 6, then S −1 ≈ 0.2 when σ g = 1.6, and the SAGE-derived backscatter coefficient (β SAGE ) can be calculated via Eq. (4), where k 1020 is the SAGE extinction product at 1020 nm. It is important to note a departure from convention in how the S −1 values are reported in Fig. 2. The standard convention would require both coefficients to be at the same wavelength. The current methodology requires these coefficients to be at different wavelengths as explained above. This deviation is only made in this conversion step (i.e., when using the data presented in Fig. 2), while subsequent discussions of lidar ratio estimates (e.g., Tables 2 and 3, Figs. 6 and 9, and Sect. 4.1.2) use the conventional lidar ratio definition.
A potential limitation of this method is that, for large particle sizes (extinction ratios < 1 in Fig. 2, corresponding to mode radius of ≈ 500 nm), two solutions for S −1 are possible. Further, for smaller particles sizes (extinction ratios > 6 in Fig. 2, corresponding to mode radius ≈ 50) the solutions rapidly diverge as a function of σ g , making selection of σ g increasingly important. However, SAGE extinction ratios were rarely outside these limits. This is seen in Fig. 3 where probability density functions (PDFs) and cumulative distribution functions (CDFs) of stratospheric extinction ratios were plotted for SAGE II and SAGE III. The stark difference in distribution shape between panels (a) and (b) is due to the SAGE II mission being dominated by volcanic eruptions, while the SAGE III mission, to date, has experienced a relatively quiescent atmosphere. Data in Fig. 3a and c were broken into two periods: (1) when the atmosphere was impacted by the Mt. Pinatubo eruption (1 June 1991-1 January 1998) and (2) periods when the impact of Pinatubo was expected to be less significant. It was observed that the majority of extinction ratios (> 90 %) were between 1 and 6 regardless of Pinatubo's impact. Therefore, we conclude that the majority of SAGE's observations can take advantage of this methodology.
While Fig. 3 shows that most extinction ratios avoid either multiple solutions or significant divergence in solutions due to σ g , it is understood that, due to uncertainty in σ g , there is an associated uncertainty in the derived β 355 . To account for this spread, SAGE-based backscatter coefficients were calculated for both extremes of σ g (i.e., 1.2 and 1.8). These two solutions were plotted in subsequent figures to illustrate this spread. Further discussion of uncertainties associated with the selection of σ g is presented in Sect. 3.2. Figure 2 shows the relationship between extinction ratio and S −1 for one combination of wavelengths. Since SAGE II and SAGE III recorded extinction coefficients at multiple wave- . Extinction ratio PDFs (panels a and b, bin width is 0.01) and CDFs (c, d) for the SAGE II and SAGE III missions. Due to the impact of the Pinatubo eruption, separate histograms were created for the SAGE II panels. Only stratospheric altitude (tropopause +2 km) were used to generate this figure to ensure exclusion of cloud contamination. In the SAGE II panels (a, c), light red indicates data impacted by the Pinatubo eruption, gray indicates data not impacted by Pinatubo, and dark red is the overlapped region of the two. lengths, there were multiple wavelength combinations from which to choose. Under ideal conditions, the β derived using this conversion methodology should be independent of wavelength combination. Indeed, it can be trivially demonstrated that, working strictly within the confines of theory (i.e., no noise or uncertainty), this is the case. However, in reality, the SAGE extinction products were impacted by errors originating in hardware (e.g., instrument noise), retrieval algorithm (e.g., how well gas species were cleared prior to retrieving aerosol extinction), and atmospheric conditions (e.g., impact of clouds). Therefore, the method's consistency was evaluated by calculating β SAGE using three wavelength combinations to form the abscissa in Fig. 2: 385/1020, 450/1020, and 520/1020 (hereafter this calculated β is referred to as β S(385) , β S(450) , and β S(520) ). The target backscatter wavelength was held constant (355 nm) in this evaluation for two reasons:

Internal evaluation of the method
(1) this is the lidar wavelength used at the three ground sites used in this study, and (2) selection of lidar wavelength does not influence the evaluation of the method's consistency.

Comparison of β
To evaluate the robustness of the EBC algorithm, β SAGE was calculated at three wavelength combinations: β S(385) , β S(450) , and β S(520) . Within this evaluation the 520 ratio acted as the reference (i.e., the 385 and 450 nm ratios were compared to the 520 ratio in subsequent statistical analyses). The intent of this comparison was to quantify and qualify the vari-ability between the differing β SAGE products. The following analysis was conducted using zonal statistics (5 • latitude, 2 km altitude bins) that were weighted by the inverse measurement error within the reported SAGE extinction products. These data are presented both graphically (Figs. 4 and 5) and numerically (Table 1).
The zonal weighted coefficient of correlation (R 2 ) and weighted slope of linear regression profiles are presented in Figs. 4a-d and 5a-d for SAGE II and SAGE III, respectively. It was observed that the coefficients of correlation and slopes between the three products were high throughout the profile (R 2 ≥ 0.85 and slope ≥ 0.78; Table 1) and were higher towards the middle of the stratosphere (R 2 >0.95 and slope ≈1). However, at lower and higher altitudes the overall performance was worse. This degradation was driven by several factors: (1) the shorter wavelengths were attenuated higher in the atmosphere due to increasing optical thickness, which led to negligible transmittance through lower sections of the atmosphere; (2) the impact of cloud contamination at lower altitudes; and (3) differences in the higher altitudes were the product of limited aerosol number density (i.e., increased uncertainty due to decreased extinction). To better understand this altitude dependence and identify altitudes at which the conversion method may be most successfully applied, we evaluated a series of altitude-based filtering criteria. A brief discussion of these criteria, and their impact on the statistics in Table 1, will be presented prior to continued discussion of Figs. 4 and 5.
Correlation plots (not shown) were generated for each latitude band and each altitude from 12 to 34 km (2 km wide bins centered every 2 km) with corresponding regression statistics to better understand how the agreement between the backscatter products varied with altitude and latitude and to aid in defining reasonable filtering criteria to mitigate the impact of spurious retrieval products typically seen at lower and higher altitudes. We observed that data collected between 15 and 31 km had higher coefficients of correlation, slopes closer to 1, and a tighter grouping about the 1 : 1 line (i.e., fewer outliers in either direction). From this, we defined the altitude-based filtering criteria to only include data collected within the altitude range 15-31 km.
As an evaluation of how much influence data outside the 15-31 km range had on this analysis an ordinary line of best fit was calculated for each combination of beta values (i.e., β S(385) vs. β S(520) and β S(450) vs. β S520 ) for the SAGE II and SAGE III missions under two conditions: (1) all available data were used, or (2) only data between 15 and 31 km were used. A summary of this evaluation is presented in Table 1 wherein it is observed that when all data throughout the profile were used the mean slope (0.78-1.02) and mean R 2 (0.87-0.98) had broad ranges, as did the corresponding standard deviations. However, when the dataset was limited to 15-31 km (values in parentheses) the range of mean slopes (0.94-0.99) and mean R 2 (0.95-0.98) decreased significantly, as did the corresponding standard deviations. It was observed that when the filtering criteria were in place the standard deviation significantly narrowed, in some cases by more than an order of magnitude.
By considering only the mean slope and mean R 2 values the impact of the filtering criteria is partially masked. The influence of these criteria is better observed by considering the minimum and maximum values for both slope and R 2 and by considering its impact on the 95th percentile (P 95 ). Here, P 95 was calculated using a nontraditional method. For slope, P 95 represents the range over which 95 % of the data fall, centered on the mean. As an example, if the mean slope is 1, how far out from 1 must we go before 95 % of the data are captured? This range is not necessarily symmetrical about the mean since either the minimum or maximum slope may be encountered prior to reaching the 95 % level. On the other hand, P 95 for R 2 indicates the lowest R 2 value required to capture 95 % of the data (R 2 = 1 acting as the upper bound). Indeed, contrasting the full-and filtered-profile P 95 values in Table 1 provides a convincing illustration of the improvement the filtering criteria had on the comparison.
From this evaluation we conclude that data outside 15-31 km significantly influenced the statistics and that the applicability of this conversion method is limited to regions where sufficient signal is received by the SAGE instruments, namely 15-31 km.
Having established an altitude range interval over which the EBC method remains robust we can continue the evaluation of the aggregate statistics as shown in Figs. 4 and 5. To gauge the overall difference between the products, P 95 values for the absolute percent differences are shown in panels (e) and (f). This is an illustrative statistic in that it shows that, for example, 95 % of the time the β S(450) and β S(520) products for SAGE II were within 10 % of each other at 24 km over the Equator. More generally, it is observed that the two products were within ±20 % of each other (all wavelengths for both SAGE II and SAGE III) throughout most of the atmosphere, similar to the R 2 and slope products. Similar to panels (a)-(d), the absolute percent difference has better agreement between the longer-wavelength products (within ±20 % for β S(450) and β S(520) ) and follows a similar contour to that seen in the R 2 (panels a and b) and slopes (panels c and d).
The high R 2 values and slopes are encouraging and we conclude that, throughout most of the lower stratosphere, the calculated backscatter coefficient is independent of SAGE extinction channel selection. It is noted that the performance of β S(385) was limited by comparatively rapid attenuation higher in the atmosphere, thereby limiting applicability of this channel within the EBC algorithm. Further, we suggest that this attenuation was the driving factor in the worse agreement between β S(385) and β S(520) . Conversely, β S(450) showed better agreement with β S(520) throughout most of the lower stratosphere, leaving two viable extinction ratios for calculating β SAGE : 450/1020 and 520/1020 nm. While the 450 nm channel will not be attenuated as high in the at-  mosphere as the 385 nm channel, it will saturate before the 520 nm channel.
In addition to comparing β as a function of extinction wavelength, the algorithm performance can be qualitatively compared between the SAGE II and SAGE III missions. While this comparison is valid, it must be remembered that the SAGE II record extends over a 20-year period, including impacts from the El Chichón (1982) and Pinatubo (1991) eruptions, which significantly influenced atmospheric composition. Conversely, the SAGE III mission is currently in its third year and, to date, has had no opportunity to observe the impact of a major volcanic eruption. On the contrary, current stratospheric conditions have been relatively clean for the past 20 years. The agreement in performance between the two missions is most readily seen by comparing the filtered slope and R 2 statistics in Table 1, wherein it is observed that the differences are statistically insignificant.
From this evaluation we determined that the selection of extinction wavelength combination had a minimal impact on the calculated backscatter products when altitudes are limited to 15-31 km (i.e., each combination of SAGE wavelengths yielded the same backscatter coefficient within the Table 1. Aggregate statistics for line-of-best-fit slope and R 2 for the SAGE II and SAGE III products compared to β S(520) . Values in parentheses were calculated after restricting the dataset to altitudes between 15 and 31 km. The last column was calculated using data from both missions and backscatter values calculated using both β S(385) and β S(385) . The slope P 95 data show the range of slopes (centered on the mean). provided errors). Therefore, we proceed with the current analysis by using the 520/1020 nm combination to convert SAGE-observed extinction coefficients to backscatter coefficients for comparison with lidar-observed backscatter coefficients.

Uncertainties
As with any study that involves modeling PSDs, the dominant sources of uncertainty are in the assumptions of aerosol composition and distribution parameters. Here, the particle number density and mode radius play a minor role. However, as seen in Fig. 2, the selection of σ g has a variable impact. The statistics presented in Sect. 3.1 were calculated using σ g = 1.5 but are not influenced by the selection of σ g since changing the selection of σ g will shift all datasets up or down equally. On the other hand, the accuracy of the method is highly dependent on σ g . As an example, setting σ g = 1.5 leads to a +32/ − 16 % uncertainty (compared to σ = 1.8 and σ = 1.2, respectively) when the extinction ratio equals 6. Since > 90 % of the stratospheric extinction ratios do not exceed 6, we consider +32/ − 16 % to act as a reasonable upper limit of expected uncertainty for this analysis. This uncertainty is depicted in subsequent figures by a shaded region that represents the extinction-ratio-dependent upper and lower bounds for β S(520,σ g =1.2) and β S(520,σ g =1.8) (i.e., for smaller extinction ratios the spread between β S(520,σ g =1.2) and β S(520,σ g =1.8) decreased). It was observed that this spread was negligible at lower altitudes but increased with altitude. Another challenge in comparing SAGE and lidar observations is the differing viewing geometries. The uncertainty introduced by these differing geometries cannot be easily accounted for. However, current versions of the algorithm (Damadeo et al., 2013) and previous studies (Ackerman et al., 1989;Cunnold et al., 1989;Oberbeck et al., 1989;Wang et al., 1989;Antuña et al., 2002;Jäger and Deshler, 2002;Deshler et al., 2003) have taken advantage of the horizontal homogeneity of stratospheric aerosol, which mitigates the impact of differing viewing geometries.

Method application
The EBC method was applied to SAGE II and SAGE III datasets for intercomparison with ground-based lidar products. A discussion of the results of each SAGE mission follows.

SAGE II
The SAGE II record spanned over 20 years and had the benefit of observing the impact of two of the largest volcanic eruptions of the 20th century: recovery from El Chichón in 1982 and the full life cycle of the Mount Pinatubo eruption of 1991, followed by a return to quiescent conditions in the late 1990s. Within this record the extinction and backscatter coefficients spanned nearly 2 orders of magnitude, providing an interesting case study.
SAGE II data were used to estimate β 355 using the 520/1020 extinction ratio (Fig. 2). For this comparison, β SAGE was calculated on a profile-by-profile basis. These profiles were then used to calculate zonal monthly means. Likewise, lidar profiles were averaged on a monthly basis for comparison. The time series, at four altitudes, are presented in Figs. 6, 7, and 8 for Table Mountain, Mauna Loa, and OHP, respectively. The spread in the β SAGE value, due to varying Figure 6. Time series of SAGE II (monthly zonal mean), lidar (monthly mean) backscatter coefficients, and a SAGE-based lidar ratio estimate at 355 nm (monthly zonal mean) over the Table Mountain Facility. The spread in the SAGE-derived backscatter and lidar ratios (both coefficients at same wavelength) represents the range of values due to changing the spread (σ ) of the lognormal distribution. The backscatter upper bound is when σ = 1.8 and the lower bound is for σ = 1.2 (vice versa for the lidar ratio). Error bars represent standard errors. Altitude bins span ±2.5 km. results in solving Eq. (4) for differing values of σ g , is represented by the black shaded time series data. It is noted that, most of the time, this shaded area is indistinguishable from the black line thickness. Error bars in Figs. 6, 7, and 8 represent the standard error (error on mean). We observed that the datasets were in qualitatively good agreement at all altitudes, especially when the atmosphere was impacted by the Pinatubo eruption (June 1991(June -1998. Statistics for the time series data are presented in Table 2. The data were broken into two time periods: (1) when the signal was perturbed by the Pinatubo eruption (labeled PE in the table, June 1991-December 1997), as defined by Deshler et al. (2003) and Deshler et al. (2006), and (2) periods outside the Pinatubo impact classified as background (labeled BG in the table). As seen in Figs. 6, 7, and 8, the return to background conditions was sooner at higher altitudes, which may influence some statistics in Table 2 since the Pinatubo time period classification (i.e., June 1991-December 1997) was applied to all altitudes. Statistics in this table were calculated using SAGE monthly zonal means and lidar monthly mean values at four altitudes. Percent differences were calculated using Eq. (5).

TMO
Data collected at 15 km showed the worst agreement due to atmospheric opacity and cloud contamination as discussed above. Conversely, the agreement was best at 20 and 25 km (percent difference within ≈ 10 %), where the atmosphere was most impacted by the Pinatubo eruption (k 520 increased by ≈ 2 orders of magnitude), followed by an extended, approximately exponential return to background conditions in  the late 1990s (Deshler et al., 2003;Deshler et al., 2006). Beginning in ≈ 1996, the stability of the lidar signal decreased as the amount of aerosol in the atmosphere decreased, with more significant fluctuations appearing immediately prior to the 1991 eruption and later in the record. In contrast to the lidar record, the SAGE record remained smooth throughout except at 30 km where it showed more variability. It was observed that during the Pinatubo time period the coefficients of correlation and line-of-best-fit slopes were higher than during background conditions. This was expected behavior for background conditions for two reasons: (1) in the absence of stratospheric injections the instruments were left to sample the natural stratospheric variability (similar to noise), which limits correlative analysis outside longduration climatological trend studies, and (2) the limited dynamic range of the observations essentially provides a correlation between two parallel lines. Overall, the percent differences for TMO show the two techniques to be in good agreement, with worse agreement occurring at 15 km, which was expected due to cloud contamination.

Observatoire de Haute-Provence
Unlike TMO, the OHP lidar record did not start until ≈ 2.5 years into the Pinatubo recovery (similar to MLO). How-ever, SAGE II recorded significantly more profiles over this latitude than over MLO, leading to a better representation of the zonal aerosol loading throughout the month. The increased differences at 15 and 30 km were expected, as discussed above. However, we did not anticipate the large difference at 25 km when the atmosphere was impacted by Pinatubo (−29.44 %). After further analysis it was determined that this difference was driven by a single 2.5-year time period that straddled both the Pinatubo time period and the beginning of the quiescent period (June 1996-January 1999. During this time the two records were consistently in substantial disagreement. This disagreement can be seen visually in Fig. 7. Removing data from this time period reduced the percent difference to −16.87 % (percent difference during background conditions was reduced to +2.90 %). In an attempt to identify the source of this discrepancy we repeated the analysis under different longitude criteria (e.g., instead of doing zonal means we used only SAGE profiles collected within 5, 10, and 20 • longitude), weekly means instead of monthly, and adjusted the temporal coincident criteria. The intention of this analysis was to determine whether variability that was local to OHP was driving the differences. However, we were unable to identify any such local variability and we currently cannot account for this anomaly within the time series.
In addition to β, the OHP data record contained a lidar ratio time series, thereby allowing comparison with the lidar ratio derived from the EBC algorithm. Percent differences for the lidar ratio comparison are presented in Table 2. The slope and R 2 values were not reported for S because the OHP S value was held static for extended periods of time, making these statistics meaningless. However, the relative difference retains meaning, and we observe that the percent difference between S values was consistently within 20 %. Changes in the lidar ratio due to changing aerosol mode radii throughout the recovery time period were in agreement with what is expected due to a major volcanic eruption. Indeed, by the end of the SAGE II mission S had recovered from the El Chichón and Pinatubo eruptions to a value of ≈ 50-60 at all altitudes as supported by the SAGE-derived S and the estimate used in the lidar retrievals.

Mauna Loa
Similar to OHP, the MLO record did not begin until ≈ 2.5 years into the Pinatubo recovery. Beginning in June 1995 the two datasets began to diverge at 20 km (Fig. 8), with the lidar record flattening out. In contrast, β SAGE continued with a quasi-exponential decay until January 1998, in agreement with the other two sites and previously published studies (e.g., Deshler et al., 2003;Thomason et al., 2018). In January 1998 the lidar signal experienced an anomaly wherein the signal decreased by approximately an order of magnitude. After this time, β SAGE was consistently larger than β Lidar . The discrepancy from June 1995 to January 1998 at 20 km is currently not understood. However, the sudden change in January 1998 coincides with a new lidar instrument setup.
The statistics in Table 2 show the MLO comparison to be the worst of the three stations (excluding the −29.44 % difference at 25 km over OHP; conversely, the MLO percent difference at 25 km was relatively small). In addition to the anomalous behavior between 1995 and 1998 the SAGE II instrument experienced relatively few overpasses over Mauna Loa's latitude (19.5 • N) as seen in Fig. 1a. Therefore, we suggest that the poor agreement between the two instruments may have been driven by inadequate sampling by SAGE.

SAGE III/ISS
To date, the SAGE III mission has made observations under relatively clean stratospheric conditions similar to conditions at the end of the SAGE II mission. Due to the limited data record (3 years since launch), the comparison between SAGE III and the Mauna Loa and OHP lidars will be cursory. Data from the Table Mountain Facility have not been released for this time period; therefore, Table Mountain was excluded from the current analysis.
The SAGE III and lidar backscatter coefficients show similar qualitative agreement at both Mauna Loa and OHP (Figs. 9 and 10, respectively), similar to what was observed in the SAGE II comparison (vide supra). During the SAGE III mission the atmosphere has been relatively stable, with a minor increase in backscatter and aerosol extinction in late 2017 due to a significant pyrocumulonimbus (pyroCB, indicated by the vertical line in the figures) event in northwestern Canada, which was comparable to a moderate volcanic eruption (Peterson et al., 2018). Smoke from the py-roCB was clearly visible over midlatitude sites like OHP (Khaykin et al., 2017), while there was no clear evidence of significant aerosol loading at low latitudes (i.e., over Mauna Loa). However, Chouza et al. (2020) showed a small increase in stratospheric aerosol optical depth over Mauna Loa during this period.
Similar to the end of the SAGE II record, calculation of a meaningful R 2 value is likewise challenging when the variability is small. Further, getting good agreement between extremely low β values is challenging since small fluctuations have a disproportionate impact on the overall percent difference (see Table 3). However, this may be indicative of two possible conclusions: (1) the EBC method has limited applicability to background conditions, and (2) the precision and accuracy of SAGE III or the ground-based lidar are too limited to make meaningful measurements during background conditions. The validity of option 1 can be challenged with the SAGE II record (compare to Table 2) wherein it was shown that the background percent differences were generally better than during the Pinatubo period. Therefore, it would seem that the EBC method is applicable to quiescent periods. Considering the precision of the SAGE instruments  and the number of lidar validation and intercomparison campaigns, the possibility of option 2 being valid seems unlikely.

Overall impression
For the SAGE II instrument the derived β SAGE products generally had high coefficients of correlation and slopes near 1 when compared with the lidar-derived products, especially in the 20-25 km range during background conditions. While agreement was consistently good in the 20 and 25 km bins (within 5 %), the agreements consistently diverged in the 15 and 30 km bins for both the Pinatubo and background time periods. The divergence at 15 km is likely attributable to optical depth and cloud contamination, but the divergence at 30 km is not as easily explained. Indeed, it may be partly caused by a lack of return signal in the lidar and lack of optical depth for SAGE (though this is generally not a challenge for SAGE instruments at this altitude). We note that this divergence was modest (± ≈ 20 %) over TMO and MLO, where β was calculated using the BSR technique. However, the divergence was significantly larger over OHP where β was calculated via the Fernald-Klett method, the lidar ratio was held constant, and the atmosphere is considered aerosolfree from 30 to 33 km. We suggest that this highlights the sensitivity of Fernald-Klett to atmospheric variability and the need to make an informed selection of lidar ratio. Perhaps the most striking feature of this analysis is how well the SAGE-derived backscatter coefficient agreed with the lidar record during the early stages of the Pinatubo eruption (Fig. 6) when particle shape and composition deviated significantly from our initial assumptions (Sheridan et al., 1992). This seems to indicate that using an extinction ratio may compensate for mischaracterization of size and composition assumptions within our model. However, further evaluation involving major volcanic eruptions is required to better understand whether this agreement is fortuitous or the EBC algorithm is actually insensitive to aerosol composition and shape.
The calculated S at each site was in good agreement with values calculated by Jäger et al. (1995). Immediately prior to the eruption S was approximately 40-45 for the lowermost altitudes (tropopause-20 km) and slightly higher (50-60) in the 25-30 km altitudes. This was followed by a quick decrease after the eruption of Pinatubo, down to values of 20 in the Jäger dataset, with our calculated value being slightly lower. Overall, the calculated S shows good agreement with the Jäger dataset in both magnitude and trend with altitude. Other studies that did not overlap with either SAGE II or SAGE III have shown similar S values to those calculated here Painemal et al., 2019).

Conclusions
A method of converting SAGE extinction ratios to backscatter coefficient (β) profiles was presented. The method invoked Mie theory as the conduit from extinction to backscatter space and required assumptions on particle shape (spherical), composition (75 % water, 25 % sulfuric acid), and distribution shape (single-mode lognormal with distribution width σ of 1.5). The general behavior of the model as a function of σ was briefly considered (Fig. 2 and Sect. 3). It was demonstrated that, due to improper selection of σ , the corresponding β value could be off by up to +32/ − 16 % when the extinction ratio exceeds 6, but > 90 % of the SAGE II and SAGE III records had extinction ratios < 6.
A major finding of this research was the demonstration of the robustness of the conversion method. It was shown that, within the specified error bars, the calculation of β was independent of SAGE wavelength combination (Sect. 3.1). Further, we showed that when altitude was limited to 15-31 km the robustness improved significantly (Table 1). Therefore, we recommend limiting the use of this conversion method to this altitude range unless appropriate modifications can be made to improve the consistency of its performance at higher or lower altitudes. Such improvements may include cloud screening at low altitudes and appropriate adjustment of size distribution parameters at higher altitudes.
The robustness of the conversion method provides an indirect validation of the SAGE aerosol spectra. If the EBC method were wavelength dependent, this would indicate a substantial error in the standard aerosol products. However, our evaluation showed that the EBC is not wavelength dependent, thereby lending credence to the SAGE aerosol product wavelength assignment.
It was shown that, overall, the SAGE II-derived β product was in good agreement with the lidar data during both background (percent difference within ≈ 10 %) and elevated (percent difference within ≈ 10-20 % depending on location) conditions. Indeed, we showed that this agreement was altitude dependent, with better agreement in the middle stratosphere and worse agreement at lower (15 km) and upper (30 km) altitudes. The reason for this divergence was attributed to increased optical depth and cloud contamination at low altitudes and decreased aerosol load at higher altitudes. The lack of optical depth at high altitudes is less of a problem for SAGE than for the lidar. This is fundamentally due to the differing viewing geometries: SAGE retains a long observation path length at 30 km, while the lidar instrument relies on few photons being backscattered at 30 km. Further, all scattered photons must re-pass through the most dense portion of the atmosphere (without being absorbed or scattered) prior to impinging on the lidar detector. This limitation is most readily observed by considering how the noise and variance increase with altitude in a lidar profile.
For the SAGE III analysis only OHP and MLO were available for comparison. The SAGE III-derived β product showed worse agreement than the SAGE II data. The lower R 2 values were attributed to a lack of variability within the data records (e.g., R 2 of parallel straight lines is 0). However, the larger percent differences may have been due to the magnitude of the backscatter values (e.g., small differences such as 2 × 10 −10 for small numbers such as 1 × 10 −10 yield large percent differences -here, 100 %). Another consideration is that the SAGE III record, to date, is short compared to SAGE II, and the lidar coverage within the SAGE III time period is approximately 1 year, further limiting the intercomparison. As the record expands (possibly including observations of moderate to major volcanic events) we expect the comparison with the lidar data to improve.
A potential application of this method is informing lidar ratio (S) selection for lidar observations. As an example, processing for the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) lidar currently assumes a static lidar ratio (50 sr) for all latitudes and all altitudes. As was recently shown by Kar et al. (2019), CALIPSO extinction products have an altitudinal and latitudinal dependence compared to SAGE III. Providing better S values may improve this agreement and may be beneficial in processing CALIPSO β products as well.
Another application of this method may be providing global backscatter profiles independent of a space-based lidar such as CALIPSO. While we do not suggest that SAGE-derived backscatter coefficients can replace lidar observations, our product may be a viable alternative. With CALIPSO scheduled for decommissioning no later than 2023 (Mark Vaughan, personal communication, 2020) and no replacement scheduled for flight prior to its decommissioning date, the SAGE III backscatter product may provide a necessary link between CALIPSO and the next space-based lidar to ensure continuity of the record and provide a method of evaluating the performance of the next-generation orbiting lidar in the context of the SAGE III record and, by association, CALISPO.
Author contributions. TNK and LT developed the methodology, while TNK carried out the analysis, wrote the analysis code, and wrote the paper. TL and FC provided lidar data collected at TMF and MLO and assisted in the description of this data product in the paper. SK and SGB provided lidar data collected over OHP and assisted in the description of this data product in the paper. MR, RD, KL, and DF participated in scientific discussions and provided guidance throughout the study. All authors reviewed the paper during the preparation process.
Competing interests. The authors declare that they have no conflict of interest.
Special issue statement. This article is part of the special issue "New developments in atmospheric Limb measurements: Instruments, Methods and science applications". It is a result of the 10th international limb workshop, Greifswald, Germany, 4-7 June 2019.