Characterization of Odin-OSIRIS ozone profiles with the SAGE II dataset

The Optical Spectrograph and InfraRed Imaging System (OSIRIS) on board the Odin spacecraft has been taking limb-scattered measurements of ozone number density profiles from 2001–present. The Stratospheric Aerosol and Gas Experiment II (SAGE II) took solar occultation measurements of ozone number densities from 1984–2005 and has been used in many studies of long-term ozone trends. We present the characterization of OSIRIS SaskMART v5.0 × against the new SAGE II v7.00 ozone profiles for 2001– 2005, the period over which these two missions had overlap. This information can be used to merge OSIRIS with SAGE II into a single ozone record from 1984 to the present, if other satellite ozone measurements are included to account for gaps in the OSIRIS dataset in the winter hemisphere. Coincident measurement pairs were selected for ±1 h, ±1 latitude, and±500 km. The absolute value of the resulting mean relative difference profile is <5 % for 13.5–54.5 km and<3 % for 24.5–53.5 km. Correlation coefficients R > 0.9 were calculated for 13.5–49.5 km, demonstrating excellent overall agreement between the two datasets. Coincidence criteria were relaxed to maximize the number of measurement pairs and the conditions under which measurements were taken. With the broad coincidence criteria, good agreement (< 5 %) was observed under most conditions for 20.5– 40.5 km. However, mean relative differences do exceed 5 % for several cases. Above 50 km, differences between OSIRIS and SAGE II are partly attributed to the diurnal variation of ozone. OSIRIS data are biased high compared with SAGE II at 22.5 km, particularly at high latitudes. Dynamical coincidence criteria, using derived meteorological products, were also tested and yielded similar overall results, with slight improvements to the correlation at high latitudes. The OSIRIS optics temperature is low ( < 16C) during May–July, when the satellite enters the Earth’s shadow for part of its orbit. During this period, OSIRIS measurements are biased low by 5–12 % for 27.5–38.5 km. Biases between OSIRIS ascending node (northward equatorial crossing time ∼ 18:00 LT – local time) and descending node (southward equatorial crossing time ∼ 06:00 LT) measurements are also noted under some conditions. This work demonstrates that OSIRIS and SAGE II have excellent overall agreement and characterizes the biases between these datasets.


Introduction
Continuous and consistent long-term atmospheric datasets are essential for the assessment of ozone recovery.The Optical Spectrograph and InfraRed Imaging System (OSIRIS) satellite instrument has been measuring ozone profiles from 2001 to the present, yielding a consistent dataset that spans 11 yr, a full solar cycle.These limb-scattered measurements have a vertical resolution that is comparable to solar occultation measurements, but with far better global coverage.

C. Adams et al.: Characterization of Odin-OSIRIS ozone profiles with the SAGE II dataset
While good agreement between OSIRIS ozone profiles and some of those other datasets has been demonstrated (e.g., Degenstein et al., 2009;and Dupuy et al., 2009), they have not yet been fully characterized through intercomparisons.
OSIRIS and SAGE II ozone datasets are both included in the SI 2 N (SPARC -Stratospheric Processes and their Role in Climate, IO 3 C -International Ozone Commission, IGACO-O3 -Integrated Global Atmospheric Chemistry Observations, NDACC -Network for the Detection of Atmospheric Composition Change) initiative, which aims to compile short-term satellite, long-term satellite, and ground-based ozone measurements in a consistent manner (SI 2 N, 2012).Short-term efforts include the European Space Agency's (ESA) Climate Change Initiative (CCI), which aims to compile the comprehensive observations that are necessary to characterize the global climate system and its variability.These datasets must be validated, have well-characterized errors, and be distributed in data formats that are useable for a wide range of users (Bennett, 2012).In order to produce a multi-decadal ozone record, SI 2 N plans to merge modern ozone measurements with the 1979-2005 SAGE I and SAGE II datasets.In order to merge datasets spatially and temporally, they must be very well characterized, and biases must be well documented.
We present the characterization of OSIRIS ozone profiles against SAGE II measurements from 2001-2005 and demonstrate that OSIRIS ozone data have the potential to be combined with the SAGE II dataset.Section 2 gives an overview of the OSIRIS and SAGE II satellite instruments and ozone datasets.In Sect.3, the intercomparison methodology is presented.The results of the satellite intercomparisons are discussed in Sect. 4 and conclusions are given in Sect. 5.

OSIRIS and SAGE II ozone profiles
The Canadian-made OSIRIS instrument, aboard the Swedish satellite Odin, was launched into a sun-synchronous orbit on 20 February 2001 (Murtagh et al., 2002;Llewellyn et al., 2004).The optical spectrograph measures limb scattered sunlight at 280-810 nm with a spectral resolution of approximately 1 nm.Odin has a polar orbit with a 96 min period that stays very near local dusk on the ascending track (northward equatorial crossing at ∼ 18:00 LT -local time) and near local dawn on the descending track (southward equatorial crossing at ∼ 06:00 LT), going through local midnight near the south pole and local noon near the north pole.This process repeats itself every orbit, with a slow precession in the local time of the ascending node throughout the lifetime of the mission.The orbit provides measurement coverage from ∼ 82.2 • S to ∼ 82.2 • N latitude.Note that due to the precession in the local time of the ascending node, OSIRIS coverage is improving in time.Due to Odin's orbit, measurements are only taken in the summer hemisphere, with coverage in both hemispheres in the spring and fall.A review of the first decade of OSIRIS measurements is given by McLinden et al. (2012).
The OSIRIS SaskMART v5.0× ozone data are used in this study.The Multiplicative Algebraic Reconstruction Technique (MART) retrieval algorithm (Roth et al., 2007;Degenstein et al., 2009) combines ozone absorption information in both the UV and visible parts of the spectrum to retrieve number density profiles from the cloud tops to 60 km (down to a minimum of 10 km in the absence of clouds).Radiative transfer is calculated using the SASKTRAN model (Bourassa et al., 2008b).Aerosol and NO 2 are retrieved at the same time as ozone to reduce biases (Bourassa et al., 2007(Bourassa et al., , 2008a(Bourassa et al., , 2011)).The SaskMART v5.0× ozone dataset has a vertical resolution of ∼ 2 km and an estimated precision of 3-4 % in the middle stratosphere (Bourassa et al., 2012).Prior to data distribution, OSIRIS data are screened using the methods described in Appendix A. Furthermore, data were screened for polar stratospheric clouds at Southern Hemisphere high latitudes.For 60-90 • S, if the aerosol extinction exceeded 0.0005 km −1 at a given altitude, ozone data were removed from 2 km above the given altitude down to the bottom of the profile.This screening technique was selected based on comparisons between OSIRIS ascending and descending node measurements and against ozonesonde measurements.Note that similar screening was attempted at other latitudes, but removed many profiles and did not significantly improve comparison results.
SAGE II was launched in October 1984 aboard the Earth Radiation Budget Satellite (McCormick, 1987) and was in operation until late 2005.From 1984-2000, SAGE II took about 15 sunrise and 15 sunset solar occultation measurements per day, covering 80 • S to 80 • N in latitude.Due the degradation of the charging system, in 2000 the SAGE II sampling was reduced to ∼ 15 measurements per day, with the observation cycle focused on either sunrise or sunset measurements.At reduced sampling, quasi-global coverage was achieved on a monthly basis.Successive measurements were separated by approximately 24 • in longitude and a fraction of a degree in latitude (McCormick et al., 1989).SAGE II had measurement channels centered at 385, 453, 448, 525, 600, 940, and 1020 nm.SAGE II v7.00 ozone data, which were released in November 2012, are used in this study.Ozone slant columns are inverted to a 1 km vertical resolution and placed onto a 0.5 km altitude grid using an onion peeling technique instead of the Twomey-Chahine technique used in earlier versions (Chu et al., 1989).The SAGE II v7.00 data have slightly smaller ozone number densities than the v6.2 data, typically on the order of 1-2 % due to the adoption of the ozone spectroscopy of SCIAMACHY (SCanning Imaging Absorption spectrometer for Atmospheric CHartogographY; Bogumil et al., 2003).Prior to comparisons with OSIRIS, SAGE II data were screened according to the recommendations of Wang et al. (2002).Furthermore, if an error value greater than or equal to 200 % was found at a given altitude level below 30 km, data at this level and all levels below were excluded from the comparisons.

Coincidence criteria & comparison methodology
Figure 1 shows the latitudes of OSIRIS and SAGE II measurements for 2001-2005, the period during which both OSIRIS and SAGE II were operational.OSIRIS and SAGE II coordinates are given for the 25 km and 30 km tangent heights, respectively.OSIRIS limb measurements have ground-tracks of ∼ 540 km (∼ 330 km) for up-scan (downscan) measurements.SAGE II ground-tracks are also ∼ 300-500 km long.The distribution of SAGE II satellite sunrise and sunset occultations varies throughout the year.For 96 % of SAGE II measurements for 2001-2005, the satellite sunrise (sunset) occultations are taken during local sunrise (sunset).OSIRIS has excellent coverage in the summer hemisphere and overlaps with many SAGE II occultations.
Coincident measurement pairs were selected for the three sets of criteria given in Table 1.The narrow coincidence criteria were used to select measurement pairs that sampled very similar air masses.The ±1 h time criterion reduces the impact of the diurnal variation of ozone at higher altitudes (see Sect. 4.2), while the ±500 km distance criterion corresponds to the approximate horizontal distance covered by an individual OSIRIS or SAGE II measurement.In order to increase the number of coincident measurements for the investigation of possible biases, broad coincidence criteria of ±24 h and ±1000 km were used.Note that the broad coincidence criteria allow morning and evening measurements to be compared against one another.For both the narrow and broad criteria, a ±1 • latitude criterion was also imposed.
In order to determine the impact of mismatched air masses on the agreement between the measurements for the broad coincidence criteria, dynamical coincidence criteria were also tested.Derived meteorological products (DMPs) (Manney et al., 2007) were calculated along the line-of-sight of the SAGE II measurements and directly above the OSIRIS tangent point, using meteorological fields from the UK Met Office stratosphere-troposphere data assimilation system.Measurements at a given altitude were considered coincident only if both measurements were in the stratosphere or both measurements were in the troposphere.In the stratosphere, additional criteria were used based on scaled potential vorticity (sPV) and stratospheric temperature.sPV values of 1.2 × 10 −4 s −1 and 1.6 × 10 −4 s −1 , can be used to estimate the outer and inner vortex edges (e.g., Manney et al., 2008), respectively, as they typically bound the region of strongest PV gradients.Both measurements were required to either be inside the vortex (sPV > 1.6 × 10 −4 s −1 ) or outside the vortex (sPV < 1.2 × 10 −4 s −1 ) at each altitude level.Measurements taken on the vortex edge (sPV between 1.2 × 10 −4 s −1 and 1.6 × 10 −4 s −1 ) were not included in the comparisons.Furthermore, a temperature coincidence criterion of ±10 K was imposed at each layer in the stratosphere to account for the temperature-dependence of ozone chemistry.
Ozone number density profiles were compared on the OSIRIS altitude grid, which is regularly spaced at 1 km intervals.Since SAGE II and OSIRIS retrieve the same fundamental quantity, number density as a function of altitude, there was no need to convert ozone units or vertical coordinates.SAGE II profiles were smoothed with a triangular filter to a 2 km resolution to match the approximate vertical resolution of OSIRIS.The smoothing of the SAGE II data had < 1 % influence (in absolute difference) on both the mean relative difference between all coincident profiles and the standard deviation in the mean relative difference for   21.5-39.5 km.At high and low altitudes small improvements to the standard deviation (∼ 1-4 %) were observed when the data was smoothed.The mean relative difference, rel at a given altitude, z, between sets of coincident OSIRIS (M 1 ) and SAGE II (M 2 ) measurements was calculated as The standard deviation of rel was also calculated.

Results
Figure 2 shows a time series of OSIRIS and SAGE II ozone data for 40-45 • N at 35.5 and 25.5 km.This time series is included for qualitative purposes only and demonstrates that good agreement is observed between the two datasets.The time dependent variability of each dataset is similar and there are no obvious biases, illustrating the potential for using OSIRIS ozone profiles to continue the long-term SAGE II record.Furthermore, the SAGE II and OSIRIS datasets complement one another during their period of overlap, with ex- cellent summertime coverage from OSIRIS and winter coverage from SAGE II.OSIRIS does not measure ozone in the winter hemisphere.Therefore, to produce a merged longterm time series with global coverage, other current satellite datasets would be required to complement the OSIRIS measurements.There are periods during which some disagreement is visually evident.However, some differences are expected as the instruments do not uniformly sample the latitude band: more OSIRIS measurements are skewed toward 40 • N, while SAGE II measurements are skewed toward 45 • N. Care has been taken to minimize the impact of sampling biases for coincident measurement pairs, as discussed in Sect.3. Throughout most of the time series, SAGE II and OSIRIS appear to measure similar variability in ozone.During some periods, the SAGE II data appear more compact because measurements are taken only during a 1-2 day window, while OSIRIS measurements over several days cannot be distinguished in the figure.However, the larger variability observed by OSIRIS at 35 km in May-August cannot be explained by differences in spatial or temporal sampling.The results shown in Fig. 2 are typical and consistent with time series comparisons at other altitudes and latitudes covered by both SAGE II and OSIRIS.
Figure 3 shows the overall agreement between OSIRIS and SAGE II measurements selected with the narrow coincidence criteria.There are 238 profiles meeting these coincidence criteria, with fewer valid measurements at lower and higher altitudes.The OSIRIS and SAGE II mean ozone number density profiles and standard deviations are also shown.Standard deviations from both instruments are much larger than the reported measurement errors and are similar to one another, indicating that OSIRIS and SAGE II coincidences have sampled similar large-scale seasonal and latitudinal variability in air masses.
The mean relative difference (see Eq. 1) and standard deviation is given for OSIRIS minus SAGE II.The absolute value of the mean relative difference is < 5 % for 13.5-54.5km and < 3 % for 24.5-53.5 km, demonstrating excellent agreement.Below 13.5 km, the differences are larger, likely due to the inclusion of measurements taken below the tropopause as discussed in Sect.4.1.A small positive bias is observed in OSIRIS measurements at 22.5 km.This bias coincides with the altitude at which UV and visible wavelengths are merged together in the OSIRIS ozone retrievals (Degenstein et al., 2009) and could, therefore, be caused by difficulties with the merging process, although this is not evident in the convergence of the retrieval.This bias also coincides with the peak in the sensitivity of limb-scattered sunlight to aerosols at 600 nm (Fig. 1 of Bourassa et al., 2007), which could affect the retrievals at visible wavelengths.Therefore small errors in aerosol or albedo could cause this bias.The standard deviation in the mean relative difference is ∼ 6 % in the middle stratosphere, which is within the range of the combined precision of the OSIRIS (3-4 %) and SAGE II (4 %) measurements.Correlation coefficients R > 0.9 were calculated for 13.5-49.5 and R > 0.85 were calculated for 11.5-51.5 km, indicating strong correlation between the two datasets.
In order to merge datasets, possible biases between them must be assessed under a variety of conditions, such as latitude, season, and solar zenith angle (SZA).Therefore, broad coincidence criteria (Table 1) were used to maximize the types of measurements being compared.Overall agreement between SAGE II and OSIRIS is given in Fig. 4. For the 1 Figure 4: As for Fig. 3 for broad coincidence criteria (see Table 1). 2 Fig. 4. As for Fig. 3 for broad coincidence criteria (see Table 1).broad coincidence criteria, 5174 coincidences were found.The standard deviations in OSIRIS and SAGE II ozone number density profiles are similar indicating that they sample similar large-scale seasonal and spatial variability in air masses.
The mean relative difference for OSIRIS minus SAGE II is very similar to the results for the narrow coincidence criteria (Fig. 3c) for 15.5-40.5 km.Below 13.5 km, agreement improves due to the inclusion of more high-latitude measurements, and therefore a smaller contribution from tropospheric measurements.Above 50 km, OSIRIS is biased low compared with SAGE II.This is primarily due to the diurnal variation of ozone, as discussed in Sect.4.2.The standard deviation in the relative differences in the middle stratosphere (∼ 8 %) is larger for the broad than for the narrow coincidence criteria, and is no longer within the combined precision of the OSIRIS (3-4 %) and SAGE II (4 %) measurements at these altitudes.Furthermore, the R correlation coefficients are smaller for the broad than for the narrow coincidence criteria.
The global comparison results are shown for dynamical coincidence criteria in Fig. 5.When the dynamical criteria are applied, ∼ 2000-4000 coincidences remain, depending on the altitude layer.Mean relative differences are within 0.5 % of mean relative differences for the broad coincidence criteria at altitudes above 18.5 km.Standard deviations improved by < 3 % (in absolute difference) and global  3 for dynamical coincidence criteria (see Table 1).Fig. 5.As for Fig. 3 for dynamical coincidence criteria (see Table 1).correlation coefficients improved by < 0.04 at all altitudes compared with the broad coincidence criteria.This suggests that for the broader coincidence criteria, the reduced correlation is not caused entirely by mismatches between the air masses sampled by the coincident measurements, due to the relaxed time and distance criteria.Instead, this is probably the result of measurements being compared under a larger variety of conditions (e.g., latitude, OSIRIS measurement SZA).

Dependence on latitude and season
Figure 6 is a contour plot of mean relative differences and R correlation coefficients for both the broad and dynamical coincidence criteria, calculated for coincidences within 10 • latitude bins.For the broad coincidence criteria, mean relative differences with absolute values < 5 % and correlation coefficients R > 0.5 are observed at most altitudes and latitudes.Below the tropopause, both positive and negative biases are observed, reaching magnitudes of up to 30 % for some altitude/latitude bins.Furthermore, R < 0.5 was calculated for much of the troposphere, indicating that these measurements are not well correlated.For 60 • S-40 • N, between the tropopause and 21.5 km, OSIRIS is biased low by up to 23 % compared with SAGE II.The high bias in OSIRIS data at 22.5 km is strongest at high latitudes, reaching 6 % in the Southern Hemisphere.For 23.5-38.5 km, mean relative dif- ferences are < 5 % at all latitudes.Above 40 km, latitudinal biases are observed and can partly be attributed to the diurnal variation of ozone, as discussed in Sect.4.2.
When the dynamical coincidence criteria are included, the latitudinal variation of the mean relative difference and the R correlation coefficient is largely unchanged.Below ∼ 40 km, the R correlation coefficient is improved slightly (by ∼ 0.1) at high latitudes with the addition of the dynamical coincidence criteria.Furthermore, most coincidences at 80 • S, for which R correlation coefficients are small (0-0.5) for the broad coincidence criteria, are removed.This suggests that R correlation coefficients at high latitudes under the broad coincidence criteria are affected slightly by mismatched air masses arising from the structure of the polar vortex.Most coincidences near the average tropopause in the tropics are removed by the criterion that both measurements be 1 km   above or below the tropopause.Since the broad coincidence criteria yield approximately two times as many coincidences as the dynamical criteria with only a small impact on R correlation coefficients, the broad coincidence criteria were used for the remainder of the study.
The seasonal variation of the mean relative differences between OSIRIS and SAGE II is given in Fig. 7. Larger 20 • latitude bins were used due to the limited number of coincidences.For 20.5-40.5 km, OSIRIS agrees with SAGE II within 5 % at most latitudes during most seasons.During May/June/July, OSIRIS measures less ozone than SAGE II, due to biases associated with low OSIRIS optics temperatures during the summer (see Sect. 4.3).At higher altitudes, there is a seasonal dependence in the agreement, which may be caused in part by the diurnal variation of ozone (see Sect. 4.2) and the systematic variation of the OSIRIS measurement SZA with the season (see Fig. 4a of McLinden et al., 2012).The low bias in OSIRIS data above the tropical tropopause also has a seasonal variability, with a weaker bias observed in the summer hemisphere.This seasonality may be related to the ascending and descending node biases in the OSIRIS dataset (see Sect. 4.4).1).

The diurnal variation of ozone at high altitudes
In the upper stratosphere and mesosphere, ozone has a diurnal variation, with larger number densities at night than during the day.At ∼ 55 km, ozone varies by ∼ 10-20 % between the daytime minimum and nighttime maximum (e.g., Huang et al., 2010;and Sakazaki et al., 2013).Therefore, if two instruments sample ozone at different SZAs, biases can be introduced to the comparisons.Figure 8 shows the correlation between OSIRIS and SAGE II coincidences at 54.5 km, with various measurement SZAs highlighted.Agreement between OSIRIS and SAGE II coincidences has a strong SZAdependence, with better agreement for OSIRIS SZAs near twilight.The measurement SZA at the OSIRIS 25 km tangent height ranges from 59-91 • , with lower SZAs sampled at high latitudes during the hemisphere's summer.SAGE-II occultations are always at twilight SZAs of 89-90 • at the 30 km tangent height.Therefore, mismatches in measurement SZA are expected to cause discrepancies between the SAGE II and OSIRIS datasets at high altitudes.
Furthermore, errors can be introduced to satellite measurements through the "diurnal effect", which is also known as "chemical enhancement" (e.g., Fish et al., 1995;Newchurch et al., 1996;Natarajan et al., 2005;Hendrick et al., 2006;and McLinden et al., 2006).Sunlight can pass through a range of SZAs before reaching the instrument.Therefore, ozone is sampled at various points in its diurnal cycle, not just the reported SZA at the tangent point.At twilight, when ozone varies rapidly with SZA, this effect is the largest.Natarajan et al. (2005) calculated the impact of the diurnal effect on Halogen Occultation Experiment (HALOE) solar occultation ozone measurements.They found that the diurnal effect leads to an overestimation of ozone at 55 km of ∼ 10 % (∼ 3 %) for measurements taken at local sunrise (sunset).They reported that below 50 km, the diurnal effect was zero.Since HALOE has a similar viewing geometry to SAGE II, these values should be similar for SAGE II.For OSIRIS measurements, the magnitude of the diurnal effect depends on the azimuthal angle between incoming sunlight and the instrument as well as the SZA (McLinden et al., 2006), and therefore varies on a scan-by-scan basis.
In order to assess the impact of the diurnal variation of ozone on the OSIRIS versus SAGE II comparisons, a photochemical model (McLinden et al., 2000) was used to calculate the diurnal variation of ozone in 10 • latitude bands from 80 • S to 80 • N for 21 March, 21 June, 21 September, and 21 December, using climatological ozone profiles (McPeters et al., 2007).Relative differences were calculated between profiles at typical OSIRIS measurement SZAs and a typical SAGE II measurement SZA (89.5 • ).For OSIRIS measurement SZAs < 80 • , a negative bias appears in the modeled relative difference profiles at 40-45 km, reaching −5 % by 47-50 km.At 54.5 km, the relative differences calculated with the photochemical model range from −15 to −23 %, depending on the measurement latitude and season.This is consistent with the average relative difference for OSIRIS measurements taken at SZA < 80 • minus coincident SAGE II measurements of −17.2 ± 0.3 % (where error is the standard error, σ / √ N ).Note that the diurnal effect should not have a large impact on the OSIRIS measurements for SZAs < 80 • because ozone is not varying as rapidly as at twilight.Furthermore, when the solar scattering angle, the azimuthal angle between the sun and OSIRIS, is ∼ 90 • , the diurnal effect does not impact OSIRIS measurements.In order to assess this fully, OSIRIS and SAGE II measurements would have to be corrected for the diurnal effect and then scaled to a common SZA prior to comparison.

OSIRIS optics temperatures
Around June each year, Earth is between OSIRIS and the sun as Odin passes over the Southern Hemisphere.During this period of eclipse, OSIRIS cools due to lack of sunlight, as is apparent in the OSIRIS optics temperatures (Fig. 1).A contour plot of mean relative differences, calculated in 1 • temperature intervals, is shown in Fig. 9a.For 28.5-47.5 km, the OSIRIS data are biased low by 5-12 % for optics temperatures < 16 • C. Above 47.5 km, a larger low bias is observed, but this may be due in part to the diurnal variation of ozone.The positive bias in measurements for altitudes above 40 km and optics temperatures > 25 • C is due to the inclusion of more measurements at southern high latitudes, for which OSIRIS is biased high compared with SAGE II.Below 20 km, mean relative differences vary with temperature, there are fewer than 5 coincidences.Measurement pairs were selected with broad coincidence 6 criteria (see Table 1).(b) Number of OSIRIS measurements (y-axis) at various optics 7 temperatures (x-axis) for the entire processed dataset (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012).8 9 10 Fig. 9. Variation in relative differences for OSIRIS minus SAGE II at various OSIRIS optics temperatures.(a) Contours of mean relative differences (color-scale) calculated for 1 • C OSIRIS optics temperature bins (x-axis) at various altitudes (y-axis).The black dashed lines indicate ±5 % mean relative difference.The grey shading indicates regions for which there are fewer than 10 coincidences.Measurement pairs were selected with broad coincidence criteria (see Table 1).(b) Number of OSIRIS measurements (y-axis) at various optics temperatures (x-axis) for the entire processed dataset (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012).
likely due to differences in sampled latitudes.The number of OSIRIS measurements for all processed ozone data (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) within each temperature bin is also shown.Most measurements are taken at optics temperatures of 18-22 • C, for which the bias between datasets is very small.20 % of all OSIRIS measurements were taken for optics temperatures < 16 • C. Note that the OSIRIS optics temperature distributions for coincidences with SAGE II is qualitatively similar to the 2001-2012 distribution, but only 11 % of the OSIRIS scans were taken at optics temperatures below 16 • C.
The low bias in OSIRIS ozone for low optics temperatures is likely due to errors in pointing and/or spectral wavelength calibration.Low temperatures may cause misalignment between OSIRIS and the star tracker due to thermal bending, flexing, or deformation torque, leading to an estimated 200-400 m error in the measurement altitude (McLinden et al., 2007).Altitude corrections as a function of OSIRIS optics temperature were estimated, but did not significantly improve the comparisons results.Defocusing, reduced spectral resolution, and wavelength shifts of the instrument at low temperatures (Llewellyn et al., 2004) may also contribute to this bias.Systematic errors can be introduced if the ozone cross section is not convolved to the resolution of the spectra.These errors are expected to be largest in spectral regions for which the ozone cross section varies rapidly with wavelength, e.g., errors will be larger at UV wavelengths than at visible wavelengths.For the OSIRIS SaskMART profile retrievals, the visible spectrum is used below ∼ 25 km and UV data are used above this (Degenstein et al., 2009).Since the low bias in OSIRIS measurements at low optics temperatures occurs above 25 km, defocussing of the instrument is likely contributing to the bias.
Currently, efforts are underway to improve SaskMART ozone retrievals during the period of eclipse.The retrieval software is being updated to recalculate the wavelength calibration and resolution of the instrument for each scan by comparing the OSIRIS measured spectrum with a highresolution solar spectrum.With this calculated point-spread function (full-width half maximum versus wavelength) the ozone cross section can be smoothed on a scan-by-scan basis, accounting for the reduced resolution when optics temperatures are low.Once these corrections have been performed, altitude corrections will also be revisited.

OSIRIS ascending versus descending node measurements
OSIRIS ascending and descending node measurements have differences in solar scattering angle and SZA that can lead to biases in ozone number density profiles due to uncertainties in the characterization of aerosol, albedo and clouds, and how they are represented in the radiative transfer model.In order to investigate this, OSIRIS descending and ascending node measurement pairs were selected using the broad coincidence criteria (Table 1).Figure 10a  all selected with broad coincidence criteria (see Table 1).6 OSIRIS ascending and descending node measurement pairs and OSIRIS and SAGE II measurement pairs were all selected with broad coincidence criteria (see Table 1).
with ascending node measurements throughout much of the stratosphere.Some of these discrepancies may be due to the diurnal variation of ozone or the diurnal effect (see Sect. 4.2), however the magnitudes of these biases are larger than photochemical model predictions at lower altitudes (40-50 km), suggesting that there may be contributions from other effects.
Mean relative differences for OSIRIS descending and ascending node measurements minus coincident SAGE II occultations are also shown in Fig. 10.The positive bias in the OSIRIS measurements at 22.5 km occurs primarily at 50-80 • S in ascending node measurements and at 40-70 • N in descending node measurements.The low bias in OSIRIS data above 27.5 km at 50-80 • N is evident only in the descending node measurements.This corresponds to the period of measurements with low optics temperatures (see Sect. 4.3).with magnitudes of mean relative differences < 5 % for 13.5-54.5km and < 3 % for 24.5-53.5 km and correlation coefficients R > 0.9 for 13.5-49.5km, for a narrow set of coincidence criteria.

OSIRIS
The OSIRIS data were characterized against the SAGE II data for various latitudes and observation conditions, using a broad set of coincidence criteria.Mean relative differences between OSIRIS and SAGE II were < 5 % under many conditions, again showing excellent consistency between these two datasets.Several biases were identified and are summarized in Table 2.These biases should be considered if the OSIRIS and SAGE II datasets are combined into a single time series.Efforts are also underway to improve OSIRIS SaskMART retrievals, which should lead to the reduction in some of these biases in future versions of the dataset.
Overall, excellent agreement between OSIRIS and SAGE II satellite ozone records demonstrates the potential for merging the OSIRIS and SAGE II datasets.Prior to merging, the long-term stability of OSIRIS measurements should be assessed through comparisons with other datasets.Furthermore, at least one additional ozone dataset will be required in order to cover the winter hemispheres.Therefore, biases between OSIRIS and additional datasets must be identified and quantified.Following this process, an ozone record including OSIRIS and SAGE II measurements spanning from 1984 to the present could be created.

Routine screening of OSIRIS ozone data
This appendix describes the screening procedure that is applied to the retrieved OSIRIS ozone profiles prior to the dis-tribution of data.This involves a three-step process: (1) radiances are screened for evidence of clouds and cosmic rays; (2) retrieved ozone profiles are screened using statistical techniques; and (3) retrieved ozone profiles are assessed visually.These screening procedures are described in detail in the paragraphs below.
The altitudes at which OSIRIS ozone profiles are contaminated with clouds are calculated in the SaskMART retrievals and excluded from the final product.However, in some cases, cloud altitudes are not correctly characterized.To determine if a scan is likely contaminated with a cloud, a detection ratio, v z , is calculated at each altitude: where I z /I 40 km is the mean radiance of the spectrum at 743 and 745.5 nm normalized to 40 km, and ρ z /ρ 40 km is the neutral density normalized to 40 km.The radiance of the spectrum is considered at 743 and 745.5 nm because clouds have a larger impact toward the red part of the spectrum and at these wavelengths emission lines from other atmospheric constituents are avoided.If v z > 0.6 anywhere for 15-40 km, then the scan is likely contaminated by clouds and is removed.This cloud criterion leads to the rejection of < 2 % of scans.In future versions, the retrieval software will detect these issues prior to retrieval so that the altitude range of the retrieval is appropriate.Spectra are also scanned for radiation hits due to cosmic rays in the detector, which typically cause a large, isolated spike in the spectrum.In order to identify spectra contaminated by cosmic rays, spectra at a given altitude level, I z , are normalized relative to the spectra at altitude layers above and below: Then, around each pixel that is used for aerosol and ozone retrievals, the mean and standard deviation of normalized intensities of the ten surrounding pixels are calculated.If there is a spike in the mean and standard deviation around a given pixel, the scan is rejected.Less than 5 % of scans were rejected due to radiation hits for 2001-2011.In future versions of the retrieval algorithm, cosmic ray hits will be identified and handled prior to retrieval.Infrequently, unstable retrievals may result in extremely large, unphysical ozone values that are not apparent in the radiance data.Therefore, for each week, ozone and aerosol data are grouped in 10 • latitude bins at each altitude.For each bin the standard deviation and the median absolute deviation (MAD) of ozone and aerosol are calculated.For most altitudes, scans with ozone or aerosol values that deviate from the mean by more than 5 σ or from the median by more than 10 MAD are rejected.These criteria are loosened at the top and bottom of the profiles to account for larger natural variability.Additionally, scans are removed if at any altitude the ozone volume mixing ratio or aerosol exceeds 1, as these values are non-physical.This filtering scheme led to the removal of < 8 % scans for 2001-2011.
Very infrequently, some outlier profiles pass the radiance and statistical profile criteria described above.Therefore, the profiles are inspected visually prior to approval, comparing them collectively in bins of latitude and month.This technique has led to the identification and removal of 0.1 % of retrieved profiles for 2001-2011, the vast majority of which are attributed to periods of incorrect altitude registration.

24Figure 1 :
Figure 1: Sampling of SAGE II and OSIRIS during the comparison period.(Top) Latitude of OSIRIS 25 km tangent height (blue) and SAGE II 30 km tangent height for satellite sunrise (yellow) and sunset (red) occultations versus measurement date.(Bottom) OSIRIS optics temperature versus measurement date.**NOTE: This figure has been extended in time to include period of overlap in Fall 2001** Fig. 1.Sampling of SAGE II and OSIRIS during the comparison period.(Top) Latitude of OSIRIS 25 km tangent height (blue) and SAGE II 30 km tangent height for satellite sunrise (yellow) and sunset (red) occultations vs. measurement date.(Bottom) OSIRIS optics temperature vs. measurement date.

Figure 6 :Fig. 6 .
Figure 6: Latitudinal variation for comparisons between OSIRIS and SAGE II measurements, 2 calculated for 10° latitude bins (x-axis) at various altitudes (y-axis) for (top) broad 3 coincidence criteria and (bottom) dynamical coincidence criteria (see Table 1).(Left) 4 Contours of mean relative differences (color-scale), with the black dashed lines indicating 5 ±5% mean relative difference.(Right) Contours of the correlation coefficient, R, with the 6 black dashed lines indicating R=0.5.The thick black lines indicate the average World 7 Meteorological Organization thermal tropopause height of the coincident measurements, 8 Fig. 6.Latitudinal variation for comparisons between OSIRIS and SAGE II measurements, calculated for 10 • latitude bins (x-axis) at various altitudes (y-axis) for (top) broad coincidence criteria and (bottom) dynamical coincidence criteria (see Table 1).(Left) Contours of mean relative differences (color scale), with the black dashed lines indicating ±5 % mean relative difference.(Right) Contours of the correlation coefficient, R, with the black dashed lines indicating R = 0.5.The thick black lines indicate the average World Meteorological Organization thermal tropopause height of the coincident measurements, calculated from European Center for Medium-Range Weather Forecast analysis data.The grey shading indicates regions for which there are fewer than 10 coincidences.
30 calculated from European Center for Medium-Range Weather Forecast analysis data.The grey shading indicates regions for which there are fewer than 10 coincidences.

Figure 7 :
Figure 7: As for Fig. 6a for 20° latitude bins in (a) November, December and January, (b) February, March, and April, (c) May, June, and July, and (d) August, September, and October.

Fig. 7 .
Fig. 7.As for Fig. 6a for 20 • latitude bins in (a) November, December and January, (b) February, March, and April, (c) May, June, and July, and (d) August, September, and October.

1Figure 8 :
Figure 8: Correlation plot of OSIRIS versus SAGE II ozone number densities at 54.5 2 altitude.The color scale indicates the SZA of the OSIRIS measurements.Linear fits (da 3 lines) and fitting statistics (text) are indicated for all coincidences (grey) and for coincide 4 with OSIRIS measurement SZA > 85° only (red).The R correlation coefficient is indic 5 by R, the slope is indicated by m, and the y-intercept is indicated by y.The black 6 indicates 1-1.Measurement pairs were selected for broad coincidence criteria (see Table 7 8

Fig. 8 .
Fig. 8. Correlation plot of OSIRIS vs. SAGE II ozone number densities at 54.5 km altitude.The color scale indicates the SZA of the OSIRIS measurements.Linear fits (dashed lines) and fitting statistics (text) are indicated for all coincidences (grey) and for coincidences with OSIRIS measurement SZA > 85 • only (red).The R correlation coefficient is indicated by R, the slope is indicated by m, and the y-intercept is indicated by y.The black line indicates 1-1.Measurement pairs were selected for broad coincidence criteria (see Table1).

Figure 9 :
Figure 9: Variation in relative differences for OSIRIS minus SAGE II at various OSIRIS 2 optics temperatures.(a) Contours of mean relative differences (color-scale) calculated for 1 3 °C OSIRIS optics temperature bins (x-axis) at various altitudes (y-axis).The black dashed 4 lines indicate ±5% mean relative difference.The grey shading indicates regions for which 5

1 Figure 10 :
Figure 10: As for Fig. 6a for (a) OSIRIS descending node minus OSIRIS ascending node 2 measurement pairs for 2001-2005, (b) OSIRIS descending node minus SAGE II measurement 3 pairs, and (c) OSIRIS ascending node minus SAGE II measurement pairs.OSIRIS ascending 4 and descending node measurement pairs and OSIRIS and SAGE II measurement pairs were 5

Fig. 10 .
Fig. 10.As for Fig. 6a for (a) OSIRIS descending node minus OSIRIS ascending node measurement pairs for 2001-2005, (b) OSIRIS descending node minus SAGE II measurement pairs, and (c) OSIRIS ascending node minus SAGE II measurement pairs.OSIRIS ascending and descending node measurement pairs and OSIRIS and SAGE II measurement pairs were all selected with broad coincidence criteria (see Table1).

Table 1 .
Coincidence criteria for SAGE II and OSIRIS comparisons.

Table 2 .
Summary of biases between OSIRIS and SAGE II ozone profiles.S-40 • N All From the tropopause to 21.5 km OSIRIS is biased low compared with SAGE II by up to 23 %.These biases are largest in the winter hemisphere.