Detecting physically unrealistic outliers in ACE-FTS atmospheric measurements

The ACE-FTS (Atmospheric Chemistry Experiment – Fourier Transform Spectrometer) instrument on board the Canadian satellite SCISAT has been observing the Earth’s limb in solar occultation since its launch in 2003. Since February 2004, high resolution (0.02 cm) observations in the spectral region of 750–4400 cm have been used to derive volume mixing ratio profiles of over 30 atmospheric trace species and over 20 atmospheric isotopologues. Although the full ACE-FTS level 2 data set is available to users in the general atmospheric community, until now no quality flags have been assigned to the data. This study describes the two-stage procedure for detecting physically unrealistic outliers within the data set for each retrieved species, which is a fixed procedure across all species. Since the distributions of ACE-FTS data across regions (altitude/latitude/season/local time) tend to be asymmetric and multimodal, the screening process does not make use of the median absolute deviation. It makes use of volume mixing ratio probability density functions, assuming that the data, when sufficiently binned, are at most tri-modal and that these modes can be represented by the superposition of three normal, or log-normal, distributions. Quality flags have been assigned to the data based on retrieval statistical fitting error, the physically unrealistic outliers described in this study, and known instrumental/processing errors. The quality flags defined and discussed in this study are now available for all level 2 versions 2.5 and 3.5 data and will be made available as a standard product for future versions.


Introduction
One of the most common techniques for screening out anomalous data from a data set is to calculate the set's mean (µ) and standard deviation (σ ).Data that are outside the limits of µ ± k σ , where k is some constant, are deemed to be outliers.Another common, and similar, method is to use the median and MAD (median absolute deviation) (Leys et al., 2013;Toohey et al., 2010, and references therein), in place of the mean and standard deviation respectively, where, MAD = median i x i − median j x j . (1) This method is much less sensitive to extreme outliers, as the presence of outliers typically has an insignificant effect on the median value.It can be used as an efficient tool in detecting outliers for data that are normally distributed.However, using this value as a method of detecting outliers can be ineffective if the data being analysed are multimodal and/or are asymmetrically distributed about the median.In the case of data that are multimodal or asymmetrically distributed and contain multiple extreme outliers, it is likely that neither the σ nor the MAD will be an appropriate estimate of the variation, or scale, of the measurements.In such cases, they should be avoided in the detection of outliers (Rousseeuw and Croux, 1993).
In satellite remote sensing, isolating geographical/seasonal/local time regions where global satellite-based data are symmetrically distributed and uni-modal can be difficult and/or tedious.Measurements grouped into a given altitude, latitude, month, and local time bin can be driven away from typical behaviour by any number of factors (e.g. the polar vortex, a solar proton event, a sudden stratospheric warming, biomass burning, presence of polar stratospheric clouds, etc.), thereby altering the "typical" distribution of observed measurements, and hence the probability density function (PDF) of the trace species concentration.
The often used method for detecting outliers of employing the MAD does not explicitly make use of a PDF, but, in order for it to be useful, it does make an implicit assumption that the PDF is approximately symmetric about the median value.Other often used methods, such as Peirce's criterion (Peirce, 1852;Ross, 2003) and Chauvenet's criterion (Chauvenet, 1871), explicitly make use of a PDF, however they assume that the PDF is a Gaussian distribution.It should also be noted that in atmospheric science, the use of PDFs is not uncommon in tracer and validation studies.Lary and Lait (2006), in their introduction, give excellent examples of different types of tracer studies; and studies such as Migliorini et al. (2004), Lary and Lait (2006), and Wu et al. (2008) have demonstrated that PDFs can be used as a validation tool, where PDFs as measured by different atmospheric sounders are inter-compared rather than inter-comparing co-located measurements.
The ACE-FTS (Atmospheric Chemistry Experiment -Fourier Transform Spectrometer (Bernath et al., 2005)) instrument, on board the Canadian satellite SCISAT, is a solar occultation, high spectral-resolution (0.02 cm −1 ) Fourier transform spectrometer operating between 750 and 4400 cm −1 .ACE-FTS observations are used to derive volume mixing ratio (VMR) profiles of over 30 atmospheric trace gases, as well as profiles of over 20 subsidiary isotopologues of atmospheric species (Boone et al., 2005).SCISAT was launched in 2003 and ACE-FTS has been providing consistent measurements since February 2004.Atmospheric profiles range in altitude from ∼ 5-110 km, depending on the species, with a vertical resolution of ∼ 3-4 km and sampling of 1-6 km.
This study outlines the repercussions of screening data based on the σ or the MAD given non-normally distributed data and discusses a two-step process for detecting outliers that is currently carried out on the ACE-FTS level 2 data set.All data presented in this study are ACE-FTS level 2 version 3.5 (v3.5) (Boone et al., 2013) spanning February 2004 to February 2013; however the same processes have been used for detecting outliers in version 2.5 (v2.5) data.The main differences in v3.5 from v2.5 are: -amended sets of microwindows for all molecules, and an increase in the number of allowed interferers in the retrievals; -improvement in temperature/pressure retrievals, leading to a reduction in unphysical oscillations in retrieved temperature profiles; -inclusion of COCl 2 , COClF, H 2 CO, CH 3 OH, and HCFC-141b, and the removal of HOCl and ClO VMR retrievals.
Physically unrealistic outliers can occur in the ACE-FTS level 2 for a number of different reasons.Many of these are often caught prior to being added to the level 2 database, such as outliers due to exceedingly noisy spectra, ice contamination on the ACE-FTS detector affecting an occultation, and a variety of processing errors.However, these are not always caught by pre-screening, and other factors not accounted for in the pre-screening can contribute to the presence of outliers, for example, poor statistical fitting or convergence onto an unrealistic solution in the retrieval, inaccurate pressure and temperature a priori information.
The outlier detection and subsequent data flagging procedures discussed in this study have only been performed on the ACE-FTS level 2 data products that have been interpolated onto a 1 km altitude grid (between 0.5 and 149.5 km) (Boone et al., 2005).The philosophical approach for identifying data as potential outliers was one of caution, in that it is better to keep some "bad" data (likely to be physically unrealistic) than to reject "good," or "true," data (likely to be physically realistic).It was also desired that the approach be consistent for all subsets of data being analysed, i.e. tolerance levels, regional limits, etc. should be the same for all species, for all seasons, at all altitudes.For the remainder of this study, these physically unrealistic data will be referred to as "unnatural" outliers, and the data that are likely to be physically realistic, yet still seemingly outlying, as "natural" outliers.All data that are not unnatural outliers will be referred to as inliers.

Detection method and results
All distributions of data discussed in this section represent the February 2004-February 2013 data, and all VMRs are given in parts per volume (ppv).
Global satellite-based measurements of trace gases in the atmosphere are typically not symmetrically distributed and are often multimodal.Different regions are governed by different, varying processes, and therefore analysis of the data is typically carried out by breaking down the data into different altitude, latitudinal, etc. bins.Figure 1 shows all the ACE-FTS H 2 O data at 17.5 and 35.5 km and the corresponding measurement distributions.For both subsets of H 2 O data, inlier limits were determined for µ ±3σ and median ±3 MAD × 1.428 (1.428 is the scale factor for the MAD to equal the σ assuming a normal distribution (Rousseeuw and Croux, 1993)).These limits are plotted in Fig. 1a and Fig. 1c and highlight two key points: first, using the standard deviation when there are extreme outliers can allow for the acceptance of data that are most likely physically unrealistic.Second, using the MAD on multimodal or asymmetrically distributed data can lead to the rejection of physically realistic data.For instance, as shown in Fig. 1a, the lower cut-off using the MAD of 2.76 ppm clearly excludes the low H 2 O concentrations that are observed in Antarctic (austral) spring.As can be seen in Fig. 1b and Fig. 1d, the H 2 O data at both altitude levels are not normally distributed.The data can be separated further into bins based on latitudinal regions and local times.For example, Fig. 2 shows H 2 O and O 3 sunset data at 30.5 and 35.5 km, separated into different latitude regions (0-30, 30-60, and 60-90 • S), with dashed lines representing best fits to Gaussian distributions.These regions are representative of bins often used to partition atmospheric data.Figure 2 exemplifies that using a given bin definition that leads to data with a symmetric and unimodal distribution at one altitude level does not necessarily lead to a symmetric and uni-modal distribution of data at all altitude levels, nor across all species.For instance, in Fig. 2a the 35.5 km O 3 distributions in all three latitude regions are fairly symmetric.However, the 35.5 km H 2 O distribution (Fig. 2c) in the mid-latitudes is highly skewed, and in the high latitudes the distribution is tri-modal.In Fig. 2b and  d we see bimodal, asymmetric distributions for both O 3 and H 2 O in the 30-60 and 60-90 • S regions at 30.5 km.For highlatitude data in many species' data sets, distributions can be bimodal due to observing inside and outside of the vortex, and therefore it is not possible to find sub-regions (based on season, latitude, or local time) that will always exhibit symmetric distributions.Therefore, the ACE-FTS data screening process takes an approach that does not require the distribu-tion of any subset of data to be symmetric or containing just one mode.
Initially, all data were pre-screened.Any occultation that contained errors due to previously known issues (e.g.unrealistic N 2 O concentrations due to a convergence failure for occultations with low water levels, ice buildup on the detectors during early mission occultations, bad spectra used in the calibration, level 0-1 processing errors, etc.) were removed prior to analysis.A full list of known issues is given on the ACE validation website, https://databace.scisat.ca/validation.Then, for each species, at each altitude level, any data point with an absolute value greater than 10 000 times the median of all absolute values was rejected.Absolute values were used, as ACE-FTS VMR retrievals were allowed to be negative, and therefore, in some cases the median of the actual values could be very close to zero.
The screening processes started by analysing the data's PDFs.The normalized PDF of data subset x, PDF(x), multiplied by the number of data points, N, gave the expectation density function (EDF) at a given value of x, (2) The total integral of the EDF is equal to N, and the integral between any two values of x gives the number of expected data points within that range.For determining unnatural outliers, we want to find the values of x where the integral between infinity (negative and positive) and x lim (lower and upper values) is less than 1.Anywhere that the integral (from infinity) of the EDF is less than 1 is most likely a statistical outlier, as no data points are expected to be measured beyond the values of x lim , given the PDF.Therefore, the criterion for excluding data can be any value of x where x dx is less than or equal to 1.This is similar to Peirce's criterion (Peirce, 1852;Ross, 2003) and Chauvenet's criterion (Chauvenet, 1871), which both assume a normally distributed PDF.The tolerance level can be varied to suit the desired acceptance level of possible outliers.For ACE-FTS data, a tolerance level, determined empirically, of 0.025 is used, which corresponds to a 97.5 % confidence of an excluded data point being an outlier, i.e. any value x where x dx is less than 0.025 is rejected.This method, however, required determining an analytical solution for the data's EDF.For each of the 50+ ACE-FTS retrieved species, at each altitude level, the data was separated into sunset and sunrise occultations, in order to separate into similar local conditions, as well as into four different latitude regions: 60-90 • S; 0-60 • S; 0-60 • N; and 60-90 • N. Due to the SCISAT orbital geometry, the majority of ACE-FTS measurements were at high latitudes, and therefore each latitudinal bin had roughly the same number of profiles.The distribution of each subset was then fit to a Gaussian mixed distribution, using three Gaussian distributions.This assumed that the data was, at most, tri-modally distributed.Since it is not uncommon for distributions of atmospheric measurements to be log-normal, the fit was done in log-space (if there is negative data, a constant greater than the minimum value was added to the data set prior to the fit, which maintained the shape of the distribution).The fit was performed using the Matlab statistical toolbox, which uses an estimation maximization algorithm (McLachlan and Peel, 2000).In an effort to avoid fitting to extreme outliers, an ad-hoc "Olympic"-style method was employed, whereby the data set's five lowest and five highest values were excluded in the fit.Figure 3 shows the O 3 distribution at 30.5 km in the 60-90 • S and 60-90 • N regions, along with the fitted EDFs and the three Gaussian distributions derived in the fit for both cases.Figure 4 shows three examples of ACE-FTS sunset data distributions -NO 2 at 60-90 • S and 30.5 km, CH 4 at 0-60 • N and 20.5 km, and N 2 O at 60-90 • N and 20.5 kmand the corresponding fitted EDFs.These were chosen in order to illustrate typical results for commonly used ACE-FTS data.The average root-mean-square error (RMSE) between the EDFs and actual distributions, over all species and data subsets, is 6 % and has a standard deviation of 2 %. Figure 5 shows the inliers and unnatural outliers as determined by the EDFs for the subsets shown in Fig. 4. As can be seen, not all subsets contain many extreme outliers, e.g.NO 2 at 30.5 km (Fig. 5a), which only has one detected outlier.When there are obvious outliers, this method did exclude the most extreme outliers, although perhaps not all unnatural outliers.For instance, several (potentially) anomalously low values, near 0.75 ppm, in the CH 4 data (Fig. 5b) remain as inliers.This is in part due to the relatively lax tolerance level of 0.025, which is more likely to leave in outliers than if a larger value (but still less than 1) were chosen.

Atmos
It should be noted that screening using the EDF is a hardlimiting filter.Therefore, using it in the manner described above does not necessarily reject data that are non-physically anomalous for a given season.To screen the data for this type of moderate outlier, the 15 day median and a 15 day variation scale are calculated for each subset, excluding outliers as determined from the EDFs.Even on a 15 day timescale, ACE-FTS subset data can have distributions that are bimodal.In many cases, the primary mode is sampled much more frequently than the secondary mode, and therefore, without careful consideration, data within the secondary mode can be erroneously screened as unnatural outliers.To avoid this, we The red circles are data that have been determined to be unnatural outliers as per the 15 day running median and MeAD, and the blue dots are data that have been determined to be inliers.need a variation scale that is sensitive to outliers (unlike the MAD), but not overly sensitive to outliers (like the σ ).For this, we define a variation scale that is similar to the MAD, only more sensitive to outliers -the MeAD: MeAD = mean i x i − median j x j . (3) Any data point with a value outside the bounds of median 15 ±10 × MeAD 15 are considered to be unnatural outliers.The value of 10 was empirically found to maximize the number of discovered unnatural outliers without rejecting obvious natural outliers.Figure 6 shows the inliers and out- liers as determined by the 15 day running values for the subsets shown in Fig. 4. Clearly this step catches moderate outliers that were not detected using the EDFs, although still not all anomalous data have been screened out.The potentially anomalous values near 0.75 ppm in the CH 4 data (Fig. 6b) still remain as inliers.Stricter tolerance criteria in either the EDF or running MeAD screening process would allow for these data to be screened out; however, they were found to lead to screening out natural outliers in other subsets of data, which would be discordant with our philosophical approach.Going back to the original case of H 2 O at 17.5 km (Fig. 1a), an example of the difference between using the MeAD as opposed to the MAD in the second step is shown in Fig. 7.However, now the focus is only on the Antarctic data.The unnatural outliers and remaining inlying data for Antarctic H 2 O at 17.5 km are shown for the two different approaches.
Figure 7a shows the results when using limiting values of median 15 ± 10 × MeAD 15 , where the significant majority of the data points screened as unnatural outliers are likely to be physically unrealistic for their local conditions.Using the limiting values of median 15 ± 10 × MAD 15 , Fig. 7b, leads to many more outliers being detected as unnatural outliers.
Upon inspection, many of these erroneous "unnatural" outliers are most likely being erroneously rejected, especially in late 2009 where ACE-FTS is most likely routinely observing dehydrated air masses.In both cases, outliers were detected in the sunrise and sunset data sets separately.
In order to explore the response to periodic extreme events and to trends, Fig. 8 shows the final inliers and unnatural outliers in all ACE-FTS HCN data at 9.5 km, which exhibits periodic increases that could correspond to biomass burning events (e.g.Crutzen and Andreae, 1990;Pommrich et al., 2010); as well as all SF 6 data at 19.5 km, which exhibits a clear positive trend throughout the time series (e.g.Rinsland et al., 2005;Brown et al., 2011).Even in these instances of extreme events and a significant trend in the data, the outlier detection method outlined here is able to keep the natural outliers as inliers.The top panels (a and d) in Fig. 8 show all data points and demonstrates the extreme unnatural outliers (red dots) that can occur within the ACE-FTS data set.The middle panels (b and e) show the same data as the top panels, however without the more extreme unnatural outliers in order to better view the data; and the bottom panels (c and f) show the data with all unnatural outliers removed.
In the overwhelming majority of instances where the ACE-FTS VMR data exhibit a sudden and/or extreme change in the distribution, the unnatural outlier detection method described above does not screen out these events.Sudden stratospheric warmings cause there to be strong descent in the northern high-latitude upper atmosphere.This leads to anomalously large concentrations of NO in the upper stratosphere-lower mesosphere, near 50 km (e.g.Manney et al., 2008;Randall et al., 2009).Figure 9a shows the time series of the final inliers in all ACE-FTS NO at 55.5 km.It can be seen that the detection method is able to keep the data during these extreme events as inliers.Anderson et al. (2012), using in situ aircraft measurements, demonstrated that in the summer there can be H 2 O intrusions from the upper troposphere into the lower stratosphere at Northern midlatitudes.As can be seen in Fig. 9b, the final inlying ACE-FTS data in the summer Northern mid-latitudes, in the lower stratosphere, do exhibit large increases in H 2 O concentrations.Manney et al. (2011) showed that the Microwave Limb Sounder (MLS) on the Aura satellite observes decreases in lower stratospheric HCl concentrations in the Arctic vortex each spring; and in the spring of 2011, HCl concentrations were anomalously low for a prolonged period.Figure 9c shows the final inlying ACE-FTS HCl data in the Arctic, which are consistent with the MLS findings.No instances have been found in which the unnatural outlier detection system outright rejects these types of phenomena.When sudden, extreme changes do occur, the rejection of potential natural outliers has been minimized, the result of which is that the rejection of the detected unnatural outliers has an insignifi- cant effect on the mean.The disadvantage of not screening out rare extreme events, however, is that this method is less likely to catch sporadic systematic instrument or processing errors.Therefore, continual monitoring of both the rejected and non-rejected data statistics is necessary to determine if any such errors have occurred.
Table 1 shows what percentage of ACE-FTS level 2 v3.5 profiles contain at least one detected outlier (by either step).For any given species, if all profiles that contained at least one outlier are rejected, less than 6 % of the total number of profiles will be omitted.

Conclusions
A two-step process has been developed in order to screen all ACE-FTS level 2 data for physically unrealistic outliers.The first step fits an EDF, the superposition of three Gaussian distributions, to actual distributions.This fit is done in logspace.Data in the tails of the distributions where the probability of finding a data point is less than the tolerance level are determined to be extreme unnatural outliers.The second step iteratively takes the 15 day running median and MeAD and screens for moderate seasonal unnatural outliers.Data that are further than 10 times the MeAD from the median are determined to be moderate outliers.Using these methods to screen the ACE-FTS data for unnatural outliers, a flagging system has been implemented to give ACE-FTS level 2 data users a guide for how best to use the data.Each VMR data point in each profile is flagged with an integer from 0-9.Table 2 gives the definition for each flag value.Any data with a 0 flag are recommended for use.In previous versions, data users were recommended that they filter out data where the percent error (the retrieval statistical fitting error divided by the retrieved value) is either greater than 100 % or less than 0.01 %; for legacy reasons, these data have been given a flag value of 1.It is recommended that data points with a corresponding flag greater than 2 be removed before any analysis is performed.This screening method alone may be adequate when only looking at one Not enough data points in the region to do statistical analysis, and percent error is within 0.01-100 % 3 Not enough data points in the region to do statistical analysis, and percent error is not within 0.01-100 % 4 Moderate unnatural outlier detected from running MeAD, percent error within limits 5 Extreme unnatural outlier detected from EDF, percent error within limits 6 Unnatural outlier detected and percent error is outside of limits 7 Instrument or processing error 8 Error fill value of −888 (data is scaled a priori) 9 Data fill value of −999 (no data) altitude level, however, profiles that contain an outlier at a given altitude level may also be compromised at lower altitude levels.Therefore it is recommended that any profile that contains a flag between 4 and 7 (inclusive) be removed before analysis.At certain altitude levels for a given species, the data can be either noisy, with a significant number of negative values, or have a strong negative bias.In either case, since the ACE-FTS retrieval allows for negative concentrations, it is possible for valid data to have values close to zero, both positive and negative.When values are systematically near zero, the percent error becomes extremely large.Therefore, in these situations, screening the data based on the percent error may introduce a bias in the data.As such, before analysis, removing data that has a corresponding flag value of 1 is only recommended at altitude levels where the overwhelming majority of data points have a VMR value greater than zero.
Since the outlier detection methodology was approached with a philosophy that it is better to leave in unnatural outliers than to remove natural outliers, there are outliers that have gone unflagged -especially in data sets that are inherently noisy and at low altitudes (below ∼ 10 km).Level 2 data users should use the defined quality flags as a starting point for screening the data and be aware that some unnatural outliers may still exist that could be screened out prior to analysis.If data users are using the MAD in an attempt to further screen the ACE-FTS level 2 data, for best results it is advised that they ensure that the distribution of the data they are screening is not multimodal nor heavily skewed.
The flag values for all v3.5 data are now available for download on the ACE-FTS website, and v2.5 flag values are available upon request from the lead author and will soon be made available for download on the ACE-FTS website.It is currently expected that similar flags will be a standard product within the level 2 data of all future products.

Figure 1 .
Figure 1.ACE-FTS level 2 v3.5 H 2 O data (left) and corresponding distributions (right).The top panel (a and b) shows all data at 17.5 km, and the bottom panel (c and d) shows all data at 35.5 km.

Figure 4 .
Figure 4. Sunrise ACE-FTS VMR distribution (blue circles) and fitted EDF (black dashed lines) for: (a) NO 2 at 30.5 km in the latitude region 60-90 • S; (b) CH 4 at 20.5 km, 0-60 • N; and (c) N 2 O at 20.5 km, 60-90 • N. The dotted green lines are the fitted Gaussian distributions in calculating each of the EDFs, and the fitted distributions have been normalized to the measured VMR distributions.

Figure 5 .
Figure5.Sunrise ACE-FTS data for the same data subsets as Fig.4.The red circles are data that have been determined to be unnatural outliers as per the EDFs, and the blue dots are the inlying data.

Figure 6 .
Figure6.Sunrise ACE-FTS data for the same data subsets as Fig.4.The red circles are data that have been determined to be unnatural outliers as per the 15 day running median and MeAD, and the blue dots are data that have been determined to be inliers.

Figure 7 .
Figure 7.All Antarctic ACE-FTS data for H 2 O at 17.5 km.The red circles are data that have been determined to be unnatural outliers following two different methods (sunrise and sunset data were analysed separately), and the blue dots are data that have been determined to be inliers.(a) Unnatural outliers determined using the 15 day running MeAD.(b) Unnatural outliers determined using the 15 day running MAD.Unnatural outliers as determined by the EDFs are not shown and were not used in the analysis.

Figure 8 .
Figure8.The final inlying data (blue dots) and unnatural outliers (red dots) for all ACE-FTS HCN data at 9.5 km (left) and SF 6 data at 19.5 km (right).The top panel shows all data, the middle panel is the same as the top panel but zoomed in for clarity, and the bottom panel is all data, excluding the unnatural outliers.

Figure 9 .
Figure 9.The final inlying data for ACE-FTS: (a) Arctic NO at 55.5 km; (b) mid-latitude H 2 O at 14.5 km; and (c) Arctic HCl at 18.5 km.

Table 1 .
Percent rejection of ACE-FTS level 2 v3.5 profiles that contain one or more detected unnatural outlier (either by running MeAD or EDF).

Table 2 .
Definition of flag values associated with ACE-FTS level 2 data.