Critical evaluation of the MODIS Deep Blue aerosol optical depth product for data assimilation over North Africa

Moderate Resolution Imaging Spectroradiometer (MODIS) Deep Blue (DB) collection 5.1 (c5.1) aerosol optical depth (AOD) data were analyzed and evaluated for the first time from an independent research group using eight years of Terra (2000–2007) and Aqua (2002–2009). Uncertainties in the DB AOD were identified and studied, and our results show that the performance of DB c5.1 is strongly dependent on surface albedo and aerosol microphysics. Using data with only “very good” quality assurance, the rootmean-square error (RMSE) of the DB Terra (Aqua) AOD is 0.24 (0.19) when validated against AERONET. Expanding upon the uncertainty analysis, the potential of applying the DB products for aerosol assimilation was explored. Empirical corrections and quality assurance procedures were developed for North Africa and the Arabian Peninsula to create a data assimilation (DA)-quality DB product. After applying those procedures, the RMSE is reduced by 18.1 % (18.2 %) for Terra (Aqua) DB data. Prognostic error models of 0.069+ 0.175× AODTerraDB with no noise floor and 0.048+ 0.182× AODAqua DB with a noise floor of 0.104 were found for DA-quality Terra and Aqua DB data, respectively. These procedures were also applied to two months of DB collection 6 (c6) AOD data, and reductions in RMSE were found, indicating that the algorithms developed for c5.1 data are applicable to c6 data to some extent.


Introduction
Numerical weather prediction of aerosol phenomena has been implemented for air quality and visibility (Lelieveld et al., 2002;Park et al., 2003;Reid et al., 2004Reid et al., , 2009;;Al-Saadi et al., 2005;Hollingsworth et al., 2008).Recent studies have shown that satellite aerosol retrievals can be effectively used, through data assimilation, to improve accuracies of aerosol analysis and forecasts (e.g., Zhang et al., 2008Zhang et al., , 2011;;Benedetti et al., 2009;Sekiyama et al., 2010;Campbell et al., 2010).The operational MODIS Dark Target (DT) products in particular are attractive for assimilation as they provide aerosol retrievals over global oceans and most land areas with near daily coverage.However, due to the high surface reflectance, traditional DT retrievals fail over bright surfaces such as the Saharan and Gobi deserts (Remer et al., 2005).This leaves large spatial gaps in the aerosol optical depth (AOD) record in desert regions, some of which host some of the largest aerosol loadings in the world.While other sensors such as the Multi-Angle Imaging Spectroradiometer (MISR) and the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO; Winker et al., 2009) can retrieve over bright surfaces, their limited swath and delayed data processing reduces efficacy in aerosol forecasting applications.
Because arid regions tend to have lower surface reflectance at shorter wavelengths, traditional DT method can often be successfully applied in blue wavelengths.The Deep Blue (DB) algorithm takes advantage of this surface phenomenology, performing aerosol retrievals at blue wavelengths (such as the 0.47 µm spectral channel in MODIS) and utilizing the selected aerosol model in the inversion to generate AOD (Hsu et al., 2004(Hsu et al., , 2006)).The DB methodology has been successfully applied to both MODIS instruments and Sea-WiFS to allow for large swath coverage for aerosol retrievals Y. Shi et al.: Critical evaluation of the MODIS Deep Blue aerosol optical depth product over and around desert regions (Hsu et al., 2004(Hsu et al., , 2006)).DB has shown that AOD can be retrieved with tolerable uncertainties, even over deserts and semi-arid regions, where traditional DT methods applied to mid-visible and red wavelengths have difficulties (Shi et al., 2011b;Li et al., 2012).This has allowed DB to be applied to such sensitive applications as source function development (e.g., Ginoux et al., 2010).
While filling a significant data gap, the use of DB specifically in data assimilation applications requires the development of a prognostic error model.That is, a realistic and scene dependent uncertainty needs to be assigned to every retrieval.Such errors are not commonly reported by aerosol retrieval developers.Instead, bulk global uncertainties are given, often expressed as an error range and a fraction of retrievals falling within that range (e.g., MODIS Dark Target -DT -over-land AOD has an expected error range of ±0.05 ± 0.15 × AOD, and roughly two-thirds of MODIS DT collection 5.1 -c5.1 -AOD fall within that error range; Levy et al., 2005).Given that uncertainty is well known to be related to spatially correlated features such as land surface albedo and aerosol microphysical properties, the use of a single uncertainty value can result in large errors in models during assimilation.The inclusion of data from a region with poorly constrained lower boundary conditions could, for example, result in a fictitious "aerosol plume" in a model forecast.Hence, one necessary and unavoidable step before applying a satellite aerosol product to aerosol data assimilation is an independent evaluation of uncertainties of the product, including an assessment of both random and systematic errors (e.g., Zhang andReid, 2006, 2010;Kahn et al., 2009;Hyer et al., 2011;Shi et al., 2011a).Data assimilation (DA) oriented products with reduced bias and more realistic descriptions of uncertainty have been generated from several different aerosol products through detailed analysis of retrieval uncertainties.For example, the data assimilation quality (DA quality) operational MODIS c5.1 products over both land and ocean are used for operational aerosol forecasting (Zhang and Reid, 2006;Shi et al., 2011a;Hyer et al., 2011).NASA GMAO performs their own retrievals based on machine learning as standard products were of insufficient quality for assimilation (A. daSilva, personal communication, 2011).ECMWF similarly has a series of quality control processes.To date, however, arid region retrievals are not operationally assimilated.
In this study the DB aerosol products were evaluated and their uncertainty sources were investigated with a focus over North Africa and the Arabian Peninsula -the world's largest contiguous dust belt.Following Zhang and Reid (2006) and Hyer et al. (2011), this study applied a series of procedures to remove outliers and reduce systematic bias in DB aerosol products.The uncertainties of data were examined as functions of their main sources, such as boundary conditions, observation conditions, and aerosol microphysics.Empirical studies and quality control procedures were applied to create quality-assured DB level 3 aerosol products suitable for data assimilation.

Data
The DB algorithm retrieves AOD and other ancillary parameters over visibly bright surfaces by taking advantage of dark-surface properties at blue channels (0.412, 0.47 µm) and weak absorption of dust at the red channel (0.65 µm) (Hsu et al., 2004).The climatologic surface albedo, built from a cloud-free surface reflectance database over arid and semi-arid areas, is used in the retrieving process (Hsu et al., 2006).This surface albedo data, together with a set of models describing aerosols with different optical properties, is used as input to a radiative transfer simulation to generate lookup tables (LUTs) describing the observed satellite radiance at 0.412, 0.47, and 0.65 µm as a function of AOD at 0.55 µm, aerosol type, and surface albedo.Using a maximum likelihood method, the optimal combination of aerosol models is selected by matching the 1 km observed radiance to the LUT values.For pure dust aerosol cases, AOD and single scattering albedo are reported at 0.412 and 0.47 µm, while for mixed aerosol cases the AOD and Angström exponent are reported (Hsu et al., 2004).The DB algorithm is applied to the 1 km cloud-free MODIS pixels, and then these 1 km retrievals are aggregated into the 10 km resolution data (Hsu et al., 2004).This is different from the standard MODIS products, where radiances are aggregated to 10 × 10 km nadir and then the retrieving processes are applied.To identify cloud-free pixels, in addition to applying the cloud screening method following the MODIS Cloud mask algorithm (Hsu et al., 2004) on the original 1km pixels, DB also uses AOD spatial variance computed every 3 × 3 pixels to remove potential cloud-contaminated pixels.The DB absorbing aerosol index AI is then used to retain pixels with heavy aerosol loading that are misidentified as cloudy pixels with the MODIS cloud masking algorithm (Hsu et al., 2004).The DB absorbing aerosol index AI detects changes in wavelength-dependent reflectance from Rayleigh scattering due to aerosol absorption (Hsu et al., 2004), and thus can be used to identify heavy UV-absorbing aerosol plumes from clouds.The DB data include a quality assurance (QA) flag that labels the data into three categories: "none", "good", and "very good".The DB data also include other ancillary parameters such as viewing/scattering angles, solar zenith/azimuth angles, surface albedo, and number of pixels used, all of which were used in this study for evaluation purpose.
MODIS c5.1 DB data are currently available for 2002-2011 from Aqua, and 2000-2007 from Terra due to the known calibration issues.The spatial coverage of the data includes North Africa, the Arabian Peninsula, part of Central Asia, India, Australia, the western US, and the Andes Mountains.The spatial resolution of the data is 10 km at nadir and the revisit time is about one to two days.Compared to MISR, Step 4 Empirical Correction Step 5 Aggregated to Level 3 Data which also retrieves aerosol properties over bright surfaces, DB has a much wider spatial coverage and a more frequent revisiting time.The uncertainties of DB AOD retrievals are listed as ±0.05 ± 20 % × AOD AERONET (Hsu et al., 2006;Huang et al., 2011).This study is based on the comparisons of MODIS DB c5.1 and AERONET AOD, coupled with a contextual analysis of retrieved aerosol features.The quality-assured level 2.0 Aerosol Robotic Network (AERONET) AOD data with a stated uncertainty of 0.015 were used as the "ground truth" (Holben et al., 1998).Eight years of AERONET AOD data were collocated in space and time with Aqua DB (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) and Terra DB (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007), following the method mentioned in Shi et al. (2011a).Using this method, a pair of DB and AERONET AOD data samples are considered collocated if the temporal difference between the two data samples is within ±30 min, and the spatial distance is within 0.3 • (latitude/longitude).Note that, since the 0.55 µm is the primary wavelength used for data assimilation, only DB AOD data at this wavelength were used in this study.However, no AERONET data are available at this spectral channel.Therefore, AERONET data from the 0.50 and 0.67 µm spectral channels were interpolated to derive AOD values at the 0.55 µm channel following O'Neill et al. (2003).

Evaluations
In this section the general performance of DB is described, along with the sources of uncertainties in the DB products with respect to observing conditions, and QA flags provided by the DB products.Details of the evaluation procedures are illustrated in Fig. 1.Four main steps include (1) evaluating the performance of the DB products with respect to QA flags included in the datasets, (2) studying the uncertainties of the DB products as functions of observation conditions, (3) assessing the uncertainties of the DB products in relation to the spatial variations of AOD and surface albedo, and (4) developing empirical correction procedures.In the second step, the performance of the DB AOD data was analyzed as functions of various parameters including lower boundary conditions, viewing geometry, cloud contamination, aerosol microphysical properties, and other observing conditions.After applying the empirical correction steps, both 1/4 • and 1 • (Lat/Lon) DA-quality DB AOD products were generated, and the 1/4 • products were generated for evaluation purposes only.All analyses were conducted for both Terra and Aqua DB products; however, in most cases, only analyses from Aqua DB data are shown, as similar structures are found for the Terra DB product.The analyses for the Terra DB product are provided in the Supplement unless specifically mentioned.

Overall nature of the Deep Blue product
This section starts with the simple global evaluation of the DB product, and then describes the selection of areas with sufficient collocated AERONET and DB data for further evaluation.Figure 2 shows the global comparisons of the collocated Aqua DB and AERONET AOD with respect to different QA flag settings.The fractional data density is shown in Fig. 2 for every 0.5 increments of AOD for both AERONET and DB.This figure displays the traditional method of evaluating satellite data against AERONET, which is used to diagnose the uncertainties in the dataset.The regression equation AOD DB = b + a × AOD AERONET is "diagnostic" and describes the quality of the retrieval against a more accurate reference dataset (in our case, AERONET AOD, AOD AERONET ).By contrast, the regression equation AOD AERONET = b + a × AOD DB is "prognostic" and describes the linear transformation that will produce values that are closest to the reference data.In this study diagnostic  regression is used to capture data characteristics, and prognostic regressions are used to develop correction factors and uncertainty estimation models.
This study makes extensive use of root-meet-square errors (RMSE), which are calculated using Eq. ( 1) and represent the bias of the evaluated datasets towards the ground truth.The uncertainty estimation model, following Zhang and Reid (2006), is based on a prognostic equation to estimate RMSE as a function of DB AOD.Development of this uncertainty estimate is discussed in Sect. 5.As Fig. 2 shows, DB AOD values have a RMSE of 0.234 with respect to AERONET AOD globally, an r 2 value of 0.52, and a slope of 0.87 for all available data.Note that this RMSE is probably a reflection of the data from the highest AOD range.A total of 42.8 % (14 023) of DB AOD data points fell outside the reported uncertainty range, defined by ±0.05 ± 20 % × AOD AERONET (Huang et al., 2011).When only data with a QA of "very good" are used, the RMSE drops to 0.207, r 2 increases to 0.75, the slope changes to 0.83, and the fraction of outliers drops to 31.7 % (1038).Although the regression slopes in Fig. 2 are not dependent on QA flags, the 11.5 % decrement in RMSE and 11.1 % decrement in outliers from QA flags equal to "none" to "very good" show that higher quality data are selected when using the "very good" QA flag.However, in addition to an improved performance, an 84.3 % data loss is found.
The performance of the DB AOD retrievals, however, shows a regional dependence, particularly in regard to slope.This is suggestive of microphysical bias, but since the DB algorithm utilizes a recalculated surface reflectance database that is based on a minimum reflectivity technique (Hsu et al., 2004), it is possible that the regional dependence of the DB retrieval performance could also be a function of surface albedo as suggested from this study as well.Using all available data with the "very good" QA flag, regional comparisons between Aqua DB and AERONET for nine selected regions were conducted as shown in Fig. 3, with Fig. 4 showing the domain of each area in a different color.As indicated from Fig. 3, only four regions -namely North Africa, Europe, East Asia, and West Asia -have more than 400 collocated data points that are sufficient for an evaluation study with respect to various observing conditions.The remaining regions, western North America, eastern North America, South America, southern Africa/Sub-Saharan Africa, and Australasia, either have a small number of collocated Aqua MODIS and AERONET data points or have larger scattering of data distribution.Of the nine selected regions, the best performance of DB data is found over North Africa, with a slope of 1.16, an r 2 value of 0.81, and an AOD RMSE of 0.19 between DB and AERONET AOD.However, high bias occurs when AOD is greater than one, which could be caused by multiple scattering.Contrary to the overestimation of AOD values over the North Africa region, an underestimation of AOD values is found for DB retrievals over Asia, with a much higher RMSE of 0.21 for West Asia and 0.29 for East Asia.Regions other than North Africa either have very few collocated DB and AERONET data points, or have a much larger scatter between satellite and AERONET AOD values.The diagnostic and prognostic RMSE models were built for regions in Fig. 3 with more than 400 data points -namely Europe, North Africa, East Asia and West Asia (Fig. 5a and b).The RMSE models were created using the same binning method for all of the components within each panel.The corresponding mean AERONET AOD for all the data points in each bin was plotted as the bin's x-axis value.Europe, shown in black in Fig. 5a and b, has low RMSE at low AERONET AOD, but higher RMSE at low DB AOD.This can be explained if DB is systematically underestimating AOD in this region, a possibility we will examine later in this section.Because of limited data volume and range of retrieved AOD in the matched datasets, only the North Africa and Arabian Peninsula regions (namely "the study region" from now on) were used to construct the DA-quality DB products.These regions will be the main focus of discussion in this paper.
Focusing on the study region, the diagnostic RMSE analysis as a function of AERONET AOD was performed for all data and data with QA flag values of "good" and "very good" (Fig. 5c and d).For all available data and data with "good" QA flags, the RMSE values from Aqua and Terra are very similar in both magnitude and pattern.When AERONET AOD values are smaller than 0.8, the RMSE values from both sensors remain relatively constant.Above this value the RMSE increases as AERONET AOD increases.With a strict QA flag filtering, the RMSE values of DB AOD reduce to approximately 0.1 for AERONET AOD below about 0.4, with a larger reduction of RMSE shown in Aqua data.
Shown in Figs. 2 and 3, the QA flag is necessary for highlighting retrievals that are the most "trustworthy" (Hsu et al., 2004).However, there are limitations in using data with only "very good" QA flags.For example, using the QA flag also introduces artifacts in AOD spatial distribution.Figure 6 shows the daily spatial distribution of DB AOD over the study region for 1, 2, and 3 May 2006, with all available data on the left panel, and data with only QA flags of "very good" on the right panel.For all three days, two patterns can be observed consistently from the right panel: (1) retrievals in the center of the swaths are removed which are due to the large scattering angles (C.Hsu, personal communication, 2012); (2) the number of retrievals is largely reduced south of 13 • N, and a significant portion of low AOD retrievals are excluded by the "very good" QA flags.When averaged over a one-year period (Fig. 7), the second pattern shows up as a near-linear feature, indicated by much higher AOD values for "very good" data below 13 • N (Fig. 7a and b).This pattern is introduced by a significant reduction in the number of retrievals, especially low AOD retrievals as shown in Fig. 6, when applying the "very good" QA filters (Fig. 7c).This reduction in data samples was caused by artificial thresholds in the DB retrieval algorithm, considering the number of pixels used in the retrieving process.Despite the disadvantage of applying "very good" QA flags, only DB data with the "very good" QA flags were used hereafter because of reduced error in these data, and because of systematic bias in AOD values with other QA flags (see Sect. 3.2.1).

Detailed analysis for DB over North Africa and Southwest Asia
Series of analyses were performed to investigate the sources of uncertainty in DB AOD product, including angular dependence, aerosol microphysics, surface reflectance, and other observing conditions.Aerosol layer height and surface elevation are a possible uncertainty source for retrieving aerosol using shorter wavelengths.For example, Hsu et al. (2004) mentioned that a ±2 km variation in aerosol plume height could introduce a 25 % uncertainty in AOD at 412 nm and 5 % at 490 nm."very good" quality data were used to conduct most of the analyses except that of angular influences due to the change of behaviors between all available data and data with "very good" quality.Although most discussions are focus on the study region only, global analysis is performed for the aerosol microphysics studies (Sect.3.2.2) as insufficient numbers of fine-mode aerosol retrievals are available at the study region.

Angular dependence
An interesting discrepancy between the DB AOD with and without QA flag filtering was discovered for angular dependency in AOD bias.For data with QA flag equals to "very good", no systematic bias (AERONET AOD minus MODIS DB AOD, symbol as AOD A−M ) is found as functions of viewing zenith angle (θ).However, with all data, there is a strong relation between increasing viewing zenith angle and increasing AOD A−M .Figure 8 shows the average difference between DB and AERONET AOD ( AOD) at 0.55 µm as a function of θ over the study region.As θ values increase, the AOD A−M changes from −0.07 to about zero, indicating a smaller bias for a larger θ value.However, this relationship between AOD A−M and θ is nonexistent when the "very good" QA flag filtering is applied (Fig. 8b).Similar patterns were found for scattering angle, but not shown in this paper.The influence of the viewing angle was then decoupled with albedo at 0.412 µm.It is shown that, when the surface is relatively bright (albedo between 5 and 11 %), the influence from the viewing angle is minimized.When the surface is dark (albedo smaller than 5 %), the bias of AOD varies with viewing angle for all available data.

Aerosol microphysics
Four aerosol microphysical parameters were evaluated for their impacts to the retrieval bias under cloud-free conditions.The four parameters were the Angström exponent and single-scattering albedo (ω) from the DB product, fine-mode fraction (η) calculated from AERONET data using a spectral convoluted method from O'Neil et al. (2003), and the aerosol type flag included in the DB QA flag.Among all the parameters, investigations showed that the DB AOD errors are most sensitive to η.Only one third of the aerosol retrievals over the study region have η > 0.5, and all data from the matched dataset with η < 0.5 are from the study region.
Figure 9 shows the scatter plot of DB vs. AERONET AOD for two η ranges: η < 0.5 (Fig. 9a) and η > 0.5 (Fig. 9b).Underestimation of AOD is found for coarse particles with η < 0.5, and an overestimation is found for fine particles with η > 0.5 globally.Consistent relationships are also found over the study region.Since nearly two-thirds of DB aerosol retrievals in the matched dataset over the study region have η < 0.5, it is likely that AOD over the study region as a whole is underestimated by DB.
Although convincing trends are found with respect to η, a parameter that is included in the DB products needs to be selected and used for empirical corrections mentioned in a later section.Thus, other microphysical parameters, including the Angström exponent and ω, were also examined, site by site and seasonally.However, no significant trends are found for these two parameters.A comparison was made between the retrieved Angström exponent and AERONET derived η; no relation between the two parameters was found.Note that the DB Angström exponent is predefined by the aerosol models contained in the lookup table.Therefore, the DB Angström exponent will not necessarily relate to the AERONET derived η.At last, instead of using external calculated η from AERONET, the aerosol type flag -a parameter that is included in the DB products -was used to represent the aerosol microphysics in the empirical correction step (see Sect. 4).

Surface reflectance
The DB algorithm utilizes a precalculated surface reflectance database that follows a minimum reflectivity technique (Hsu et al., 2004).Therefore, it is necessary to evaluate the influence of the static albedo on AOD retrievals.Also, as mentioned in Sect.3.2.2,AOD can be affected by inaccurate assumptions of aerosol microphysical properties in the retrieval process.To decouple the effects of aerosol microphysics and surface albedo on AOD, the surface-albedo-related DB AOD bias was investigated as a function of aerosol type and fine/coarse aerosol modes.Again, global data were used to observe the fine-mode aerosol performances and the coarsemode particle analyses are the same for the study region.
For all analyses, the collocated DB and AERONET AOD data were separated into four groups based on DB surface albedo (α) at a wavelength of 0.412 µm.The four albedo ranges are 0-5 %, 5-8 %, 8-11 % and above 11 %. Figure 10 shows the spatial distribution of the selected albedo ranges over the study area.Illustrated in Fig. 10, areas with albedo values higher than 11 % are located over the white sand deserts, and regions with albedo values lower than 5 % are located over semi-vegetated areas.The influences of surface albedo as well as η on DB AOD data are shown in Fig. 11.
Here again, all collocated DB and AERONET data are included as there are insufficient fine-mode AOD retrievals over the study area.The left panels of Fig. 11 show that for η < 0.5 (coarse mode), when α is less than 11 %, an underestimation in satellite AOD is observed, and a strong nonlinear trend is found.The magnitude of the underestimation is reduced when α increases from 5 to 11 %.For 3. page2, right column, section 2 (13 line from the bottom up in the first paragraph), "The DB absorbing aerosol index AI is also used to" changes to "The DB absorbing aerosol index AI is then used to" 4. TS1, page 21, add "Adelphi, MD" to refrence.Please replace some figures due to clarity and consistency issues.5. please replace Fig. 6 with the following figure η > 0.5 (fine mode), however, an overestimation is found for low-albedo ranges, but not for the 8-11 % albedo range (Fig. 11, right panels).In general, for coarse-mode aerosols, a higher albedo results in a smaller underestimation, and for fine-mode aerosols an opposite pattern is observed.Also illustrated in Fig. 11, large scatter is found between DB and AERONET AOD when surface albedo (0.412 µm) values are greater than 11 % for both η > 0.5 (fine mode) and η < 0.5 (coarse mode) cases.Figure 11 highlights the necessity of decoupling the surface and aerosol microphysical factors for empirical corrections.

Observing conditions
Cloud contamination is one of the potential sources of uncertainties for satellite aerosol products.However, 93 % of retrievals with "very good" QA are free of MODIS-detected cloud.The error statistics of the remaining 7 % do not show significant differences, and do not demonstrate the systematic offset in AOD shown in the MODIS dark-target overland product (Hyer et al., 2011).
Surface elevation is another potential source of uncertainties when using the blue wavelength for retrieving.The relationship between AOD A−M and the surface elevation of the AERONET stations was studied as a function of AERONET AOD.However, no significant trend was found between surface elevation and AOD A−M .Yet, such a study may be biased, as only a limited number of AERONET sites are located at high elevation.
DB products also contain a parameter that records the number of 1 km level 1b MODIS reflectance pixels used in creating the 10 km resolution AOD retrievals.The quality of the DB retrievals was checked with respect to this parameter, and a noticeable high bias in AOD A−M of 0.11 was found when all of the 1 km pixels are used in the retrieval process, as shown in Fig. 12.The DB data has a low bias over most of the scenarios except when the number of pixels used is around 60-80.The pattern of AOD A−M increasing when 100 pixels were used is also found in Terra.However, for the rest of the scenarios, there is no systematic low bias found (see Fig. S6 in the Supplement).6. please replace Fig. 7 with the following figure

Statistical analysis for spatial variations
In Sect.3.2, sources of physical-based uncertainties of the DB AOD have been identified.The DB aerosol data are reported at a spatial resolution of 10 km, and therefore the regional variations of surface albedo and aerosol optical properties within the 10 km domain could also affect the accuracy of the DB AOD values, as illustrated by Eq. (2).Equation (2) shows the relationship between the uncertainties in DB AOD values and three main contributors: (1) regional variations of AOD (STE AOD ), (2) regional variations of surface albedo (STE sfc ), and (3) physically based uncertainties as described in Sect. 3 (physical parameters, or PP).
Here, STE x represents the spatial variance of parameter x and is defined as the standard error of component x that is calculated using where N is sample size, x i is each sample value, µ is the expected value, and σ is standard deviation.The standard error is calculated using a 3 × 3 (approximately 30 km × 30 km) moving window around a given aerosol retrieval.
The goal of this study is to evaluate potential sources of uncertainties in the DB aerosol products, and to develop quality assurance steps and empirical methods to minimize bias and noise.Therefore, the first two terms from the righthand side (RHS) of Eq. ( 2) need to be studied and removed for the further development of empirical correction methods.It is difficult to completely decouple the three terms listed in the RHS of Eq. ( 2).However, it is possible to identify scenarios that minimize the first two terms, as shown in Fig. 13.
Figure 13 shows the analyses of normalized AOD ( AOD A−M over DB AOD) as a function of STE sfc with respect to surface reflectance (Fig. 13a), DB AOD (Fig. 13b), and aerosol type (Fig. 13c).Figure 13a shows that for darker surfaces (albedo lower than 8 %), the variation of SDE sfc is low.Higher SDE sfc values are found over regions with brighter surfaces (e.g., 8 % < albedo < 11 %), especially when normalized aerosol bias becomes negative.Figure 13b suggests that larger STE sfc values correspond to regions with low AOD values.When normalized aerosol bias reaches −1.0, the largest mean values of STE sfc correspond to AOD values smaller than 0.25.When separating the STE sfc based on aerosol type, the STE sfc of smoke particles oscillates around 0.0015, while those of "mixed" and "dust" particles fluctuate at much larger values and reach 0.003.This indicates both "mixed" and "dust" aerosol retrievals contain data that are largely biased by STE sfc .
Similar analyses were conducted for STD AOD as functions of surface reflectance and aerosol type.However, no significant trend was found.Figure 14 was introduced to show the STD AOD as a function of AOD.Although globally an increasing trend is found between STD AOD and AOD (Fig. 14a), over the study region the STD AOD is nearly invariant with respect to AOD other than when AOD is smaller than 0.1 (Fig. 14b).STD AOD cutoff has been used as a 7.
Replace Fig. 8 with the following figure and caption of Fig. 8 changes to "The differences in AOD between Aqua DB and AERONET as a function of viewing angle over the study region for (a) total AOD without QA filter, and (b) AOD with "very good" QA.Data were averaged for every 10° viewing angle (except for 10° to 30° in Fig. 8b) and one standard deviation bars were shown."

8.
Replace Fig. 9 with the following figure  method to exclude cloud-contaminated pixels (e.g., Shi et al., 2011a).Figure 14b suggests a flat STD AOD cutoff can be applied to the study region, which is applied in the next section.Section 4 describes how scenarios with significant contributions from STE sfc and/or STE AOD were identified and removed as part of the QC procedures.

Development of QA/QC procedures for DA-quality DB over North Africa and Southwest Asia
Based on discussions from Sect. 3, Level 3 DA-quality DB data over the study region were constructed in two steps.Initially, noisy data were removed using various filters, including QA flags, standard error check, and buddy checks over the study region.Table 1 shows all the filtering standards with corresponding data loss.Next, empirical corrections were applied based on each of the aerosol microphysical properties and surface conditions.During the standard error check, scenarios with significant contributions from STE sfc and STE AOD were identified.

9.
Replace Fig. 11 with the following figure    Among nine cases for three STE sfc ranges (0.00-0.001, 0.001-0.002,and 0.002-0.004)and three STE AOD ranges (0.0-0.01, 0.01-0.03and 0.03-0.05),large scatter is found for STE AOD ranging from 0.03-0.05.Therefore, to filter out data with large spatial variations in either AOD or surface albedo, only data with STE sfc less than 0.004 and STE AOD less than 0.03 were used to construct DA-quality DB AOD data.
Following the STE AOD filtering, a buddy check was performed, which is a test that searches for the adjacent retrievals, where retrievals without any adjacent retrieved AOD are rejected.It is designed to detect isolated retrievals and is aimed at removing retrievals that occur in between clouds and are subject to cloud contamination.Also, retrievals within the geographical range of 10 • S to 13 • N and 12 • W to 25 • E were excluded due to the spatial AOD bias related to the QA flag as discussed in Sect.3.1.
As mentioned in Sect.3.2.2,aerosol type was decoupled with the surface albedo for empirical correction purpose.Four aerosol species, defined by the aerosol type flag, are "mixed", "dust", "smoke", and "sulfur".Over the study region, no retrieval labeled as "sulfur" was found for the collocated dataset.Therefore, only retrievals with the 1. please replace Fig. 13 with the following figure 12.
Replace Fig. 14 wit the following figure aerosol type reported as "mixed", "dust", or "smoke" were discussed.Figures 15-17 show the comparisons between DB and AERONET AOD with decoupled aerosol type and albedo range (similar setting as Fig. 11).Empirical correction steps were established, based on Figs.15-17, but use DB AOD as the independent variable, for a total of nine scenarios.Three types of aerosols (mixed, dust and smoke) for three ranges of albedo (low: 0-5 %, median: 5-8 %, and high: 8-11 %) were considered.Coefficients (slopes and offsets) for the linear empirical correction equations are listed in Tables 2  and 3 for Aqua and Terra, respectively.Figures 15 to 17 show that both linear and nonlinear patterns exist between DB and AERONET AOD values.Linear corrections were, therefore, applied to the identified scenarios that showed  linear relationships between satellite and AERONET AOD.
For low-albedo regions with mixed aerosol types (Fig. 15a), a nonlinear relationship is found between DB and AERONET AOD.Therefore, two linear corrections were made for the AOD ranges of 0.0 to 0.25 and 0.25 and above.Similarly, for dusty regions (as identified by the DB product) with a surface albedo (412 nm) range of 5-8 % (Fig. 16b), linear corrections were made for AOD ranges of 0.0 to 1.0 and 1.0 and above for Aqua (ranges of 0.0 to 0.9 and above 0.9 for Terra).These corrections were based on linear regressions in prognostic analyses, which used DB AOD as the x-axis.When slopes from prognostic analyses are inversely proportional to slopes from diagnostic analyses, slope corrections were applied.In three cases prognostic and diagnostic slopes are inconsistent, and no corrections were made for those scenarios: Aqua data over mixed aerosol and dust regions with albedo between 8 and 11 % and Terra data over mixed aerosol regions with the same albedo range.As mentioned before, the coarse-mode aerosol is the dominant aerosol mode over the study region, and there are an insufficient number of collocated pairs of Aqua DB and AERONET data for smoke aerosol types.Therefore, one linear correction was applied to retrievals with DB smoke aerosol type.We also excluded smoke aerosol retrievals for regions with DB-retrieved surface albedo values greater than 0.08.
Finally, slope corrections are restrained to 1.3 for both Aqua and Terra DB data, respectively.These slope thresholds are rather arbitrary and were applied to avoid significant corrections to the DB AOD.Details of the steps and parameters for the corrections mentioned above are included in Tables 2  and 3. Table 4 shows the sensitivity study concerning the arbitrary limitation of the slope corrections.For the selected slope limits of 1.1, 1.2, and 1.3, the smallest RMSE occurs when the slope correction limit is restrained at 1.3.Again, the main concern for restraining the slope correction is to avoid potential discontinuities in the data that are created by the application of large corrections.Using the data screening steps and empirical correction procedures mentioned in the previous section, the DA-quality DB AOD data were generated.In this section the accuracy of the newly generated data was evaluated through intercomparison with ground observations and through the prognostic and diagnostic models of the RMSE.The comparison of DB and AERONET AOD before and after the quality assurance and empirical corrections steps are shown in Fig. 18 for Aqua and Terra over the study region in order to estimate the prognostic uncertainty.Reductions in both bias and noise are clearly visible for both DA-quality Terra and Aqua DB AOD data.The slopes of AERONET and the newly generated DB AOD are 0.88 and 0.87 for Aqua and Terra, respectively.The nonlinear features for both Aqua and Terra are weakened, but not eliminated, due to the restriction in empirical corrections that the multipliers cannot exceed 1.3.The RMSE values were checked for three AOD ranges: total AOD, AOD greater than 0.5, and AOD greater than 1.0.The corresponding RMSE are from 0.19 to 0.16 with 18.1 % error reduction, from 0.33 to 0.24 with 26.3 % reduction, and from 0.54 to 0.37 with 32.3 % reduction for Aqua after applying the QA steps and empirical corrections.Similarly for Terra, the corresponding RMSE are from 0.24 to 0.17 with 18.2 % error reduction, from 0.35 to 0.27 with 22.9 % reduction, and from 0.55 to 0.35 with 36.4 % error reduction.The total data losses, calculated against the total number of retrievals with "very good" QA flags, are 28.5 % for Aqua and 44.5 % for Terra.
Figure 19 shows the RMSE of the new product as a function of DB AOD before and after all processes.The upper panels are for total AOD, while the lower panels are separate dust and mixed aerosol types.Smoke aerosol particles were not included due to insufficient data samples.In Fig. 19 the same binning methods were used for the original data and the corresponding DA-quality data.However, the methods of binning vary for different datasets (e.g., dust vs. mixed aerosol) due to their respective data distributions.Figure 19a shows two lines of noise floors.The noise floor is defined as the RMSE value when RMSE is invariant to AOD variations.The noise floor represents the basic RMSE introduced by the system.As Fig. 19a and c shows, RMSE values are reduced for all AOD ranges after the correction processes.For total AOD less than 0.4, the noise floors of RMSE of original and newly generated data are 0.113 and 0.104, respectively.Different trends are found for different aerosol types.
For example, the RMSE values show an increasing pattern as DB AOD increases for mixed-type aerosol particles.However, for dust particles the minimum RMSE appears around DB AOD value of 0.3.This V-shaped RMSE distribution indicates a larger retrieval uncertainty for dust AOD values smaller than 0.3.Figure 19b and d shows a similar analysis to Fig. 19a and c, but use Terra DB data.One distinct difference from Aqua to Terra is that no noise floor of RMSE is found for Terra data.In the prognostic analyses, a sudden increase of RMSE values is found at an AOD value around 0.5 (black dots in Fig. 19d).This sudden increase in RMSE values is due to outliers from the mixed type of aerosol particles in the high surface albedo case (see Fig. S9 in the Supplement for details).Generally, the RMSE analyses show that the newly generated DA-quality data has smaller RMSE values when compared to the original data for both Aqua and Terra.
The level 3 quality-assured data were generated over the study region by spatially averaging the AOD data in a onedegree or a quarter-degree latitude and longitude resolution.Figure 20 shows the spatial plots of the original DB data, the "very good" QA quality DB data, and the newly generated data for Terra and Aqua separately for 2007.The main features are similar before and after the empirical corrections and QA procedures.When compared with DB data that has the "very good" QA flag, high AOD noise was reduced, and general AOD values were increased due to the correction of the nonlinear features.All data with surface albedo values exceeding 11 % were removed.Also, data for regions below 13 • N were not included due to the QA filtering issue mentioned in Sect.3.1.It is shown in Fig. 20 that Terra AOD has higher values, approximately 0.1, than Aqua AOD.Knowing that dust aerosols have a diurnal feature, the difference in local passing time for the two satellites may cause this problem.Also, Terra AOD have a larger bias, as shown in Figs.5d  and 19b, which can also contribute to this problem.
Through an independent study, we have also evaluated the newly generated level 3 Aqua DB AOD data for 2010 and 2011 that are not included in the analyses as mentioned in Sects.2-4.AERONET level 1.5 data were used instead of AERONET level 2.0 data, since level 2.0 AERONET data were not available from all sites over the study region for 2010 and 2011 when the study was conducted.Again, with the empirical correction and quality assurance steps, both bias and noise are reduced.The RMSE for newly generated data is reduced 11 % from 0.227 to 0.202, and the r 2 changes from 0.74 to 0.77 for prognostic purpose (Fig. 21).Note that there were four outliers that showed in blue dots from Fig. 21, which were manually removed from the analyses for both original and DA-quality DB data.

Preliminary analysis using the collection MODIS DB data
A new version of the MODIS DB product (collection 6, c6) is currently under development with a targeted release date of next year.C6 of the DB algorithm includes important updates to the retrieval process and the QA flag standards, resulting in important differences in the data product, including a large increase (roughly 2 × ) in the number of retrievals with "very good" QA flags.We therefore tested the algorithm developed in this study using two months (April 2006 and July 2008) of preliminary c6 DB data.Applying the empirical corrections and QA procedures that were developed based on DB c5.1 data to DB c6 data with QA equal to "very good", the modified DB c6 data had a reduced RMSE -from 0.160 (preliminary DB c6) to 0.137 (Modified DB c6) for total AOD and from 0.11 to 0.07 for AOD greater than 0.5 (Fig. 22) -using AERONET AOD as truth.The slope of the comparison between DB c6 and AERONET AOD also became closer to one, increasing from 0.79 to 0.94.Similarly, a higher r 2 value of 0.809 was found, compared to 0.688 for the preliminary DB c6 data.This preliminary analysis indicates that issues identified and quality assurance steps developed from this study are partially applicable to DB c6 data.
Quarter by quarter degree averaged AOD spatial distributions of DB c6 data were also plotted for April 2006 and compared to the DB c5.1 data distributions.Figure 23      available data, "very good" QA data, and DA-quality data for both c5.1 (left panels) and c6 (right panels).Several significant changes between the spatial distributions were observed.One significant change is that the AOD pattern in DB c6 data is smoother for all three categories of data, especially for "very good" quality data, due to DB c6 containing twice the amount of data as DB c5.1.Lower AOD values were also observed in DB c6 when compared to c5.1.Comparing the modified and the preliminary DB c6 AOD, the modified DB c6 AOD are smaller than the preliminary DB c6 data over regions with AOD less than 0.3, and the modified DB c6 data over regions that have very high surface reflectance were removed.Although detailed analyses are still required for DB c6 data, the statistics show that our method is robust and can reduce bias in DB c6 data.The analysis results provide useful information to the MODIS DB team and hopefully will be considered in the DB c6 product.

Conclusions
A thorough analysis with an emphasis on North Africa and Southwest Asia was conducted to evaluate the DB c5.1 aerosol products through the use of ground-based AERONET data.Retrieval biases and uncertainties were analyzed as functions of sampling and observation-related factors such as surface conditions, observation geometry, aerosol microphysics, cloud contamination, and other parameters that are used in the retrieval process.Updated quality assurance procedures, filtering processes, and empirical correction steps were developed for constructing new qualityassured DB products.Prognostic models were built for evaluating the newly developed data product against AERONET observations.Our findings include: 1. QA flags can be used to improve the quality of the DB AOD data.An important systematic bias in DB AOD as a function of viewing angle is eliminated by the use of the "very good" QA flag.However, both the data density and the geographic distribution of DB data are affected by the QA flag, and users of the product should be aware of this.
2. Particle size and surface albedo were identified to be significant to retrieval accuracies, and were highlighted and decoupled from the remaining parameters.For coarse-mode aerosols, the higher the surface albedo is, the lower the underestimation of DB AOD.For finemode aerosols, however, the higher the albedo is, the lower the overestimation of DB AOD.
3. The new QA and empirical correction procedures were constructed, and new level 3 DB c5.1 products were created for future implication in data assimilation.Reductions in RMSE, which were calculated using groundbased AOD from AERONET as truth, of 18.1 % and 18.2 %, were found for the quality-assured products when compared to the original DB products for Aqua and Terra DB products, respectively.
4. An independent validation of DB c5.1 data over 2010 and 2011 was also conducted and improvements to the new dataset were found as well.The newly developed level 3 products will be used in aerosol data assimilation and aerosol climate studies.
5. Lastly, the algorithm developed from this study was also tested using the preliminary DB c6 data that is targeted to be released next year.The quality assurance steps developed from the DB data c5.1 improve the accuracy of DB c6 data, indicating that the algorithm and methods developed from this study are at least partially applicable to the new version DB data.Also, we are hoping that issues identified from this study can provide useful information to the DB team in developing future versions of the DB product.

Fig. 1 .
Fig. 1.Flow chart of the production process for the level 3 DA-quality DB aerosol product.
Comparisons between Aqua DB and AERONET 2002-2009 for diagnostic purpose for 3 a.all data, b. data with very good QA quality globally.The red line is the linear fit line and the 4 blue lines are the 95% confident interval lines.The color contour shows the fractional data 5 density.6

Fig. 2 .
Fig. 2. Comparisons between Aqua DB and AERONET 2002-2009 for diagnostic purpose for (a) all data, (b) data with very good QA quality globally.The red line is the linear fit line and the blue lines are the 95 % confident interval lines.The color contour shows the fractional data density.

Fig. 3 .
Fig. 3. Regional comparisons between Aqua DB AOD and AERONET AOD 2002-2009 with only QA equal to "very good" for (a) Northwest America, (b) Northeast America, (c) South America, (d) Europe, (e) North Africa, (f) Southern Africa/Sub-Saharan Africa,, (g) East Asia, (h) Australasia, and (i) West Asia.The blue line is the linear fit line and the black lines are the 95 % confident interval of the linear fit line.

Fig. 4 .
Fig. 4. The domains for areas that are shown in Fig. 2. Western North America is shown in indigo, eastern North America is shown in dark slate blue, South America is shown in blue, Europe is shown in sky blue, North Africa is shown in spring green, southern Africa/Sub-Saharan Africa is shown in lime green, Australasia is shown in orange, West Asia is shown in white, East Asia is shown in yellow, and other regions are shown in black.

1
Fig. 5.The RMSE of DB AOD against AERONET AOD for (a) data with "very good" QA flag over Europe (black), North Africa (blue), East Asia (green), and West Asia (red) in Fig. 2 as a function of AERONET AOD; (b) similar to (a), but as a function of DB AOD; (c) all data over the study region as a function of AERONET AOD, and (d) similar to (c), but with data with QA equals to "very good" and "good".

Fig. 6 .
Fig. 6.Quarter degrees spatial average of satellite aerosol observation over the study region for DB AOD for three days.The first, second and third rows correspond to DB data at 1, 2, and 3 May 2006.The left column is all available DB data and the right column is DB data with QA equal to "very good" only.

Fig. 7 .
Fig. 7. Spatial distributions of DB for 2006 (a) AOD before the QA filtering, (b) only AOD with 'very good' QA, and (c) number of retrievals available after the QA filtering.Red dots in (a) represent the AERONET sites.

Fig. 8 .
Fig. 8.The differences in AOD between Aqua DB and AERONET as a function of viewing angle over the study region for (a) total AOD without QA filter, and (b) AOD with "very good" QA.Data were averaged for every 10 • viewing zenith (except for 10 • to 30 • in Fig. 8b) angle and one standard deviation bars were shown.

Fig. 9 .Fig. 10 .
Fig. 9. Comparisons between Aqua DB AOD and AERONET AOD globally during 2002-2009 under cloud-free conditions for (a) fine-mode fraction smaller than 0.5 and (b) fine-mode fraction greater than 0.5.The blue dots represent the averaged DB AOD for each AERONET AOD bin.The thicker black line is the linear fit line and the thin black line is the 95 % confidence interval.The red dashed line is the 1 to 1 line.

Fig. 11 .
Fig. 11.Comparisons between coarse and fine-mode Aqua DB AOD and AERONET AOD at 0.55 µm globally 2002-2009 with albedo at 0.412 µm.Each row represents data from a range of albedo: (a) and (b) are for albedo less than 0.05, (c) and (d) albedo ranges between 0.05 and 0.08, (e) and (f) albedo ranges between 0.08 and 0.11, and (g) and (h) are for albedo greater than 0.11.The left panel shows the coarse mode with the fine-mode fraction less than 0.5, the right panel shows the fine mode with the fine-mode fraction greater than 0.5.The blue line is the linear regression line, and the red line is the polynomial regression line.
12 with the following figure and "ΔτA-M" need to be changed to "ΔAODA-M" in the figure caption.

Fig. 12 .
Fig. 12. AOD bias ( AOD A−M , AERONET minus Aqua DB AOD) as a function of the number of pixels used for Aqua DB over the study region.The error bars indicate one standard deviation above and below the mean.

Fig. 13 .
Fig. 13.Normalized AOD ( AOD A−M over Aqua DB AOD) various with STE sfc as a function of (a) surface reflectance at 0.412 µm, (b) DB AOD, and (c) aerosol type.The error bars indicate one standard deviation above and below the mean.

Fig. 14 .Fig. 15 .
Fig. 14.Scatter plot of standard error threshold of Aqua AOD versus Aqua AOD at 0.55 µm.Dots represent the averaged standard error (blue) of AOD and the 1.5 standard deviation (red) for AOD increments of 0.1 for AOD < 0.5 and increments of 0.3 for AOD > 0.5.The blue lines and red lines show the linear fit of corresponding dots.(a) is for DB AOD globally, and (b) for DB AOD over the study region.

1 2Fig. 18 .
Fig. 18.Scatter plot of DB AOD versus AERONET level 2.0 AOD at 0.55 µm over the study region.The blue line is the linear regression line for all data -except in (c), which is for data smaller than 1.5 -and the black lines are the 1.0 standard deviation lines of the data.(a) is for the original Aqua DB aerosol products, (b) for the DA-quality Aqua DB aerosol products, (c) and (d) are similar to (a) and (b), but for Terra DB.

1Fig. 19 .
Fig. 19.RMSE of DB AOD compared to AERONET AOD as a function of DB AOD for all data and for mixed and dust aerosol types over the study region -(a) and (c) for Aqua, and (b) and (d) for Terra.The RMSE of original and DA-quality mixed and dust aerosols are indicated by the different colors of dots.
includes an AOD monthly map over the study region for all Y. Shi et al.: Critical evaluation of the MODIS Deep Blue aerosol optical depth product 8. please replace Fig.20 with the following figure

Fig. 20 .
Fig. 20.Spatial distribution of AOD at 0.55 µm from the DB aerosol products for 2007.The black color represents regions with no data, the blue color represents areas with low AOD loadings, and the pink color indicates locations with extremely high AOD values.Rows 1, 2, and 3 represent the original data, data with "very good" QA flags, and the DA-quality data, respectively.The left column is Terra DB data and the right column is Aqua DB data.

Fig. 21 .
Fig. 21.Scatter plot of Aqua DB versus AERONET level 2.0 AOD at 0.55 µm from 2010 to 2011 for an independent study.The blue line is the linear regression line for all of the data.(a) is for the original Aqua DB aerosol products, and (b) for the DA-quality Aqua DB aerosol products.

Fig. 22 .
Fig. 22. Scatter plot of MODIS Aqua DB preliminary c6 versus AERONET level 2.0 AOD at 0.55 µm for April 2006 and July 2008.The blue line is the linear regression line for all data and the black lines are the 1.0 standard deviation lines.(a) is for the preliminary DB c6 aerosol products, and (b) for the modified Aqua DB aerosol products using the procedures that were developed based on DB c5.1 data.10. please replace Fig.23 with the following figure

Fig. 23 .
Fig. 23.Spatial distribution of AOD at 0.55 µm from the DB aerosol products for April 2006.The black color represents regions with no data, the blue color represents areas with low AOD loadings, and the pink color indicates locations with extremely high AOD values.Rows 1, 2, and 3 represent the original data, data with "very good" QA flags, and the DA-quality data, respectively.The left column is Aqua c5.1 DB data and the right column is Aqua c6 DB data.

Table 1 .
Filters and thresholds that are used in QA procedures with corresponding data loss for generating DA-quality Aqua DB AOD product, with data concerning Terra DB presented in parentheses.The percentage of data loss for all procedures after the QA filtering were calculated based on the number of retrievals that had QA equal to "very good".

Table 4 .
Statistical analyses of different slope limitations for the empirical correction procedures for Aqua DB data when validating against AERONET AOD.