Application of spectral analysis techniques to the intercomparison of aerosol data – Part 4 : Combined maximum covariance analysis to bridge the gap between multi-sensor satellite retrievals and ground-based measurements

Introduction Conclusions References


Introduction
Global aerosol properties are highly variable in space and time.Aerosols from different regions generally have different chemical compositions, emission sources, and are subject to different meteorological conditions.Understanding the spatial and temporal variability of aerosols is critical in quantifying their direct and indirect climate effects.Satellite observations have become and will be an indispensable source of information about aerosol characteristics for use in various assessments of climate change (King et al., 1999).In the past decade, many satellite sensors have been developed to monitor global aerosol properties and have greatly advanced our knowledge of aerosols and their variability.These aerosol products have been validated against ground-based measurements from Aerosol Robotic Network (AERONET, Holben et al., 1998;Dubovik et al., 2002) and their data accuracy and reliability are confirmed (e.g., Levy et al., 2010;Kahn et al., 2005;Sayer et al., 2012;Torres et al., 2007).As a result, they have been extensively used in various aerosol and climate related studies.For example, Kalashnikova and Kahn (2005) used Multiangle Imaging Spectroradiometer (MISR) and Moderate Resolution Imaging Spectroradiometer (MODIS) aerosol products to study mineral dust plume evolution over the Atlantic.Torres et al. (2010) studied the anomalous biomass burning in the Southern Hemisphere using aerosol retrievals from Ozone Monitoring Instrument and MODIS.And Hsu et al. (2012) investigated global and regional trends in aerosol optical depth using Sea-viewing Wide Fieldof-view Sensor (SeaWiFS) measurements.In these studies, usually only one or two datasets were used to study the physical problem.With multiple datasets available, it is desirable to take advantage of all available pieces of information in one analysis in order to yield more reliable results.Several authors have used aerosol retrievals from multiple sensors in their study.Nabat et al. (2013)  mean aerosol optical depth over the Mediterranean using nine satellite-derived AOD products.Carboni et al. (2012) evaluated desert dust optical depth retrievals from eight different satellite instruments.Another application of multi-sensor aerosol data is to validate and constrain aerosol parameterizations in climate models.Kinne et al. (2003Kinne et al. ( , 2006) ) compared global monthly mean aerosol properties between AeroCom aerosol modules and several satellite datasets.Liu et al. (2006) assessed the GISS ModelE aerosol climatology against multiple satellite retrieval products.In these multi-sensor applications, although different datasets achieved an overall global agreement, considerable regional differences were revealed that were associated with different aerosol sources or transport regimes.Regional differences between satellite-retrieved aerosol properties were also reported for India (Prasad and Singh, 2007), for South America (Ahn et al., 2008), and for Southeast Asia (Xiao et al., 2009).Therefore, effective and efficient use of multi-sensor datasets requires an understanding of the strengths and weaknesses of each dataset in representing different aerosol types and variability in different regions of the world.
Previously, we have demonstrated that spectral decomposition techniques such as Principal Component Analysis (PCA) can be effectively used to examine the spatial and temporal variability in multi-dimensional aerosol observations (Li et al., 2009(Li et al., , 2011(Li et al., , 2013a)).Many global and regional aerosol source regions and their seasonal and interannual variability are successfully captured by the dominant orthogonal modes.We further introduced the Maximum Covariance Analysis (MCA) method that allows the verification of the variability revealed by a particular satellite dataset through the comparison with ground-based measurements from AERONET (Li et al., 2014).And in Li et al. (2013b), we applied Combined Principal Component Analysis (CPCA) to achieve a parallel examination and comparison of the spatial and temporal variability in aerosol optical depth as measured by multiple satellite datasets.The CPCA method is powerful in both confirming the agreement and finding locations and times of disagreement between the satellite data sets.However, a major drawback is that the CPCA methodology by itself does not accommodate the inclusion of scattered ground observations, Figures

Back Close
Full as combining different fields assumes equal weight and is thus only suitable for gridded data with the same spatial mapping.The MCA does incorporate ground-based data, however, its results alone are not sufficient to select which dataset best characterizes aerosol variability for a particular region, in that the method only evaluates one satellite dataset against AERONET.For multi-sensor data analysis, it is necessary to simultaneously examine the capability of each dataset in representing aerosol variability for particular regions, in order to determine which dataset or datasets provide the best constraints on the aerosol property for the regions of interest (1) integrating all available information from both satellite and surface measurements resulting in a more complete view of the picture; (2) the common modes of variability revealed through CPCA can be further confirmed, and the problems in each satellite dataset can be identified through the comparison with ground truth measurements; (3) the examination and comparison is associated with specific aerosol sources, types or events, which are essential for both understanding the physics of the problem and improving satellite retrievals.The goals of this paper are to introduce and highlight the utility of the CMCA technique and thereby promote its usage by the aerosol data community.We describe data selection, preprocessing and the detailed analysis procedure in Sects. 2 and 3.In Sect.4, we present the results of our global analysis and two representative regional case studies that demonstrate the usefulness of this technique, while readers are welcome to use the method to explore additional regions based on their specific interest.Finally, a summary and discussion of potential extended usage of the CMCA technique is given in Sect. 5.

Datasets
We use monthly mean, gridded Aerosol Optical Depth (AOD) products from four satellite sensors: MODIS, MISR, OMI and SeaWiFS.These four data sets have all been validated against ground observations and have reasonably good global coverage.
Only over-land data is used primarily because the majority of AERONET stations are located over land.The ground-based observations are from 58 selected AERONET (considered a benchmark for satellite data).The period of study is chosen to be January 2005 to December 2010, which corresponds to the period of the longest overlap for the four satellite data records.Finally, because OMI AOD is reported at 500 nm while MODIS, MISR and SeaWiFS report AOD at multiple wavelengths, to facilitate parallel comparison, we interpolate MODIS, MISR and SeaWiFS AOD to 500 nm according to the Ångström Relationship as detailed below.

MODIS
The MODIS instrument is a multi-spectral radiometer, designed to retrieval aerosol microphysical and optical properties over land and ocean (Tanré, 1997;Levy et al., 2007).The 2330 km swath width of the MODIS instrument produces a global coverage in 1 or 2 days and captures most of aerosol variability due to this high sampling frequency.The MODIS on Aqua platform is used here, as Terra MODIS AOD is not as complete as Aqua over desert regions.The official Level 3 monthly mean AOD product at 1 • ×1 • resolution is used for this study (MYD08_M3, collection 5.1, available from http://ladsweb.nascom.nasa.gov/).We use QA weighted averages ("*QA_Mean_Mean" variables, Hubanks et al., 2008) for both dark target (DT, Levy et al., 2010) and deep blue (DB) AOD retrievals (Hsu et al., 2004(Hsu et al., , 2006)).The deep blue algorithm covers Figures

Back Close
Full most of the dust regions and is thus important for the analysis.Note that in the MODIS collection 6 data, merged DT and DB will be provided as a standard product (Levy et al., 2013).However, since the Collection 6 data is not yet available for Level 3, we merge these two products by ourselves here.The DT and DB products are combined following the procedure described by Levy et al. (2013) for Collection 6, which determines the selection of DT or DB product according to the MODIS NDVI climatology.Specifically, for NDVI > 0.3, DT data is selected, for NDVI < 0.2, DB data is selected and for 0.2 ≤ NDVI ≤ 0.3, and average of DT and DB AOD is used.Nonetheless, this merging of DT and DB product will result in a seasonally varying product type for some regions with seasonally varying NDVI, especially for semi-arid areas such as the Sahel, Western US, North India and East China.To examine the consistency of these two products, in Fig. 1 we plot the merged DT and DB time series at four AERONET stations with changing vegetation type: Banizoumbou, Beijing, Bratts Lake and Kanpur.We find that the time series appear rather smooth, indicating that the merging of DT and DB products has negligible influence on the overall data consistency.The MODIS AOD is interpolated to 500 nm using measurements at 470 nm and 660 nm.

MISR
The MISR is a multi-angle sensor with nine pushbroom cameras on the EOS Terra platform.The zonal overlap of the common swath of all nine cameras is at least 360 km in order to provide multi-angle coverage in 9 days at equator, and 2 days at poles (Diner et al., 1998).Compared to MODIS, the multi-angle view of MISR performs better over bright surfaces (Kahn et al., 2005(Kahn et al., , 2010)), while its lower sampling may not fully resolve short scale variability.In this study, we use version 31 Level 3 gridded monthly products, available from http://eosweb.larc.nasa.gov.The original 0.5 • ×0.5 • data resolution has been rescaled to 1 • × 1 • .The rescaling is performed by assigning equal weights to each sub-grid, and the final 1 • × 1 • grid is considered valid only when more than half of the sub-grids have valid data.The data are also interpolated to 500 nm using measurements at the four MISR wavelengths of 446 nm, 555 nm, 672 nm and 865 nm.

SeaWiFS
The SeaWiFS instrument was launched on the SeaStar spacecraft in 1997.It is also a wide view imager with a swath width of 1502 km and covers the global in approximately 2 days.The SeaWiFS over-land aerosol retrieval uses the deep blue algorithm developed by Hsu et al. (2004Hsu et al. ( , 2006)).The AOD data over land has been validated using AERONET measurements (Sayer et al., 2012).Here we use the standard Level 3 monthly mean AOD product (Version 004, available from http://mirador.gsfc.nasa.gov/).The data are converted to 500 nm using the reported AOD values at 412 nm, 490 nm and 670 nm.

OMI
The OMI sensor (Levelt et al., 2006) on the EOS Aura satellite has been providing global aerosol measurements since October 2005.The OMI instrument also has a wide swath of 2600 km and produces daily global coverage.The AOD data used here are derived from the UV algorithm (OMAERUV, Torres et al., 2007).The AOD is primarily retrieved at 388 nm using the instrument's two near-UV channels, and the 500 nm AOD reported in the standard product is converted according to the spectral dependence of the assumed aerosol model (Torres et al., 2007;Ahn et al., 2008).While the reliability of the 500 nm AOD is affected by aerosol model assumptions, comparison with AERONET, MODIS and MISR showed reasonable agreements (Torres et al., 2007;Ahn et al., 2008).Moreover, the upgraded OMI algorithm by Torres et al. (2013), which made use of aerosol layer information derived from CALIPSO and AIRS, produced noticeable improvements on the retrieval of dust and smoke aerosols.does not explicitly account for ocean color effects and retrievals over ocean are limited to only high AOD conditions, it is only used over land and in regional analyses.The wide swath of OMI provides daily global coverage.However, its relatively large footprint (13 km × 24 km at nadir) makes cloud contamination a more serious issue in OMI retrievals (Torres et al., 2007).

AERONET
AERONET (Holben et al., 1998) is a ground-based sun-photometer network with over 400 stations globally.The AERONET AOD is derived from direct beam solar measurements (Holben et al., 2001) at two UV and five visible channels.The measurements from AERONET are usually regarded as ground truth when assessing satellite retrievals of aerosol properties.In this study, we also consider the AOD variability represented by AERONET data as the benchmark against which we evaluate the different satellite datasets.The data used are the Version 2 Level 2 quality assured and cloud screened (Smirnov et al., 2002)  and Table 1 lists the station name, location, aerosol type and the number of available monthly mean data points.The aerosol type information for the AERONET stations are mostly obtained from existing references including Kinne et al. (2003), Kahn et al. (2010), andGarcia et al. (2012), and for several stations not available from literature, the aerosol type is inferred from the station description available on the AERONET website (http://aeronet.gsfc.nasa.gov/).The AERONET AOD is converted to 500 nm using measurements from 380 nm to 870 nm by applying a 2nd order polynomial fitting of ln(AOD) vs. ln(wavelength), as recommended by Eck et al. (1999).

Treatment of missing data
As mentioned in Sect.2.5, the completeness of time series is critical to the construction of the temporal covariance matrix.For AERONET data, we apply a linear interpolation to the time series to fill the gaps.The interpolation is performed on the de-seasonalized data constructed by removing the multi-year avearge seasonal cycle, so that the influence of interpolation on the seasonal variability will be minimized.The full data series is then reconstructed by adding the seasonal cycle back.Figure 3 shows the raw and interoplated time series at Minsk station, which is a typical example with several scattered gaps.We can see that the interpolation performs well without introducing much uncertainty.
For the satellite data, we focus on the 60 • S to 60 • N domain where the monthly mean products have nearly full coverage.Nonetheless, we do find that there a few regions with persistently missing data.These regions include the Tibet Plateau for SeaWiFS and OMI, Central Australia for MODIS and the intertropical convergence zone for Sea-WiFS.For these regions, we apply a data availability mask to each monthly mean map to exlcude them from the analysis.Figure 4 shows the mask of the four datasets.Introduction

Conclusions References
Tables Figures

Back Close
Full Overall, the removed data only account for a small portion of the global map and do not affect major aerosol source regions.

Combined Maximum Covariance Analysis
The CMCA technique can be viewed as a combination of MCA and CPCA analysis techniques.The latter two techniques have been described in Li et al. (2013bLi et al. ( , 2014)), respectively.In CMCA, a Singular Value Decomposition (SVD) is performed between the joint satellite data matrix and AERONET data matrix to extract the modes of variability that maximizes the covariance between these two fields.In this way, the modes retain the orthogonality feature, and the leading modes will both have the highest correlation between the two data fields and explain the most variance of each individual field.Specifically, we arrange each satellite data field and AERONET by space and time dimension as where m is the number of spatial locations (number of grid boxes for satellite data and number of stations for AERONET) and n is the number of measurements at each location (length of the data time series).The data are centered by removing the temporal mean from each row of X.In addition, we also create an anomaly data matrix by removing the multi-year avearged seasonal cycle from each row, in order to examine the interannual variability.Introduction

Conclusions References
Tables Figures

Back Close
Full After organizing the data sets in this manner, the data matrix of the satellites are combined into one large 4m × n matrix as It is important to note that the combing of the data matrices assumes equal weight, which requires that the fields being combined have the same measure.For the question here, as the four fields are the measurement of the same physical quantity (AOD) and mapped to the same spatial resolution (1 Next, we construct the cross covariance matrix between the joint satellite field X sat and the AERONET data matrix X AERONET by where X T AERONET denotes the transpose of X AERONET .The orthogonal modes that maximize the covariance between X sat and X AERONET are then found by a SVD of C C = UΣV T (4) U and V are orthogonal matrices whose columns are singular vectors for X AERONET and X sat , respectively, and each pair of singular vectors represent co-varying modes between the two data fields.In the SVD, the singular values in Σ which is the covariance between each pair of singular vectors are organized in descending order, so that the first mode represents the most covariance between the two fields.As the covariance can be expressed as

Conclusions References
Tables Figures

Back Close
Full where cov(X, Y) denotes the covariance between X and Y, r X ,Y denotes the correlation between X and Y, and S 2 X and S 2 Y are the variances of X and Y respectively, maximizing the covariance implies the maximization of both the correlation and the variances.Therefore, the leading modes will represent the correlated variability in the two data sets and account for most of the variance.
The singular values in U and V are the spatial patterns of AERONET data and the combined satellite field, respectively.To find the spatial pattern of each individual satellite field, we divide the V matrix back into four segments as Each segment will have dimension m × n whose columns are the spatial patterns of each individual satellite dataset.The time series A and B describing how each mode oscillates in time are then found by projecting U back to X AERONET and projecting V back to Let σ i denote the i th element of Σ, the Fraction of Squared Covariance (SCF) explained by the i th mode is then given by SCF = σ The major advantage of CMCA over MCA and CPCA is that CMCA effectively incorporates all available information.We will be able to examine the coherency as well as Introduction

Conclusions References
Tables Figures

Back Close
Full discrepancies across satellite datasets in parallel, and to further identify the strengths and weaknesses of each dataset by evaluating their individual spatial modes against the AERONET results.

Results
We start by presenting the results of the global analysis followed by two typical regional examples.Since the main purpose of this paper is to introduce the usage and demonstrate the effectiveness of the CMCA technique, we choose not to dive into detailed regional analysis but rather leave some open questions for the readers to explore using this method.

Global analysis
Analysis is first performed on the full datasets with the seasonal cycle left in.agreement in the semi-annual variability of aerosol optical depth.Note the correlation between the PC time series are also quite high (above 0.9) for these three modes.However, notable differences can also be identified across the data sets.An obvious example is the Indian subcontinent.In Mode 1, MODIS, MISR and OMI all have postive signals over this region, while SeaWiFS has weak negative signals.Turning to AERONET, we find that the three stations over this region also have negative signals, consistent with SeaWiFS but different from MODIS, MISR and OMI.It is thus highly possible that SeaWiFS well captures the seasonality of aerosol variability over the Indian subcontinent while the other three datasets may have lower skills over this region.As a result, this region will be examined in greater detail in the next section.
In fact, with spatial modes from multiple satellites, regions with the highest uncertainty can be highlighted by examining the spread (standard deviation) of the four spatial patterns.Figure 7 shows the standard deviation fields of the four spatial maps for each mode.Regions with largest spread are marked by red rectangles.For Mode 1, in addition to India, East Asia also appears to have larger disagreement.This region has been an emerging global aerosol source region over the past decade, with heavy pollution from industrialized urban areas, especially in East China, and also seasonal dust pollution from Central North Asia.However, as most AERONET stations in East China were established in recent years, we found almost no qualified stations for the purpose of this study.The large disagreement across the satellite measurements over this region therefore suggests the necessity for continuous monitoring of aerosol properties from the surface in this region.For Mode 2, South America, the Sahel, Central Asia and Borneo Island appear to have the largest discrepency.Looking back to Fig. 7, it is seen that for South America and the Sahel, MODIS and SeaWiFS both have strong positive and negative signals respectively and in good agreement with AERONET, while the signals for MISR and OMI are generally weaker, especially for MISR over the Sahel and OMI over South America.Li et al. (2013b) have discussed the problems in these two satellite datasets for these two regions and found underestimation in MISR and OMI during the peak biomass burning season over South America, as well as, the weaker  et al., 2006), it should also be the focus of future AERONET instrumentation deployment.The differences in Mode 3 are similar to those in Mode 1 and Mode 2 and we therefore omit the discussion here.
With respect to the results of the analysis of the anomly dataset, we again select to present the first three modes based on the behavior of variance explained curve shown in Fig. 8.These three modes, as shown in Fig. 9, are also consistent with Li et al. (2013aLi et al. ( , b, 2014) ) and reveal aerosol source regions and their interannual variability.It is encouraging that all four satellite data sets agree well with AERONET qualitatively.Quantitative examination of the standard deviation maps (Fig. 10) reveals discrepencies in the signal strength over South America and the Sahel, which is similar to Fig. 8 and were previously discussed by Li et al. (2013).In Mode 3, Eastern Europe is highlighted with larger uncertainty.This is related to an extreme event and will be further investigated in the next section.East and Southeast Asia also appear in the spread map of Mode 3 which again suggests that additional obsevations are needed in these areas.
While the global results mainly confirm our previous findings, the advantage of using CMCA is clearly seen: comparing multiple satellite datasets in parallel and Figures

Back Close
Full simultaneously validating the variability associated with specific aerosol types and/or source regions against AERONET in one spatial map.Without prior knowledge, these results would be very difficult to obtain by direct comparison, as one would need to compare hundreds of spatial maps or time series from numerous regions.

Regional analysis
In this section, we present the results of two regional case studies.These studies focus on the added information content of the temporal variability, and demonstrate the advantage of the CMCA technique in identifying problems associated with extreme events, interannual variability and seasonal variability.
In the global analysis of the anomaly data (focusing on interannual variablity), we identified a "hot spot" in Eastern Europe, i.e., a region that has large disagreement among the four data sets (Mode 3 in Fig. 10).Here we further examine this disagreement using CMCA by isolating this region.CMCA is performed over Europe within the spatial domain of 6 • W to 56 an extremely strong peak in 2010.MODIS and MISR agree well with AERONET with a peaks of similar strengh.SeaWiFS also has a peak in 2010, but much weaker compared to AERONET.While the OMI data do not show any outstanding peaks in this year.Various reasons may account for the problems in SeaWiFS and OMI.For example, over-conservative cloud screening may mistake smoke pixels for clouds, and the row anomaly developed in the OMI instrument since 2008 (http://www.knmi.nl/omi/research/product/rowanomaly-background.php) may lead to OMI missing this event due to reduced sampling.Our CMCA results suggest that the retrieval of AOD by Sea-WiF and OMI may need to be improved for this region to sufficiently represent this type of extreme events.
Our next example focuses on the analysis of annual variability over the Indian subcontinent, which is another major source of discrepency revealed through the global analysis (see Fig. 7).A major difficulty encountered for India is that few AERONET stations over this area have qualified data records for the construction of the temporal covariance matrix.Therefore, we only have four stations available for this analysis.
Nonetheless, the distribution of these stations does cover the typical aerosol source regions of the Gangetic Plain, Thar Desert and South India.
Figure 13 shows the first two modes of India, which account for ∼ 98 % of the variance.The first mode mainly represents the variability of dust aerosols around the Thar Desert.The PC has a regular summer/winter (boreal) seasonal cycle.The second mode highlights the Gangetic Plain in North India, and its PC time series displays a semi-annual variability with two peaks in the late (boreal) spring to summer and the fall seasons, respectively.The Gengetic Plain has highly variable aerosol types in different seasons.During the pre-monsoon (March-May) and monsoon season (June-August), this region is primariy influenced by dust aerosols, while during the post-monsoon (September-November) and winter seasons (December-January), anthropogenic aerosols compose a larger fraction of the total aerosol loading (Singh et al., 2004;Dey et al., 2010).The four datasets all agree with AERONET over the Thar Desert in Mode 1.However, with respect to the Gangetic Plain, more differences Introduction

Conclusions References
Tables Figures

Back Close
Full appear.In Mode 1, only SeaWiFS agrees well with AERONET over this region with negative signals around the two AERONET sites, which is coherent with AERONET signal.The other three satellite datasets, especially MODIS, have positive signals in this area.For Mode 2, SeaWiFS and MODIS well capture the semi-annual variability and are consistent with AERONET, while the signals for MISR and OMI are much weaker than that observed by AERONET, SeaWiFS and MODIS.This result implies that the seasonality of AOD at Gangetic Plain may be problematic in the MODIS, MISR and OMI datasets.
We also examine the interannual variability of the Gangetic Plain region using the anomaly data.This region appears in the dominant mode, which is shown in Fig. 14.
Interestingly, while the SeaWiFS datasets best represents the seasonal variability of AOD over the Gangetic Plain, Fig. 14 indicates that on interannual time scale, this dataset has the most difference from AERONET compared to the other three datasets.The positive anomalies on the SeaWiFS spatial map are both narrower and weaker.
To explain this paradox in the SeaWiFS data, as well as the problems in the MODIS, MISR and OMI datasets, we compare the time series between the AERONET measurement and satellite data for the Kanpur station, which is located in the center of the Gangetic Plain.The raw time series, multi-year averaged seasonal cycle, and the anomaly time series for each of the satellite data plotted against AERONET at Kanpur are shown in Fig. 15.The correlation coefficient between the two time series on each panel is indicated in the upper left corner.We are able to see that overall, SeaWiFS data has the highest correlation with AERONET for the raw time series and seasonal cycle.Especially for the latter, the correlation is above 0.9.Compared with AERONET time series, MISR and OMI both have an overall low bias, which is larger during the winter months.For MODIS, however, there is an overall high bias during the summer months but an underestimation during the winter.These differences lead to a stronger summer peak and weaker winter peak in MODIS, MISR and OMI data, which is responsible for the positive projection of the winter-summer seasonality (PC 1 of Fig. 13) on these three datasets.On the other hand, for AERONET and SeaWiFS, the intensity

AMTD Figures Back Close
Full of the winter peak is comparable to or even stronger than the summer peak.As a result, the variability of these two datasets is captured by PC 2, which has an associated semi-annual time scale.The comparison between the interannual variability using AOD anomalies (right column of Fig. 15), however, displays a completely different picture.
Unlike the raw time series and seasonal cycle, SeaWiFS now has the lowest correlation with AERONET on interannual time scale.Not only does it fail to capture several strong anomalies in 2005, 2008 and 2009, but also the variance of the time series is considerably lower than that of AERONET.The variance for the SeaWiFS anomaly time series is 0.0041, while that for AERONET is 0.0136, and those for MODIS, MISR and OMI are 0.0113, 0.0083 and 0.0051, respectively.Accordingly, the weaker signal in the SeaWiFS spatial mode in Fig. 14 is attributed to both this low correlation and low variance.
From the global analysis and regional studies, we can clearly see that the CMCA technique is both an efficient and effective way in the analysis and comparison of multisensor data.On a first order, spectral decomposition reduces data dimensionality and limits the comparison to only the first few leading modes that explain the bulk of the variance in the data.Moreover, by integrating all available information, many variations, source regions and events can be further confirmed.Most importantly, the analysis helps to identify the strengths and weaknesses of each dataset in representing aerosol variability for specific regions and on different time scales, which is essential for understanding the capability of the data and making the best use of them.clearly associated with specific aerosol source regions, events or temporal scales represented by each orthogonal mode, which provides useful insights into the underlying physics of the problem.
Examples of global and two representative regional analyses are presented and discussed to show the usage of the CMCA method.Globally, the results indicate that all four datasets reasonably agree with AERONET for major aerosol source regions, including dust over North Africa and the Arabian Peninsula, biomass burning over South America and South Africa and mixed aerosol types over the Sahel.The interannual variability of the source regions also agrees well.These results suggest that these patterns are the most believable and we should be confident in using all or any of the four satellite datasets in the study of aerosols properties over these regions and their temporal variability.
The purpose of the regional case studies is to illustrate the ability of the CMCA method to identify potential problems in certain the datasets.The strengths and weaknesses of each dataset are identified through direct comparison between the positive/negative signals in the spatial patterns of the satellite and AERONET data maps.The nature of the problem can then be further examined by comparing the raw time series.Moreover, the capability of each dataset in capturing the variability on seasonal and interannual time scales can be separately assessed.The results from our regional analysis indicate that SeaWiFS and OMI do not capture the intensive Russian wildfire in August 2010.The AOD seasonality over the Indian Gangetic Plain needs to be improved for MODIS, MISR and OMI.SeaWiFS has the best agreement with AERONET on the seasonal variability over this region, however, on interannual time scales, its agreement is poorer than that for MODIS, MISR and OMI.
Because the main purpose of this paper is to present the CMCA technique, we did not analyze all interesting regions.However, readers are encouraged to use this technique for comprehensive analysis covering more regions and events, or in studying specific regions of their interest.Although this technique has been applied between satellite and AERONET data, there is no doubt that it can be adapted for model -data Figures

Back Close
Full comparison and validation as well as for use with other ground-based network measurements (e.g., MPLnet).Model validation is an important potential application of the CMCA method.On one hand, with multiple observational datasets available, it is desirable to incorporate all pieces of information to yield a more robust validation.One the other hand, as chemical transport models are usually constrained using satellite observations, large uncertainty in observations will also result in poorly constrained model fields.Therefore, places where retrieval skills are low often correspond to those where model fields are in accurate.For example, Trivitayanurak et al. (2012) found poor agreement between the GEOS-Chem simulated AOD and MODIS AOD for Southeast Asia region due to the uncertainties in satellite retrieval.The CMCA technique will identify these regions, and thus provide insights into the problems in either the satellite or the model or both.When using model data, the model data field should be treated in same manner as AERONET is used here, when the model resolution is coarser than satellite data.Or for models with comparable spatial resolution to the satellite data, the model field can be treated as one of the satellite fields, and directly compared with the satellite datasets and AERONET.Traditional model validation usually compares averaged time series between model and data for the globe and several representative regions, while the CMCA offers a new approach with a simultaneous spatial and temporal view.It also provides an effective and efficient way to identify problems that are not easily detected by traditional methods.With the continuous development of remote sensing datasets as well as climate models, we believe this technique will become a useful tool for the data retrieval, data analysis and modeling community.Introduction

Conclusions References
Tables Figures

Back Close
Full   The four satellite datasets also agree well with AERONET.821 822 created a 4-D climatology of monthly Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | And the evaluation work by Ahn et al. (2014) on the upgraded algorithm indicated improved agreements with ground based observation and comparable accuracy with MODIS and MISR.Here we use Collection 003 data from the upgraded algorithm at 1 • ×1 • spatial resolution, available from Goddard Earth Sciences Data and Information Services Center (http://mirador.gsfc.nasa.gov/).Note that as the current OMAERUV algorithm Discussion Paper | Discussion Paper | Discussion Paper | monthly mean AOD product.As the CMCA technique requires the construction of the temporal cross covariance matrix, the completeness of the AERONET AOD time series is critical to the success of the analysis.Therefore, we select stations primarily based on the availability of a continuous data record for the study period of 2005 to 2010.Three steps are involved in the selection and quality control of AERONET data: (1) data from all stations are automatically screened by a threshold of at least 8 monthly mean data points each year from 2005 to 2010; (2) the selected stations are further manually screened by removing stations with relatively large gaps (≥ 3 months) in the time series.This is because we need to interpolate to fill the gaps and generally interpolation with gaps greater than 3 data points will result in large uncertainty; (3) a few stations that do not strictly meet the above criteria are added to account for regions with representative aerosol variability.These stations are primarily based in Asia, including Pune and Gandhi_College in India, Mukdahan in Thailand and Singapore in Singapore City.A total of 58 stations are selected globally.Figure 2 shows the distribution and associated aerosol types of the selected stations, Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Figure  5shows the variance explained by the first 20 modes.The first 3 modes all explain greater than 10 % of the variance and there is a sharp drop in variance from Mode 3 to Mode 4. Based on this behavior, we determine the first 3 modes to be dominant.The spatial patterns and PC time series are displayed in Fig.6.The results are very similar to the those of PCA and MCA as presented by our previous studies(Li et al., 2013a(Li et al., , b, 2014) and thus we will not repeat the discussion here.Special attention should be paid, however, to the agreements and disagreements between the signals at the AERONET stations and those of the underlying maps, as these provide information on the capability of each dataset to represent the associated aerosol variability.For example, in Mode 1, all four datasets and AERONET exhibit positive signals with similar strength over dust dominated regions of Northwest Africa and the Arabian Peninsula.This is an indication that dust variability over these regions is well represented by all satellite datasets.The same applies to the biomass burning aerosol source regions of South America, the Sahel and South Africa shown in Mode 2. Mode 3 also reveals reasonable Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | • E and 40 • N to 60 • N. The first mode of the anomaly data, shown in Fig. 11, clearly highlights the Eastern European region.This mode accounts for 42.3 % of the variance.On the spatial maps, both MODIS and MISR exhibit strong positive signals while SeaWiFS and OMI have a weak or absent signal.The two AERONET stations located in this region also have positive signals, in accordance with MODIS and MISR but disagree with SeaWiFS and OMI.The PC time series of this mode exhibits a high peak in August 2010.Therefore, this mode is most likely associated with the documented intense Russian wildfire in the summer of 2010 (Witt et al., 2011; Konovalov et al., 2011; Chubarova et al., 2012).And the patterns of the spatial maps of the four satellites indicate that MODIS and MISR capture this event while it is less well represented in the SeaWiFS and OMI datasets.To confirm this conclusion, we compare the time series between the AERONET data and the satellite data at Moscow_MSU_MO station, located at the center of the positive anomaly with the strongest signal.The results are presented in Fig. 12 and it is clearly seen that AERONET data at this station are mostly temporally flat except for Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | , we introduce a new spectral decomposition technique based on Principal Component Analysis and Maximum Covariance Analysis.By extracting the modes of variability that maximize the covariance between the combined satellite field and ground-based AERONET observations, the CMCA has the advantage of evaluating each individual dataset using AERONET simultaneously.In addition, the results are Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Fig. 8 .Figure 9 .
Fig. 8. Variances explained by the first 20 CMCA modes of global satellite and AERONET anomaly data.

Fig. 14 .
Fig. 14.The first mode of the anomaly data over the Indian subcontinent.Unlike Fig. 13, on interannual time scales, MODIS and MISR best represent the AOD variability over the Gangetic plain, while the SeaWiFS and OMI patterns have less coherency with AERONET.
. Such information is critical in many aspects of satellite data application, such as developing aerosol parameterization schemes and extending station measurements to a broader spatial context.In this study, we develop a new technique -the Combined Maximum Covariance Analysis (CMCA), to bridge the gap between MCA and CPCA by examining and comparing spatial and temporal variability retrieved by multiple satellite sensors as well as incorporating more randomly distributed ground-based station data such as AERONET.Compared with previous techniques, the advantages of the CMCA include: (Generoso et al., 2003;van der Werfer the Sahel due to its underestimation of AOD during the (boreal) fall and overestimation of AOD during the (boreal) spring.The CMCA sucessfully confirms these conclusions with the help of AERONET.Li et al. (2013b)also investigated the problem for Central Asia around the Taklamkan desert and indicated that the low sampling frequency of MISR may miss dust emission events and thus lead to an underestimation of the variability.Unfortuantely, there is no AERONET station in this area to confirm this hypothesis.The disagreement over Borneo Island in Mode 2 comes from the positive signals seen on MODIS and MISR maps, but no signal in OMI.SeaWiFS has consistently missing data over this region due to its difficulty in cloud screening with the lack of IR chanels (personal communication with Andrew Sayer, August 2012).Again no AERONET station is available here.As this region is a major bioass burning source region(Generoso et al., 2003;van der Werf Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |