Aerosol retrieval experiments in the ESA Aerosolcci project

Within the ESA Climate Change Initiative (CCI) project Aerosolcci (2010–2013), algorithms for the production of long-term total column aerosol optical depth (AOD) datasets from European Earth Observation sensors are developed. Starting with eight existing pre-cursor algorithms three analysis steps are conducted to improve and qualify the algorithms: (1) a series of experiments applied to one month of global data to understand several major sensitivities to assumptions needed due to the ill-posed nature of the underlying inversion problem, (2) a round robin exercise of “best” versions of each of these algorithms (defined using the step 1 outcome) applied to four months of global data to identify mature algorithms, and (3) a comprehensive validation exercise applied to one complete year of global data produced by the algorithms selected as mature based on the round robin exercise. The algorithms tested included four using AATSR, three using MERIS and one using PARASOL. This paper summarizes the first step. Three experiments were conducted to assess the potential impact of major assumptions in the various aerosol retrieval algorithms. In the first experiment a common set of four aerosol components was used to provide all algorithms with th same assumptions. The second experiment introduced an aerosol property climatology, derived from a combination of model and sun photometer observations, as a priori information in the retrievals on the occurrence of the common aerosol components. The third experiment assessed the impact of using a common nadir cloud mask for AATSR and MERIS algorithms in order to characterize the sensitivity to remaining cloud contamination in the retrievals against the baseline dataset versions. The impact of the algorithm changes was assessed for one month (September 2008) of data: qualitatively by inspection of monthly mean AOD maps and quantitatively by comparing daily gridded satellite data against daily averaged AERONET sun photometer observations for Published by Copernicus Publications on behalf of the European Geosciences Union. 1920 T. Holzer-Popp et al.: Aerosol retrieval experiments in the ESA Aerosol cci project the different versions of each algorithm globally (land and coastal) and for three regions with different aerosol regimes. The analysis allowed for an assessment of sensitivities of all algorithms, which helped define the best algorithm versions for the subsequent round robin exercise; all algorithms (except for MERIS) showed some, in parts significant, improvement. In particular, using common aerosol components and partly also a priori aerosol-type climatology is beneficial. On the other hand the use of an AATSR-based common cloud mask meant a clear improvement (though with significant reduction of coverage) for the MERIS standard product, but not for the algorithms using AATSR. It is noted that all these observations are mostly consistent for all five analyses (global land, global coastal, three regional), which can be understood well, since the set of aerosol components defined in Sect. 3.1 was explicitly designed to cover different global aerosol regimes (with low and high absorption fine mode, sea salt and dust).

the different versions of each algorithm globally (land and coastal) and for three regions with different aerosol regimes.
The analysis allowed for an assessment of sensitivities of all algorithms, which helped define the best algorithm versions for the subsequent round robin exercise; all algorithms (except for MERIS) showed some, in parts significant, improvement.In particular, using common aerosol components and partly also a priori aerosol-type climatology is beneficial.On the other hand the use of an AATSR-based common cloud mask meant a clear improvement (though with significant reduction of coverage) for the MERIS standard product, but not for the algorithms using AATSR.It is noted that all these observations are mostly consistent for all five analyses (global land, global coastal, three regional), which can be understood well, since the set of aerosol components defined in Sect.3.1 was explicitly designed to cover different global aerosol regimes (with low and high absorption fine mode, sea salt and dust).

Introduction
The IPCC has identified anthropogenic aerosols as the most uncertain climate forcing constituent (IPCC, 2007;GCOS-92, WMO, 2004), which calls for further work to improve all types of available observations.The satellite aerosol retrieval situation (even with most recent specific aerosol instruments) can be characterized as follows (see e.g.Kokhanovsky and de Leeuw, 2009;de Leeuw et al., 2011).The first algorithms worked with only one or two independent measurements (which required assumptions about all but the one retrieval parameter aerosol optical depth -AOD).The second generation of algorithms/instruments provide several independent observations (spectral, angular, polarization) to better limit the retrieval solution and reduce the number of a priori assumptions.Due to the non-linear, nonisotropic and non-homogeneous propagation of light through the earth-atmosphere system, the sensitivity and thus the retrievable information is different for every different sensor/algorithm combination.These sensitivities depend on atmospheric aerosol load and its characteristics as well as properties of the underlying surface, the presence of clouds, the presence of trace gases and instrument characteristics such as spectral range, polarization and viewing angles.Therefore, products from different instruments cannot easily be compared or merged even if converted to a common reference wavelength.On the other hand, the complementary sensitivities of different instruments hold the potential to increase the number of observations if used in a synergetic way.
The primary objective of the study described in this paper is to better understand and quantify the reasons for differences between the various aerosol products from the different algorithms and sensors described in Sect. 4. The assessment was based on a detailed inter-comparison of the different algorithm approaches.In order to quantify the influence of each assumption, several experiments were then carried out by producing global one-month datasets from eight precursor algorithms with different prescribed aerosol properties and cloud masking.
Section 2 summarizes the analysis concept of Aerosol cci.The common steps to improve and harmonize the algorithms are described in Sect.3.These steps included the definition of common aerosol components and an aerosol-type climatology, and the definition of a common nadir cloud mask.Section 4 describes the algorithms participating in the analysis and the specific implementation of the experiments for each of them.Section 5 gives an overview of the datasets produced, the evaluation tools used and the results of the experiments.The results are discussed in Sect.6.

The analysis concept of Aerosol cci
Within the ESA Climate Change Initiative (CCI, Hollmann et al., 2013), 13 Essential Climate Variables are under investigation, each of them in a dedicated project.Following GCOS principles, each project started with a thorough analysis of user requirements and subsequently of available algorithms for producing consistent, satellite-based, longterm data sets.With regard to the aerosol variables, the project Aerosol cci (July 2010-July 2013) brings together the major European aerosol retrieval experts and the Aero-Com (aerosol model inter-comparison initiative) user community represented by its leaders.Aerosol cci focuses on European total column AOD retrieval algorithms.In addition, the OMI/SCIAMACHY absorbing aerosol index and GO-MOS stratospheric extinction profiles are also considered in the project (not analysed here, since they do not provide total AOD).
The overall concept for the qualification of AOD algorithms in Aerosol cci consists of three steps: (1) several algorithm experiments conducted on a minimal statistically significant amount of data in order to understand the effects of major assumptions (this paper); (2) a round robin exercise (four months, one in each season) to evaluate the improved algorithms versus a more comprehensive independent ground-based dataset and thus identify mature algorithms (de Leeuw et al., 2013); and (3) the production of a complete validated one-year ECV product for assessment by the climate model community.
For the experiments, datasets covering the entire globe for one complete month (September 2008) were chosen as a compromise between statistical significance and production effort with eight algorithms and several experiments.The evaluation of the datasets was conducted by consideration of statistical parameters (mean bias, root mean square error (RMSE), and Pearson correlation coefficient) obtained from comparison of gridded 1 • latitude longitude daily satellite products (level 3) versus AERONET daily averaged aerosol optical depth (AOD) interpolated to a reference wavelength at 0.55 µm.All experiments began with an analysis of baseline datasets for each algorithm (prior to making any changes).
This paper summarizes the algorithm experiments conducted as listed in Table 1 and the analysis made on them: (0) baseline datasets produced with the pre-cursor algorithms prior to any changes; (1) the use of a common set of optical aerosol properties with four components, which were externally mixed; (2) the additional use of an AeroCom/AERONET-based aerosol-type climatology as a priori information; (3) the use of a common AATSR nadir cloud mask for ENVISAT algorithms.It should be noted that not all algorithms could conduct all experiments due to a number of technical constraints.
Following the analysis discussed here, the round robin analysis (de Leeuw et al., 2013) also included assessments of satellite level 2 datasets (10 km super pixels) as well as inter-comparisons to external reference datasets such as other satellite instruments (MODIS, MISR) or from models such as AeroCom median.

Common changes to the algorithms for the experiments
The algorithm development within Aerosol cci was based on existing precursor algorithms with an initial focus on ENVISAT sensors and PARASOL with a later extension to predecessor instruments (e.g. on board ERS-2) and successor sensors (e.g.Sentinels).The key aerosol Essential Climate Variable (ECV) product of Aerosol cci is global multi-spectral AOD with additional information on aerosoltype/aerosol optical properties, both including pixel-wise error information.The following three sub-sections describe the setup made for the three experiments conducted to study the sensitivity of the retrieval results to two of the three most critical parts of aerosol retrieval algorithms: assumptions on aerosol optical properties and cloud masking.The third critical element, namely surface treatment, is typically intrinsic to each retrieval algorithm and thus was not (yet) assessed for all algorithms.For aerosol optical properties and cloud masking the ultimate goal was to come to harmonized definitions for a community algorithm.

Definition of common aerosol components
Aerosol size distributions in global modelling and satellite retrieval are commonly approximated by multi-modal log-normal number size distributions (Seinfeld and Pandis, 1998), covering a size range from a few nanometres to several tenths of micrometres: where each log-normal mode is defined by three parameters: aerosol number concentration N i , number mode radius r gi and (geometric) standard deviation σ i .
For use in satellite retrievals, only those particles need to be included which are large enough to be detected by optical instruments, i.e. with sizes larger than about 0.05 µm in radius.For those particles the scattering efficiency differs significantly from zero.Furthermore, because physical, chemical and optical properties of particles with radii smaller or larger than about 0.5 µm are usually quite different, the size distributions used in aerosol retrievals are usually described by a bi-modal distribution (n = 2 in Eq. 1).The two size modes are commonly referred to as the fine and coarse modes.
For the Aerosol cci experiments, the choices made for r gi and σ i are presented for each size-mode in Table 2 (de Leeuw et al., 2013).These choices were based on probability distribution statistics derived from AERONET analysis, provided in Fig. 1, and detailed literature review of the various definitions currently in use in the eight precursor and other aerosol retrieval algorithms.In basing these choices on AERONET statistics, the authors are well aware of existing limitations (bi-modal size distribution, assumptions on refractive indices) but consider this dataset as the most comprehensive and uniform available source of aerosol property knowledge.Table 2 also provides the complex refractive indices used for the mid-visible region.The two fine mode types are taken as the two extremes in terms of absorption; the reality (in terms of absorption) is a combination of these two types.As can be seen in the joint probability distribution of the upper part of Fig. 1, based on AERONET sun photometer data, the most frequent fine mode size (in terms of the effective radius) is near 0.14 µm, which was thus chosen as the characteristic value for the Aerosol cci fine mode definitions shown in Table 2.
The coarse mode is dominated by two quite different aerosol types: spherical non-absorbing sea salt particles and non-spherical absorbing dust particles.Based on an AERONET probability distribution for the coarse mode shown in the lower part of Fig. 1, the effective radius was set to 1.94 µm for these two coarse mode aerosol types.Here it is noted that for sea salt aerosol the size distribution is slightly different form the one recently derived by Sayer et al. (2012), based on version 2 of the AERONET retrieval algorithm (Dubovik and King, 2000;Dubovik et al., 2006;Sinyuk, et al., 2007); Sayer et al. (2012) derived an effective radius of 2 µm, a variance of 0.72 and a refractive index of (1.363, 3 × 10 −9 ).The assumed log-normal size mode is wider than the fine mode.It is noted that a small contribution of aerosol particles larger than 15 µm in radius cannot be ruled out.For dust the variability of effective radii between different regions is depicted in Fig. 2. As a result of this variability, any global definition can only describe an average and may differ from the reality in each specific retrieval case.
For the calculation of the aerosol optical properties for the aerosol types in Table 2, the particles are assumed spherical and a Mie code can be applied, except for dust for which the aerosol is modeled as an ensemble of randomly oriented spheroids with scattering kernels generated based on Dubovik et al. (2006) using a combination of T matrix and improved geometrical optics calculations.The distribution of aspect ratios ranging between 1.44 and 3.0 was derived by Dubovik et al. (2006) by fitting phase matrices of dust measured by Volten et al. (2001) in laboratory experiments.Although spheroids may be unable to represent the entire shape complexity for dust, this spheroid method is preferable over methods for spheres.An important issue is also the choice of the correct refractive index for dust (Volten et al., 2001).Observational data for the Sahara region (Dubovik et al., 2002;Sinyuk et al., 2003) demonstrate that the dust absorbing strength is spectrally dependent and decreases from the UV (refractive index near 0.005) to the near-IR (refractive index near 0.001).Dust indices of refraction vary with source region, and are really not well characterized globally, so that an area of significant uncertainty still remains.
All experiments described in this paper (except the baseline references) use this definition of four basic aerosol components, which are externally mixed or the mixing fractions even (partly) retrieved in a way specific by each algorithm.

Definition of a common aerosol component climatology
Having defined four common aerosol components for use in the retrieval algorithms (Table 2), the particular aerosol model applied to each retrieval pixel can then be determined by three external mixing fractions of AOD550 (aerosol optical depth at 0.55 µm, the usual mid-visible reference wavelength): the fine mode fraction of total AOD, the fraction of weakly absorbing fine mode AOD of total fine mode AOD, and the dust fraction of the coarse mode AOD.
The aerosol component experiments differ in the way these fractions were determined.In the first experiment algorithms tested a completely free retrieval of the three fractions and their associated AOD.In the second experiment, a priori information for these three fractions, all or in part (depending on capabilities of the different algorithms) based on climatological data, was introduced.Since no global daily a priori maps of the aerosol type for 2008 are available, a climatology was used.Such a climatology has been extracted from Aero-Com model median global monthly maps (Kinne et al., 2006, Appendix) which were locally improved by using high quality statistics on the occurrence of aerosol components available from analysis of ground-based remote sensing from the AERONET sun photometer network (Holben et al., 1998).Climatological data produced with this combination of Aero-Com model and AERONET measurements for the month of September for the three mixing fractions (and for reference the total AOD) are presented in Fig. 3.In order to demonstrate the implementation of the common aerosol model in the retrieval algorithms, this figure does not show parameters such as single scattering albedo, but rather the three mixing fractions used in the retrievals.

Selection of a common cloud mask
Reliable cloud masking is an essential part of aerosol remote sensing algorithms as cloud contamination can significantly increase measured reflected radiance and thus retrieved AOD.In recent years, also the radiative transfer in the vicinity of clouds came into discussion, as it is not as straightforward to detect cloud-contaminated pixels as it may seem (Koren et al., 2007(Koren et al., , 2008)).Especially for satellite observations with spatial resolution in the order of 1 km this may result in significant misinterpretation (Koren et al., 2008;Coakley et al., 2005).Thus cloud masking has to take into account also some "twilight" or "safety" zone around clouds to reduce impacts of three-dimensional effects or contamination from sub-pixel clouds on aerosol retrievals.
Cloud masking is an application of satellite remote sensing with a long history.Cloud information for different applications (such as cloud properties, atmospheric sounding, aerosol or sea surface remote sensing, vegetation and land surface observations) has different requirements on cloud detection schemes.Most cloud detection techniques use similar physical principles, but there are large differences in thresholds defined in accordance with the intended application.Consequently, cloud masking results differ, even when applied to the same sensor.
To exclude cloud masking effects on the results from the different retrieval algorithms, an experiment was carried out in which all participating aerosol retrieval algorithms used a common cloud mask.In order to choose a well performing cloud mask with a reasonable effort, a set of 17 globally distributed scenes from four different days in September 2008 (1, 6, 7, and 25) was selected.These scenes covered the most difficult conditions (in terms of aerosol remote sensing) with different types of clouds and partly coincident high aerosol loadings (heavy smoke plumes from biomass burning, airborne dust transported over the ocean, industrial haze).Visual identification of obvious cloud and aerosol patterns in true colour composite images from the MODerate resolution Imaging Spectro-radiometer (MODIS) was used as additional independent comparison.
An example result of one single scene analysis is shown in Fig. 4  No effort was made to use common projections and visualisations, as the main differences became visible even with a qualitative analysis of images as provided by the respective partners.
It is evident in Fig. 4 that all AATSR cloud masks are able to detect the main cloud features well, but the three AATSR nadir cloud masks are not identical.The cloud fraction estimated by the ESA operational AATSR mask is generally much higher than that from APOLLO, and thus leaving fewer observations for aerosol retrieval (higher sensitivity near cloud edges, classification of structured land surfaces as clouds, artificial patterns arising over ocean).Part of the smoke plume (shown in the MODIS RGB image) is flagged as cloudy by the ESA operational AATSR mask.On the other hand, the ESA operational mask partly fails to detect clouds associated with shallow inland convection.The ADV AATSR mask misses part of the cirrus band.The MERIS standard algorithm cloud flag fails to detect the cirrus cloud band due to the lack of TIR channels.
The analysis of all 17 scenes (not shown) led to following overall outcome: the ESA operational AATSR mask fails to detect a substantial amount of closed large-scale stratocumulus fields with high reflectance.APOLLO, and to some extent also the ESA operational AATSR cloud mask, classify inland water bodies and river estuary regions with high amount of dissolved particles or shallow water as cloudy.For heavy dust events APOLLO and the ESA operational AATSR mask frequently fail to distinguish between dust and cloud.The ESA operational AATSR cloud mask fails to detect part of the shallow (scattered) clouds with warm tops.The FMI AATSR cloud mask misses part of the cirrus cloud cover.Potential high-reaching biomass burning plumes are classified as cloudy by all masks.The MERIS standard algorithm cloud flag has generally lower cloud fraction compared to all AATSR masks and it partly fails to detect scattered (strato-) cumulus clouds within dust plumes.
After evaluating the strengths and limitations of the different cloud masks used in the precursor algorithms, it was agreed to use the APOLLO cloud mask as common nadir cloud mask for the third experiment.Because it is evident that APOLLO also cannot provide a perfect cloud mask, a safety zone was adopted as a zone of 10 km around any cloudy pixel (which means a highly conservative experiment to minimize cloud contamination).
The APOLLO scheme is based on a variety of different tests with solar and thermal channels including reflectance ratios, brightness temperature differences and histogram tests.The method was originally developed by Saunders and Kriebel (1988) and re-evaluated and updated by Kriebel et al. (2003) for the AVHRR operated on the NOAA satellite series.The APOLLO algorithm has also been transferred to a set of other satellite sensors including AATSR on board ENVISAT.For application in the field of aerosol retrieval, another set of updates to the AATSR adaptation of APOLLO was described in Holzer-Popp et al. (2008).It is important to note that the APOLLO adaptation to AATSR differs significantly from the ESA standard cloud mask algorithm for AATSR.
As an illustration of spatial aerosol retrieval limitations in the experiments conducted, Fig. 5 presents the global monthly mean cloud fraction for September 2008 obtained from the APOLLO method (adapted to AATSR as described in Holzer-Popp et al., 2008) with AATSR.It is evident that over ocean the cloud fraction is generally higher than over land, especially in the subtropical subsidence regions and only few regions over land have more than 60 % cloud-free observations.This common nadir cloud mask was used with the native AATSR orbit pixel resolution of 1 km (at nadir) in the third experiment of Aerosol cci for the retrievals using ENVISAT morning observations (AATSR, MERIS, synergetic AATSR + SCIAMACHY); it could not be directly transferred to the afternoon PARASOL data due to the large temporal variability of clouds (here no cloud mask experiment was made).

Algorithms participating in the analysis
The initial focus of Aerosol cci was on ENVISAT instrumentation (AATSR, MERIS, SCIAMACHY), which have well cross-calibrated visible reflectance (Kokhanovsky et al., 2007) to within a few percent.With several instruments on the same platform, the temporal and spatial collocation differences are also minimized when inter-comparing retrieval products.One additional European instrument assessed was PARASOL due to its very high information content of its multi-spectral, multi-angular and polarization measurements.The precursor algorithms, which provide total column aerosol optical depth and were included in the experiments, are listed in Table 3 together with their main characteristics.All algorithms apply cloud, snow and glint masking prior to aerosol retrieval.All core retrieval algorithm features (numbers of observations used, surface treatment) remained unchanged for each algorithm through all experiments in this study.The aerosol model and cloud mask information given in Table 3 is valid only for the baseline datasets, but they were altered in the experiments conducted.For each algorithm the key features and the specific way of implementing the experiments are briefly discussed in the following sub-sections.

The FMI AATSR retrieval (ADV/ASV)
The AATSR dual-view (ADV) algorithm over land is based on the so-called k assumption, where the ratio (k) of the ground reflectances for the two views is assumed to be independent of wavelength (Flowerdew and Haigh, 1995).The k ratio is computed at the 1.61 µm wavelength, where reflectance due to aerosols is in first approximation assumed to be negligible compared to ground reflectance.Over ocean the aerosol single view (ASV) algorithm minimizes the discrepancy between the TOA-measured and the modelled reflectances.Uncertainty of the retrieved AOD was determined on pixel level by propagating the measurement error in the TOA reflectance through the retrieval process (Kolmonen et al., 2013, and references cited therein).
For this study the two basic aerosol components in the precursor algorithm were replaced by the four common components defined in Sect.3.1.The aerosol climatology described in Sect.3.2 was implemented in three different ways.First, the fine mode fraction, the mixture between absorbing/nonabsorbing fine particles, and the dust fraction were used without modifications in the retrieval (experiment 2).Second, the fine mode fraction was retrieved.Third, the fine mode fraction and the mixture between absorbing/non-absorbing fine particles were retrieved (experiment 1).All mixtures were treated as external ones.The new version is more complex when compared to the baseline algorithm because one new mixture is introduced.The fine mode fraction was converted to the above-mentioned mixing ratio during retrieval.For the cloud mask experiment no. 3 described in this paper, the common cloud mask was collocated in the algorithm and used instead of the default ADV/ASV cloud screening (Curier et al., 2009).

The Oxford RAL Aerosol and Cloud retrieval (ORAC)
ORAC describes an optimal estimation retrieval scheme designed for the retrieval of aerosol and/or cloud properties from AATSR (as well as the upcoming Sea and Land Surface Temperature Radiometer that will form part of the European Sentinel series satellites).It provides rigorous uncertainty propagation (providing pixel-by-pixel uncertainty on retrieved quantities) and inclusion of a priori knowledge and ensures that the most is made of the information provided by the measurements, by calculating all retrieved parameters as a function of all measurements simultaneously.Surface reflectance is retrieved as a bi-hemispherical reflectance, with the directional dependence of the BRDF being constrained by a priori values provided by the MODIS MCD43B surface BRDF product over land (Sayer et al., 2011) and a comprehensive sea surface reflectance model (Sayer et al., 2010) over ocean.
Incorporating the common aerosol models (Sect.3.1) into ORAC was achieved by producing a total of ten new aerosol classes based on the climatology presented in Fig. 3, which represent the full range of fine mode absorbing/nonabsorption ratios and dust/sea salt ratios found in the climatology.For the "free retrieval" ORAC product (experiment 1), the aerosol class for each retrieved pixel was selected based on a chi-squared goodness of fit measure.For the prescribed aerosol-type product (experiment 2), the class was selected to match the climatology, although course-fine mode ratio was still free to vary (as the effective radius remains a retrieval parameter).Since using the APOLLObased common cloud mask resulted in a significant decrease in the quality of the product, the respective experiment is not included in this paper.

The Swansea University AATSR retrieval (SU)
The algorithm is based on iterative optimization of AOD and aerosol model subject to multiple constraints (over land a multi-angular constraint, over ocean a spectral constraint).The uncertainty in the retrieved AOD is derived from the curvature of the error surface near the minimum, and perchannel instrument and surface model uncertainties.The retrieval of aerosol properties is normally made at a coarser grid than the sensor resolution, to allow computational efficiency and to minimize registration error.For the ATSR series, the ratio of surface reflectances at the nadir and forward viewing angles is well correlated across wavebands, and the variation in anisotropy may be modeled simply (Veefkind and de Leeuw, 1998;North et al., 1999).This avoids the need for assumptions on absolute surface brightness or spectral properties.The method differs from other approaches by using a more sophisticated physically based surface model to account for spectral variation of the surface anisotropy owing to the variation of the fraction of scattered light with wavelength (North et al., 1999).
For the subsequent experiments the atmospheric look-up table set was replaced by a new set derived using the common aerosol model definition for four pure components.For experiment 1, the best fitting pure component model was returned without a priori assumptions.For experiment 2, a larger set based on 35 external mixtures of these components was derived, and estimation of continuous component fractions defined by local climatology was estimated by tetrahedral interpolation of radiative components.The climatology model was used as an a priori estimate of aerosol type, and retrieval proceeded using this in addition to a fixed set of mixtures.Best model was chosen based on optimisation as before, with weighting of the error function parameterized to favour the climatology model and to force the retrieved model to the climatology for low AOD (< 0.2) where constraint from the data is weak.The APOLLO common cloud mask (experiment 3) including safety zone was implemented for nadir viewing; however, the original cloud mask was used to further screen clouds in the forward view since a common forward cloud mask was not defined.

The Synergetic Aerosol Retrieval for AATSR and SCIAMACHY (SYNAER)
The synergistic aerosol retrieval method SYNAER delivers AOD and an estimation of the type of aerosols from a predefined representative set of aerosol mixtures in the lower troposphere over both land and ocean by exploiting a combination of a radiometer (e.g.AATSR) and a spectrometer (e.g.SCIAMACHY).The radiometer is used for inversion of AOD and related surface reflectance for the different aerosol mixtures, for which the selection is then based on spectral fitting of the collocated spectrometer measurements.It is important that the entire method for both sensors uses the same aerosol model and radiative transfer code.Pixel level AOD uncertainties are estimated with a parameterization increasing with two terms: surface reflectance (aerosol-surface discrimination error) and AOD (aerosol-type discrimination error).The SYNAER information content for aerosol type was analysed theoretically in Holzer-Popp et al. (2008).
In this study the four common optical components were used to define a set of 36 aerosol mixtures covering a realistic range of atmospheric aerosol conditions.Whereas experiment 1 allowed free retrieval of the three mixing fractions, for experiment 2 the common AeroCom/AERONET climatology was used to identify the nearest discrete mixture along the three mixing fractions.Since SYNAER uses the APOLLO cloud mask, the experiment on the common cloud mask was of limited scope for SYNAER.However, the different size of the safety zone (5 km for SYNAER baseline and 10 km for the common cloud mask) was tested in the third experiment.As the nadir-only approach of SYNAER (adopted for consistency with successor instruments AVHRR + GOME-2 on board METOP) makes the surface brightness parameterization important, an experiment was conducted for this algorithm only (not shown here), where the surface albedo of all retrieval pixels was reduced by 0.01 at the retrieval wavelengths of 0.67 µm (over land) and of 0.87 µm (over ocean).

The Bremen Aerosol retrieval for MERIS (BAER)
For the determination of AOD from observations of the MERIS single-view multi-spectral imager, the Bremen AErosol Retrieval algorithm (BAER) (von Hoyningen-Huene et al., 2003, 2006) solves the radiative transfer equation for the aerosol reflectance after subtracting Rayleigh path reflectance calculated with a digital elevation model (GTOPO30).Over land the variable surface albedo is considered by a mixing model of surface reflectance of "green vegetation" and "bare soil" tuned by the normalized differential vegetation index.Over ocean a similar mixing model is used, tuning water leaving reflectance by mixing of a clean ocean spectrum with one of coastal water using the normalized differential pigment index for tuning.The Fresnel reflectance of the water surface is modelled, using Cox and Munk (1954).
For experiment 1 the common aerosol optical components defined in Sect.3.1 have been implemented in BAER with a limited number of fixed mixtures of them.BAER has not yet been adapted to the common cloud mask.

The ESA MERIS standard retrieval
The MERIS standard aerosol retrieval over land algorithm was originally designed to work over Dense Dark Vegetation (DDV) targets and was extended to brighter surfaces (as DDV spatial cover is low) where the spectral albedo can be predicted as it is linearly related to Atmospherically Resistant Vegetation Index (ARVI).In calculating ARVI, MERIS benefits from the available blue channel which is missing in AATSR.Cloud contamination is the main issue of the standard product as the ESA standard MERIS cloud mask is not robust enough over land (the MERIS instrument has no thermal infrared bands).
Assessing the use of the common cloud mask (experiment 3) derived from AATSR APOLLO (together with the 10 km safety zone) was thus of high interest for this MERIS algorithm, although this led to a vast reduction of the data coverage.For experiment 2, the common aerosol components defined in Sect.3.1 were implemented together with their geospatial prescription through the common aerosoltype climatology defined in Sect.3.2.A free retrieval of aerosol type (experiment 1) was not tested since MERIS has not enough information to retrieve the aerosol mixing fractions.

The Aerosol Load and Altitude from MERIS over Ocean retrieval for MERIS (ALAMO, ocean only)
The MERIS ALAMO (Aerosol Load and Altitude from MERIS over Ocean) algorithm has been primarily developed for aerosol altitude retrievals using MERIS data (Dubuisson et al., 2009).Necessary inputs for altitude retrievals, such as aerosol optical properties, are derived in a first step with an initial assumption on the layer altitude.The cloud masking and AOD retrieval schemes are a close adaptation of the MODIS algorithm.The aerosol products of ALAMO include the optical depth and the mixing ratio of fine and coarse modes.Aerosol models used for ALAMO baseline are the same as the ones used for the most current version of MODIS products.In a second step the altitude of the aerosol layer is estimated using the MERIS O 2 A absorption channel.A pixel reclassification is done after the altitude retrieval to remove high thin clouds based on a threshold on altitude and spatial variance of altitude.The set of common aerosol components for experiment 1 was implemented in ALAMO with a number of fixed mixtures and allowing retrieval of the fine/coarse ratio, whereas the other mixing fractions were kept fixed.

The PARASOL retrieval (ocean only)
In this study only the PARASOL aerosol retrieval over ocean is considered.It is based on a comparison between spectral, directional and/or polarized radiances and look-up tables built for a set of aerosol models, different AOD and geometrical conditions.Thanks to the use of directional and polarized information, several parameters (size, refractive index, shape) describing aerosol properties can be derived when the scattering angle range is large enough (at least 125-155 • ).The first step of the algorithm is to perform cloud screening and then correct for ozone or water vapor absorption effects (Vesperini et al., 1999) and for potential stratospheric contamination (Lafrance and Herman, 1998).Overall, the cloud detection has been shown efficient except for situations with overcast high thin clouds that are difficult to identify.
When using the Aerosol cci components in free retrieval the algorithm did not mix the two fine mode aerosol components.Unlike ENVISAT, the equatorial crossing time of PARASOL is 13:30, which makes the use of the APOLLO AATSR (10:30 equatorial crossing time) common cloud mask irrelevant.5 Results from algorithm experiments

Evaluation approach
For evaluation, 1 • × 1 • gridded level 3 satellite datasets, produced with the Aerosol cci experiments covering the month of September 2008, were compared with daily averaged AERONET sun photometer data (daylight hours only).The daily satellite data (in fact one daytime snapshot per satellite) were retrieved on the days when AERONET observations in the grid were reported.Coherent pairs of valid daily observation from satellite and sun photometer were thus retained at each station for clear-sky conditions when both sun photometer and satellite retrievals were successful.
AeroCom tools were used to evaluate the Aerosol cci satellite retrieval versions.These tools were initially programmed to perform analyses for the AeroCom project (http://aerocom.met.no/).They perform comparisons between models and models and between models and groundbased measurements for many model parameters and can read observations from about 20 different observation networks for different variables (Schulz et al., 2009).The outputs are, e.g.maps, difference maps, measurement number maps, time series on the location of ground-based observations, zonal mean plots, model versus observations scatter plots, statistical analyses (e.g.correlation, Bias, RMSE), etc.
In order to facilitate the access to the analyses, the output of the AeroCom tools is presented via a web interface (http: //aerocom.met.no/cgi-bin/aerocom/surfobsannualrs.pl).An adaptation of the tools was made to enable them to use gridded daily satellite-based data as a new data source (by treating the satellites as a model) and to account for the specifics of satellites as a data source (e.g.work only with common data points of the different satellite retrievals as described in de Leeuw et al., 2013).
To quantify the performance of the different versions of each retrieval algorithm in the experiments, reference data sets were compiled from sun photometer data.High-quality AOD data is provided by the ground-based sun/sky photometer networks of AERONET, PHOTONS, SKYnet and GAW (Holben et al., 2001).In contrast to aerosol remote sensing from space, these ground-based transmission measurements require no a priori assumption of aerosol absorption or radiative background.The error in individual retrieved AOD measurements has been estimated (Eck et al., 1999;Dubovik et al., 2002) to be ∼ 0.01, or 5-10 % for AOD values smaller than 0.2.Using AERONET reference data averaged over the day somewhat increased their uncertainties in cases of highly variable aerosol conditions.Even though limited to the land-based observation sites, having access to a global set of sun/sky photometer data provided the possibility to establish solid statistics.In this evaluation total AOD interpolated to 0.55 µm from the direct sun observations of AERONET, level 2, version 2 were used.
This paper focuses on the global statistical analysis (separated for land and coastal sites) and in addition looks into few selected regions representative for different aerosol regimes.The differentiation between land and coastal pixels was made on the basis of the land/sea flag in the ORAC datasets on a 10 km grid.The regions analysed are South America and surrounding oceans (60 • S-20 • N, 105-30 • W representing biomass burning), northern Africa and the Mediterranean (0-45 • N, 20 • W-50 • E, representing mineral dust), and East Asia (0-50 • N, 90-150 • E, representing anthropogenic smoke).Europe and North America, where the bulk of the global AERONET stations are located, are included in the global statistics.AeroCom tools also provide monthly mean AOD maps calculated from daily satellite data.These monthly mean maps were also visually inspected to judge the overall differences between algorithms and versions (or their reduction) and in particular to see how far features of the global aerosol distribution could be resolved by the algorithms.

Experiment analysis results
The analysis of the impact of the various algorithm experiments was made by visual inspection of monthly mean AOD maps and quantitative comparison statistics against AERONET daily AOD measurements.As an initial way of assessing the results of the various experiments, Figs.6-9 show maps of monthly mean (simple average of all gridded daily AOD values) for September 2008.These maps allowed checking qualitatively whether typical large-scale features of the global aerosol distribution could be retrieved by the various algorithms (e.g.biomass burning plumes west of South Africa, dust plume west of the Sahara, seasonal biomass burning in South America and South Africa, industrial/urban pollution in West China and India, low AOD values in remote oceanic regions and higher mid-latitudes and over large mountain regions such as Tibet or the Rocky Mountains, dust loading in semi-arid regions).Also the global coverage of the different datasets could be estimated, showing some differences in extending to high latitudes and cloud-induced gaps in the tropics.
Figure 6 shows the baseline datasets for all eight algorithms prior to the experiments.Although there is qualitative agreement on several of the characteristic features, there are also many qualitative and quantitative differences between the maps over both land and ocean (e.g.global oceans, biomass burning in South America and Africa, industrial pollution in East China and India, Europe and North America) and large differences in the global mean AOD values ranging from 0.09 to 0.32.
Figure 7 shows the results of the first experiment using the common aerosol components for seven of the retrievals.
There is a general tendency for better agreement of the features for most but not all of the algorithms and a significant reduction of the differences in the global mean AOD values, now ranging from 0.13 to 0.22.High AOD regions due to biomass burning in South America and Central/South Africa and over adjacent oceans, or due to North African dust are now at least partly visible in all of them, which could be a result of allowing larger absorption with mid-visible singlescattering albedo of the strongly absorbing component at 0.8 (e.g. the MERIS algorithms, see Table 3).Also over India, East China, Europe and North America the differences between most of the datasets are reduced, although the features still do not agree everywhere.Four algorithms now have similar background oceanic AOD (and the other three agree at least in the tropics).MERIS has very high AOD at high latitudes and for SYNAER the high AOD features in Central Asia of the baseline dataset get even more pronounced.
Figure 8 shows the results of the second experiment, where the AeroCom/AERONET climatology was used as a priori information for the aerosol mixing fractions, for five algorithms.Here, the agreement of features over land between the three AATSR and the one MERIS algorithm in the (sub-)tropics is further enhanced.ADV shows very high AOD at high latitudes and increased AOD over tropical oceans.SYNAER has generally much lower AOD, which is assumed to be partly linked to an interplay of aerosol absorption (now prescribed by the climatology) and the surface parameterization developed using different assumptions on aerosol absorption.Over ocean some retrievals seem to completely fail with the climatology-prescribed aerosol mixture (not meeting fit quality criteria).Global mean AOD of the algorithms have values from 0.07 to 0.26 which differ more than in experiment 1.
The use of the common cloud mask in the third experiment is shown for four retrievals in Fig. 9.It is noted that ADV and MERIS-STD used the common climatology (thus comparing experiment 3 directly to experiment 2), while SU and SYNAER went back to free retrieval (thus comparing experiment 3 to experiment 1).The impact of using the common cloud mask is relatively small for the retrievals using AATSR.For SYNAER (where only the size of the safety zone was changed because APOLLO was used in the baseline version already) a reduction of numbers of available dark fields was observed, which led to a minor change of the resulting map.The largest change is visible in the MERIS-STD dataset, when using the AATSR-based common cloud mask with a significant reduction in coverage and a general reduction of AOD values.Global mean AOD of the algorithms show a similar range as in experiment 2.
For the maps shown in Figs.6-9 the numbers of pixels contributing in each grid box to the monthly mean is shown in the appendix (Figs.A1-A4).It is obvious that MERIS and PARASOL provide more pixels (up to 15) than all three AATSR datasets (up to 6), due to its much smaller swath, and even further than SYNAER limited in addition to AATSR by Fig. 6.September 2008 unweighted monthly mean ADO550 for eight total column precursor algorithms: baseline datasets (experiment number 0 in Table 1).From top left to bottom right: AATSR ADV, AATSR ORAC, AATSR SU, SYNAER, MERIS BAER, MERIS STANDARD, MERIS ALAMO (ocean only), PARASOL (ocean only).
the nadir/limb alternating of SCIAMACHY (up to 3).Large differences of numbers occur for the three MERIS datasets (BAER exploiting by far more pixels, which points to an issue of cloud flagging and ALAMO exploiting only few pixels).Coverage often does not change much between the different experiments, with few exceptions.Major increases of numbers are observed for SU and ALAMO in experiment 1, whereas clear decreases happen for BAER in experiment 1, for SU in experiment 2 over ocean and for MERIS-STD in experiment 3 due to using the AATSR cloud mask with half of the coverage only.
For a quantitative analysis, scatter plots have been used to assess bias, RMSE and correlation with AERONET measurements.This analysis is separated for land and coastal stations to grasp the different performance of the retrieval algorithms over land and ocean (though the land/ocean specific parts of each algorithm have not been changed throughout the experiments).Over the open ocean the number of reference points (from the MAN network) was not sufficient for statistical analysis.Figures A5 to A12 in the supplementary material show these scatter plots of satellite AOD at 0.55 µm versus AERONET (all gridded daily means) for the datasets produced by the various algorithms and experiments as shown in Figs. 6 to 9, separated over land and ocean.It is obvious that, except for PARASOL, all datasets showed a clearly weaker performance than the reference datasets (e.g.MODIS, MISR).
For each algorithm and each experiment conducted, the results of the analysis depicted in the scatterplots are summarized in Tables 4a-e.These data clearly show that the number Fig. 7. September 2008 unweighted monthly mean AOD550 for seven total column precursor algorithms as in Fig. 6: datasets with use of common aerosol components and (partly) free retrieval (experiment number 1 in Table 1).
of data points coincident with AERONET observations may change significantly between the different algorithms and between the experiments.Similar observations to the numbers of all pixels used in the maps (Figs.A1-A4) can be made with the satellite-AERONET data pairs.Evidently, the wider swath of MERIS provides larger numbers of data points compared to AATSR.The synergetic retrieval constrained by two instruments and with even larger pixel size had the lowest number of data points, as expected.PARASOL also has a large pixel size, resulting in fewer data points.But even when the same sensor was used, the numbers differed due to different quality thresholds for cloud-free super pixels of 10 × 10 km 2 within the level 2 products.Outstanding changes in numbers of data pairs between experiments of one algorithm are obvious for MERIS.For the MERIS standard algorithm, the use of the common cloud mask based on AATSR led to a major decrease as one would expect in view of the reduction to half of the swath width.However, the decrease went much further and yielded even significantly fewer pixels than for AATSR-only algorithms.The reason for this needs further investigation.Interestingly, for ALAMO the coverage almost doubled with using the common aerosol components, while SYNAER had a reduced number of data pairs.This indicates sensitivity in algorithm convergence, depending on aerosol assumptions.
In addition to the global analysis for all land and all coastal stations, regional analysis was conducted in three regions each representing a different aerosol regime (South America biomass burning, northern Africa/Mediterranean mineral dust, East Asia anthropogenic smoke).Interpretation of the results for the different regions needs to account for the different numbers of data pairs in each region and the Fig. 8. September 2008 unweighted monthly mean AOD550 for five total column precursor algorithms as in Fig. 6: datasets with use of common aerosol components and AeroCom/AERONET climatology as prescription or a priori information (experiment number 2 in Table 1).
implications for the statistical reliability.Also, this is why the visual assessment of aerosol features and coverage in the monthly mean maps discussed earlier was taken into account.Filtering for common data points over all four experiments and eight algorithms, as applied in the round robin analysis described in de Leeuw et al. (2013) for which four months' worth of data was analysed, was not done here, since this would have reduced the number of common points in all 24 datasets to a statistically weak sample.
The statistical parameters listed in Tables 4a-e are plotted in Figs.10-14 for the global land, global coastal, South American, northern African/Mediterranean and East Asian analysis in order to get a better overview of results of the experiments.Comparisons are always made between subsequent experiments of one algorithm.Note that as an exception experiment 3 compares to experiment 1 in two cases as denoted in Tables 4a-e, and for the MERIS-STD algorithm experiment 2 compares to experiment 0, since no free retrieval was possible.It should also be noted that for two of the eight algorithms only results over ocean are available (highlighted in the plots), where correlation values are typically higher and RMSE lower than over land.
Figures 10-14 show the following changes in the sequence of experiments (mostly similar for the three regions and the global land or coastal analysis): for AATSR-only algorithms the introduction of the common aerosol properties and/or the use of the AeroCom/AERONET climatology as a priori led to (sometimes even very large) improvement of correlations and partly to a slight reduction of RMSE and/or bias, whereas the use of the common cloud mask showed little Fig. 9. September 2008 unweighted monthly mean AOD550 for four total column precursor algorithms as in Fig. 6: datasets with use of common aerosol components and (partly) free retrieval or AeroCom/AERONET climatology as a priori information and common cloud mask (experiment number 3 in Table 1).
or even a small negative effect.For SYNAER the bias was reduced by the common aerosol properties and even further by the increased safety zone in the cloud mask (probably reducing remaining cloud contamination), but the use of the aerosol-type climatology increased the bias, which is probably due to the fact that the surface parameterization used for all experiments has too bright a surface albedo assumptions and thus prefers the wrong aerosol absorption; impact on RMSE and correlations was mostly small or even negative.The MERIS algorithms did not benefit from the common aerosol properties with or without the use of the a priori aerosol component climatology.Using the AATSR-based common cloud mask led to a slight improvement of all statistical parameters for the MERIS standard algorithm.For PARASOL, with its highest information content and smallest dependence on aerosol model assumptions only, the impact of using the common aerosol properties was assessed.Generally the impact is quite small with either correlations or RMSE improving (sometimes the other quantity degrades then a bit).The largest positive impact is observed over North Africa/Mediterranean (both RMSE and correlation improve), whereas for East Asia a negative impact is observed (small numbers!).  1.Each plot provides results for one individual algorithm.

Discussion and conclusions
Several experiments were conducted in order to understand the role of major modules in eight European aerosol retrieval algorithms.For the experiments, datasets covering the entire globe and one complete month (September 2008) were produced in order to allow for statistical analysis and include cases in all climate zones and with all major types of aerosol.It can be questioned whether a one-month global dataset is sufficient for identifying the impacts of algorithm changes, but this approach was chosen as a pragmatic tradeoff between statistical soundness and processing efforts.The subsequent analysis steps (round robin exercise with four months, one in each season, see de Leeuw et al. (2013), and validation with complete twelve months of the same year, in preparation) prove that the limited analysis of only one month of global data summarized here has helped to identify possible improvements (both demonstrated in this paper, such as the revised optical aerosol model, and identified for subsequent algorithm development such as postprocessing to further reduce cloud contamination) and to ultimately reach algorithms which performed significantly better than the baseline algorithms.
The evaluation of the datasets was conducted by assessing statistical parameters (mean bias, root mean square error (RMSE), and Pearson correlation coefficient) of gridded 1 • latitude longitude daily satellite products (level 3) versus AERONET daily averaged AOD interpolated to a reference wavelength at 0.55 µm for all retrievals and the sun photometer measurements.All experiments started from an analysis of baseline datasets for each algorithm prior to any changes.Up to three experiments were conducted with the various precursor algorithms: use of common optical aerosol properties, additional use of a common aerosol-type climatology, and use of a common cloud mask.
Across algorithms the use of a common definition of aerosol components harmonized the retrievals (in terms of their internal construction), which is documented in a clearly reduced range of global AOD mean values obtained from the participating algorithms and, in part, better similarity between regional features across the eight algorithms.The analysis versus AERONET data shows an improvement of all algorithms (except for MERIS) in at least one or sometimes several statistical parameters (including coverage).The additional use of a common climatology of aerosol type led to further improvement of several algorithms, but could also reduce the algorithm performance.Obviously a monthly model-based climatology also has its limitations, which may in some cases overcome the benefit from constraining the retrievals.Using the common cloud mask (which still had important deficiencies, namely missing forward cloud mask, missing dust flag), led to minor changes or even decreased accuracy for one algorithm.On the other hand, an increased size of the safety zone around clouds was shown to be beneficial (tested with one algorithm) at the expense of data coverage.In the one case of applying the AATSR-based common cloud mask to a MERIS algorithm a clear improvement in all statistical parameters was shown, but at the cost of a major reduction in pixel numbers.It is noted that all these observations are mostly consistent for all five analyses (global land, global coastal, three regional) with few exceptions, which are likely linked to the small numbers of successful retrieval matching-up with AERONET data in some cases.This can be understood well, since the experiments did not make any changes to the ocean/land specific parts of the algorithms and the set of aerosol components defined in Sect.3.1 was explicitly designed to cover different aerosol regimes (with low and high absorption fine mode, sea salt and dust).
The experiments allowed studying the sensitivities of each participating algorithm and drawing conclusions for the improved setup of the round robin algorithm in the subsequent analysis step.For all three AATSR algorithms the common definition of aerosol components should be used, as well as the a priori constraints on the mixing ratios from the climatology.Over ocean, SU coverage was reduced significantly by using the climatology, and for ORAC overall accuracy decreased slightly.The common cloud mask which could only be tested for the ADV and SU algorithms should not be used in the form tested in the third experiment (without forward mask and dust flag), since it had little impact (ADV) or even decreased accuracy (SU) and introduces an additional external dependence.In the ORAC algorithm an error occurred in the common cloud mask implementation which needs correction.In general for AATSR algorithms, further work was identified as necessary for reducing cloud contamination by post-processing.Additionally, problems at high latitudes were found for ADV which need analysis and correction.Among the three AATSR algorithms SU seems to be more accurate overall, which might be due to the advanced surface treatment, but the good results might also be due to the stricter quality control, which results in relatively Fig. 12. Regional evaluation results for the different algorithms and experiments over South America and surrounding oceans.Each plot shows from top to bottom correlation, RMSE and bias.The x axis gives the number of the respective experiment as defined in Table 1.Each plot provides results for one individual algorithm.Plots shaded in grey refer to results for experiments/algorithms only over ocean.poor data coverage.It is pointed out that the purpose of this study was not to select a "best" algorithm among those for one sensor -this task is part of the subsequent round robin exercise.
For SYNAER, the common aerosol components and the increased cloud safety zone showed some positive impact, but the use of the aerosol-type climatology decreased coverage and accuracy.Improving the surface parameterization was identified as highest priority (singular experiment not shown here), before being able to draw further conclusions on the added value of the aerosol-type climatology.For all three MERIS algorithms, the common aerosol components should be used for consistency with the AATSR results although for the algorithm versions tested this led to some reduction in product accuracy.For the ALAMO algorithm over ocean, the use of the common components increased coverage significantly but reduced its accuracy.Since cloud mask (only tested for the ESA operational algorithm) allowed for accuracy improvement, but at the cost of a significantly reduced coverage, a user-oriented trade-off between coverage and accuracy with regard to the cloud contamination is needed for each application.For the PARASOL algorithm the stability of its results with regard to the assumptions on aerosol properties was proven together with a reduced noise when using the common components, in particular in the mineral dust region tested.
So far the highest accuracy was observed for PARA-SOL (ocean only) with really convincing numbers.The three AATSR retrievals showed clear improvement over the various experiments.The nadir-only algorithms (MERIS, SYNAER), which are more dependent on their surface parameterizations, need further improvement of this critical aspect; MERIS algorithms also are very sensitive to cloud contamination with the opportunity for improvement in synergy with AATSR.
In conclusion it can be stated that the experiments revealed opportunities for algorithm improvement (as shown for the use of the common aerosol components and in part also the a priori aerosol climatology) and identified critical sensitivities where further work is needed.MERIS cloud masking was improved by using the AATSR mask which benefits from the availability of thermal infrared bands, however, this severely limited the available MERIS swath resulting in significant reduction of coverage.Generally, other means of reducing remaining cloud contamination such as a larger safety zone (tested for SYNAER), with the drawback of further It should be noted, that the cloud screening in AERONET leads to a bias to cloud-free conditions, so that evaluation of different cloud masks is limited in its extent.
To some extent harmonization between different algorithms, different sensors and different principles has been achieved by the use of both common optical aerosol components and an aerosol climatology, as documented in global mean AOD values and the enhanced similarity of monthly mean maps.This is important as it will facilitate future merged datasets, because the AOD due to each component can be compared between the retrievals.For total AOD this harmonization is mostly important for those retrievals with the lowest information content.The common cloud mask needs further improvement: utilization of the forward view, improving the discrimination between desert dust outbreaks and clouds, post-processing applied as a secondary means to avoid cloud contamination.
In a subsequent step the analysis has been extended from one month of global data to four months (one in each season) to further substantiate the results with a larger data amount in a round robin analysis (de Leeuw et al., 2013) but then limited to only one "best" version for each algorithm.The choices for these "best" algorithm versions in this round robin exercise have been based on the experiments described in this paper.This subsequent step showed further improvements of several of the algorithms (e.g. by post-processing to avoid cloud contamination, and/or by detecting and correcting bugs in some algorithms).Ultimately a full global one-year dataset of Aerosol Essential Climate Variables will be produced and will be validated and assessed by aerosol climate model users.
As one potentially critical element of aerosol retrievals, the treatment of brightness and directional reflections of land surfaces could not be assessed by similar experiments since it is a core component of each algorithm that cannot be easily modified.Furthermore, it has different importance for different classes of algorithms: dual or multiple viewing instruments (AATSR, PARASOL, MISR) avoid to a first order any dependence on surface brightness by effective decoupling of the surface reflectances and the path radiance or by fully retrieving surface directional behaviour.Nadir-only algorithms (MERIS, synergetic AATSR + SCIAMACHY), on the other hand, are dependent on an assumption or parameterization (from vegetation index and mid-infrared reflectances) of surface brightness (of dark fields or all pixels).It is thus planned to use atmospherically corrected MODIS reflectance data around selected AERONET stations in order to assess the parameterizations used in the respective precursor algorithms.
Over ocean it can generally be assumed that the surface reflectance is very low in the red and infrared bands.However, for specific situations (wind speed dependent white cap fraction, coastal sediments and chlorophyll) this assumption is not valid and auxiliary datasets and parameterizations need to be used.Here, preparations have been made to harmonize auxiliary data (e.g.ECMWF re-mapped wind field analysis) where this is feasible.However, for sediments and chlorophyll, the use of an external climatology or daily dataset would mean that the aerosol-surface separation has already been solved in a different retrieval.Furthermore, evaluation of AOD and surface treatment over ocean is difficult due to low number of ground-based (ship) observations for September 2008; for later years this situation improves with sun photometer observations from the Marine Aerosol Network (Smirnov et al., 2011).
The results of all analysis steps are available at the Aerosol cci project website http://www.esa-aerosol-cci.org,where all datasets, experiments and documents are published and available to the scientific community; from there also links to the ftp data server at ICARE (accessible on request) and the open AeroCom and ICARE visualization and analysis tools are provided.6 for eight total column precursor algorithms: baseline datasets (experiment number 0 in Table 1).

Fig. 1 .
Fig.1.AERONET probability distribution statistics: the upper panel shows the frequency for aerosol sizes smaller than 0.5 µm as function of the associated effective radius (x axis) and the aerosol optical depth at 0.44 µm (y axis).The lower panel shows the frequency for aerosol sizes larger than 0.5 µm as function of the effective radius (x axis) and the aerosol optical depth at 0.44 µm (y axis).

Fig. 3 .
Fig. 3. Climatologies of the three external mixing fractions for the aerosol components of Table 2.The monthly maps shown here are based on AeroCom model median/AERONET AOD550 aerosol type for September; these fractions are used as a priori information for the aerosol type in the retrievals in the second experiment: fine mode fraction (upper left panel), fraction of less absorbing component in the fine mode (upper right panel), and fraction of dust in the coarse mode (lower left panel).As reference (not used as a priori) the AOD550 distribution is also shown (lower right panel).
for 1 September 2008, east off the coast of Madagascar.True colour RGB images from AATSR (left panels) and MODIS-Terra (right panels) show the main cloudaerosol features from south to north: optically thick stratiform clouds, an east-west cirrus band, an east-west aerosol plume (only visible in MODIS), and north-south bands of convective clouds.Between the two true colour images cloud masks are shown from left to right: a composite of AATSR APOLLO (AVHRR Processing scheme Over Land, cLouds and Ocean, used in SYNAER) and ESA operational (as used in ORAC and SU algorithms) cloud masks, an ADV AATSR cloud mask and the MERIS ESA operational cloud mask.

Fig. 4 .
Fig. 4. Single scene analysis example for 1 September 2008 west off Madagascar.From left to right the image shows: AATSR RGB composite, AATSR cloud mask composite from APOLLO and ESA standard masks, AATSR ADV cloud mask, MERIS ESA standard cloud mask, MODIS-Terra true colour RGB composite.The colour codes in the three cloud masks are as follows: AATSR AP/STD: green = land flagged as cloud free in both masks, blue = water flagged as cloud free in both masks, white = both masks agree to flag as cloudy, red = only standard mask flags as cloudy, yellow = only APOLLO flags as cloudy; AATSR ADV: number of positive cloud tests from 0 (dark blue) over green, yellow to red; MERIS STD: cloud fraction from 0 (black) to 1 (white).

Fig. 5 .
Fig. 5. Global monthly mean cloud cover from DLR APOLLO AATSR (common nadir cloud mask used in Aerosol cci) for September 2008.

Fig. 10 .
Fig. 10.Global evaluation results for the different algorithms and experiments over land.Each plot shows from top to bottom correlation, RMSE and bias.The x axis gives the number of the respective experiment as defined in Table1.Each plot provides results for one individual algorithm.

Fig. 11 .
Fig. 11.Global evaluation results for the different algorithms and experiments over coastal sites.Each plot shows from top to bottom correlation, RMSE and bias.The x axis gives the number of the respective experiment as defined in Table1.Each plot provides results for one individual algorithm.Plots shaded in grey refer to results for experiments/algorithms only over ocean.

Fig. 13 .
Fig. 13.Regional evaluation results for the different algorithms and experiments over northern Africa and the Mediterranean.Each plot shows from top to bottom correlation, RMSE and bias.The x axis gives the number of the respective experiment as defined in Table1.Each plot provides results for one individual algorithm.Plots shaded in grey refer to results for experiments/algorithms only over ocean.

Fig. 14 .
Fig. 14.Regional evaluation results for the different algorithms and experiments over East Asia.Each plot shows from top to bottom correlation, RMSE and bias.The x axis gives the number of the respective experiment as defined in Table1.Each plot provides results for one individual algorithm.Plots shaded in grey refer to results for experiments/algorithms only over ocean.

Fig. A1 .
Fig. A1.September 2008 monthly number of pixels contributing to the ADO550 maps shown in Fig.6for eight total column precursor algorithms: baseline datasets (experiment number 0 in Table1).

Table 1 .
Overview of algorithm experiments conducted in Aerosol cci.

Table 2 .
Log-normal parameters for two coarse and two fine mode aerosol components and their associated mid-visible refractive indices (note, mode number radius and standard deviation [or variance] define the effective radius, which is the 3rd moment to 2nd moment radius ratio).ω0denotes the single scattering albedo (from deLeeuw et al., 2013).

Table 3 .
Precursor algorithms and sensors targeted in Aerosol cci total column experiments.

2013 1928 T. Holzer-Popp et al.: Aerosol retrieval experiments in the ESA Aerosol cci projectTable 4a .
Summary of global statistical parameters versus AERONET daily mean AOD550 over land stations for the experiments conducted.Each cell provides number of data points (N), mean bias (b), RMSE (σ ) and correlation (R).

Table 4b .
Summary of global statistical parameters versus AERONET daily mean AOD550 over coastal stations for the experiments conducted.Each cell provides number of data points (N), mean bias (b), RMSE (σ ) and correlation (R).

Table 4c .
Summary of regional statistical parameters versus AERONET daily mean AOD550 over South America and surrounding oceans for the experiments conducted.Each cell provides number of data points (N), mean bias (b), RMSE (σ ) and correlation (R).

Table 4d .
Summary of regional statistical parameters versus AERONET daily mean AOD550 over northern Africa/Mediterranean for the experiments conducted.Each cell provides number of data points (N), mean bias (b), RMSE (σ ) and correlation (R).

Table 4e .
Summary of regional statistical parameters versus AERONET daily mean AOD550 over East Asia for the experiments conducted.Each cell provides number of data points (N), mean bias (b), RMSE (σ ) and correlation (R).