Statistically optimized inversion algorithm for enhanced retrieval of aerosol properties from spectral multi-angle polarimetric satellite observations

. The proposed development is an attempt to enhance aerosol retrieval by emphasizing statistical optimization in inversion of advanced satellite observations. This optimization concept improves retrieval accuracy relying on the knowledge of measurement error distribution. Efﬁcient application of such optimization requires pronounced data redundancy (excess of the measurements number over number of unknowns) that is not common in satellite observations. The POLDER imager on board the PARASOL microsatellite registers spectral polarimetric characteristics of the reﬂected atmospheric radiation at up to 16 viewing directions over each observed pixel. The completeness of such observations is notably higher than for most currently operating passive satellite aerosol sensors. This provides an opportunity for profound utilization of statistical optimization principles in satellite data inversion. The proposed retrieval scheme is designed as statistically optimized multi-variable

Abstract. The proposed development is an attempt to enhance aerosol retrieval by emphasizing statistical optimization in inversion of advanced satellite observations. This optimization concept improves retrieval accuracy relying on the knowledge of measurement error distribution. Efficient application of such optimization requires pronounced data redundancy (excess of the measurements number over number of unknowns) that is not common in satellite observations. The POLDER imager on board the PARASOL microsatellite registers spectral polarimetric characteristics of the reflected atmospheric radiation at up to 16 viewing directions over each observed pixel. The completeness of such observations is notably higher than for most currently operating passive satellite aerosol sensors. This provides an opportunity for profound utilization of statistical optimization principles in satellite data inversion. The proposed retrieval scheme is designed as statistically optimized multi-variable fitting of all available angular observations obtained by the POLDER sensor in the window spectral channels where absorption by gas is minimal. The total number of such observations by PARASOL always exceeds a hundred over each pixel and the statistical optimization concept promises to be efficient even if the algorithm retrieves several tens of aerosol parameters. Based on this idea, the proposed algorithm uses a large number of unknowns and is aimed at retrieval of extended set of parameters affecting measured radiation.
Correspondence to: O.  The algorithm is designed to retrieve complete aerosol properties globally. Over land, the algorithm retrieves the parameters of underlying surface simultaneously with aerosol. In all situations, the approach is anticipated to achieve a robust retrieval of complete aerosol properties including information about aerosol particle sizes, shape, absorption and composition (refractive index). In order to achieve reliable retrieval from PARASOL observations even over very reflective desert surfaces, the algorithm was designed as simultaneous inversion of a large group of pixels within one or several images. Such multi-pixel retrieval regime takes advantage of known limitations on spatial and temporal variability in both aerosol and surface properties. Specifically the variations of the retrieved parameters horizontally from pixel-to-pixel and/or temporary from day-to-day are enforced to be smooth by additional a priori constraints. This concept is expected to provide satellite retrieval of higher consistency, because the retrieval over each single pixel will be benefiting from coincident aerosol information from neighboring pixels, as well, from the information about surface reflectance (over land) obtained in preceding and consequent observations over the same pixel.
The paper provides in depth description of the proposed inversion concept, illustrates the algorithm performance by a series of numerical tests and presents the examples of preliminary retrieval results obtained from actual PARASOL observations. It should be noted that many aspects of the described algorithm design considerably benefited from experience accumulated in the preceding effort on developments of currently operating AERONET and PARASOL retrievals, as well as several core software components were inherited from those earlier algorithms.

Introduction
The research presented in this paper aims to develop a new retrieval algorithm optimized for deriving maximum information content using the data redundancy available from advanced satellite observations, such as those from POLDER/PARASOL observations. The design of the POLDER imager allows collecting rather comprehensive characterization of angular distribution of both total and polarized components of solar radiation reflected to space. The observations in window channels where the effect of absorption by atmospheric gases are minimal are usually used for aerosol retrievals. The complete set of such observations collected operationally by POLDER/PARASOL over each pixel includes angular measurements of both total radiances and linear polarization at 0.49, 0.675 and 0.87 µm and angular measurements of only total radiances at 0.44, 0.565 and 1.02 µm. The number of viewing directions is similar for all spectral channels and varies from 14 to 16 depending on observed geographical location. The completeness of such observations is significantly higher comparing to any currently operating passive satellite aerosol sensors. In addition PARASOL provides nearly global coverage every 2 days. Therefore, such complete set of PARASOL observations potentially provides very valuable basis for enhanced characterization of global aerosol.
However, rigorous interpretation of redundant satellite observation is a very challenging task. Indeed, the optimized inversion requires applying complex multi-variable inversion algorithms. Such methods are time-consuming and challenging for implementation. This is why the rigorous methods of inversion optimization are not generally used for processing very large data sets provided by satellite aerosol imagers. Instead, most satellite aerosol retrievals use look-up tables of simulated satellite signals pre-computed for some limited selected scenarios of aerosol and underlying surface combinations. The modeled scenario that provides the best match of the observed radiances is accepted as the retrieved solution. With some modification, this strategy is adopted in most of aerosol satellite retrievals because it allows rapid operational processing of satellite images. For example, it is successfully employed in retrievals of single-view AVHHR (Stowe et al., 1997;Mishchenko et al., 1999;Higurashi and Nakajima, 1999), TOMS (Torres et al., 1998), MODIS Tanré et al., 1997;Remer et al., 2005), etc. At the same time, applying the same methodology for processing observations from imagers with multi-viewing capabilities, such as MISR Martonchik et al., 1998;Kahn et al., 2007Kahn et al., , 2009, SEVIRI (Govaerts et al., 2010;Wagner et al., 2010;Carrer et al., 2010), or POLDER (Deuzé et al., 2001;Herman et al., 2005;Tanré et al., 2011), reveals some deficiencies of the look-up table retrievals. The multidirectional observations have notably higher sensitivity to the details of aerosol and surface properties, and the retrieval of larger number of parameters is expected. Correspondingly, the required comprehensive look-up tables of such observations may have larger dimensions and thus be less suitable for operational use. As a result, most look-up table based algorithms rely only on the selected sub-sets of the observations with highest sensitivity to the aerosol parameters and retrieve reduced set of characteristics. For example, the current POLDER/PARASOL operational retrieval algorithm over ocean  uses measurements of total and polarized reflectances only at two spectral channels (0.67 and 0.87 µm) . The look-up table algorithm works efficiently for these two channels because they are sensitive to the scattering of both fine and coarse mode aerosols. At the same time, observations at these two channels are insensitive to vertical variability of aerosol and not strongly affected by water-leaving radiation. The POLDER/PARASOL retrieval over land (Deuzé et al., 2001) uses only polarized measurements of reflected light at the same two channels. Such strategy is used because the contribution of aerosol into polarized reflectance generally dominates over the contribution of the land reflectance, while contribution of land surface into total reflected radiation is usually comparable or stronger than that of aerosol. Therefore, as discussed by Deuzé et al. (2001), utilization of only polarized observations allows one to derive aerosol properties and to avoid challenging issue of separation of surface and aerosol contributions into the total reflectance. Although this algorithm has successfully provided valuable aerosol retrievals from POLDER observations, several shortcomings were identified in the POLDER aerosol products. First, PARASOL retrieval over land provides information only about fine aerosol particles, because the contribution of large aerosol (predominantly non-spherical dust) over land to the polarized reflectances often is small. In addition, the correct interpretation of PARASOL observations of desert dust even over ocean surface is challenging due to difficulties to model appropriately the light scattering by non-spherical particles of desert dust (Gérard et al., 2005). Second, since the PARASOL algorithm, both over land and ocean, relies on the observations of only two spectral channels, the retrieved aerosol spectral properties are not always fully consistent with the observations at other channels.
The retrieval algorithm proposed here fits the complete set of PARASOL observation in all spectral channels (with the exception of the channels dominated by gaseous absorption such as 0.763, 0.765 and 0.910 µm) and including both measurements of total radiances and linear polarization (if available). Based on this strategy, the algorithm is driven by larger number of unknown parameters and aimed on retrieval of an extended set of parameters affecting measured radiation. For example, the approach allows the retrieval of both the optical properties of aerosol and underlying surface from PARASOL observations over land. Also, comparing to the current operational PARASOL retrieval, the proposed algorithm is designed to provide more detailed information about aerosol properties including the particle size distribution, complex refractive index, parameters characterizing aerosol particle shape and vertical distribution. This setup of the aerosol retrieval algorithm is based on accumulated experience and current understanding of the high potential of using spectral multi-angular polarimetric observations from space for improving global aerosol monitoring. Diverse aspects of aerosol retrieval improvements by using advanced satellite observations have been already demonstrated and outlined in numerous previous studies (e.g. see Kokhanovsky and de Leeuw, 2009). For example, the studies by , Kalashnikova et al. (2005), Kalashnikova and Kahn (2006) demonstrated the possibility of deriving not only aerosol loading but also some information about aerosol particle size, morphology and shape from observations from the MISR imager that provides multiple view observations of total reflectance in 9 directions in 4 spectral channels (0.44, 0.55, 0.67 and 0.87 µm). These studies have suggested a high importance of using multi-angular observation geometry for deriving more detailed aerosol information. However, most of the known comparisons of the aerosol parameters derived from multi-viewing images (such as MISR and POLDER) with the aerosol products of single view satellite sensors do not indicate clear advantage of multi-viewing observation for aerosol monitoring. For example, Kokhanovsky et al. (2007) compared the aerosol retrievals obtained from different satellite platforms over land with ground-based AERONET observations and indicated significant differences in currently available satellite products and did not reveal any notable advantage of one particular satellite sensor. The study suggested that retrieval indeterminacies are likely part of the observed discrepancies, and their reduction will likely be aided by new missions incorporating spectral multi-angular polarimeters. Indeed, sensitivity analysis of Mishchenko and Travis (1997a,b) related the possibility of potential important improvements of satellite aerosol retrievals with use of spectral multi-angular polarization as well as intensity of reflected sunlight. Later studies by Chowdhary et al. (2002Chowdhary et al. ( , 2005 demonstrated the possibility of retrieving the detailed aerosol properties from spectral multi-angular Research Scanning airborne Polarimeter (RSP) over water. This polarimeter is an aircraft-based prototype of the APS instrument projected to be part of the future NASA Glory mission (Mishchenko et al., , 2007. The analysis of RSP observations over land by Waquet et al. (2007Waquet et al. ( , 2009a illustrated the possibility of reliable aerosol retrievals over reflective land surfaces. It should be noted here that Kokhanovsky et al. (2007) did not identify any superiority of POLDER results (included into the comparisons) over other satellite imagers. This fact has several probable explanations. First, although POLDER sensor has multi-viewing polarimetric capabilities, the spectral range of POLDER observation is notably narrower that spectral coverage of some single viewing sensors, such as a MODIS. Similar remark is valid for comparisons on multi-viewing MISR satellite instrument. Second, the algorithm processing POLDER observations was not designed to take full advantage of positive redundancy of spectral multi-angle polarimetric observations. Indeed, the POLDER retrieval algorithm by Deuzé et al. (2001) oriented on rapid operational processing uses polarized reflectance at two visible spectral channels. Therefore, the positive polarimetric information from other spectral channels, as well as any information from total reflectance observation, was not used. Waquet et al. (2007) demonstrated that using polarimetric observations over a wider spectral range is essential for aerosol retrieval over land from polarimetry observations. Waquet et al. (2007Waquet et al. ( , 2009a followed Deuzé et al. (2001) approach and used only polarized reflectances. At the same time, the Waquet et al. (2007Waquet et al. ( , 2009a algorithm was driven by a large number of unknowns and was of significantly higher complexity than the POLDER algorithm. An algorithm of such level of complexity has never been applied to POLDER/PARASOL observations. In addition, one could expect that including total reflectance into such an enhanced retrieval scheme could result in additional improvements of the aerosol retrieval. Indeed, the spectral angular measurements of total reflectance are shown to provide valuable aerosol information even over land surfaces (e.g. Martonchik et al., 2004;Liu et al., 2004;Kahn et al., 2005;Diner et al., 2005;etc.). In addition, rigorous sensitivity studies suggest high importance of using observations of both total and polarized reflectances for reliable aerosol retrieval (Mishchenko and Travis, 1997a,b;Landgraf, 2005b, 2007). That is why the retrieval concept described in this paper pursues inversion of both total radiances and linear polarization measurments and includes implementations of several important algorithm refinements. The realization of this concept is expected to result in enhancement of completeness and accuracy of POLDER aerosol retrieval.
The presented algorithm developments essentially rely on the available positive research heritage from previous remote sensing aerosol retrieval developments, in particular those from the POLDER and AERONET retrieval activities. The general inversion scheme will be designed as multi-term LSM fitting by . Such an inversion strategy allows for the use of a continuous space of solutions instead of a limited set of predetermined solutions as used in look-up table based algorithms. During more than a decade the algorithm developed by  was successfully employed for processing observation of AERONET of ground-based sun/sky-radiometers (Holben et al., 1998). During this period the algorithm has passed notable evolution and several useful modifications were added into the inversion procedure. The modifications of that algorithm were effectively applied for interpretation of coincident up and down-looking remote sensing observations. For example, Sinyuk et al. (2007) conducted the retrieval of both atmospheric aerosol and land surface properties from a combination of AERONET data with coincident MISR or POLDER satellite observations. Gatebe et al. (2010) implemented the joint retrieval of detailed properties of multilayered aerosol and underlying surface reflectance from a 978 O. Dubovik et al.: Statistically optimized inversion algorithm for enhanced retrieval of aerosol properties combination of AERONET data and airborne measurements by NASA's Cloud-Absorption Radiometer. The main details of the resulting fine-tuned numerical inversion scheme are discussed by Dubovik (2004) and below in Sect. 3 of the current paper. The modeling of PARASOL observed reflectances is implemented using approaches and computer codes developed previously for accurate radiative transfer equation solutions and for modeling aerosol single scattering and surface reflectance properties. Specifically, the multiple solar light interactions with the atmosphere and underlying surface are accounted for using the successive order of scattering radiative transfer code by Lenoble et al. (2007). This approach and actual computer code have been used and refined in POLDER-1, POLDER-2 and POLDER/PARASOL data analysis. The land surface reflectance of total solar radiance is approximated by the model of Rahman et al. (1993) that has already been successfully used for interpretation of MISR (e.g. Martonchik et al., 1998), SEVIRI (e.g. Govaerts et al., 2010 and RSP (e.g. Litvinov et al., 2010) observations. The reflectance of polarized radiation by land surface is approximated by using the models developed by Nadal and Bréon (1999) and Maignan et al. (2009) from the observations by POLDER. These models were also validated by RSP observations (Litvinov et al., 2011). The modeling of aerosol single scattering properties is adopted from AERONET developments. This scattering model seems rather suitable for applying to multi-angular polarimetric observations, since this model was demonstrated to accurately reproduce the observations by ground-based radiometers that have high sensitivity to the fine features of angular and spectral aerosol scattering. For example, the software developed by Dubovik et al. (2006) for AERONET allows for very fast simulations of scattering by non-spherical aerosol particles. As discussed by Herman et al. (2005) and Gérard et al. (2005), the adequate modeling of scattering by non-spherical aerosol particles is critical for analysis of PARASOL observations.
In addition, as a part of the PARASOL aerosol algorithm improvement, a new aspect has been introduced into the concept of satellite data inversion. Specifically, in order to overcome some difficulties related to the limited information of the PARASOL observations over a single pixel, the retrieval is organized as a simultaneous inversion of a large group of pixels within one or several images. For example, derivation of aerosol properties over bright land is known to be an extremely difficult task. The multi-pixel retrieval regime takes advantages from known limitations on spatial and temporal variability in both aerosol and surfaces properties. Similar ideas have already been used in different forms for improving satellite retrievals. For example, Martonchik et al. (1998) derive the surface reflectance properties from a group of nearby MISR pixels in a 16 × 16 km region by relying on the similarity of aerosol properties over this area.  have built the SEVIRI aerosol and surface retrieval concept assuming rather limited time variability of the land surface reflectance properties. Even more explicitly the idea was used in studies by Lyapustin et al. (2008) and Lyapustin and Wang (2009), who used the limited time variability of land surface for screening cloud-contaminated data or the limited space variability of aerosol properties for constraining aerosol retrieval from MODIS observations. Temporal smoothness constraints on surface reflectance variability were used by Quaife and Lewis (2010) in the method for inverting linear bi-directional surface reflectance models from MODIS observations. Here, the satellite retrieval is designed as a statistically optimized simultaneous fitting of the observations over a group of pixels implemented under additional inter-pixel constraints. Specifically, the variations of the retrieved parameters horizontally from pixel-to-pixel or temporary from day-to-day over the same pixel are limited by the additional a priori constraints, in a similar manner to how it is applied in inverse modeling by Dubovik et al. (2008). The inclusion of these additional constraints is expected to provide retrieval of higher consistency for aerosol retrievals from satellites, because the retrieval over each single pixel will be benefiting from coincident aerosol information from neighboring pixels, in addition to the information about surface reflectance (over land) obtained in preceding and consequent observations over the same pixel.
It should be noted that this paper focuses on detailed implementation of core ideas for a new PARASOL retrieval algorithm, however it does not address many aspects of an operational implementation of the algorithm. For example, issues such as cloud-screening, retrieval time requirements and other important aspects of algorithm implementation for operational processing are to be addressed in follow-on studies.

General structure of the algorithm
The general structure of the algorithm is shown in Fig. 1. In order to make the algorithm more flexible it is divided into several interacting but rather independent modules. Each module has rather particular function. The interactions between the modules are minimized for a straightforward exchange of a very limited set of parameters. The "Forward Model" and "Numerical Inversion" are the two most complex and elaborated modules in the developed algorithm. The organization of the algorithm by modules enhances the flexibility in algorithm utilization. For example, the "Numerical Inversion" module implements quite universal operations that have no particular relation to the physical nature of the inverted observations. This module can, in principle, be used in any other application not related to atmospheric remote sensing. The "Forward Model" module does not have such universal applicability as the "Numerical Inversion" module. Nonetheless, the "Forward Model" module is developed in a quite universal way allowing modeling quite a broad variety of atmospheric remote sensing problems. As a result of such organization of the algorithm, it can equally be applied (with minimal changes) to inverting observations from other satellite sensors or from the ground. In addition, such an algorithm structure was helpful in adapting physical models and computer routine fragments inherited from previous AERONET and POLDER developments.
The following several Sections of the paper provide a full description of the "Forward Model" and "Numerical Inversion" algorithm modules. A number of optional adjustments are suggested for setting both aerosol physical model and retrieval scheme. Although the algorithm is tuned for inverting PARASOL observations, some aspects of aerosol parameterization and inversion implementation (in particular a priori constraint settings) can be modified and adjusted for optimizing the algorithm performance if it is applied to other remote sensing observations. For example, two alternative strategies are suggested for implementing numerical inversion of satellite image observations: conventional pixel-by-pixel inversion and a new multi-pixel inversion strategy. According to this new multi-pixel approach, the retrieval developed as simultaneous inversion of a large group of pixels within one or several images. Such a retrieval regime takes advantage of known limitations of spatial and temporal variability in both aerosol and surface properties.

Forward model of POLDER/PARASOL observations
The aerosol retrieval algorithm is designed to invert the POLDER/PARASOL observations acquired in window channels shown in Table 1, that is: the total radiance in 6 window channels: 0.44, 0.49, 0.565, 0.675, 0.87 and 1.02 µm, and the linear polarization in 3 of these channels: 0.49, 0.675 and 0.87 µm, reflected by a ground pixel. In each channel, observations of the same pixel are performed nearly simultaneously in up to 16 viewing directions (Deschamps et , 1994). It is assumed that the light observed at the top of the atmosphere is only linearly polarized. In the polarized channels, besides the total reflected radiance, I , the measurements provide the Stokes parameters Q and U referred to axes perpendicular and parallel to the local meridian plane, i.e. Q = I p cos(2α) and U = I p sin (2α) where I p is the polarized component of reflected radiance and α is the angle between the meridian plane and the polarization direction.
Let I = (I, Q, U, V ) T and E 0 = (E 0 , 0, 0, 0) T stand, respectively, for the Stokes' vectors of the observed electromagnetic radiation and of the incident unpolarized solar radiation; the subscript "T " denotes transposition and V is assumed to be negligible.
The reflected radiance may be written: where the terms M scat and M reflec correspond to the light reflected as a result of single interaction of incident solar light, respectively, with the atmosphere and surface. In Eq. (1) it is assumed that polarized light is referred to axes perpendicular and parallel to the scattering and reflection planes (here, both formed by the solar and viewing directions); and the matrix L transforms the Stokes' vector into the plane of observations (details are given in Lenoble et al., 2007).
Under assumption of plane parallel multi-layered atmosphere, the single scattering term, M scat , at the top of the atmosphere can be expressed as: where τ i is the optical thickness of the i-th atmospheric layer (i = 1, ..., N numbered from the top to the bottom of the atmosphere) and τ i is the optical depth of the bottom of layer i (i.e. τ i = k=1,...,j τ k ); P i ( ;λ) and ω i 0 denote the phase matrix and single scattering albedo of the i-th atmospheric layer, m = 1/µ 0 + 1/µ 1 .
The optical properties τ i , P i ( ; λ) and ω i 0 of each atmospheric layer include the contributions of aerosol (characterized in i-th layer by τ i,a , ω a 0,i and P a i ( ; λ)), molecular scattering (characterized in i-th layer by τ i,mol , ω mol 0,i = 1 and P mol i ( ; λ)) and atmospheric gases (characterized in i-th layer by τ i,gas and ω scattering albedo ω i 0 and phase matrix P i ( ; λ) of the i-th atmospheric layer are: and and the extinction optical thickness τ of the atmosphere is the sum of the corresponding components: The properties of atmospheric molecular scattering τ mol and P mol ( ; λ) are well known and can be calculated with sufficient accuracy. The absorption of atmospheric gases τ gas has rather minor contributions in the POLDER/PARASOL window channels and can be accounted for using known climatologies, as well as using available information from ancillary observations. For example, the present development uses the same procedure as used in the operational algorithm by Deuzé et al. (2001). That procedure corrects the water vapor absorption using PARASOL measurements in 0.910 µm spectral band. The minor absorption from ozone, NO 2 and O 2 are accounted using the climatology data. Thus, the most challenging part in modeling single scattering properties of the atmosphere is the modeling of aerosol contribution, i.e. aerosol extinction τ a , single scattering albedo ω a 0 and phase matrix P a ( ; λ). These properties depend on aerosol microphysics: particle size, shape and composition (refractive index). All these characteristics are driven by the parameters included in the vector of unknowns and correspondingly they are retrieved from the observations. The single reflection M reflec at the top of atmosphere can be calculated as: where reflection matrix R (µ 0 ; µ 1 ; ϕ 0 ; ϕ 1 ; λ) describes the surface reflection properties in the plane formed by the solar and viewing directions. For the ocean surface the reflection R (µ 0 ; µ 1 ; ϕ 0 ; ϕ 1 ; λ) is mainly governed by the wind speed at sea level as suggested by the Cox-Munk model (Cox and Munk, 1954) employed in the currently operational POLDER algorithm (Deuzé et al., 2001;Herman el al., 2005;Tanré et al., 2011). In contrast, the reflection matrix of the land surface may differ very strongly from location to location. Therefore, in the present algorithm, the key properties of the land surface reflectance are included in the set of unknowns and retrieved from the observations. As follows from Eq. (1), once the single scattering terms M scat and M reflec are defined one needs to account for multiple interactions of scattered light with atmosphere and surface. In the present algorithm these interactions are accounted for by rigorously solving vector radiative transfer equation. Thus, the forward model of reflected radiances measured by POLDER/PARASOL contains three main components: (i) aerosol single scattering, (ii) surface reflection and (iii) solving vector radiative transfer equation for accounting for multiple scattering. The following parts of this section will describe each of these components in detail.
It should be noted that the forward model for reproducing POLDER/PARASOL observations is designed by means of adapting the atmospheric modeling strategies and computer routines developed within previous POLDER and AERONET activities. At the same time, several important modifications required for optimizing the forward modeling performance have been implemented in the present PARA-SOL algorithm. Specifically, the models of land surface reflectance have been introduced into the radiative transfer calculations, the number of aerosol parameters driving the model has been reduced, the different regimes of the radiative transfer calculations have been designed for allowing faster but less accurate calculations. These and other forward model modification allowed the performance of the developed "on-line" inversion procedure to attain the standards required for operational processing (achieving sufficient speed of computations, etc.). Figure 3 shows the data flow within the "Forward model" block of the algorithm. Three main complementary efforts are involved in modeling the atmospheric radiation field observed by the POLDER sensor: -Modeling of single scattering properties of the atmospheric aerosol.

982
O. Dubovik et al.: Statistically optimized inversion algorithm for enhanced retrieval of aerosol properties -Modeling of the surface reflectance properties.
-Accounting for multiple scattering effects using full radiative transfer model.
These aspects are described in detail below.

Aerosol single scattering properties
The modeling of the aerosol scattering matrices has been implemented following the ideas employed in AERONET retrieval algorithm by  and Dubovik et al. (2002bDubovik et al. ( , 2006. In order to account for aerosol non-sphericity, the atmospheric aerosol is modeled as an ensemble of randomly oriented spheroids. Specifically, AERONET operational retrievals use the concept developed by Dubovik et al. (2006) and model the particles for each size bin as a mixture of spherical and non-spherical aerosol components. The nonspherical component was modeled by ensembles of randomly oriented spheroids (ellipsoids of revolution). According to this concept, aerosol particles of non-spherical component have size-independent distribution of shapes and the modeling of the total aerosol optical thickness τ of nonspherical aerosol can be written as the following: where c ε τ (λ; k; n; r; ε) denotes the extinction cross-sections of spherical particle and randomly oriented spheroid, λwavelength, n and k -real and imaginary parts of the refractive index, ε spheroid axis ratio (ε = a/b, a -axis of spheroid rotational symmetry, b -axis perpendicular to the axis of spheroid rotational symmetry), r -radius of volumeequivalent sphere. Correspondingly the mixture of spheroid includes the flattened oblate spheroids (ε < 1), spheres (ε = 1) and elongated prolate spheroids (ε > 1). The characteristics r and ε are used here for describing size and shape of the ensemble of spheroids. Analogously to the combination of a and b, the combination of r and ε allows unique definition of the spheroid shape. As discussed by Mishchenko et al. (1997), the usage of r and ε is convenient for separating the effect of particle shape and size in analysis of aerosol mixture light scattering. Then the functions dN (r) d lnr and dN(ε) d lnε denote the number particle size and the number particle shape (axis ratio) distributions accordingly.
For performing fast and accurate calculations of aerosol extinction by polydisperse non-spherical aerosols, the size and shape integration can be approximated by the double sum, e.g.
where K ε ext (λ; k; n; r i ; ε k ) = (9) = lnr i + lnr where v(r) is the volume of particle, dV (r) d lnr = v(r) dN (r) d lnr is the volume particle size distribution, A k (ε) and B i (r) are the functions providing correspondingly the interpolation of shape distribution between the selected points ε k and the interpolation of size distribution over selected points r i . In studies by Dubovik et al. (2002bDubovik et al. ( , 2006, the coefficients A k (ε) for integrating over axis ratio were assumed as rectangular functions A k (ε) = const. For approximating size distribution between the used size bins r i , the trapezoidal approximation was chosen by . Such kinds of interpolation are traditionally applied in aerosol applications (e.g. see Twomey, 1977), where the functions B i (r) are defined as isosceles triangles.
It should be noted that in Eq. (8) the integral over size is approximated by a sum using values of volume size distribution dV(r)/dlnr (in place of the number size distribution dN(r)/dlnr) defined in logarithmically equidistant points r i . Utilization of both the volume size distribution and logarithm of radius was chosen for the convenience of the algorithm implementation. In principle, the particle number distributions dN(r)/dlnr or dN(r)/dr could be equally used in Eq. (8) (e.g. see King et al., 1978;King, 1982). At the same time, the usage of both the volume of the particle (instead of number) and logarithmic scale in binning of the size distribution helps to optimize the approximation given by Eq. (8). First, these choices help to improve the accuracy of this approximation (a smaller number of points N r provides appropriate accuracy). Second, under this representation, the kernels K ε ext (λ; k; n; r i ; ε k ) for different points r i are closer in values. This is one of the favorable conditions for implementing inversion. Therefore, volume size distribution dV(r)/dlnr is often used as retrieved aerosol characteristic in the algorithms applied to invert the optical data of high sensitivity to aerosol particle size. For example, a similar size distribution representation was used in earlier studies by Nakajima et al. (1983Nakajima et al. ( , 1996 for retrieving aerosol properties from ground-based sky-radiometers. In the AERONET retrieval,  represented volume size distribution dV(r)/dlnr by N r = 22 points r i . These points are equidistant in logarithmic space and cover the size range from 0.05 to 15 µm. This size range was chosen following the sensitivity analysis by , which showed that the aerosol particles of smaller and larger sizes produce negligible contribution to AERONET radiometer observations. This range of aerosol particle sizes is slightly wider than the one used in earlier studies by Nakajima et al. (1996). As discussed later in this paper, the size range of aerosol is modified for retrieving aerosol from PARASOL observations.
Using the approximation given by Eqs. (8)-(9), Dubovik et al. (2006) developed a numerical tool for fast calculations of scattering properties of spheroid mixture. The quadrature coefficients K ε ext (λ; k; n; r i ; ε k ) for the extinction, as well as for absorption cross-sections and scattering matrices have been calculated and stored into the look-up tables for a wide range of n and k (1.3 ≤ n ≤ 1.7; 0.0005 ≤ k ≤ 0.5). The calculations were done for spheroids with axis ratio ε ranging from ∼0.3 (flattened oblate spheroids) to ∼3.0 (elongated prolate spheroids) and for 41 narrow size bins covering the size-parameter range from ∼0.012 to ∼625. The look-up tables were arranged into a software package allowing fast, accurate, and flexible modeling of scattering by randomly oriented spheroids with different size and shape distributions. In addition, Dubovik et al. (2006) used the developed software and showed that spheroids can closely reproduce single-scattering matrices of mineral dust measured in the laboratory by Volten et al. (2001). It was shown that scattering matrices have rather limited sensitivity to the minor details of axis ratio distribution dN (ε k ) d ln ε . Therefore, Dubovik et al. (2006) have suggested and demonstrated that AERONET retrieval may rely on rather simple assumption that shape (axis ratio) distribution dN (ε k ) d ln ε in the non-spherical fraction of any tropospheric aerosol is the same. Based on this conclusion the aerosol scattering model was set in AERONET retrieval as a mixture of spherical and non-spherical fractions, and dN(ε k ) d ln ε obtained by Dubovik et al. (2006) from fitting Volten et al. (2001) observations was employed as shape distribution for non-spherical fraction. Based on this assumption, the integration over ε in Eq. (7) can be done once and for all for each size bin, and, therefore, modeling of aerosol optical properties (τ a (λ), ω a 0 and P a ( ; λ)) in AERONET retrieval is implemented in a particularly convenient form. For example, for modeling τ a (λ) one can write: where C sph is the fraction of the spherical particles. Note, that while the preparation of the core look-up tables K nons ext (λ; k; n; r i ) required several years of computations, the resulting software allows very fast simulations of scattering by non-spherical aerosol particles. The calculation takes a fraction of a second for any realistic combination of aerosol size distribution and complex refractive index. At present, it is probably the only approach that can calculate scattering matrices for non-spherical particles as part of the retrieval without relying on look-up tables of scattering matrices. Once the forward model in AERONET retrieval was updated with non-spherical aerosol light scattering modeling capabilities by Dubovik et al. (2006) (as shown in Eq. 10) the fraction C sph was included in the set of retrieved parameters along with the concentrations for 22 bins of size distribution.
It is noteworthy that the spheroid model developed by Dubovik et al. (2002bDubovik et al. ( , 2006 appeared to be rather useful for AERONET and other aerosol remote sensing applications. First, the utilization of this model has significantly improved the AERONET operational retrieval of aerosol with pronounced coarse mode fraction (e.g. see Reid et al., 2003;Eck et al., 2005;Dubovik et al., 2006). The same model has been shown to reproduce adequately the ground-based polarimetric observations of non-spherical desert dust. Specifically, the efficient application of the model to the polarimetric observations has been done by Dubovik et al. (2006) for a case study and Li et al. (2009) for an extended series of observations. In addition, it was shown that the spheroid model allows qualitative reproduction of the main characteristic features of lidar observations of non-spherical desert dust. For example, the increase of extinction-to-backscattering lidar ratio and a high depolarization of signal regularly observed in lidar observations of desert dust, and traditionally associated with aerosol particle non-sphericity, can be adequately reproduced using a spheroid-based model (see discussion by Dubovik et al., 2006). Cattrall at al. (2005) showed that lidar ratios calculated from aerosol properties derived from AERONET observations using a spheroid model agree well with known lidar observations of desert dust. Furthermore, Veselovskii et al. (2010) have used the approach suggested by Dubovik et al. (2006) and incorporated the spheroid model into the algorithm retrieving aerosol properties from lidar observations. That is, probably, one of the first attempts to interpret quantitatively the sensitivity of the lidar observations to particle non-sphericity. The non-spherical coarse aerosol models derived from climatologies of AERONET retrievals had been successfully incorporated into MODIS and SEVIRI satellite retrievals (Levy et al., 2007a,b;Wagner et al., 2010). The AERONET retrievals are being used for making accurate calculations of atmospheric broadband fluxes and aerosol radiative forcing. These calculations were shown to agree very reasonably with available coincident ground-based flux observations in desert regions  and globally (Garcia et al., 2008). Derimian et al. (2008) demonstrated that the neglect of desert dust non-sphericity in climatic assessment leads to ∼10 % systematic overestimation of cooling of the atmosphere by desert dust aerosol on the top of the atmosphere. The retrieval algorithm developed here for POLDER/ PARASOL uses the same modeling strategy as described above. Correspondingly, a solution is sought in continuous space of aerosol size distribution parameters, aerosol particle shape and complex refractive indices (see Table 1). However, due to differences in information content of AERONET and POLDER measurements, the retrieved size distribution is represented by a smaller number of bins N r . Instead of N r = 22, retrieved by AERONET, here N r is reduced to 16 and even significantly smaller numbers. In order to assure that the aerosol model remains adequate and its accuracy is acceptable even if number of aerosol bins is small, the performance of Dubovik et al. (2006) software was analyzed for situations corresponding to POLDER/PARASOL measurements with reduced number of aerosol bins. It was found that the accuracy of the calculations remains practically unchanged if the aerosol size bins corresponding to very small and very large particles have been eliminated, i.e. N r = 16 covers size range from r min = 0.07 µm to r max = 10 µm. The contribution of smaller and larger particles into POLDER/PARASOL observations is negligible. Nonetheless, the need for optimizing the software of Dubovik et al. (2006) by using even smaller number of aerosol bins N r ≤ 10, was identified. It was found that sufficiently accurate modeling of POLDER/PARASOL observations can be achieved if the shape of each single bin is optimized. Specifically, utilizations of log-normally shaped bins provided notable improvements if N r ≤ 10. The conducted small series of calculations suggested some advantages of modeling aerosol size distribution as superposition of the log-normal functions with fixed parameters: For example, Fig. 4 illustrates that a small number of lognormal bins retains the realistically smooth shape of atmospheric aerosol size distribution, while applying triangular or trapezium approximation leads to appearance of inadequate features (apparent triangle tops) in the size distribution. Thus, the new option allowing the usage of log-normally shaped bins was included in Dubovik et al. (2006).

Modeling surface reflectance
The reflective properties of ocean surface are modeled analogously to the currently operational POLDER algorithm (Deuzé et al., 2001;Herman et al., 2005;Tanré et al., 2011). The Fresnel's reflection on the agitated sea surface is taken into account using the Cox and Munk model (Cox and Munk, 1954). The water leaving radiance is nearly isotropic (Voss et al., 2007) and modeling shows that its polarization is negligible (e.g. Chami et al., 2001;Chowdhary et al., 2006;Ota et al., 2010). This term and the white cap reflection are taken into account by Lambertian unpolarized reflectances. The whitecap reflectance is driven by the wind speed at sea surface according to the Koepke (1984) model. The seawater reflectance at short wavelengths is not negligible and depends on the properties of oceanic waters. Thus, in present model, the wind speed and the magnitude of seawater reflectance at short wavelengths need to be known a priori or retrieved together with aerosol. The modeling of the reflectance by the land surfaces has been adjusted to the needs of newly developed POLDER/PARASOL retrieval. The aerosol retrieval algorithm by Deuzé et al. (2001) over land relies only on the PARASOL measurements of polarized reflectance and, correspondingly, it does not consider the detailed directional scattering properties of total reflectance by land surface. Therefore, the "forward model" was modified to account adequately for both total and polarized properties of surface reflectance.
In remote sensing applications the effects of directionality of land surface reflectance are often accounted for by semi-empirical models driven by a small number of internal parameters. For example, the Ross-Li model (Ross, 1981;Li and Strahler, 1992;Wanner et al., 1995) is employed for characterization of directional properties of land surface reflectance derived from MODIS observation (Justice et al., 1998). The Rahman-Pinty-Verstraete (RPV) model (Rahman et al., 1993) is successfully used for the analysis of MISR observations by Martonchik et al. (1998) and SEVIRI by  and Wagner et al. (2010). The comparisons of the models with satellite (Maignan et al., 2004(Maignan et al., , 2009) and aircraft (Litvinov et al., 2010(Litvinov et al., , 2011 data showed that, generally, both Ross-Li and RPV models are comparably capable of reproducing the multi-angle observations of land surfaces. Since the RPV model was applied more extensively to interpretation of multi-directional images (e.g. MISR, SEVIRI), it has been retained in the present POLDER/PARASOL algorithm as a primary formulation for modeling Bidirectional Reflectance Distribution Function (BRDF). Rahman et al. (1993) describe BRDF as: cos α i = cos ϑ 1 cos ϑ 2 − sin ϑ 1 sin ϑ 2 cos (ϕ 1 − ϕ 2 ), (12c) G = tan 2 ϑ 1 + tan 2 ϑ 2 − 2 tan ϑ 1 tan ϑ 2 cos (ϕ 1 − ϕ 2 ) This model provides bi-directional reflectance as a function of four empirical parameters: ρ o (λ) characterizes intensity of reflectance. κ (λ) characterizes anisotropy of reflectance, θ (λ) characterizes forward/backscattering contributions in total reflectance, h 0 (λ) is a "hot spot" parameter. All these parameters are considered as unknowns and included in the set of retrieved parameters (see Table 1). The available airborne and satellite polarimetric observations in the visible and infrared showed that the Bidirectional Polarization Distribution Function (BPDF) of land surface tends to have rather small values (compared to BPDF) with no spectral dependence (e.g., Rondeaux and Herman, 1991;Nadal and Bréon, 1999;Maignan et al., 2004Maignan et al., , 2009Waquet et al., 2009a,b;Litvinov et al., 2010Litvinov et al., , 2011. Most theoretical models developed for approximating observed BPDF are based on the Fresnel equations of light reflection from the surface. For example, Nadal and Bréon (1999) have proposed simple two-parameter non-linear function of the Fresnel reflection for characterization of atmospheric aerosol over land surfaces based on POLDER observations of land surface reflectance. Recently, Maignan et al. (2009) have introduced a new linear BPDF model with only one free parameter and demonstrated that this simple model allows a similar fit to the POLDER measurements as more complex non-linear model by Nadal and Bréon (1999). This model has been used in the present POLDER/PARASOL retrieval algorithm as a primary model of polarized reflectance of land surface.
The model by Maignan et al. (2009) describes the polarized reflectance as: where BPDF is given as a linear function of polarized component F 12 (α i , n) of the Fresnel reflection matrix (dependent on incident angle α i and refractive index n) multiplied by an empirical coefficient: The attenuation term exp(−z) reflects the observed tendency of decreasing polarized reflectance with increasing vegetation cover, where z is the Normalized Difference Vegetation Index (NDVI). The NDVI value z was obtained from the reflectance measurements concomitant with the polarization observations, B is a free parameter that should be chosen to fit the observed BPDF angular dependence. Thus, in the present algorithm the land surface reflectance properties are modeled using Eqs. (12) and (13) for simulating BRDF and BPDF accordingly. However, since these formulations are semi-empirical and derived completely independently, one needs to exclude physically unrealistic combinations of BRDF and BPDF. Therefore, in order to assure that the surface reflectance of polarized radiation in any geometry does not exceed the reflectance of total radiation, the reflectance matrix R (µ 0 ; µ 1 ; ϕ 0 ; ϕ 1 ; λ) of the land surface is represented as a sum of two surface reflection phenomena: where the matrix for diffuse unpolarized reflectance R diff is modeled using Eq. (12): and the matrix for specular reflectance R spec is modeled as matrix of Fresnel reflectance F (α i , n) scaled by ρ Maignan (ϑ 1 ; ϕ 1 ; ϑ 2 ; ϕ 2 ) empirical coefficient: where the Fresnel matrix F (α i , n) defined for Stokes parameters referred to directions parallel and perpendicular to the reflection plane can be written as (e.g. see Lenoble et al., 2007): The Fresnel matrix F (α i , n) depends on incident angle α i and refractive index n. The coefficients r r and r l are defined as where the refraction angle α r is related to α l through the Snell-Descartes refraction law given as: The straightforward analysis of the above equations shows that the definition of R (µ 0 ; µ 1 ; ϕ 0 ; ϕ 1 ; λ) given by Eq. (14) secures the physically correct ratio between polarized and total radiation components (R ij ≤ R 11 ) for any combination of ρ RPV and ρ Maignan . Thus, in the present algorithm, the BRDF and BPDF properties are driven by four free spectral parameters of RPV model (ρ o (λ), κ (λ), θ (λ), h 0 (λ)) and one free (generally spectrally dependent) parameter B (λ). All these parameters have been added into the retrieved vector of unknowns as shown in Table 1.
It should be noted, however, that the Rahman et al. (1993) and Maignan et al. (2009) formulations, chosen as primary models for BRDF and BPDF in the present algorithm, have limited accuracy (e.g. see Litvinov et al.,2010Litvinov et al., , 2011. Therefore both the forward model and inversion module of the present algorithm have an assumed option of changing both or either BRDF or BPDF models. For example, in the present algorithm version, the BRDF can be simulated using the Ross-Li model and BPDF can be modeled using Nadal-Bréon (1999) formulation. Nonetheless, the utilizations of these secondary BRDF or BPDF models will not be discussed in detail.

Full forward radiative model
Accounting for multiple scattering effects in the atmosphere is implemented by the successive order of scattering radiative transfer code (Lenoble et al., 2007) that was used in PARA-SOL operational retrievals (Deuzé et al., 2001;Herman el al., 2005;Tanré et al., 2011). The code provides full information about the atmospheric radiation field under the assumption of the plane parallel atmosphere. In order to reduce calculation time for inverting PARASOL observations, the V component in the Stokes vector has been neglected. Hansen (1971) has demonstrated that the radiation properties measured by passive remote sensing exhibit negligible circular polarization of the electromagnetic field. The developed version of successive order of scattering radiative transfer code allows calculations of atmospheric radiances for several N k aerosol components. Each aerosol component can be described by defined vertical profile of spectral extinction k ext,k (h, λ) and altitude independent phase matrix P k ( , λ) and single scattering albedo ω k 0 (λ). In the present set up of the aerosol retrieval code these optical properties are detremined based on microphysical model of atmospheric aerosol. Correspondingly, only parameters describing aerosol microphysics are directly included in the set of retrieved parameters listed in Table 1. Specifically, the vertically invariant P k ( , λ) and ω k 0 (λ) are driven by: the shape of the size distribution dV k (r i )/dlnr giving the aerosol particle volume in the total atmospheric column per unit of surface area (in the unites of µm 3 /µm 2 ); the real n k (λ) and imaginary k k (λ) parts of the complex refractive index; and the fraction of the spherical particles C k,sph . The spectral dependence of optical thickness τ k (λ)/τ k (λ i ) is also vertically invariant and defined by these parameters, while the absolute value of τ k (λ) additionally depends on the total volume of the aerosol in the atmospheric column: In order to account for vertical variability of the aerosol extinction k ext,k (h, λ), the additional characteristic c k (h) was added. The function c k (h) defines the vertical distribution of aerosol concentration and the optical thickness of k-th aerosol component in each of i-th atmospheric layer is defined as: The aerosol concentration profile c k (h) is assumed as a Gaussian function normalized to unity, i.e.: where h BOA -Bottom Of the Atmosphere (BOA) height and h TOA -Top Of the Atmosphere (TOA) height. In the retrieval, the standard deviation characterizing the width of the aerosol layer is fixed to σ k = 0.75 km. Therefore, only one parameter characterizing aerosol vertical distribution is included into the retrieval: h k,0 -the mean altitude of the k-th aerosol component layer. The present version of the POLDER/PARASOL retrieval code is set to retrieve only one aerosol component. The algorithm derives single values of complex refractive index and fraction of spherical particles for particles of all sizes. Such retrieval assumption is based on the earlier AERONET sensitivity studies by Dubovik et al. ( , 2006 that indicated major limitations in discriminating between refractive indices and shapes of aerosol particles of fine and coarse modes. At the same time, the possibility of retrieving several aerosol components with different complex refractive indices and vertical distributions is also assumed (see Table 1) and sensitivity of polarimetric observations to multi-component aerosols is planned to be verified in future studes. In addition, in order to harmonize the radiative transfer code with the structure and needs of general inversion approach, several modifications have been implemented. Specifically, the modifications were aimed to increase the speed of calculations by allowing admissible decrease of the accuracy of modeling. The three possible tradeoffs permitting reduction of computation time without any significant loss of retrieval accuracy were identified and implemented.

Adjustable number of terms in the expansion of the phase matrix and in the quadrature of directional integration
The accuracy of radiative transfer calculations strongly depends on the number of terms M used in the expansion of the phase matrix into Legendre polynomials and number of terms N used in Gaussian quadrature for zenithal integration. The values should satisfy the inequality 4 N − 1 > 2 M to retain conservation of energy in the successive order of scattering integration. The values M and N should be sufficiently large to provide accurate calculation. However, the larger M and N the longer the calculation time. At the same time, the high accuracy of the modeling is not always required during the retrieval. For example, studies by  showed that when observations of groundbased radiometers are inverted, the successful retrieval can be achieved using the approximate and quick calculations of the first derivatives. Correspondingly, the retrieval time can be significantly decreased because the calculations of the first derivatives is the most time consuming component of Newtonian's retrieval algorithms. Following this strategy, the operational retrieval of aerosol from AERONET data relies on analytical single scattering approximation for calculations of derivatives. The possibility of using this tradeoff in POLDER retrieval algorithm was tested. It has been concluded that using single scattering approximation is not sufficient for conducting retrieval from satellite observations. Nonetheless, it has been found that the Jacobians estimated numerically as finite differences on basis of the full radiative transfer calculations implemented with significantly reduced values of M and N provide fast and accurate retrievals. This approach is used in the present algorithm. The utilization of linearized radiative transfer code (e.g. see Hasekamp and Landgraf, 2005a) that provides all derivatives with respect to aerosol and surface properties in a single run would be alternative and promising strategy of accelerating satellite observation inversion. This strategy was not used in the present study because the linearization of radiative transfer code is a rather complex effort that requires significant time investments. The possibility of using this approach will be considered in future studies. At the same time, a retrieval algorithm relying on the numerical calculation of the first derivatives is probably more flexible in practical applications. Indeed, in the present POLDER/PARASOL algorithm the set of retrieved aerosol or/and surface parameters can be changed with no modifications in the calculations of the first derivatives. If the derivatives are calculated analytically, achieving such flexibility could be more difficult.

Truncation of the phase matrix
The truncation of the phase function is a technique whereby the scattering effects from the sharply increasing forward peak of the phase function are calculated separately from those of the rest of the phase function, which permits the accurate but much faster modeling of diffuse radiation. For example, the AERONET aerosol operational retrieval by  employs the discrete ordinate radiative transfer code by Nakajima and Tanaka (1988) that uses the efficient procedure of the phase function truncation and provides very fast and accurate calculation of downwelling diffuse radiation in moderately thick atmospheres. The detailed discussion of different methods of implementing the phase matrix truncation and their comparison are given in the recent paper by Rozanov and Lyapustin (2010).
Following the ideas of Nakajima and Tanaka (1988) the truncation of the phase matrix has been implemented in the successive order of scattering code as a part of present studies. In the developed version of the PARASOL algorithm the use of truncation is optional but recommended. The utilization of the phase matrix truncation allows for decreasing the number of terms M in the expansion of the truncated phase function and N in the Gaussian quadrature for azimuth integration. According to the results of the conducted tests, accurate PARASOL retrievals can be achieved with the following recommended values: M = 21 and N = 10 -for calculating the fit to PARASOL observations; M = 15 and N = 7 -for calculating Jacobian matrices.

Flexible angular representation in the phase function
The successive order of scattering radiative transfer code uses the phase function values at N ang angles corresponding to the points of Gaussian quadrature. These values are provided to the radiative transfer code from the Dubovik et al. (2006) software generating aerosol single scattering properties. The software provides the phase function for the set of fixed scattering angles allowing sufficiently accurate modeling of angular variability of scattering. For example, in AERONET retrieval the phase function is calculated at 83 Gaussian points. At the same time, the speed of aerosol single scattering modeling depends on the number of the scattering angles used. In order to enhance the flexibility of the retrieval code the possibility of using the phase function at nearly arbitrary set of scattering angels has been implemented, with the values of the phase function at the required Gaussian points derived from the software data by convenient interpolation. The results of the series of the numerical tests have shown that POLDER/PARASOL observations can be adequately modeled using set of N ang = 35 selected scattering angles.

Numerical inversion
In contrast to the majority of existing satellite retrieval algorithms, this effort is one of the first attempts to develop an aerosol satellite retrieval using statistically optimized multivariable fitting. Such strategy does not rely on the preassumed classes of potential solutions. Instead the solution is sought in a continuous space of solutions under statistically formulated criteria optimizing the error distribution of the retrieved parameters. The implementation of some elements of such a strategy was pursued in the earlier developments of satellite retrieval algorithms. For example, the statistical optimization of the retrieval solutions was used for inversion of MISR observations by Martonchik et al. (1998), in the retrieval algorithms proposed by Chowdhary et al. (2002Chowdhary et al. ( , 2005 and by Waquet et al. (2007Waquet et al. ( , 2009a for inverting anticipated GLORY/APS observations and applied to RSP data, in the retrieval approach suggested by Landgraf (2005b, 2007) for applying to multiple-viewing-angle intensity and polarization measurements and the retrieval algorithm developed by  for processing SEVIRI/ MSG observations. Detailed descriptions of inversion methods can be found in various textbooks (Tikhonov and Arsenin, 1977;Twomey, 1977;Tarantola, 1987;Press et al., 1992;Rodgers, 2000;Doicu, 2010). However, the textbooks generally provide the reviews of well-established inversion procedures and do not suggest the reader sufficient explanations as to which method and why should be chosen for a particular application. The approach used here is focused on clarifying the connection between different inversion methods established in atmospheric optics and unifying the key ideas of these methods into a single inversion procedure. It follows the developments by , Dubovik (2004), Dubovik et al. (2008). The methodology has several original (compare to standard inverse methods) features optimized for remote sensing applications. As shown in detailed description by Dubovik (2004), the methodology addresses such important aspects of inversion optimization as accounting for errors in the satellite observations, inversion of multi-source data with different levels of accuracy, accounting for a priori and ancillary information, estimating retrieval errors, clarifying potential of employing different mathematical inverse operations (e.g. comparing iterative versus matrix inversion methods), accelerating iterative convergence, etc. The concept uses the principles of statistical estimation and suggests a generalized multi-term Least Square type formulation that complementarily unites advantages of a variety of practical inversion approaches, such as Phillips-Tikhonov-Twomey constrained inversion (Phillips, 1962;Tikhonov, 1963;Twomey, 1963), Kalman filter (Kalman, 1960), Newton-Gauss and Levenberg-Marquardt iterations, etc. This approach provides significant transparency and flexibility in development of remote sensing algorithms for deriving such continuous characteristics as vertical profiles, size distributions, spectral dependencies of some parameters, etc. For example, compared to the popular "Optimal Estimation" equations (Rodgers, 2000), the multi-term Least Square type formulation allows harmonious utilization of not only a priori estimate term but instead, or in addition, using a priori terms limiting derivatives of the solution (see discussion by Dubovik, 2004 andDubovik et al., 2008). This methodology has resulted from the multi-year efforts on developing inversion algorithms for retrieving comprehensive aerosol properties from AERONET ground-based observations. Two alternative scenarios are proposed for inverting satellite observations: single-pixel retrieval and multiple-pixel retrieval. The single-pixel retrieval is a conventional approach when observations of the satellite instrument over each single pixel (e.g. in the case of POLDER/PARASOL the pixel size is 5.3 km × 6.2 km at nadir) are inverted completely independently. The multiple-pixel retrieval is a newly suggested approach when the observations of the satellite instrument over a group of pixels are inverted simultaneously and extra a priori constraints on the inter-pixel variability of the retrieved parameters is applied. It is expected that applying such constraints will help to improve reliability of the retrieval.

Single-pixel retrieval
This part of the inverse procedure is adopted, with some modifications, from the AERONET  inversion algorithm. There are two key aspects in the algorithm organization: the general organization of observation fitting and the a priori data representation.

Single-pixel observation fitting
Formally, the retrieval algorithm is designed as a multi-term LSM that implements statistically optimum fitting of several sets of observations and a priori constraints under assumption of normally distributed uncertainties (detailed discussion is given by Dubovik, 2004). It provides the solution of the following system of equations: Here, f * is a vector of the PARASOL measurements, f is a vector of measurement uncertainties, a is a vector of unknowns. The second line in Eq. (17) represents the a priori smoothness assumptions used to constrain variability of size distribution and spectral dependencies of the real and imaginary parts of the refractive index as well as the spectral dependencies of parameters of surface reflectance model. The matrix S includes the coefficients for calculating mth differences (numerical equivalent of the derivatives) of dV (r j )/dlnr, n(λ j ), k (λ j ), ρ o (λ j ), κ (λ j ), θ (λ j ); 0 * -vector of zeros and ( a) -vector of the uncertainties characterizing the deviations of the differences from the zeros. Formally, this equation states that all these m-th differences are equal to zeros within the uncertainties ( a). The third line in Eq. (17) includes the vector of a priori estimates a * and a * is the vector of the uncertainties in a priori estimates.
In order to account for the non-negative character of the observed radiances and retrieved aerosol (dV (r j )/dlnr, n (λ j ), k (λ j )) and surface reflectance (ρ o (λ j )) parameters, the assumption of log-normal error distribution was used. The noise distribution is apparently the most appropriate for the positively defined values. The log-normal noise distribution implies that the logarithms of the observed positively defined values are normally distributed. Thus, for convenience of formulating the statistically optimized solution of Eq. (17) the logarithmic transformation is used for both measured f j and retrieved a i parameters (see detailed discussions on validity of applying this transformation in the publications by King, 2000, Dubovik, 2004). Correspondingly, the errors f , ( a), and a * are assumed normally distributed. Table 2 shows the exact definitions of each element of the vectors f * and a. In addition, Table 2 shows the variability ranges allowed for each retrieved parameter.
The statistically optimized solution of Eq. (17) defined on the base of Maximum Likelihood Method corresponds to the minimum of the following quadratic form: In generally non-linear case the minimum can be obtained by the following iterative procedure: where a p is a solution of the p-th so-called, in statistical estimation formalism, Normal System The matrix in the left side of Eq. (18c) is also known as Fisher Matrix A p and the right side of Eq. (18c) represents the gradient ∇ (a p ) of quadratic form (a p ) in vicinity of a p : where f p = f (a p ) − f * , K p -Jacobi matrix of the first derivatives ∂f j (a p ) ∂a i . It should be noted that Fisher Matrix A p can be considered as so-called Hessian matrix of second-order partial derivatives of the quadratic form (a p ) (e.g. see Bevington, 1969;Tarantola, 1987). Correspondingly, Eq. (18c) can be also written as follows: where ∇∇ T (a p ) is the matrix with the elements ∇∇ T (a p ) j i = ∂ 2 (a) ∂a j ∂a i a=a p . In the present inversion strategy, all equations are expressed via W -weighting matrices that defined by dividing the corresponding covariance matrix C by its first diagonal element ε 2 , i.e. W = (1/ε 2 )C. This formal transformation of the inversion equations allows rather transparent interpretation of Lagrange multipliers γ -parameters determining the contributions of a priori terms into solution. Following the concept proposed by  the Lagrange multipliers γ a and γ are defined as: where ε 2 f , ε 2 and ε 2 a are the first diagonal elements of corresponding covariance matrices C f , C and C a . Note, that if Surface BRDF and BPDF parameters: one can assume a simplified situation when the POLDER observations f * , a priori estimates of differences ( a) * and a priori estimates of parameters a * are independent and have the same accuracy, i.e. C f = ε 2 f I (N f ×N f ) , C = ε 2 I (N ×N ) and C a = ε 2 a I (N a ×N a ) , then corresponding weighting matrices are simply equal to the unity matrices: In addition, as discussed by Dubovik (2004), straightforward increasing of N f by adding redundant very similar observations is not necessarily beneficial for the retrieval. For example, from a practical viewpoint, adding many observations with identical or nearly identical observation conditions does not bring new information and does not improve the accuracy of the solution. However, formally even such addition of observations artificially increases the impact of the measurement term on the solution compare to a priori second and third terms in Eq. (18). Therefore, the Lagrange multipliers γ are scaled as: As suggested by Dubovik (2004) such scaling reflects the inevitable decrease of accuracy of single observation ε 2 f if the number of observation N f increases excessively. The above equation is written under an assumption that increasing the number of measurements in the coordinated set of remote sensing observations inevitably results in a decrease of the accuracy of each single measurement in this observation set. For example, if a satellite sensor takes one single observation, the expected variance of measurement error is ε 2 f,N . If the same sensor makes N f space-and/or time-coordinated observations the variance of the error in each single observation increases by the factor N f , i.e. ε 2 f,N ∼ N f ε 2 f,1 . This increase can be explained by the fact that the consistency of the N f coordinated observations should be assured by controlling N f relations between the N f observations. The control of each of those relationships introduces a random error ε 2 f,N , correspondingly the error variance of a single measurement in N f dimensional observation increases in N f times. The coefficient t p ≤ 1 in Eq. (18b) is adjusted to provide monotonic decrease of (a p ), i.e.
If all assumptions are correct, the minimum value of the above quadratic form can be theoretically estimated as follows: Note that the minimum value of (a p ) relates to ε 2 f because the weighting matrices were used instead of using directly the covariance matrices. Once the value of measurement error is known ε 2 f , Eq. (20b) can be used to verify the consistency of the retrieval. Specifically, the inability to achieve the above minimum can indicate the presence of unidentified biases or inadequacy in the assumptions made.
It should be noted that the control of "measurement residual" f (a p ) (the first term of quadratic form in Eq. (18a)) is a very useful tool for diagnostics of the retrieval dynamics. Specifically, the final value of f (a p ) should be close to the level of the expected measurement noise. Indeed, if the algorithm has found the right solution the value of the total residual (a p ) should be rather small and determined mainly by the random errors of observations. The contributions of the a priori residual terms in Eq. (18a) should not be significant, because generally the weights of a priori terms (a p ) and a (a p ) are minor compared to the weight of the "measurement residual" term f (a p ). However, at early iterations when the solution approximation is very far from the solution, f (a p ) is dominated by linearization errors and has the value much higher than the level of the expected measurement noise. Therefore, since accuracy of a priori data is independent of the iteration, the weight of the a priori term should be increased. Correspondingly, this additional enhancement of the a priori data impact on the solution improves the convergence of non-linear fitting. For example, in developed POLDER/PARASOL algorithm, following  and Dubovik (2004), the strength of a priori constraints is adjusted dynamically as a function of the measurement residual f (a): where f (a p ) reflects the accuracy of the POLDER observation fit at the p-th iteration and provides an indication of how the iterations converge to the solution. For example, at the last iteration, when the solution estimate a p last is expected to be in a small vicinity of the actual solution, the value of the residual f (a) of the measurement fit can be estimated as: Using ε 2 f (a p ) Eq. (20c) adjusts the values of the Lagrange multipliers γ and γ a in Eq. (18) and enhances the contribution of a priori constraints at the earlier iterations. As suggested by Dubovik (2004), this dynamic determination of a priori constraints improves convergence of non-linear iterations analogously to Levenberg-Marquardt formulations. At the same time, in contrast to the original Levenberg-Marquardt method, the idea of enhancing constraints on the solutions at earlier iterations in the formulations by Dubovik (2004) is included harmoniously within the framework of united statistical estimation approach. For example, if no smoothness constraints are used (i.e. the smoothness terms in the right sides of Eqs. (17) and (18) are eliminated) and no a priori estimates a * are available, then one can assume a * = a p , Eq. (18) become equivalent to Levenberg-Marquardt formulations (see details in Dubovik, 2004).
Thus, the inversion procedure described above is driven by the limited set of the input characteristics that includes the weighting matrices W ... and corresponding variances ε ... . The vector of POLDER/PARASOL measurements includes 2 components: where index "I" denotes total reflectance observations and index "P" denotes the observations of the degree of linear polarization (see Table 2). Since the total radiance and degree of linear polarization are measured with different accuracy, the covariance matrix assumed for logarithms of measurements f * has the array structure: Correspondingly, the weighting matrix W f is defined as: where I... denote the unity matrices of corresponding dimension and γ P is a ratio of the total and polarized reflectance variances: The variance of the errors in measurements of total reflectance is expected at the level of 2 % relative to the magnitude of observed radiance I , i.e. ε I = (lnI ) ≈ I I = 0.02. The variance of the errors in measurements of the degree of linear polarization is expected at the absolute level of 1 %, i.e. P = 0.01 and ε p = (lnP ) ≈ P P = 0.01 P . Equation (21b) assumes the simplest structure of the covariance matrix when intensity and polarization observations are equally accurate for all spectral channels and angles of observations. If more detailed information about covariance matrix is available it can be trivially integrated into Eq. (21).
In contrast to the numerous remote sensing algorithm relying on a priori estimates as suggested by Rodgers (2000), the a priori estimates a * were not used at all, i.e. ε 2 a → ∞ (although in practical application of the algorthm to satellite observation the using of, at least for land surface parameters, the climatological values a * with adequately estimated variances ε 2 a can be beneficial). Thus, in contrast to the methodology by Rodgers (2000), the retrieval was constrained here using exclusively the a priori smoothness constraints as discussed in the follow on Section.

A priori smoothness constraints in single pixel fitting
The vector a includes several components: a = a v a n a k a sph a Vc a h a brdf,1 a brdf,2 a brdf,3 a bpdf T , where a v , a n , a k , a sph denote the components of the vector a corresponding to the dV(r i )/dlnr, n (λ i ), k (λ i ) and C, a h is equal to the logarithm of mean altitude of the aerosol layer h a . The element a Vc is the logarithm of total volume concentration, while a v includes logarithms of values dV(r)/dlnr normalized by total volume concentration. This modification (compare to AERONET algorithm) allows decoupling the retrieval of the total amount of aerosol from the shape of the size distribution. This decoupling of the size distribution parameter is essential for satellite imager algorithm, since it allows applying different constraints on the shape of size distribution and aerosol loading. The three components a brdf,1 , a brdf,2 and a brdf,3 that are related to the logarithms of spectrally dependent parameters ρ o (λ), κ (λ) and θ (λ) of employed RPV model. Note, that here the "hot spot" parameter h 0 (λ) that is present in Eq. (12) is not included in the vector a, because it is not retrieved near-backscattering direction nor observed (it is significant only in a narrow range of scattering angles around the backscattering direction). At the same time, h 0 (λ) is retrieved if the "hot spot" is observed.
In such situation it is included in the retrieval similarly to other BRDF parameters (ρ o (λ), κ (λ) and θ (λ)). The vector a bpdf includes the logarithms of the spectrally dependent free parameter B (λ) of the employed Maignan et al. (2009) model. The detail description of each element of the vector a is given in Table 2. The a priori smoothness constraints are applied in the POLDER/PARASOL algorithm on several different components of the vector a differently. For example, for a given by Eq. (22), the matrix S has the following array structure: a v a n a k a sph a Vc a h a brdf,1 a brdf,2 a brdf,3 a bpdf where the corresponding matrices S ... have different dimensions and represent differences of different order (3 for size distribution, 1 for n (λ), 2 for k (λ), 2 for ρ o (λ) 1 for κ (λ), θ (λ) and B (λ). The lines in Eq. (23) corresponding to a h , a Vc and a sph contain only zeros because no smoothness constraint can be applied. It should be noted that for making formulations more transparent only one coefficient γ was shown in Eq. (18). However, the actual algorithm uses 7 (or even 8 if h 0 (λ i ) is retrieved) multipliers γ ,i . The errors ( a) are assumed independent for different components of the vector ( a) * and the smoothness matrix in Eq. (18) has the following array structure: where i = S T i W −1 i S i uses the derivative matrices S i (i = 1, ...,7) S v , S n , S k , S brdf,1 , S bdrf,2 , S brdf,3 , S bpdf . If it is assumed that the covariance matrices of errors ( a) have the structure C ,i = ε 2 ,i I (N ,i ×N ,i ) for each components of ( a) * , i.e. weighting matrices are W ,i = I (N ,i ×N ,i ) . Correspondingly, the quadratic form (a p ) in Eq. (18) can be written as the following sum: where γ ,i = ε 2 f /ε 2 ,i . The utilization of the smoothness constraints for a single retrieved function y(x i ) was originated in the papers by Phillips (1962), Tikhonov (1963) and Twomey (1963). Although application of the smoothness constraints is usually considered to be an implicit constraint on derivatives, in these original papers and most of follow-on studies the solutionâ was constrained by minimizing m-th differences m of the vectorâ components: The corresponding "smoothness" matrix = (S k ) T (S k ) was defined using S (m) matrices of m-th differences S (m) (i.e. m = S (m)â ). For example, S (2) (m = 2) is: The present development follows the concept of  and Dubovik (2004) that considers smoothness constraints explicitly as a priori estimates of the derivatives of the retrieved characteristicy(x i ). The values of m-th derivatives g m of the function y(x) characterize the degree of its non-linearity and, therefore, can be used as a measure of y(x) smoothness. For example, smooth functions y(x), such as, a constant, straight line, parabola, etc. can be identified by the m-th derivatives as follows: These derivatives g m can be approximated by differences between values of the function a i = y(x i ) in N a discrete points x i as: The corresponding matrix of m-th derivatives S g,(m) (i.e. g m = S g,(m)â ) can easily be defined using Eq. (25). For example, S g,(2) (m = 2) is: One can see that S g,(2) has significantly more complex structure than the conventional definition of smoothing matrix by Eq. (25). At the same time for the special case of uniformly spaced ordinates I = = const the matrices S g,(m) and S (m) have straightforward relation: S g,(m) = −(2m) S (m) .
In difference with Eq. (25), Eq. (26) allows for applying smoothness constraints in more general situations when 1 (x i ) = const. For example, in present algorithm there is a number of spectral parameters that are functions of λ and the algorithm deals with their values defined for each spectral channel λ i . Obviously the (λ i+1 −λ i ) = const and using standard definition of differences by Eq. (25) for smoothing spectral parameters (e.g. n(λ), ρ o (λ) etc.) is not completely correct. Applying the limitations on the derivatives defined by Eq. (27) is more rigorous if no significant changes of derivatives of y(x) are expected for different ordinates x i . Although using Eq. (27) leads to a loose of transparency in definitions of matrices S (m) , generating those matrices on algorithmic level is rather straightforward and defining of Lagrange parameters γ is more logical. Therefore, in present algorithm, the smoothness constraints are applied to limit directly the numerical equivalents of m-th derivatives given by Eq. (25), by a priori assumption that m-th derivatives are equal to zeros with some uncertainty (g * = 0 * = 0 + g ).
In order to estimate correctly the strength of a priori constraints  and Dubovik (2004) suggested to relate the strength of a priori smoothness constraints (i.e. the uncertainty ε 2 ,i in a priori known deviations of derivatives from zeros in second line of Eq. (17) to the integral norm of the derivatives defined as: where b i characterizes the smoothness of the physical continuous function y i (x): the closer b i to zero the more smooth the function y i (x). Indeed, there is a direct relation of this integral to the quadratic forms ,i (a p ) in Eq. (24) to the integral norm of the derivatives given by Eq. (28). Namely, if the values of function y i (x) are known at N i discrete points x i , one can estimate the norm of the m-th derivatives: where g,i = (S g,i ) T W −1 i S g,i , S g,i are the matrices of the coefficient for estimating derivatives g m,i x j as shown in Eq. (27), W i is diagonal weighting matrix with the elements {W i } jj = 1/( m,i (x j )). Therefore, the multiplier γ ,i = ε 2 f /ε 2 ,i in the minimized quadratic form of Eq. (24) can be directly related to the known values of b i as follows: Here y ns i (x) denotes most unsmooth real function y i (x). Based on this definition, the smoothness a priori condition imposed by the second line in Eq. (17) allows any function y i (x) to be solution that has norm of m-th derivatives b m i smaller than b m i max . In the framework of used here statistical approach this definition means the second system in Eq. (17) defines the derivatives equal zeros: g * i = 0 * i = 0 + i,g . The errors g are assumed independent for each i-th (i = 1, 2, ..., 7) component of vector a (see Eq. 22) have the following diagonal covariance matrix with the elements of the diagonal defined as: C g,i j = ε 2 g,i m,i (x j ) . This assumes that the deviation of estimates g m,i x j from zero has variance ε 2 g,i / m,i (x j ) . This variance is chosen inversely proportional to the coefficient m,i (x j ) calculated as shown by Eq. (27). This inverse proportionality to m,i (x j ) reflects the fact that the deviations of g m,i (x j ) from zeros can be stronger at smaller scales. This is essential in general case when 1 (x j ) = const. Thus, following this consideration we define the strength of a priori constraint applied on each retrieved function (dV(r)/dlnr, n (λ), k (λ), ρ o (λ), κ (λ), θ (λ) and B (λ) by calculating values of b m i max using most unsmooth (variable) examples of corresponding physical functions, i.e. most unsmooth size distributions, spectral dependencies of refractive indices and BRDF parameters. Table 3 shows the orders of the derivatives used for smoothness constraints and the corresponding values of Lagrange multipliers γ ,i . The general structure of the numerical inversion data flow implemented for the retrieval of aerosol over single pixel is summarized in Fig. 5.

Multiple-pixel retrieval
In contrast with most satellite retrievals, the algorithm developed here does not implement the measurement fitting for each single pixel independently. Instead, the fitting is implemented for a group of pixels and is constrained by the extra a priori limitations on inter-pixel variability of aerosol and/or surface reflectance properties. This approach improves the stability of satellite data inversions because the information content of the reflected radiation from single pixels is sometimes insufficient for a unique retrieval of all retrieved parameters. For example, deriving aerosol properties over bright land is known as an extremely difficult task. On the other hand, if the surface reflectance and aerosol properties happen to be the same for a certain time period or over some area, one can trivially achieve higher redundancy in the retrieval by applying single pixel algorithm to the observations collected from such multiple pixels. This fundamental tendency can be formulated as a priori known statistical limitations that Surface BRDF: Surface BPDF: can serve as an extra constraint making multi-pixel retrieval more robust and reliable. Dubovik et al. (2008) discussed using this concept in a frame of statistical estimation formalism for improving global aerosol inverse modeling retrievals. In principle, this approach allows for applying smoothness on 4-D (temporal and spatial: x-y-z) variability of the retrieved characteristics, while for inverting POLDER/PARASOL imager observations only 3-D constraints (temporal and horizontal: x − y) are used. Methodologically quite similar concept was used by Quaife and Lewis (2010) for imposing temporal (1-D) smoothness constraints on surface reflectance variability for inverting linear kernel-driven BRDF models from MODIS observations. It should be noted that application of the multi-pixel approach is discussed here only as a tool for applying extra spatial and temporal constraints on the retrieved aerosol and surface parameters. At the same time, this approach when a group of pixels is inverted together is also convenient for rigorous accounting for cross-pixel coupling in this forward modeling of reflected radiances. This coupling appears due to the multiple light scattering effects and it introduces some dependence of reflected radiances over observed pixel on the properties of surface and aerosol over neighboring pixels. In satellite remote sensing this is known as adjacency effect that may introduce some additional errors in the aerosol retrieval. This effect is significant for high-resolution observations and becomes negligible for satellite images with resolution ∼2 km and larger (e.g., see Tanré et al., 1981;Kaufman, 1982;Lyapustin and Kaufman, 2001). Since POLDER/PARASOL observations have single pixel resolution of ∼5.3 km × 6.2 km, the adjacency effect was not accounted for in our algorithm. In addition, accurate accounting for this effect requires complex and time consuming 3-D radiative calculations that are difficult to adapt to the requirement of the operational satellite retrieval algorithms.

Multiple-pixel observation fitting
In order to make the equations more compact Eq. (17) defined for i-th single pixel can be denoted as follows: Then, the inversion of multi-pixel observations can be considered as a solution to the following combined system of equations: where index "i" (i = 1, 2, 3, ...) denotes the number of each single pixel. Correspondingly, the total vector of unknowns a is composed from the vectors of unknowns a i of each i-th pixel: The matrices S x , S y and S t include the coefficients for calculating m-th differences (numerical equivalent of m-th derivatives) of spatial (x−/y−) or temporal (t−) inter-pixel variability for each retrieved parameter a k characterizing dV(r j )/dlnr, n (λ j ), k (λ j ), ρ o (λ j ), κ (λ j ), θ (λ j ); 0 * x , 0 * y , 0 * t -vectors of zeros and ( x a), ( y a) and ( t a) -vectors of the uncertainties characterizing the deviations of the differences from the zeros. The solution of the multi-pixel system can be implemented using formally the same sequence of the operations as shown in the single pixel case. However, the minimized quadratic form (a p ), its gradient ∇ (a p ) and Fisher Matrix A p formulated for inversion of combined multi-pixel Eq. (32b) would be defined via corresponding single-pixel terms: Surface BPDF: where i (a p ), ∇ i (a p ) and A i,p are defined for N single pixels. The smoothing inter-pixel matrix inter is defined as: It is easy to observe that if inter-pixel matrix inter is eliminated the solution of the multi-pixel system of N pixels can be considered as solution of N independent single pixel systems. However, the matrix inter has non-zero non-diagonal elements and ∇ (a p ) is not equal to a simple sum of ∇ i (a p i ). Therefore, the solution of the multi-pixel system of N pixels is not equivalent to the solution of N independent single pixel systems. At the same time, as shown in the Appendix A the inter-pixel smoothness matrix inter is sparse and contains many zeros. This fact allows multi-pixel constraints without a significant increase of computation time.
Also, the values of Lagrange parameters γ x , γ y and γ t are defined using similar concepts to those discussed for applying smoothness constraints on variability of aerosol and surface parameters in single pixel inversion scenario as shown in Eq. (29). The only difference is that in Eq. (29) the function y i (x) denotes the variability of each single parameter of aerosol or surface over x-, y-or t-coordinates. Note, that while Eq. (34) suggests the same strength of all constraints (that is done for clarity of formulation), the variability limitations can be very different for each single parameter. This is achieved by rather straightforward re-scaling of smoothing matrices corresponding to each single limitation (i.e. smoothing variability of each a i over each of x-, y-or t-coordinates can be different). Also differences/derivatives of different order can be used in each single constraint. This strategy is fully implemented in the current algorithm. Table 4 outlines the details of the inter-pixel constraints employed in the developed algorithm. Figure 6 shows the design of multi-pixel inversion algorithm as a rather straightforward generalization of single-pixel inversion.
The diagram in Fig. 7 emphasizes the similarity of multipixel inversion with N pixel single-pixel retrievals. The difference is only in the additions of the inter-pixel smoothing terms into the N pixel pixels joint Normal System. The matrix inter defining the inter-pixel smoothing terms is very sparse and transparent (see Appendix A), therefore computer time required for definition of these smoothing terms is insignificant compare to other calculations performed in the inversion. Another difference is that the multi-pixel approach requires solving the N pixel pixels joint Normal System that can have rather high dimension. At the same time, the Fisher matrix A p of this system is also very sparse and a number of numerical tools are available for optimizing the solution of linear system with such structure.

Inter-connection between neighboring groups in multiple-pixel fitting
The previous Section described the simultaneous inversion of a group of pixels. The proposed multi-pixel strategy uses expected continuity in pixel-to-pixel variability of aerosol and land surface properties as an additional retrieval constraint. At the same time pixel group has limited 9 Fig. 6   Fig. 6. The diagram illustrates the approach for retrieval aerosol from a multiple pixels of POLDER/PARASOL. size: N x × N y × N t . One can naturally expect the continuity/similarity of aerosol and surface properties observed the edges of this pixel group and with the properties in the pixels-neighbors located just outside of the N x × N y × N t pixel group. These additional constraints enforcing continuity of retrieved properties between different inverted groups of pixels can be rather logically added to the inversion formulations described in the preceding Sections. As shown by detailed derivation in Appendix B, the additional conditions of continuity of aerosol and surface properties with the values of corresponding parameter in the neighborhood of the inverted pixel group can be implemented by adding extra terms in Eq. (34). Specifically, the minimized quadratic form (a p ), its gradient ∇ (a p ) and Fisher Matrix A p will include the extra-terms edge (a p ), ∇ edge (a p ) and (A p ) edge accordingly. As demonstrated in Appendix B, these additional terms are rather transparent and can be calculated with very minor computational effort. At the same time, this way of equation organization provides rather powerful additional tools constraining the retrieval. Indeed, the inversion of satellite observations over large geographical areas and extended time periods cannot be implemented simultaneously due to natural limitations of computer resources. Instead, the large records of satellite observations can be inverted sequentially by small pixel parcels/groups of limited N x × N y × N t size. Adding intergroup constraints makes such sequential retrieval nearly equivalent to simultaneous inversion of all data. The only difference is that the retrieval for each small pixel group would not benefit from the information contained in the observations over the pixels inverted in subsequent retrieval acts. At the same time, the retrieval would fully benefit from all preceding retrievals. Thus, taking into account that spatial and temporal correlations of both aerosol and surface properties have rather limited range, one can expect that sequential pixel parcel-by-parcel retrieval with the interparcel continuity constraints should produce the extended fields of the retrieved parameters. The high consistency of those fields are enforced by unified inter-pixel constraints.
It is worthwhile mentioning the analogy of the sequential parcel-by-parcel retrieval with the Kalman Filter type retrieval sequences. For example, if the "edge" inter-pixel constraints are implemented using only first differences, and if they were applied only in the time dimension (when the retrieved values of unknowns in the given time are used to constrain the solution in consequent time moment), then the technique described above in this Section becomes full equivalent of known Kalman Filter retrieval technique (Kalman, 1960).

Assuming common parameters for several different pixels in the retrieval
The multi-pixel retrieval scheme suggested in the above Sections takes advantage of the limited spatial variability of aerosol or land surface reflectance for different pixels or temporary for the same pixel. However, there are some situations when it is reasonable to assume that some parameters do not change at all from pixel to pixel. For example, in the retrieval algorithms developed for geostationary observations, the land surface reflectance parameters and most of the aerosol parameters are assumed diurnally constant (e.g. . Moreover, even in the retrievals of aerosol from polar-orbiting observations (e.g. POLDER, MODIS), neglecting surface property variations on small time scales can be a reasonable assumption. Therefore, a possibility of assuming the inter-pixel invariant parameters can be a useful option for constraining the retrieval and verifying the importance of neglected variations. Inclusion of such assumptions, in principle, can be implemented by a significant increase of the corresponding Lagrange multiplier in Eqs. (34) and (B17) (in Appendix B). This would enforce very high correlation between corresponding retrieved parameters and make them practically equal. However, such enforcement of the full dependence of the retrieved parameters has serious deficiencies. Indeed, the increase of γ t would result in a strong increase in the smoothness matrices inter and edge into the total Fisher matrix in Eqs. (34) and (B18) (in Appendix B). Correspondently, making contributions of inter and edge dominant would doubtlessly produce degenerated Normal systems. Therefore, if one can expect that properties of aerosol or surface in different pixels do not change, the retrieval of a single group of those constant parameters for many pixels seems more logical than the retrieval of several groups of parameters (one group per each pixel) forced by the enhanced smoothness constraints to have very close values in different pixels. For example, this retrieval approach leads to a significantly smaller number of simultaneously retrieved parameters that essentially simplifies the numerical inversion. However, if the algorithm is based on this straightforward strategy it looses flexibility, because practical implementation of the algorithm allowing easy increase or decrease in the number of fixed parameters is logistically very challenging. Therefore, here we suggest another approach. The approach uses the general formulation of the inversion strategy via inter-pixel smoothness constraints even for retrieval scenarios when some derived parameters are fixed to be the same for different observations (e.g. assuming that some parameters of BRDF are intra-day independent over the same pixel) or even for one set of observation (e.g. assuming that some parameters of BRDF are spectrally independent in any single pixel).
The approach is the following. First, the Normal system is built under the assumption that every single parameter that drives the forward model is retrieved. Then, as shown in Appendix C, fixing several parameters equal in the inversion can be achieved by rather simple modification of the Normal system without any other changes in the inversion algorithm. Assume one has defined the Fisher matrix A (N ) and gradient ∇ (N) for deriving N parameters a i . Then if we assume that n + 1 parameters are equal, i.e. a i = a i k (k = 1, ..., n + 1), we should decrease number of the retrieved parameters and recalculate Fisher matrix and gradient for new decreased set of the retrieved parameters. Direct realization such reduction of the retrieved set of parameters requires significant changes in the algorithm. However, analyzing structure of the equations we noted that the Fisher matrix and gradient for the reduced set of parameters can be obtained trivially from the Fisher matrix and gradient calculated for original extended set of retrieved parameters. Specifically, we can implement the retrieval simply by decreasing the dimensions of the Fisher matrix and gradient by summing up n + 1 of lines and columns of the equal parameters. As a result the solution can be obtained by solving modified Normal system with Fisher matrix A (N−n) and gradient ∇ (N−n) of smaller dimensions: (N − n) × (N − n) and (N − n) correspondingly. The details are shown in Appendix C. It is important to note that the described reorganization of the Normal system is rather straightforward and is not time consuming.

Algorithm functionality and sensitivity tests
The algorithm is designed to retrieve a rather extended set of parameters describing the atmospheric aerosol and underlying surface reflectance. The list of the parameters is given in Table 1. The algorithm derives basically the same group of aerosol parameters in both scenarios of observation over ocean or land. At the same time, there is some flexibility in setting the representation of the aerosol size distribution. The size distribution can be modeled using different number of size bins N i : (i) large number of the size bins (N i ≥ 10) modeled as tri-angle functions; (ii) small number of the size bins (N i < 10) modeled as log-normal size functions with different assumed width. The smaller number of bins can be set for reducing retrieval implementation time. The retrieval of large numbers of size bins can be a preferable option in situations where observations are more sensitive to details of aerosol properties. For example, contribution of the atmospheric aerosol dominates the POLDER observations over the dark ocean surface. The modeling of ocean surface reflection fully follows the approach the currently operational POLDER algorithm (Deuzé et al., 2001;Herman el al., 2005;Tanré et al., 2011). The only difference assumed in the developed algorithm is a possibility to retrieve the values of the Lambertian reflectance linked to the seawater reflectance at short wavelengths (0.44 and 0.49 µm) and to the whitecaps reflectance properties. The details of algorithm performance in this situation have not been tested yet. We plan to provide analysis and illustrations of the retrieval algorithm performance over ocean in future publications.
The design of the inversion algorithm allows two complementary inversion scenarios. The conventional Single-Pixel Inversion is designed to retrieve all above aerosol and surface parameters for each single pixel of observation. The Multi-Pixel Retrieval implements the simultaneous inversion of satellite observations over a group of the neighboring pixels obtained during the limited time period (e.g. several weeks). This strategy allows applying rather diverse constraints on variability of every retrieved parameter in space and time. It also allows the assumption of time or space pixel independence of any single parameter or a group of the retrieved parameters. Based on the known experience in aerosol and land surface retrieval developments, and on general understanding of limitations of the information content (checked in sensitivity tests) the following combination of assumptions has been chosen for applying to POLDER/PARASOL observations.
For aerosol (within each single pixel): -The aerosol size distribution is assumed smooth.
-The spectral dependence of n(λ i ) and k(λ i ) is assumed smooth.

For aerosol (between pixels):
-The moderate spatial variability is assumed for all aerosol parameters with the exception of aerosol loading (driven by aerosol total concentration).
-Significant spatial variability of aerosol loading (total concentration) is allowed.
For surface reflectance (within each single pixel): ρ o (λ) is assumed moderately spectrally smooth.
-B (λ i ) is assumed strongly to moderately spectrally smooth.
For surface reflectance (between pixels): -All parameters of surface reflectance are assumed constant during ∼7 days.
-No constraints are applied at spatial variability of the parameters. Table 5 summarizes the above assumptions and Tables 3  and 4 show the Lagrange parameters used for applying smoothness constraints on the size or spectral dependencies of the retrieved aerosol and surface properties in each single pixel and on the spatial and temporal variability of the any single retrieved parameter between different pixels.
A series of sensitivity tests has been performed to verify the performance and potential of the developed algorithm.
The tests have been designed to provide rather compact and conclusive illustration of capabilities and limitations of the algorithm to derive full set of aerosol and surface reflectance parameters from POLDER type satellite observations. The discussion below will be focused on the satellite observations over land surfaces. The tests verifying the performance of the algorithm over dark surfaces are not discussed in this paper. Nonetheless such tests were also performed and they showed rather robust retrieval of all aerosol parameters in most tested situations. Some results of such tests have been shown in the paper by Kokhanovsky et al. (2010) that discusses the outcome of the series of aerosol retrieval "blind test". In the framework of this study the observations of aerosol over dark surfaces by different satellites were simulated for a single chosen undisclosed aerosol model. These observations were distributed to different research groups involved in aerosol retrieval developments. The groups willing to participate in such tests have derived the aerosol properties from the obtained synthetic observations and returned the obtained aerosol retrieval results to the distributor of the data. Once all retrieval results were collected, the assumed aerosol model was disclosed and compared with the collected retrieval results. Based on the analysis of the "blind test" outcome the present algorithm (mentioned in the paper by Kokhanovsky et al., 2010, as "LOA-2") was rated among the algorithms providing the most comprehensive and accurate results. The retrieval results provided for studies by Kokhanovsky et al. (2010) were obtained using conventional pixel-by-pixel retrieval approach. Table 5. Summary of the a priori assumptions applied for constraining of each element of the vector of unknowns.

AEROSOL:
Parameter: Single-Pixel Inter-Pixel constraints constraints C v no t -no; X -mild; Y -mild dV(r i )/dlnr weak size smoothness t -constant during 7 days; X-Y (horizontal) -mild n(λ i ) strong spectral smoothness t -constant during 7 days; X-Y (horizontal) -mild k(λ i ) mild spectral smoothness t -constant during 7 days; X-Y (horizontal) -mild SURFACE REFLECTANCE: Parameter: Single-Pixel Inter-Pixel constraints constraints ρ o (λ i ) weak spectral smoothness t -constant during 7 days; X-Y (horizontal) -mild κ(λ i ) strong spectral smoothness t -constant during 7 days; X-Y (horizontal) -mild θ (λ i ) strong spectral smoothness t -constant during 7 days; X-Y (horizontal) -mild h 0 (λ i ) weak spectral smoothness t -constant during 7 days; X-Y (horizontal) -mild B(λ i ) strong spectral smoothness t -constant during 7 days; X-Y (horizontal) -mild The sensitivity tests for retrieval over land surface were aimed to verify the performance of the retrieval approach under conditions maximally reproducing the real environment. With that purpose, the POLDER observation geometry and the aerosol and surface characteristics have been assumed to mimic closely the observations over two AERONET (Holben et al., 1998)

observation sites, Banizoumbou (Niger) and
Mongu (Zambia). The properties of both aerosol and surface reflectance were extensively studied and discussed in the scientific literature. For example, the observations over Banizoumbou were discussed in a number of papers Johnson et al., 2009;Sow et al., 2009, etc.), and the observations over Mongu were discussed in many studies (Dubovik et al., 2002a;Eck et al., 2003;Schmid et al., 2003;Haywood et al., 2003;Pilewskie et al., 2003;Lyapustin et al., 2006;Sinyuk et al., 2007, etc.). Two rather distinct aerosol types dominate the aerosol over AERONET sites: desert dust and biomass burning. Mongu is highly affected by biomass burning aerosol. The aerosol over Banizoumbou site is strongly impacted by desert dust outbreaks with notable seasonal contributions of smoke. Biomass burning is known to be dominated by fine absorbing spherical particles. All optical properties of biomass burning aerosol (extinction, single scattering albedo, etc.) generally have distinctly decreasing spectral dependence (e.g. Eck et al., 1999;Dubovik et al., 2002a;Reid et al., 2005, etc.). In contrast, desert dust is dominated by coarse non-spherical and weakly absorbing particles. Therefore, the extinction of desert dust is generally spectrally flat (e.g. Eck et al., 1999;Holben et al., 2001, etc.). Dust particles generally are nearly non-absorbing in visible with some moderate absorption appearing at ∼0.5 µm and increasing towards UV spectral range (e.g. Kaufman et al., 2001;Dubovik et al., 2002a;Lafon et al., 2006, etc.). Biomass fine aerosol particles strongly polarize the scatted light, while the polarization by coarse non-spherical desert dust particles is weak (e.g. Volten et al., 2001;Dubovik et al., 2006, etc.). The two series of tests were conducted: (i) when desert dust was dominating and (ii) when biomass burning was dominating. The sensitivity tests were carried out for a wide range of aerosol loadings. The 16 test scenarios were designed to cover the range from τ (0.44 µm) = 0.01 to τ (0.44 µm) = 4. The size distributions for both biomass and desert dust aerosol models were adapted from original observations over Banizoumbou and Mongu AERONET sites. The 22 AERONET size distribution bins were used for generating synthetic observations. Since the size distributions were adapted from actual AERONET observations, they did not have perfect multi-modal shape. This fact makes the tests more challenging but more realistic. The values of complex refractive index for λ = 0.44, 0.67, 0.87 and 1.02 µm were adapted from actual AERONET observations. The values for intermediate spectral channels λ = 0.49 and 0.55 µm were obtained by the interpolation. At the same time it should be noted that values chosen for both the size distribution and complex refractive index are generally close to the values reported from AERONET retrieval climatology for dust and biomass burning by Dubovik et al. (2002a). The random noise at the level of 1 % for intensity and 0.5 % for degree of linear polarization of PARASOL signal have been added to the simulated "synthetic" PARASOL observations. These synthetic observations were inverted by the present algorithm using two retrieval scenarios: conventional pixel-by-pixel and multi-pixel retrieval. In the first scenario all synthetic observations were inverted fully independently. For multi-pixel retrieval, the "synthetic" PARASOL observations were assumed to be observed during ∼1 week in four consequent observations (with time difference of 1 to 2 days). Table 6. Definition of " the initial guess" for the vector of unknowns.

Aerosol Properties Surface Reflectance
C v = C 0 (corresponding to the value of τ aer (0.44)∼0.05); ρ o (λ i ) = 0.05 (i = 1, ..., N λ ) dV(r i )/dlnr = 0.1; (i = 1, ..., N r ) κ(λ i ) = 0.75 (i = 1, ..., N λ ) It was assumed that the four neighboring pixels were observed at each observation time. For example, Fig. 6 shows situation, when 9 (3 × 3, where N x = 3 and N y = 3) pixels observed during 3 consequent observations. One can imagine similar case when 4 (2 × 2, where N x = 2 and N y = 2) pixels observed during 4 consequent observations. The constraints were applied on the inter-pixel variability of the retrieved aerosol and surface properties as described above in this Section and summarized in Tables 2-5. Specifically, the properties of the surface reflectance were assumed constant during the week for the same pixels. No constraints were applied on either temporal or horizontal variability of aerosol concentration and for all other aerosol parameters only weak horizontal variability was allowed during the same day (i.e. observed by the same PARASOL image). All retrievals in the tests were conducted using most detailed size distribution representation: 16 logarithmically equidistant size bins covering the range of aerosol particle radii from 0.1 to 7 µm. The same initial guess was used for the unknown parameters in every pixel. It is described in Table 6. Figures 8-11 show the results of the sensitivity test for retrieving biomass burning aerosol over Mongu. The illustrations demonstrate the retrieved normalized size distribution, τ (0.44 µm), ω o (λ) and spectral albedo of the surface calculated using the retrieved parameters of aerosol microphysics and surface reflectance respectively. Each figure shows the retrieval results for three situations. Left part of each figure shows the results of the retrieval for the case with no noise added. It should be noted however, that some discrepancies were present even in this case since the size distribution is modeled using 22 logarithmically equidistant size bins covering range of particle radii from 0.05 to 15 µm, while retrieval uses only 16 size bins covering radii from 0.1 to 7 µm. The central and the right parts of each figure show the results of the retrieval with no noise added at the level of 1 % for total radiances and 0.5 % for degree of linear polarization. Both the left and central parts show the results obtained using conventional pixel-to-pixel retrieval approach, while the right part of each figure shows results obtained using the multi-pixel retrieval strategy. Figures 12-15 are analogous to Figs. 8-11 with the difference that they illustrate the retrieval results of desert dust over Banizoumbou (Niger).
The analysis of the results shows that in situations where no noise is added, all aerosol characteristics are retrieved rather accurately. Some notable deviations are seen for single scattering albedo and size distribution in situations with very low aerosol loading when τ (0.44 µm) ≤ 0.2. The deviations are particularly notable for the size distribution retrievals of the coarse size particles and for single scattering albedo of biomass burning at long wavelengths. These   deviations can be explained by the fact that at such low aerosol loadings the aerosol contribution into radiation observed by satellite is negligible compare to the contribution of land surface reflectance. Therefore even minor perturbations of the observation may significantly affect the retrieval results. In contrast, in situations with very high aerosol loading τ (0.44 µm) ≥ ∼2-2.5, the contribution of aerosol dominates in the observed reflected radiation and the reflectance properties of the underlying land surface become nearly invisible for satellite. Correspondingly, the retrievals of surface reflectance become unstable, as it is seen for retrieved surface reflectance values at short wavelengths. All these tendencies become pronounced once the random noise is added. As one can see from central parts of Figs. 8-15, the size distribution and single scattering albedo retrieved by conventional pixel-by-pixel approach deteriorate in most of the cases including the situations with moderate and even high aerosol loading with τ (0.44 µm) reaching 0.4 and even 0.8 (for single scattering albedo in particular). In addition one can see the significant complications in the retrieval of shape of aerosol size distribution, in particular for larger particle sizes. The size distribution retrieved for desert dust (see Fig. 13) becomes much wider and the maximum shifts towards smaller sizes. In retrievals of biomass burning size distributions ( Fig. 9) there are also significant complications for the large radii. This difficulty can be explained by the wellknown fact (e.g. see Bohren and Huffman, 1983) of strongly decreasing scattering efficiency (per unit of particle volume, i.e. k ext (λ,r)/v(r)) for particles with sizes much larger than the wavelength of observations. Therefore retrieving accurate shape of size distribution for radii ≥ ∼3 µm from POLDER/PARASOL observations appears to be very difficult. The retrieval of aerosol optical thickness τ (0.44 µm) seems to be rather reliable in situations when random noise is added with the exception of cases of very high aerosol loading τ (0.44 µm) ≥ ∼3 where the retrieval errors reach the values of 0.5 or even larger (see Fig. 8). These high errors can be explained by the fact that in such situations the reflected  radiances are dominated by the multiple scattering of very high orders and some of fine agular features of reflected light distribution are smoothed out. Correspondingly, deriving detailed aerosol and surface properties becomes more difficult. The analysis of the right parts of Figs. 8-15 shows that using multi-pixel approach significantly improves the retrievals of all aerosol and surface parameters. This tendency is not surprising because added constraints allow propagation and consolidation of useful information from different observational situations. For example, in the situations with low aerosol loading the satellite observes mainly surface reflectance properties. Correspondingly, once the constraints limiting time variability of the surface reflectance are applied, this information is supplied into the interpretations of observations corresponding to moderate and high aerosol loading over the same pixel. Similarly, the constraints of horizontal variability of aerosol properties help to improve the retrieval of aerosol by benefiting from observations of the same or similar aerosol properties over several pixels with somewhat different conditions of observations (geometry, surface reflectance, aerosol loading).
It should be noted that such retrieved parameters as aerosol mean height, fraction of spherical particles, detailed parameters of BRDF and BPDF are not shown in Figs. 8-15. All these parameters have been included in the tests. The results are shown only for most significant aerosol and surface parameters commonly discussed in remote sensing retrieval analysis. In a future study, it is planned to implement the comprehensive sensitivity analysis of the retrieval approach suggested here. The sensitivity studies shown in the present paper are aimed to provide insightful but preliminary outlook at the expected performance of the retrieval. Nonetheless, it is possible to state here that the tests have shown reasonably robust retrieval of all sought parameters. Figures 11 and 15 show the retrieval of surface albedo. The retrievals of the detailed BRDF and BPDF parameters generally demonstrate the same trends. In pixel-by-pixel retrieval scenario all parameters were retrieved rather accurately in situations with low and moderate aerosol loading. Using multi-pixel retrieval helped to achieve stable retrieval of all surface reflectance parameters in all situations including the cases with high aerosol loading. In contrast, the retrieval of the mean altitude of the aerosol layer h a was retrieved robustly when aerosol loading was moderate or high. When pixelby-pixel retrieval scenario was used, h a was retrieved with accuracy better than 1 km for τ (0.44 µm) ≥ ∼0.5 and better than 0.5 km for τ (0.44 µm) ≥ ∼1.0. Once the multi-pixel scenario was applied, h a was retrieved with accuracy better than 0.5 km for τ (0.44 µm) ≥ ∼0.5 and better than 0.3 km for τ (0.44 µm) ≥ ∼1.0. The retrieval of the spherical particle fraction was rather successful for desert dust aerosol, where the coarse mode is dominant. When pixel-by-pixel retrieval scenario was used, C sph was retrieved with the accuracy of ∼ 50 % in situation with τ (0.44µm) ≥∼1.0. In multi-pixel retrieval scenario it was retrieved with the accuracy of ∼30 % in situations with τ (0.44 µm ) ≥ ∼0.2 and better than 10 % for τ (0.44 µm) ≥ ∼1.0.
Finally, Figs. 16-19 summarize the performance of the retrieval in an extended series of numerical tests with added random noise. The series were composed by the tests analogous to those illustrated in Figs. 8-15, with the difference that each test was repeated 100 times with different realizations of the generated noise. The random noise was added at the level of standard deviation σ = 1 % for total radiances and σ = 0.5 % for degree of linear polarization. Thus Figs. 16-19 show the results of 3200 inversions = 100 × 16 × 2 (100 -number of tests with different realizatios of modeled rundom noise; 16 -aerosol loading covering τ (0.44 µm) from 0.01 to 4.0; 2 -number of observational sites: Banizoumbou and Mongu) and the differences: (τ (retrieved) τ ("true"))/τ ("true") and ω o (retrieved)ω w ("true") for λ = 0.44 and 1.02 µm. All retrievals were conducted using most robust multi-pixel retrieval scenario. Table 7 provides a brief summary of the test outcome. It provides the standard deviations σ τ/τ (λ) of the relative errors calculated using the retrievals obtained with 100 different realizations of added random errors as follows: and These estimates can be considered as reference values for the expected retrieval errors. It should be noted that the real errors in actual POLDER/PARASOL observations (Fougnie et al., 2007) are likely to have 2 to 3 times higher levels than the errors modeled. Therefore, the actual retrieval errors should be expected to be higher than the numbers given in the Table 7 by a factor of 2 or 3. In any case, these estimates are to be verified in a more comprehensive sensitivity analysis.

Algorithm applications to real POLDER/PARASOL observations
As a final stage of this study the developed algorithm was applied to actual observations by POLDER/PARASOL. For consistency with the sensitivity test, the algorithm was used to process the full 2009 year of POLDER/PARASOL data over Banizoumbou (Niger) and Mongu (Zambia) AERONET sites. Specifically, the algorithm was applied to 4 POLDER/PARASOL pixels surrounding the exact locations of the sites. The multi-pixel inversion scenario was employed in the retrieval. Since the current studies did not  address the cloud-screening of the satellite data, the algorithm was applied to PARASOL data identified as cloud-free by the current operational POLDER algorithm (Deuzé et al., 2001;Herman el al., 2005;Tanré et al., 2011). In addition, the retrieval outliers with the fitting residual higher than 5 % were eliminated from the final results.
It should be noted that the algorithm has been developed here for utilization of complete set of POLDER/PARASOL observations as shown in Table 1. However, the analysis of PARASOL in-flight calibration and performance studies performed by Fougnie et al. (2007) indicate that PARASOL 0.443 µm band does not meet to the mission requirements due to unidentified stray light problem and, therefore, it is not recommended for use. Nonetheless for consistency with the sensitivity tests performed in Section 5, we have decided to use PARASOL measurements at 0.443 µm in these illustrative applications of the algorithm. Such a decision was also made with the idea to benefit from unique information content of the observations at the shortest wavelength. Indeed, the optical thickness of aerosol is generally higher at 0.443 µm compare to other PARASOL spectral bands, while the land surface reflectance is generally lower at 0.443 µm compare to other bands. In order to minimize the possible effect of the higher uncertainty at 0.443 µm spectral band, we have introduced the coefficient for correcting observations at this wavelength. Specifically, the data were inverted using a set of different correction coefficients that could increase or decrease the measured radiances at 0.443 µm up to 10 %. The coefficient corresponding to the best fit of corrected observations by the theoretical model was used in the illustrations below. Such strategy for correction of PARASOL observation is in agreement with evaluation of possible effect of Table 7. The summary of the sensitivity tests for aerosol retrievals from simulations mimicking POLDER observations over Mongu (Zambia) and over Banizoumbou (Niger) with the added random noise at the level of standard deviation σ = 1% for total radiances and σ = 0.5% for the degree of linear polarization.  Fig. 19. Statistics of the retrieval errors of desert dust single scattering albedo over Banizoumbou in the series of the numerical tests for 100 different realizations of random noise added (multi-pixel retrieval): (left panel) 0.44 µm; (right panel) 1.02 µm. stray light contamination (B. Fougnie, personal communication, 2011). In future studies we plan to investigate comprehensively the propagation of the PARASOL 0.443 µm band errors to the aerosol retrieval results. We hope to identify the best strategy for addressing this uncertainty in the operational application of the current algorithm of using the corrected 0.443 µm data (e.g. as it is done here) or eliminating them completely from the aerosol retrieval.
The τ (0.44 µm) and ω o (0.44 µm) calculated using the set of retrieved aerosol parameters (Table 2) were compared with available AERONET data. The comparisons showed rather robust performance of the algorithm. The retrieved values of aerosol optical thickness closely followed the AERONET observations with a correlation coefficient of ∼0.9 for Banizoumbou and ∼0.87 for Mongu. The values of aerosol single scattering albedo also agreed reasonably well with AERONET data for observations with high aerosol loading. The differences between the values of ω o (0.44 µm) obtained from PARASOL and from AERONET did not exceed 0.03-0.05 for the cases when τ (0.44 µm) ≥ ∼0.5. It also should be noted that the retrieval with different correction coefficients did not exhibit dramatic changes in the retrieval. For example, the differences in retrieval of τ (0.44 µm) and ω o (0.44 µm) due to changes in the correction coefficient had generally smaller magnitudes than mean differences between PARASOL and AERONET results.
Figures 20-21 illustrate the comparison of POLDER/ PARASOL retrievals with the AERONET observations during 2 months over Banizoumbou and Mongu. These particular periods were chosen for the illustrations because they corresponded to the most complete and continuous AERONET data records during the seasons with the presence of high aerosol loadings. As can be seen from the illustrations, the aerosol loading trends retrieved from PARASOL agree well with the AERONET observations. The values of aerosol single scattering albedo are also in reasonable agreement with   AERONET values when τ (0.44 µm) ∼0.5 or larger. At the same time, one can identify some notable differences between AERONET data and PARASOL retrievals in some single points. Nonetheless, our analysis showed that usually these points correspond to days when AERONET data indicate the partial cloudiness during the day as identified by the Smirnov et al. (2000) cloud-screening procedure. Therefore, such outliers can probably be explained by some sky inhomogeneities in one or all PARASOL observed pixels that cover an area of 12 × 12 km around the AERONET site. It should be also noted that comparisons of AERONET observations and PARASOL retrieval results showed that the retrievals tend to underestimate the spectral dependence of retrieved aerosol properties. The desert dust retrievals tend to show more pronounced spectral dependence of aerosol properties, while the aerosol properties retrieved for biomass burning tend to have smaller spectral dependence than observed from AERONET. As follows from the sensitivity tests and our general understanding, this is likely due to very low sensitivity of satellite observations to the shape of the aerosol size distribution for large particles. Indeed, the contribution of very large particles (r > 3 µm) into radiation reflected to space is significantly smaller than from fine particles, while the spectral variability of aerosol properties is very sensitive to the presence of such large particles. Therefore, deriving fully adequate spectral dependence of aerosol optical properties from satellite observations appears to be a harder task than deriving aerosol loading especially over bright land surfaces considered in present test. Apart from this relatively minor issue, the present analysis showed very encouraging performance of the developed algorithm over bright land surfaces, i.e. in conditions that are considered traditionally the most difficult for retrieval of aerosol from satellites. This result is particularly encouraging because the algorithm is designed to provide rather extensive set of the retrieved parameters providing detailed characterization of the properties of aerosol and the underlying surface. Both the results of numerical sensitivity tests and the obtained results of actual PARA-SOL data inversion suggest that the processing the PARA-SOL with developed algorithm can provide global products of total aerosol optical thickness and single scattering albedo rather consistent with the accuracy requirements formulated by Mishchenko et al. (2004Mishchenko et al. ( , 2007. However, some limitations in accuracy and scope of the aerosol information derived by presented algorithm should be noted. For example, the present approach does not discriminate between optical properties of aerosol fine and coarse modes (see Sect. 3.3). Such retrieval assumption is consistent with the resutls of limited tests conducted in scope of this study for POLDER type imager. In addition, analyzing the standard deviations σ τ/τ (λ) resulted of retrieval tests with synthetic PARASOL observations perturbed of random noise and shown in Table 7 one can see that errors in the retrieved total aerosol optical thickness τ (λ) (especially for high values of τ ) could exceed the value of 0.04 suggested by Mishchenko et al. (2004Mishchenko et al. ( , 2007 as APS retrieval uncertainty expected over land for optical thickness of both fine and coarse modes of aerosol. At the same time, the results presented here are preliminary. Further testing, verification and tuning of the presented algorithm are planned. In addition, the efforts are also planned for the acceleration of the developed algorithm. In the current state of the algorithm the implementation of the retrieval for a single PARASOL pixel (5.3 × 6.2 km) required in average of ∼10 s of computer time. This time should be decreased in 50 to 100 times, in order to reprocess existing data archive of PARASOL observations in reasonable time framework. Such acceleration of the algorithm is expected to be achieved by reengineering and optimizing the algorithm, for example, using parallel programming and implementing the retrievals at several computers in parallel.

Conclusions
The paper has discussed in detail a concept for a new stateof-the-art algorithm developed for deriving detailed properties of atmospheric aerosol from satellite observations. The proposed retrieval does not use precalculated look-up tables commonly utilized in satellite retrievals for fitting observations. Instead, a more general approach is applied that searches in continuous space for the solutions and optimizes the statistical properties of the obtained retrieval. Such optimization can be achieved by adjusting the structure of the deviations in an effort to fit observations by theoretical model under conditions where the amount of observations exceeds the number of retrieved parameters. The set of observations provided by modern enhanced spectral multi-viewing spectral polarimeters allows applying such optimization. The algorithm described in this paper was adapted for applying to observations of POLDER/PARASOL imager that registers reflected atmospheric radiation at six wavelengths in up to 16 directions. The algorithm fits total radiance and linear pollarization observed in all directions in all available spectral channels using generalized multi-term Least Square type numerical inversion formulation. That formulation allows fitting several sets of both observations and a priori data. The concept complementarily unites advantages of a variety of practical inversion approaches, such as Phillips-Tikhonov-Twomey constrained inversion, Kalman filter, Newton-Gauss and Levenberg-Marquardt iterations, etc. This methodology has resulted from a multi-year effort at developing inversion algorithms for retrieving comprehensive aerosol properties from AERONET ground-based observations. The algorithm is driven by a large number of unknowns and designed as a retrieval of extended set of parameters affecting measured radiation. For example, over land the algorithm is set to retrieve parameters of the land surface reflectance together with detailed information about aerosol sizes, shape, absorption and composition (refractive index) and aerosol layer elevation. In addition, the algorithm is developed as simultaneous inversion of a large group of pixels within one or several images. Such, multi-pixel retrieval regime takes advantage of known limitations on spatial and temporal variability in both aerosol and surface properties. Specifically the pixel-to-pixel and/or day-to-day variations of the retrieved parameters are enforced to be smooth by additional appropriately set a priori constraints. This concept is aimed to achieve higher consistency in satellite retrievals because in such an approach the solution over each single pixel is benefiting from information contained in coincident observations over neighboring pixels, as well as from information about surface reflectance (over land) obtained in preceding and consequent observations over the same pixels. The paper provided detailed description of full set of formulations necessary for realizing this concept.
The performance of the developed algorithm has been demonstrated by application to both synthetically generated and real POLDER/PARASOL observations. First, a series of sensitivity tests was conducted by applying the algorithm to the synthetic POLDER/PARASOL observations over vegetated and desert surfaces. The simulations were designed to mimic satellite observations over well-studied AERONET network sites in Mongu (Zambia) and Banizoumbou (Niger). The synthetic POLDER/PARASOL signals were perturbed by random noise prior to applying the retrieval algorithm. Both the conventional pixel-by-pixel and newly suggested multi-pixel retrieval approaches were tested. The results of the tests showed that the complete set of aerosol parameters can be robustly derived with acceptable accuracy in both situations over both vegetated and desert surfaces. The summary of the error analysis is provided. In addition, the algorithm was applied to one year of PARASOL observations over both Mongu and Banizoumbou AERONET sites. The comparison of the derived aerosol properties with available observations by AERONET ground-based sun/sky-radiometers indicated encouraging consistency of PARASOL derived optical thickness and single scattering albedo with those obtained by AERONET. At the same time, the presented tests and analysis of the retrieval from actual PARASOL observation had somewhat limited character and were aimed to provide an introduction and some limited illustration of the proposed retrieval algorithm. More comprehensive studies for testing and tuning the developed algorithm are planned in future efforts. Such important aspects of algorithm implementation for operational processing as cloud-screening and retrieval time requirements are to be addressed in follow-on studies.
It should be noted that the research efforts described in this paper considerably relied on the accumulated experience and many aspects of the retrieval, as well as actual computer tools that were inherited from precedent efforts on development of currently operating AERONET retrieval and PARASOL aerosol retrieval algorithms.

Matrices of multi-pixel smoothness constraints
In order to determine the index of pixels we will follow first the changes of x i -coordinates (i = 1, ..., N x ) than y i -coordinates (i = 1, ..., N y ) and t i -time coordinates (i = 1, ..., N t ). For the simple case when N x = N y = N t = 2, the total vector of unknowns a is composed of vectors a i of each i-th pixel as: a (x 1 ; y 1 ; t 1 ) a (x 2 ; y 1 ; t 1 ) a (x 1 ; y 2 ; t 1 ) a (x 2 ; y 2 ; t 1 ) a (x 1 ; y 1 ; t 2 ) a (x 2 ; y 1 ; t 2 ) a (x 1 ; y 2 ; t 2 ) a (x 2 ; y 2 ; t 2 ) The spatial and temporal smoothness constraints are applied separately along corresponding coordinates, i.e. constraints of variability over x i -coordinates are applied only to the vectors with the same values of y i and t i ; constraints of variability over y i -coordinates are applied only to the vectors with the same values of x i and t i and constraints of variability over t i -coordinates are applied only to the vectors with the same values of x i and y i . Therefore, if x = x i+1 − x i = const, y = y i+1 − y i = const and t = t i+1 − t i = const, for the vector a defined by Eq. (A1) the corresponding matrices of first differences are defined as: where the smoothness matrix D is defined as: where S 1 is the matrix of first differences. The matrix I d ij is a diagonal matrix of dimension N a × N a (N a is dimension of corresponding vector a i ) with the all elements on the diagonal equal to corresponding element d ij of matrix D, i.e.: The definition of inter can be generalized for situations with a larger number of pixels. For example, for the case similar to above but for N x = N y = N t = 3, Eq. (A3) are transformed to the following: where the diagonal matrices I d ij are defined according to Eq. (A5) with the differences that the smoothness matrix D is generated as the base of the matrix of first S 1 or second differences S 2 for N x = N y = N t = 3 as: The two examples given above show a clear pattern in defining the matrix inter for N x = N y = N t = 2 and 3 that can easily be generalized to the situation with higher dimensions. It can be observed from this pattern that inter always retains the sparse array structure. Specifically, the dimension of the matrix inter is: while the number of non-zero elements does not exceed the following value: where m is the order of the differences/derivatives used for inter-pixel smoothing. Obviously, the value of Eq. (A9) is generally significantly smaller than all the elements of inter and thus inter is essentially a sparse matrix. It should be noted that all of the above formulations are given for the idealized situation where x i = x i+1 − x i = const, y i = y i+1 − y i = const, t i = t i+1 − t i = const and N x = N y = N t . In reality, none of these conditions are assured and the algorithm should be able to handle the more general situation where x i = y i = t i = const as well as N x = N y = N t . In such general situations, the above equations describing the matrix inter loose some of their simplicity and transparency. Nonetheless, the general sparse structure of the matrix inter is always conserved. Also, even the equations loose some transparency, the realization of interpixel smoothness constraints inter is always rather simple on the algorithmic level and the most general case has been realized in the algorithm described here.

Constraining multiple-pixel retrieval by available information in the neighborhood of the inverted pixel group
If the values of a * nbr parameters a in the pixels neighboring the inverted pixel group are known, then the basic system Eq. (32) can be complemented by the additional equations: x,y,t = S x,y,t a + ( a) s * x,edge = S x,edge a + ( x a) s * y,edge = S y,edge a + y a s * t,edge = S t,edge a + ( t a) , where S x,edge , S y,edge , S t,edge are matrices containing the coefficients defining the m-th finite differences of parameters a describing the properties of the inverted pixel group with a * nbr . These matrices can be trivially derived from S m matrices (cf. Eq. 25b). For example, in the simple case where N x = 3, N y = 1 and N t = 1, one can write:   a * before a a * after   =   a (x 1 ; y 1 ; t 1 ) a (x 2 ; y 1 ; t 1 ) a (x 3 ; y 1 ; t 1 )     a (x 4 ; y 1 ; t 1 ) a (x 5 ; y 1 ; t 1 ) a (x 6 ; y 1 ; t 1 )     a (x 7 ; y 1 ; t 1 ) a (x 8 ; y 1 ; t 1 ) a (x 9 ; y 1 ; t 1 ) (B2) If Eq. (B1) uses the second differences, continuity with a * before can be expressed using the following formulations: 0 * 1 = a * 2 − 2 a * 3 + a 4 + 1 0 * 2 = a * 3 − 2 a 4 + a 5 + 2 .
In matrix form this equation can be written as: Finally, using Eqs. (30) and (34), we can define s * x,edge and S x,edge used in Eq. (B1) as follows: Thus, the derivations shown by Eqs. (B1)-(B11) demonstrate the principle of defining the constraints that enforces the continuity of the retrieved properties with those obtained for the pixels neighboring the inverted pixel group. Analogously the constraints can be included for continuity over coordinates y and t. In the general case of a large inverted pixel group one can use rather large numbers of equations similar to those of Eqs. (B3) and (B8). The maximum number of such edge continuity equations can be estimated as: where m is the order of the differences/derivatives used for inter-pixel smoothing.
Based on the additional equations shown in Eq. (B1), the constraining of the multi-pixel system solution by additional conditions of continuity of aerosol and surface properties with values in the neighborhood of the inverted pixel group can be implemented by adding extra terms in the Eq. (20). The quadratic form (a p ), its gradient ∇ (a p ) and Fisher Matrix A p formulated for inversion of combined multi-pixel Eq. (26) should be modified as follows: It should be noted that the illustrative derivations shown in Eqs. (B3)-(B12) are given for the idealized situation where x i = x i+1 − x i = const and N y = N t = 1. In reality, the edge continuity can be applied over all three coordinates x,y,tand the algorithm should be enabled to handle the most general situation when x i = y i = t i = const as well as N x = N y = N t . In such general situations, the above equations describing the matrix edge loose their simplicity and transparency. Nonetheless, the general sparse structure of the matrix edge is always conserved. Also, even the equations loose some transparency, the realization of inter-pixel smoothness edge constraints edge is always rather simple on the algorithmic level and it takes negligible computer time to implement. The values of the Lagrange parameters γ x , γ y and γ t used for applying the "edge" inter-pixel constraints are exactly the same as those used in Sect. 4.2.1 describing inter-pixel smoothness constraints application in the simultaneous inversion of a group of pixels.

Assumption of inter-pixels constant parameters in the retrieval
The general idea of this technique of imposing an extra assumption of equal parameters can be demonstrated using the following simple illustration. Assuming, Eq. (17) is represented by the following simple system corresponding Normal system Aa = ∇ (a) is defined as If two parameters a 1 and a 2 are retrieved from one single observation f * 1 , then one can write: For such situation the Fisher matrix A and gradient ∇ (a) are A (2) = U T U = u 2 1 u 1 u 2 u 2 u 1 u 2 2 ; (C4) If only it is assumed that a 1 = a 2 and only one parameter retrieved, then the definitions given by Eq. (C3) should be replaced by: Correspondingly, the Fisher matrix A and gradient ∇ (a) become Comparing Eqs. (C4) and (C5) one can note the following trivial relation between A (1) , ∇ (1) and A (2) , ∇ (2) :
These relationships can be easily generalized. Let us assume that we have defined the Fisher matrix A (N ) and gradient ∇ (N ) for deriving N different parameters a i . Then it was decided (or discovered) that n + 1 of those initially different N parameters represent the same single parameter, i.e. a i = a i k (k = 1, ..., n + 1). In this situation, the new Fisher matrix A (N−n) and gradient ∇ (N−n) can be obtained as follows: The resulting matrix A (N−n) has reduced dimension (N − n) × (N − n). Thus, Eqs. (C8)-(C10) allow significant flexibility in designing the retrieval. The option is fully implemented in the algorithm presented here.