The Community Cloud retrieval for CLimate (CC4CL) &ndash; Part 2: The optimal estimation approach

Abstract. The Community Cloud retrieval for Climate (CC4CL) is a cloud property retrieval system for satellite-based multispectral imagers and is an important component of the Cloud Climate Change Initiative (Cloud_cci) project. In this paper we discuss the optimal estimation retrieval of cloud optical thickness, effective radius and cloud top pressure based on the Optimal Retrieval of Aerosol and Cloud (ORAC) algorithm. Key to this method is the forward model, which includes the clear-sky model, the liquid water and ice cloud models, the surface model including a bidirectional reflectance distribution function (BRDF), and the "fast" radiative transfer solution (which includes a multiple scattering treatment). All of these components and their assumptions and limitations will be discussed in detail. The forward model provides the accuracy appropriate for our retrieval method. The errors are comparable to the instrument noise for cloud optical thicknesses greater than 10. At optical thicknesses less than 10 modeling errors become more significant. The retrieval method is then presented describing optimal estimation in general, the nonlinear inversion method employed, measurement and a priori inputs, the propagation of input uncertainties and the calculation of subsidiary quantities that are derived from the retrieval results. An evaluation of the retrieval was performed using measurements simulated with noise levels appropriate for the MODIS instrument. Results show errors less than 10 % for cloud optical thicknesses greater than 10. Results for clouds of optical thicknesses less than 10 have errors up to 20 %.


Abstract. The Community Cloud retrieval for Climate (CC4CL) is a cloud property retrieval system for satellite-based multispectral imagers and is an important component of the Cloud Climate Change Initiative (Cloud_cci) project. In this paper we discuss the optimal estimation retrieval of cloud optical thickness, effective radius and cloud top pressure based on the Optimal Retrieval of Aerosol and Cloud (ORAC) algorithm. Key to this method is the forward model, which includes the clear-sky model, the liquid water and ice cloud models, the surface model including a bidirectional reflectance distribution function (BRDF), and the "fast" radiative transfer solution (which includes a multiple scattering treatment). All of these components and their assumptions and limitations will be discussed in detail. The forward model provides the accuracy appropriate for our retrieval method. The errors are comparable to the instrument noise for cloud optical thicknesses greater than 10. At optical thicknesses less than 10 modeling errors become more significant. The retrieval method is then presented describing optimal estimation in general, the nonlinear inversion method employed, measurement and a priori inputs, the propagation of input uncertainties and the calculation of subsidiary quantities that are derived from the retrieval results. An evaluation of the retrieval was performed using measurements simulated with noise levels appropriate for the MODIS instrument. Results show errors less than 10 % for cloud optical thicknesses greater than 10. Results for clouds of optical thicknesses less than 10 have errors up to 20 %.

Introduction
Remote sensing of clouds from satellites is vitally important for advancing our understanding of the Earth and its climate. Essential cloud parameters to retrieve are optical thickness, particle size and cloud top pressure. These parameters are critical for determining the liquid and ice water content of clouds and for evaluating both radiative and latent heating rates. Methods used to retrieve cloud properties from radiometric measurements made from satellite-based sensors abound. These methods differ in the types of measurements used, the assumptions made and the way the forward problem (the simulation of measurements given cloud properties) is inverted to obtain an estimate of cloud properties from the measurements.
The theoretical basis for retrieving cloud optical thickness and particle size from solar reflectance measurements has been discussed by several authors (Hansen and Pollack, 1970;King, 1997) while measurement studies with airborne sensors have also been presented (Twomey and Cocks, 1982;Foot, 1988Foot, , 1998Rawlins and Foot, 1990). The re-trieval relies on the fact that the reflectance from clouds at non-absorbing visible wavelengths (0.63 and 0.86 µm) is primarily a function of optical thickness, whereas in the nearinfrared (1.6, 2.1 and 3.7 µm) both liquid water and ice particle absorption become significant so that the reflectance is primarily a function of particle size. Essentially, the problem involves the solution of two nonlinear equations for two unknowns. Nakajima and King (1990) formally outlined the theory as it is applied in many retrievals today. In a companion paper, Nakajima et al. (1991) applied the technique to ER-2-based measurements made with a multispectral cloud radiometer at 0.754 and 2.160 µm observing cloud along the same ground track as aircraft-based in situ measurements. The encouraging results were followed by similar retrievals of optical thickness and effective radius using AVHRR (Nakajima and Nakajma, 1995;Ou et al., 1999), MODIS (Rolland et al., 2000;Platnick et al., 2003Platnick et al., , 2017King et al., 2010), VIIRS , GOES (Walther et al., 2013) and SEVIRI (Roebeling et al., 2006). Research has also addressed the choice of near-infrared channels with several aspects to consider, including stronger particle absorption at 3.7 µm and therefore a stronger function of reflectance to particle size compared to 1.6 and 2.1 µm, deeper photon penetration at 1.6 and 2.1 µm compared to 3.7 µm (Nakajima and King, 1990;Platnick, 2000) and the need to account for both solar and thermal components at 3.7 µm.
At night the retrieval of optical thickness and effective radius is more difficult. Past studies have focused on a splitwindow method (Inoue, 1985;Prabhakara et al., 1988;Parol et al., 1991) which relies on the radiative difference of cloud particles at two wavelengths in the infrared window region (8.5-12 µm). Unfortunately, this method is heavily dependent on a priori knowledge of atmospheric temperature and the cloud boundaries, both of which affect the cloud temperature. Studies have shown that explicitly including cloud boundary information from lidar or radar significantly improves retrieval uncertainties (Miller et al., 2000;Cooper et al., 2003) but, of course, this is limited to observations coincident with these active sensors.
The retrieval of cloud top pressure typically relies on matching thermal emission from a cloud in the 11 µm window to a vertical location in a known temperature profile. Without accounting for cloud transparency, this method produces heights that are biased low for semi-transparent clouds due to the contribution of thermal emission from below the cloud. Including additional thermal channels allows for the possibility of retrieving information on cloud transparency from which it is possible to separate the cloud from the below-cloud signal. In particular, the 3.7 µm channel paired with 11 µm has been used to retrieve cloud top pressure along with a single microphysical parameter that describes the cloud radiative thickness in the infrared, typically referred to as the "effective emissivity" (Szejwach, 1982;Wu, 1987;Liou et al., 1990;Ou et al., 1995). In a similar man-ner,  have shown that the relative interdependence of optical thickness and effective radius in the 3.7 and 11.0 µm combination allows these to be retrieved along with cloud top pressure, although with a significantly greater uncertainty than using solar wavelengths during the day. CO 2slicing techniques use multiple wavelengths in the 15 µm CO 2 band with increasing absorption and therefore increasingly higher peaking weighting functions providing sensitivity to semi-transparent clouds at a range of heights from the middle to upper troposphere (McCleese and Wilson, 1976;Menzel et al., 1992;Wylie and Menzel, 1999). It is possible to use CO 2 -slicing results to estimate the cloud top pressure for the thermal methods discussed above to improve the uncertainty in the cloud radiating temperature and the retrieved optical thickness and effective radius (Cooper et al., 2003). Use of CO 2 slicing is limited to instruments that have at least one 15 µm CO 2 band, such as MODIS (Menzel et al., 2008) and SEVIRI (Hamann et al., 2014), and is therefore not possible with the long heritage of AVHRR.
Some other retrievals using both solar and thermal channels to obtain information on both optical thickness and/or microphysics and cloud top pressure have been presented. In cases where a near-infrared channel is not available, optical thickness and cloud top pressure may be retrieved with a solar and thermal channel (Rossow and Schiffer, 1991;Minnis et al., 1990Minnis et al., , 1993. Alternatively, the radiative thickness can be represented in a "cloud amount" effectively accounting for semi-transparent clouds to obtain a better cloud top pressure (Shenk and Curran, 1973;Reynolds and Vonder Haar, 1977). Finally, Arking and Childs (1985) combined solar (visible and near-infrared) and thermal measurements to estimate optical thickness, a microphysical parameter and cloud top pressure.
The retrieval techniques discussed so far suffer from several drawbacks. First, most of them are separated into solar and thermal methods even though the measurements in these spectral regions are not independent of parameters retrieved in the other. As a result, not all of the available information may be used, i.e., solar information on the thermal optical thickness of semi-transparent clouds and thermal information on particle size. Although some methods discussed above use both solar and thermal channels, their usage is not simultaneous and, therefore, information shared between the different wavelengths may not be optimally used. In addition, the resulting retrievals may not be radiatively consistent with each other. As a consequence, forward modeling using the solar retrieved optical thickness and effective radius for a cloud with its top placed at the thermally retrieved cloud top pressure may produce simulated radiances that are significantly different than the observed radiances. This inconsistency could have significant impacts on broadband flux computations for radiation studies. Finally, except in some specific cases, these methods tend to lack a formal characterization of their uncertainties which incorporates measurement noise and the uncertainty of assumed parameters.
The optimal estimation approach to inverse problems is a statistical inversion method based on Bayes' theorem. Application to atmospheric retrievals was presented by Rodgers (1976) and Marks and Rodgers (1993) and formally outlined by Rodgers (2000). Although the method was originally applied to atmospheric temperature and composition retrievals, application to other atmospheric constituents such as clouds began in the late 1990s and has grown steadily in the past 20 years. In optimal estimation the parameters to be retrieved are inputs into a forward model that produces simulated measurements. These inputs are optimized to obtain the best match between the real and simulated measurements, while being constrained by a priori knowledge of the state. Optimal estimation offers several advantages over more traditional retrieval algorithms: -It is able to use any number of channels, where the independent information provided by each channel contributes to the retrieval, maximizing the use of the available information, whereas traditional methods are usually limited to a few preselected channels.
-The parameters are retrieved simultaneously, providing a retrieval that is radiatively consistent over the wavelengths of the measurements, provided that the noise characteristics of the instruments are well known.
-It is able to easily incorporate measurements from multiple sensors for synergistic retrieval algorithms, i.e., passive and active measurements.
-The same framework may be applied to the retrieval of different parameters such as aerosol and cloud, providing more consistency in aerosol-cloud interaction studies, for example.
-A priori information can be explicitly included in the retrieval in a way that is consistent with the measurements. This a priori information can be thought of as virtual measurements that help constrain the retrieval.
-It provides a rigorous characterization of the retrieval uncertainties, including propagation of measurement noise, the uncertainty of assumed parameters and the uncertainty in the forward model.
-It provides a framework to objectively evaluate the information content of the measurements in a way that is consistent across different measurement sources.
A look at the cloud retrieval literature reveals an increasing usage of optimal estimation including application to AVHRR (Heidinger, 2003;Heidinger and Pavolonis, 2009;Walther and Heidinger, 2012), MODIS (Cooper et al., 2007), ATSR (Poulsen et al., 2012), GOES (Miller et al., 2001;Walther et al., 2013) and SEVIRI (Watts et al., 2011). There have also been several studies using data from multiple instruments synergistically including passive solar and/or thermal observations combined with cloud profiling radar Cooper et al., 2003). Finally, objective information content analyses for cloud retrievals from satellite imagers, using visible, near-infrared and thermal infrared channels simultaneously, have been presented Cooper et al., 2006), including a technique for determining the optimal set of channels using the optimal estimation framework.
This paper is Part 2 of two papers describing the Community Cloud retrieval for Climate (CC4CL) retrieval system in the context of the Cloud_cci (Hollmann et al., 2013;Stengel et al., 2017) project. Part 1 (Sus et al., 2018) describes the system as a whole including a description of the supported satellite imagers, ancillary input details, cloud detection and classification, the gridding over space and time, the application of CC4CL to selected scenes and the validation of those scenes against Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) products. Our intention with Part 2 is to describe the details of the optimal estimation retrieval including the forward model and the inversion technique based on the Optimal Retrieval of Aerosol and Cloud (ORAC) algorithm (Thomas et al., 2009;Poulsen et al., 2012). Taking advantage of the benefits of an optimal estimation framework, ORAC retrieves aerosol, cloud and volcanic ash parameters and supports measurements from several different sensors. The focus of this paper will be on the retrieval of cloud parameters given a set of satellite imager measurements that are comparatively consistent with the AVHRR heritage channels centered at 0.615 (AVHRR-2) or 0.630 (AVHRR-3), 0.862, 1.61 or 3.74, 10.8 and 12.0 µm. Note that for Cloud_cci CC4CL uses this channel configuration even for instruments with additional channels (see Part 1 for supported instruments) to produce a consistent time series across sensors, although ORAC, and the CC4CL system in itself, is flexible enough to use any number of imager channels from the visible to the infrared. Finally, a summary of all datasets generated in Cloud_cci using CC4CL is given in Stengel et al. (2017). Section 2 will briefly clarify this, listing the required parameters. Section 3 will present our forward model, discuss assumptions and limitations, and provide a validation using a more advanced, albeit much slower, reference model. Section 4 presents our retrieval method in general and discusses specific details of the use of this method and some additional products derived from the results, while Sect. 5 presents a theoretical study of the performance of the retrieval. Finally, some concluding remarks are given in Sect. 6.

Data
The method described in this paper requires satellite imager measurements and several ancillary quantities, all of which are prepared for input in a preprocessing stage described in detail in Part 1. We will only briefly summarize here what is required. The measurements include, for each pixel, reflectance for the visible and near-infrared wavelengths and the brightness temperature for the thermal wavelengths. The method also requires the corresponding pixel geolocation (latitude and longitude) and solar and instrument geometry (solar zenith angle, satellite zenith angle and relative azimuth angle, defined as the shortest absolute difference between the solar and satellite azimuth angles). Several ancillary quantities are also required. These include meteorological profiles of pressure, temperature, water vapor and ozone, as well as surface reflectance and emissivity characteristics. In addition to the measurements and ancillary quantities, an estimate of their uncertainty characteristics is also required for an accurate estimate of the uncertainty of the retrieved quantities. Preprocessing is also responsible for cloud masking and classification and the retrieval methodology described in this paper will assume the properties of liquid water or ice cloud based on the cloud classification.

Forward model
The forward model contains the physics that simulates radiances as observed by a satellite instrument at the top of the atmosphere (TOA) given both retrieval and assumed model parameters. In addition, derivatives of the TOA radiances with respect to the retrieval parameters must also be computed. The forward model can be thought of as consisting of several component models and a radiative transfer solution that computes the radiances and associated derivatives given the outputs of the component models and solar and instrument geometry. The component models are a clear-sky model including molecular (Rayleigh) scattering, absorption and emission; a cloud layer model including cloud particle scattering, absorption and emission; a surface reflectance model incorporating ocean and land surface bidirectional reflectance distribution functions (BRDFs).
It is important to develop a forward model that accounts for the physics to the desired accuracy of the retrieval but is also computationally efficient enough to be used for largescale data processing. The ORAC forward model is numerically efficient by making use of both an offline component and an online component. The offline component handles the expensive particle scattering computations and the multiple scattering radiative transfer computations, the results of which are then used to produce look-up tables (LUTs) of fundamental radiative operators. These LUTs are then used in the online component, along with simple arithmetic expressions, to compute the "fast" radiative transfer solution.

Clear-sky model
Our model assumes a plane-parallel atmosphere with the levels defined by the ancillary meteorological input profiles discussed in Part 1. The required meteorological inputs are pressure, height, temperature, specific humidity and ozone mixing ratio. The surface is taken to be at the bottom level.
To account for the effects of molecular absorption and emission in a clear-sky atmosphere, transmittances and thermal radiance profiles are required. Specifically, the required transmittances include two profiles, one for transmittance from each level to TOA T ac (p) and one for transmittance from each level to the surface T bc (p) , both for a satellite slant path, where p is the atmospheric pressure at a particular level. The required thermal radiance profiles include upwelling radiance, which is the total radiance emanating from the layers between the corresponding level and TOA L ↑ ac (p) ; downwelling radiance, which is the total radiance emanating from the layers between the corresponding level and the surface L ↓ ac (p) ; and upwelling radiance, where for each level in the profile is the total radiance from the layers from the surface to that level L ↑ bc (p) . The transmittances and thermal radiance profiles are computed with the Radiative Transfer for TOVS (RTTOV) model (Saunders et al., 1999;Hocking et al., 2014) version 12.1. RTTOV is a "fast" radiative transfer model for downwardviewing passive visible, infrared and microwave satellite radiometers, spectrometers and interferometers. RTTOV's methodology is based on linear regression of line-by-line computations from LBLRTM version 12.2 (Clough et al., 2005) and the AER molecular database version 3.2. An extensive set of regression predictors is used based on the molecule type with pressure, temperature, specific humidity, ozone mixing ratio and slant path angle as variables. In ORAC, water vapor and ozone are the only variable gases with RTTOV treating effects from other gases using climatology. Additional input parameters include the surface emissivity and the surface skin temperature discussed in Part 1.
In our case RTTOV actually only provides the transmittance T ac (p) and thermal radiance profiles L ↑ ac (p) and L ↓ ac (p) but T bc (p) and L ↑ bc (p) can be computed from RT-TOV output with and respectively, where T * is the transmittance from the surface to TOA for a satellite slant path and L TOA is the clear-sky TOA radiance.
Since the clear-sky transmittance and emission profiles are independent of cloud, their computation is considered a preprocessing task performed only once in the preprocessing phase discussed in Part 1. The effect of the satellite zenith angle is removed from the transmittance profiles with where T (θ ) is transmittance for an air mass factor 1/ cos θ and θ v is the mean satellite zenith angle within a grid box. An air mass factor is then reapplied to the transmittances on a pixel-by-pixel basis in the RT solution discussed in Sect. 3.6 using the pixel's solar or satellite zenith angles as appropriate. Equation (3) is an approximation as satellite zenith angle is one of the linear regression predictors, although it has been shown that the effect is minimal compared to other sources of model uncertainty.
In addition to the molecular absorption and emission effects, RTTOV includes an extinction component due to molecular (Rayleigh) scattering in the transmittance calculations. In ORAC this effect must be removed because, as will be discussed in Sect. 3.4, Rayleigh scattering is incorporated into the multiple scattering treatment of the cloud layer. The corrected transmittance T corr is given by where T Ray = e −τ Ray (λ,p,θ v ) , the pressure-and pathdependent Rayleigh scattering optical thickness and the Rayleigh optical thickness of the entire atmosphere at nadir is (Hansen and Travis, 1974) τ 0 Ray (λ) = 0.008569λ −4 (1 + 0.0113λ −2 + 0.00013λ −4 ). (6) For each layer bounded by upper and lower pressure levels p u and p l , respectively, the Rayleigh scattering optical thickness τ Ray (λ, p l , p u ) that we apply (Sect. 3.4) is computed according to Justus and Paris (1985) with where the Rayleigh optical thickness from TOA to the pressure level p is and the Rayleigh optical thickness of the entire atmosphere is where p 0 = 1013.25 hPa, p s is the surface pressure in hPa and λ is in µm. The Rayleigh single-scattering albedo is implicitly set to unity and the single-scattering phase function P Ray (λ, ) is computed with P Ray (λ, ) = 3 4 1 + cos 2 where the depolarization factor δ is taken to be constant at 0.0279 from Young (1981). Note that ORAC does not take the variation of surface pressure due to terrain height or meteorology into account. Combined with the lack of polarization in the radiative transfer calculations, this means that ORAC is not currently suitable for use with instrument channels in the blue or ultraviolet, where the Rayleigh signal is much stronger and will vary significantly with terrain height. The effects due to polarization at these small wavelengths would require a full vector radiative transfer solution.

Cloud layer model
The cloud layer model accounts for particle scattering, absorption and emission effects and is parameterized in terms of particle type (liquid water droplets or ice crystals) and the following retrieved quantities: cloud optical thickness at 0.55 µm τ 0.55,c , effective radius of the cloud particle size distribution r e,c and cloud top pressure p c .
The cloud layer model is assumed to consist of a single layer containing only a single particle type. The cloud is assumed to be geometrically infinitely thin and planeparallel and is linearly interpolated into the plane-parallel atmospheric model at a given cloud top pressure p c , producing values of pressure, height and temperature for the cloud at p c . In addition, transmittances T ac (p c ) and T bc (p c ) and thermal radiance quantities L ↑ ac (p c ), L ↓ ac (p c ) and L ↑ bc (p c ) are also interpolated from their respective profiles discussed in Sect. 3.1. An infinitely thin cloud model allows for a significant performance gain as the effects of the atmospheric gases are separated from that of the cloud. This is an important part of the "fast" radiative transfer solution discussed in Sect. 3.6.
Optical thickness τ (λ) is defined as where z is the vertical depth into the cloud, H is the geometric thickness of the cloud, λ is wavelength and β e (λ) is the extinction coefficient defined as the sum of the scattering and absorption components, β s (λ) and β a (λ), respectively. The extinction coefficient β e (λ) may be related to the extinction cross section σ e (λ, r) and efficiency Q e (λ, r) by where r is, in general, the sphere equivalent particle radius, n(r) is the particle size distribution in terms of number density, and σ e (λ) = σ s (λ) + σ a (λ) and Q e (λ) = Q s (λ) + Q a (λ) are sums of their scattering and absorption components, respectively.
If we introduce the normalized size distributionn(r) = n(r)/N , where then Eq. (12) may be written as whereσ e (λ) is the normalized ensemble-averaged extinction cross section. This value is dependent on the shape of the size distribution but independent of the density of particles and represents the spectrally dependent microphysical influence on the cloud optical thickness. Since the cloud is parameterized according to the cloud optical thickness at 0.55 µm but forward model simulations must be performed in each wavelength channel the required optical thickness τ (λ) in these channels must be determined from the spectral variation defined by The effective radius r e is defined as the ratio of the third and second moments of the particle size distribution: while the effective variance of the distribution is defined as v e = ∞ 0 (r − r e ) 2 r 2 n(r) dr ∞ 0 r 2 n(r) dr .
(17) Hansen and Travis (1974) and Mishchenko and Travis (1994) show that different size distributions that have the same effective radius and effective variance have similar scattering and absorption properties. Thus, by parameterizing the size distribution in terms of effective radius we reduce the dependency on the details of the size distribution depending mainly on the assumption of the distribution's width. It is from these relationships and further assumptions based on particle type that the three fundamental radiative transfer quantities required for the multiple scattering computations discussed in Sect. 3.4 are parameterized: the total optical thickness τ (λ), the ensemble-averaged singlescattering albedo ω(λ) and the ensemble-averaged singlescattering phase matrix F(λ, ). It is through these parameterizations that the radiative transfer quantities depend on particle concentration and size distribution.
The single-scattering albedo ω(λ) is the proportion of radiation incident on a particle that is scattered rather than absorbed and is given as .
The (4 × 4) single-scattering phase matrix F(λ, ) describes the angular distribution and state of polarization of scattered radiation given the state of polarization of the incident radiation, where the radiation is described by the fourelement Stokes vector I = [I, Q, U, V ]. This form of the phase matrix, depending only on the angle between the incident and scattering directions , is valid for mediums composed of randomly oriented particles. The ORAC forward model makes the so-called "scalar approximation" where only the intensity I is considered. In this case only the (1,1) element of the phase matrix is required, which is referred to as the single-scattering phase function P (λ, ) with the following normalization condition: Likeσ e (λ), both ω(λ) and P (λ, ) are size distribution normalized so they are independent of the density of particles. The equations presented so far in this section are independent of particle type and shape and are valid for all mediums of randomly oriented particles. It is the source of the normalized scattering and absorption cross sectionsσ s (λ, r) andσ a (λ, r) and the phase function P (λ, ) that varies with particle type. They are computed with physical models such as Mie theory (Mie, 1908) or T-matrix theory (Mishchenko et al., 2002) or, in the case of ice crystals, geometric optics (Liou, 2002).

Liquid water droplets
For liquid water droplets the Mie theory code implementation presented by Grainger et al. (2004) is used for the computation of Q s [λ, r, m(λ)], Q a [λ, r, m(λ)] and P [λ, r, m(λ), ], where m(λ) = m r (λ) + im i (λ) is the refractive index of the particle composition. It is convenient to integrate the computations across an analytical size distribution described by a limited number of parameters for which we use the modified gamma distribution given by Deirmendjian (1969). The distribution, in terms of number density as a function of radius n(r), is given as where the parameters a, α, b and γ are positive, α is an integer and the mode r m of the distribution occurs when r = α bγ 1/γ . In ORAC a = 2.373, α = 6, b = α/r m and γ = 1.
Atmos. Meas. Tech., 11, 3397-3431, 2018 www.atmos-meas-tech.net/11/3397/2018/ As discussed, the size distribution is parameterized in terms of effective radius, which is related to the modifiedgamma distribution of Eq. (20) by while the effective variance is related to the modified-gamma distribution by For the complex index of refraction for liquid water, values were taken from Hale and Querry (1973) for wavelengths in the range 0.25 ≤ λ ≤ 0.69 µm, Palmer and Williams (1974) for 0.69 < λ ≤ 2.0 µm and Downing and Williams (1975) for λ > 2.0 µm.

Ice crystals
Ice crystal single-scattering property models provided by Baum et al. (2011Baum et al. ( , 2014 are used, which are based on in situ cloud microphysical measurements and single-scattering calculations. The in situ measurements provide information on particle size distributions, ice water content and the median mass diameter (Heymsfield et al., 2013). The singlescattering properties are provided by Yang et al. (2013) and are based on a combination of the Amsterdam discrete dipole approximation, the T-matrix method and the improved geometric-optics method. They are provided for each habit at 189 discrete sizes between 2 and 10 000 µm and for 445 discrete wavelengths ranging from 0.2 to 100 µm. The bulk single-scattering properties are computed for each wavelength by integrating over particle size distribution and the nine ice particle habits to produce size distribution averaged properties for a "general habit mixture". These are made available as a function of effective radius in 23 bins from 5 to 60 µm. It has been shown that ice particle roughening significantly impacts the single-scattering phase function (Baum et al., 2010) and that the general habit mixture with severe roughening provides the closest comparison with polarization measurements . These are the models used in the ORAC retrieval.
ORAC currently accepts ice crystal scattering properties up to an effective radius of 92 µm. The ice crystal properties are only provided up to 60 µm and therefore linear extrapolation is used up to 92 µm. The error incurred in the approximation is minimal as the variation in the properties tends to flatten as effective radius increases and cloud ice crystal effective radii greater than 60 µm are rare.

Surface reflectance model
The surface is characterized by a BRDF which is computed differently for ocean and land surface. The BRDF over ocean is computed using the methodology outlined by Sayer et al. (2010), which includes three components: where ρ sg is the sunglint off wave facets (Cox and Munk, 1954), ρ wc is the reflectance from surface foam, so-called "white caps" (Koepke, 1984), and ρ ul is the scattering from within the water, so-called "underlight" (Morel and Prieur, 1977). The required physical parameters include the horizontal wind vector u and v (m s −1 ), obtained from the meteorological input, to determine wave statistics and white-cap coverage, as well as chlorophyll and dissolved organic matter concentration C (mg m −3 ), a globally averaged value obtained from climatology. The BRDF over land is a weighted sum of an isotropic kernel (unity) and two BRDF kernels (Wanner et al., 1997), both dependent on solar and satellite geometry only: where K vol (θ 0 , θ v , φ) is known as the Ross-thick kernel which parameterizes volumetric scattering and K geo (θ 0 , θ v , φ) is the Li-sparse kernel which parameterizes geometric shadowing. Although our method is independent of the source of the weights f (λ), in the current CC4CL implementation the weights are provided by the 0.05 • MODIS MCD43C1 BRDF ancillary input discussed along with references in Part 1. Over snow and ice the surface reflectance is assumed to be Lambertian with reflectance values taken from the ASTER Spectral Library Version 2.0 (Baldridge et al., 2009). This reflectance is combined with the reflectance for either ocean or land as described above based on the fraction of snow/ice provided by the preprocessing stage described in Part 1.

Reflectance and transmission operators
The next step in the forward model is the computation of reflectance, transmission and emissivity operators which are used in the "fast" RT solution described in Sect. 3.6. This computation involves the solution to the radiative transfer equation (RTE) for monochromatic radiation through a single plane-parallel homogeneous layer given as where L(λ, τ, µ, φ) is the radiance along the direction specified by the cosine of the polar angle µ and the azimuthal angle φ at optical depth τ (λ) measured perpendicular to the surface of the medium. The second term on the right-hand side is the source function given by where P (λ, µ, φ, µ , φ ) is the single-scattering phase function for radiation from the direction (µ , φ ) scattered into the direction (µ, φ) and ω(λ) is the single-scattering albedo. The first term on right-hand side of Eq. (26) represents the source from multiple scattering. The second term is the solar single-scattering source given by where E 0 (λ) is the TOA incident solar irradiance, and µ 0 and φ 0 are the solar zenith and azimuth angles, respectively. The third term on the right-hand side of Eq. (26) is the thermal emission source given by where B(λ, T ) is the Planck black body function and T is the average layer temperature. For performance reasons, the operators are precomputed and stored in an LUT from which the values for an arbitrary set of geometric and optical parameters may be linearly interpolated. The LUTs are computed with the DIscrete Ordinates Radiative Transfer (DISORT) software package (Stamnes et al., 1988). This step, although slow, is performed offline and the resulting LUTs are static. DISORT is a general purpose plane-parallel RTE solver for monochromatic radiation with support for multiple layers, absorption and multiple scattering, and solar and thermal emission sources. It implements the delta-M correction of Wiscombe (1977) for strongly asymmetric phase functions and the TMS singlescattering correction of Nakajima and Tanaka (1988).
DISORT still makes approximations, which can limit its accuracy in certain circumstances. The most important of these are as follows: -It assumes a plane-parallel atmosphere and so does not account for the curvature of the Earth. This is important at solar and satellite zenith angles greater than approximately 75 • .
-It is a one-dimensional model and so cannot reproduce the effects of horizontal gradients in the scattering medium. This is important where strong gradients exist, such as near cloud edges and/or in broken cloud fields, but this limitation is not relevant as we treat each pixel independently.
-It does not model polarization effects and so cannot be used to model measurements made by instruments which are sensitive to polarization. In addition, this socalled "scalar approximation" does not take into account polarization introduced into the diffuse component of radiance by Rayleigh scattering and/or the surface, and the subsequent depolarization effect of particles.
To compute the operators DISORT must be provided with solar and instrument geometry and the optical thickness, single-scattering albedo and phase function for each layer. In addition to particle effects, the LUTs account for Rayleigh scattering and therefore, even though the operators are for a single homogeneous cloud, the computation is performed for an entire atmospheric profile. For this we use the midlatitude summer profile provided by McClatchey et al. (1972) to compute the Rayleigh scattering optical properties in each layer as described in Sect. 3.1. In order to account for multiple scattering between molecules and cloud particles we give the cloud layer a physical thickness of 1 km and place it at a top pressure of 560 hPa. In this layer, the optical properties for Rayleigh scattering and those of the cloud must be combined by (assuming dependence on λ) where the Rayleigh single-scattering albedo is equal to unity. The computations are performed at optical thickness, effective radius, solar zenith angle, satellite zenith angle and relative azimuth angle vertices defined by the dimensions in Tables 1 and 2 for liquid water and ice cloud, respectively. The reflectance and transmission operators represent the transfer of either direct beam or diffuse incoming radiation resulting in either direct beam or diffuse outgoing radiation. They are computed separately for both direct beam and diffuse incoming sources with results for both direct beam or

CC4CL FM operators
Beam solar Diffuse solar Beam thermal Diffuse thermal diffuse outgoing radiation being produced simultaneously. In addition, an operator representing the emissivity of the cloud is produced by including a thermal source in the cloud layer. A total of seven operators are required (assuming dependence on τ 0.55,c and r e,c ): is the downward diffuse reflectance from the cloud, as viewed from a specific direction.
3. R dd (λ) is the bihemispherical reflectance of the cloud.
4. ε(λ, θ v ) is the emissivity of the cloud, as viewed from a specific direction.

T
↓ bb (λ, θ 0 ) is the downward direct transmission of the cloud of the direct solar beam.
is the downward diffuse transmission of the cloud, as illuminated by the direct solar beam.
is the upward diffuse transmission of the cloud, as viewed from a specific direction.
A schematic diagram of the operators and their interaction with the surface is presented in Fig. 1. The symbol ↓ denotes transmission from the top to the bottom of the atmosphere, while ↑ indicates the reverse. Dependence on the solar zenith, satellite zenith and relative azimuth angles is denoted by θ 0 , θ v and φ, respectively. The pairs of b and d subscripts denote the type of radiation each term operates on and produces; for example, T ↓ bd (λ, θ 0 ) operates on the direct beam (b) of solar radiation and produces the diffuse radiation (d) that results at the bottom of the atmosphere. Note that can be used for each of these for a total of seven LUTs.
The operators are computed for each channel across the channel's spectral interval and then convolved with the channel's instrument response function with where X is either reflectance R, transmittance T or emissivity ε; SRF i is the spectral response function for channel i; and λ 1 and λ 2 define the channel's spectral interval.
The inclusion of Rayleigh scattering effects in the reflectance and transmission operators requires some approximation. This is because the Rayleigh scattering parameters included are defined for each individual atmospheric layer whereas the cloud layer is added to a single layer. This cloud layer must be placed at a fixed top pressure within the atmosphere (560 hPa) when producing the LUTs, since the cloud top pressure is not an LUT variable. This is an approximation since Rayleigh scattering effects are pressure dependent and therefore vary with height. This means that when the retrieval places the particle layer higher than 560 hPa, the Rayleigh scattering effects will be slightly overestimated, whereas if it is placed lower than 560 hPa, the Rayleigh scattering will be slightly underestimated. We have investigated the effects of this and have determined that relative to other sources of error these effects are small at the wavelengths used in this work but if smaller wavelength channels were to be incorporated (less than approximately 6 µm) the Rayleigh scattering pressure dependence would probably have to be considered in more detail. Rayleigh scattering could of course be included in the clear-sky transmittances but this would be an extinction only effect and would not account for multiple scattering between molecules and between molecules and particles.

Surface reflectance operators
Interaction with the surface is parameterized by four reflectance operators (assuming dependence on the appropriate BRDF kernel parameters discussed in Sect. 3.3): 1. ρ bb (λ, θ 0 , θ v , φ) is the bidirectional reflectance. This is the reflectance of the surface to direct beam illumination at θ 0 , as viewed from a specific direction θ v . It is the reflectance that would be observed by a satellite instrument in the absence of an atmosphere.
2. ρ bd (λ, θ 0 ) is the directional-hemispherical reflectance. This is the fraction of incoming direct beam illumination at θ 0 that is reflected across all zenith angles. This is also referred to as the "black-sky albedo".
This is the reflectance of the surface to purely diffuse illumination, as viewed from a specific direction θ v .
4. ρ dd (λ) is the bihemispherical reflectance. This is the reflectance of the surface to purely diffuse illumination, across all viewing directions. This is also referred to as the "white-sky albedo".
A schematic diagram of the operators and their interaction with a cloud is presented in Fig. 1, while a more detailed diagram of the operators themselves is given in Fig. 2. The first term ρ bb (λ, θ 0 , θ v , φ) is computed directly from the BRDF over land or ocean, where over ocean or land, respectively. The three other terms are derived from the BRDF integrated over solar and/or satellite geometry written as (assuming dependence on λ) and Note that ρ db (λ, θ 0 ) ≡ ρ bd (λ, θ v ) when θ 0 = θ v ; that is, they are both hemispherical reflectance, just seen from different zenith angles.

The "fast" radiative transfer solution
The "fast" radiative transfer solution uses the cloud reflectance, cloud transmission and surface reflectance operators along with clear-sky transmittances to simulate the measurements as observed by a satellite sensor at TOA. Shortwave solar reflectance computations are separated from the longwave thermal infrared brightness computations. Although not strictly required, this separation results in an efficient implementation as components such as surface reflectance or cloud top emission are specific to solar and thermal wavelengths, respectively. At thermal wavelengths, where there is a significant solar contribution, separate calculations are performed and the resulting solar reflectance is converted to brightness temperature and added to the resulting thermal brightness temperature computation.

Solar reflectance
Using the reflectance and transmission operators described in Sect. 3.4, the surface reflectance operators described in Sect. 3.5, and neglecting molecular absorption, the observed reflectance of the cloud-surface system can be written as (assuming channel dependence) Schematic diagrams of each of the four ORAC surface reflectance operators. Blue represents incident radiation and red represents reflected radiation. The long arrows represent beam transport and the hemispheres with short arrows represent diffuse transport.
Here we have four terms resulting from a single surface reflection in Eq. (36), which can be described as follows: is the directly transmitted solar beam that is reflected off the surface into the viewing direction of the satellite and directly transmitted back through the atmosphere.
is the directly transmitted solar beam that is diffusely reflected off the surface and diffusely transmitted into the viewing direction of the satellite.
is the diffusely transmitted solar beam that is reflected off the surface into the viewing direction of the satellite and directly transmitted back through the atmosphere.
is the diffusely transmitted solar beam that is diffusely reflected off the surface and diffusely transmitted into the viewing direction of the satellite.
The terms following on from these describe the rapidly diminishing series of multiple reflections between the surface and overlaying atmosphere. For these terms the assumption has been made that the surface-cloud pair are essentially Lambertian reflectors so that only the bihemispherical reflectance of the cloud is required in the series. This is equivalent to saying that neglecting directly transmitted solar radiation, the sky is equally bright in all directions.
By gathering terms, Eq. (36) can be simplified to give This can then be further simplified, using the appropriate series limit, to give Finally, the reflectance at the top of the cloud (TOC), including molecular absorption below the cloud, is obtained by scaling the terms in Eq. (38) by the appropriate below-cloud clear-sky transmittance terms, subscripted with "bc": and the reflectance at TOA including molecular absorption above the cloud is obtained by scaling R TOC (θ 0 , θ v , φ) by the appropriate above-cloud clear-sky transmittance terms, subscripted with "ac": where T ac (θ ) = T ac (0, p c ) secθ and T bc (θ ) = T bc (0, p c ) secθ , and T ac (0, p c ) and T bc (0, p c ) are above-cloud and belowcloud nadir transmittances, respectively, interpolated from transmittance profiles T ac (0, p) and T bc (0, p), respectively, obtained from RTTOV as described in Sect. 3.1. For diffuse radiation the nadir transmittances are scaled in a manner appropriate for a isotropic radiation field. Consider an isotropic radiation field L(θ, φ) = L incident on a layer of optical depth τ . The incident irradiance is and the transmitted irradiance neglecting Rayleigh scattering (accounted for in the cloud operators) is One can equate this to transmission through a scaled optical depth τ * so π Le −τ * = 2π 0 π/2 0 L(θ, φ)e −τ/ cos θ sin θ cos θ dθ dφ, 3.6.2 Thermal brightness temperature The observed TOA brightness temperature is given by (assuming channel dependence) where L ac (θ v ) is the upward radiance into the viewing direction from the atmosphere above the cloud, L ↓ ac is the downward radiance from the atmosphere above the cloud, L ↑ bc is the upward radiance from the atmosphere below the cloud, B(T c ) is the Planck function as a function of the cloud top temperature T c and ε(θ v ) is the cloud emissivity operator described in Sect. 3.4. The clear-sky radiance terms L are obtained from RTTOV as described in Sect. 3.1. L ↑ bc is computed as a preprocessing task but actually depends on the surface temperature T s , which is a retrieved parameter. Therefore, L ↑ bc must be updated during the retrieval process. For this we approximate L ↑ bc by assuming a linear relationship between T s and L ↑ bc around the a priori base state, i.e., the ancillary surface temperature input, given as where L ↑ bc,a is the upward radiance from the atmosphere below the cloud computed with the a priori surface temperature T s,a .

Derivatives
The gradient of the forward model (∂y/∂x), where y is a radiance measurement in a single channel and x is one of the retrieved parameters, is required for the following two purposes: 1. The gradient with respect to parameters which are to be derived from the measurements (state parameters) is required for the inversion of the nonlinear forward model discussed in Sect. 4.
2. The gradient with respect to parameters which are considered known and are not retrieved, e.g., meteorology, surface reflectance and surface emissivity, is used to judge the sensitivity to these parameters and thus to estimate their contribution to the retrieval uncertainty.
Derivatives of the forward model may be obtained through straightforward linearization of the forward model equations already given and, as a result, the derivations will not be presented here.

Assumptions and limitations
The forward model introduces some assumptions and limitations that contribute to uncertainty and may under certain conditions bias retrieval results. Inaccuracies which result from these assumptions and limitations are termed forward model uncertainty and do not include uncertainties in the input atmospheric and surface parameters (termed retrieval parameter uncertainty) or uncertainties in RTTOV (these are forward model uncertainty, but their evaluation lies outside the scope of this paper). The assumptions and limitations may be grouped into two lists. The first list involves limitations related to instrument resolution and assumptions related to limited information content: -Satellite pixels are assumed to be either completely clear or completely overcast. Retrievals from pixels with subgrid variability, i.e., broken cloudiness, will be biased and therefore unrepresentative of the clouds within the pixel (Zhang et al., 2012).
-Satellite pixels are assumed to be either completely of land or completely of ocean so that the BRDF and emissivity assumptions will be either for land or ocean. Retrievals from pixels with both land and ocean, such as with coastlines, islands and inland waters, will be biased since the BRDF and emissivity will be unrepresentative of at least part of the pixel.
-The liquid water droplet size distribution has an assumed shape and width and only varies in effective radius. Deviations from this assumption will result in biases particularly in optical thickness and effective radius. Although it is theoretically possible to retrieve additional size distribution parameters besides effective radius, the lack of information in typical multispectral image measurements has made this impractical in the current implementation of the forward model.
-Ice crystal scattering properties are computed based on shape (habit) and size. The ice cloud bulk scattering models used in the forward model must assume a size distribution and a mixture of possible habits. These assumptions are based on in-depth analysis of aircraftbased in situ measurements. Deviations from this assumption will result in biases, particularly in optical thickness and effective radius.
The second list involves assumptions and limitations in the radiative transfer solution: -The forward model characterizes the cloud layer with an infinitely thin geometric thickness. Since the peak sensitivity of the thermal channels to the cloud is within the cloud itself, the cloud will be placed at a height below the top of a real cloud with finite geometric thickness.
-The forward model contains only a single cloud layer. Retrievals from pixels with more than one cloud layer, where the upper cloud layer is optically thin (cirrus overlying liquid water cloud), will be the result of radiance contributions from both clouds resulting in a bias away from the properties of either cloud. For example, relative to the cirrus cloud and assuming typical cloud conditions, the optical thickness will be biased high, effective radius will be biased low and the cloud top pressure will be at a level between the cloud layers.
-Each pixel is processed independently, which means that the radiative transport that occurs in each pixel occurs independently of that in the neighboring pixels; i.e., horizontal transport between neighboring pixels is not accounted for. In the literature this is referred to as the independent pixel approximation (Cahalan et al., 1994). It has been shown that the biases incurred can be significant at higher spatial resolutions, whereas on the order of 1 km the subgrid variability mentioned above will usually dominate (Heidinger and Stephens, 2002).
-The forward model is a scalar operator model in which radiation that is incident and reflected/transmitted from/through the cloud layer or reflected from the surface is modeled as one of two types: directional or hemispherical. This is in contrast to multi-stream models that model reflection and transmission with a range of quadrature points over the upward and downward hemispheres. The use of scalar operators is one of the primary reasons the forward model is orders of magnitude faster than a multi-stream model. Note that the operators themselves are computed with a multi-stream model to accurately account for multiple scattering effects within the cloud layer.
-Finally, the assumption of a plane-parallel atmosphere and ignoring polarization effects, both discussed in Sect. 3.4, should also be included in this list.

Validation
In this section we discuss the evaluation of the forward model with respect to a "reference forward model". The focus will be based on the use of scalar operators versus a multi-stream solution while all other aspects of the two models are essentially identical; i.e., they use the same input parameters, both use RTTOV for gas transmittance, both use the same method to compute Rayleigh scattering parameters and both have the same limitations and assumptions listed in the first list of Sect. 3.8. The assumptions of a geometrically infinitely thin cloud and on ignoring polarization effects are common and well explored in the literature. Likewise, the plane-parallel assumption and the assumption of 1-D geometry are inherent in both the fast forward model and the reference model on which numerous studies have been performed (see Heidinger and Stephens (2002) and authors therein). The assumption of a single layer in a multilayer case is covered in a another study performed by the authors (McGarragh et al., 2018). The reference forward model divides the atmosphere into as many layers as the meteorological input contains. Gas transmittance is taken from the clear-sky RTTOV computations. The Rayleigh scattering optical thickness for dry air τ Ray (λ) is computed according to Justus and Paris (1985). The Rayleigh scattering phase matrix F Ray (λ, ) is computed from the depolarization ratio, which may be determined from the King factor for dry air (Peck and Reeder, 1972). Liquid water particle single-scattering properties are computed using the Mie theory code implementation presented by Grainger et al. (2004), assuming the same size distribution parameters used for the fast forward model. The refractive indices for liquid water are taken from Hale and Querry (1973), Palmer and Williams (1974) and Downing and Williams (1975). Ice crystal single-scattering properties are taken from the same source as the fast forward model. A standard discrete ordinate solution is performed accounting for absorption, emission and multiple scattering with solar and thermal sources using DISORT (Stamnes et al., 1988). Unlike the fast forward model, for channels with both a solar and thermal component a single solution is performed. In addition, the solution includes delta-M scaling (Wiscombe, 1977), the TMS single-scattering correction of Nakajima and Tanaka (1988) and the so-called pseudo-spherical approximation in which the solar beam is modeled through a spherical shell (Dahlback and Stamnes, 1991).
The comparisons that follow are presented in the form of 2-D plots of fractional difference given by where x is either reflectance R or brightness temperature L.
The comparisons revolve around a base state where any two parameters in the state are varied in a plot. The base state can be summarized as follows: midlatitude summer temperature, pressure and trace gas profiles provided by McClatchey et al. (1972); all four BRDF operators, ρ bb (λ, θ 0 , θ v , φ), ρ bd (λ, θ 0 ), ρ db (λ, θ v ) and ρ dd (λ), are set to 0.2; surface emissivity s (λ) = 0.8; retrieval parameters τ c and r e,c are set to the a priori values indicated in Tables 5 and 6 for liquid water and ice cloud, respectively; retrieval parameter p c is set to 800 and 245 hPa for liquid water and ice cloud, respectively; retrieval parameter T s is set to 290 • K; The results are given as a function of optical thickness versus effective radius, solar zenith angle, relative azimuth angle and the four BRDF parameters for 0.65 µm and effective radius, cloud top pressure and surface temperature for 3.70 and 11.0 µm. The satellite zenith angle is omitted as the results for it are relatively similar to those for the solar zenith angle. For the BRDF parameters ρ dv and ρ dd are combined as they both account for incident diffuse radiation and the effect of ρ dv is usually relatively small. We provide the minimum and maximum values ( min and max , respectively) in order to maintain a useful scale in the plots. In addition, values equivalent to prelaunch measurement noise requirements for the comparable MODIS channels are also provided as req . MODIS is an instrument , where x is either reflectance R or radiance L, between the reference forward model and the fast forward model for liquid water cloud as a function of optical thickness τ 0.55,c versus effective radius r e,c (µm), solar zenith angle θ 0 , relative azimuth angle φ, bidirectional surface reflectance ρ 0v , directional-hemispherical surface reflectance ρ 0d and a combination of hemispherical-directional and bihemispherical surface reflectance ρ dv + ρ dd for wavelength λ = 0.65 and effective radius r e,c (µm), cloud top pressure p c (hPa) and surface temperature T s (K) for 3.70 and 11.00 µm.
with relatively high accuracy requirements and one of the Cloud_cci instruments. This value by no means should be considered a requirement for the difference between the two models and is only there as a relevant reference. The difference between these models is quite a different value es-pecially since we assume that the reference model, although more accurate than the fast model, surely cannot simulate the measurements exactly. In Fig. 3 fractional differences between the reference and fast forward models are presented for liquid water cloud at 0.65, 3.70 and 11.0 µm. The choice of the wavelengths is sufficient to cover the full range of wavelengths used for our retrieval, i.e., a shortwave solar wavelength where multiple scattering of solar radiation will dominate (0.65 µm), a thermal wavelength dominated by thermal emission and absorption (11.0 µm) and a mixed channel with both a solar and a thermal component (3.70 µm). Note that values in the red direction are due to overestimation by the fast forward model relative to the reference forward model and values in the blue direction are due to underestimation. At 0.65 µm it is apparent that the errors tend to be small, above an optical thickness of approximately 10. Small in this case is considered close to req . One exception is at a solar zenith angle of around 20 • , corresponding to single-scattering angles of around 140 • where there is rapid variation from the first and second rainbows from 120 to 150 • . The other two exceptions are at relative azimuth angles of around 90 • , a singlescattering angle of around 132 • (again within the rainbow region) and at 180 • , where at the base state solar and satellite zenith angles of 35 • corresponds to backscattering. It is also apparent that the differences tend to be low below an optical thickness of approximately 0.1. This indicates that the surface reflection is well characterized in the solution since most of the signal at low optical thicknesses will be from the surface. It is in the critical region of optical thickness around unity -the transition between single and multiple scattering -where the differences are largest. For variation in bidirectional reflectance there is a sweet spot around 0.2-0.3 with overestimation below and underestimation above. This is due to the fact that the model does not account for the reflection from the bottom of the cloud of incident beam radiation from the surface and the sweet spot falls where this is compensated by incident diffuse radiation. The same minimum in the differences occurs for ρ 0d and ρ dv + ρ dd , although to a lesser degree.
For the thermal wavelengths the differences remain below 0.5 % for variation in effective radius and cloud top pressure. Interestingly, as a function of surface temperature the difference can be much larger for a optical thicknesses below 10 away from the base state of 290 K. This is due to the linear approximation of Eq. (48).
For ice cloud (Fig. 4) the results are, in almost all cases, better than that for liquid water. Most of the variation as a function of effective radius has disappeared, most likely due to the flatter phase function for ice particles relative to liquid water droplets. For the same reason, most of the features in the plots as a function of solar zenith and relative azimuth angle have also disappeared. It is also apparent that, as a function of the BRDF parameters, the underestimation has grown somewhat at the expense of overestimating a change attributed to the flatter phase function and a slight change in the balance discussed above.

Optimal estimation
The ORAC retrieval algorithm is based on the optimal estimation approach for atmospheric inverse problems described by Rodgers (2000), in which the input state to a forward model is optimized to obtain the best match between real measurements and simulated measurements computed with a forward model while being constrained by a priori knowledge of the state. The relationship between the n element state vector x and the m element measurement vector y is given by where F is the forward model, b is the set of all other assumed model parameters not in the state vector x and represents the measurement and forward model error. Optimal estimation falls in the category of statistical inversion methods based on Bayes' theorem: where x and y are continuous random variables, P (x) is the prior probability density function (PDF) of the state x before the measurements are made, P (y) is the prior PDF of the measurements y before the measurements are made, P (y|x) is the conditional PDF of y given x and P (x|y) is the conditional PDF of x given y. The solution is obtained by minimizing P (x|y) to obtain the maximum posteriori solution, the solution that has the maximum probability of being the truth. If the PDFs are assumed to follow a Gaussian distribution P (x|y) can be expressed as a χ 2 distribution: where S is the measurement, forward model and forward model parameter error covariance matrix, x a is the a priori state vector and S a is the a priori error covariance matrix. x a and S a denote the best guess of the state before the measurement is made and the uncertainty of this guess, respectively. Equation (52), known as the cost function, is a combination of the squared deviations between the measurements and the forward model and the retrieved state vector and the a priori state vector, each weighted by their associated covariance matrix. The retrieval problem is that of finding the minimum value of χ 2 .
As with most atmospheric inverse problems, our cloud retrieval problem is ill-posed in that noise in the measurements y and forward model and parameter errors leads to significant errors in the estimate of x. the solution is stable with respect to perturbations in y (Engl et al., 2000;Vogel, 2002). If any of these conditions are not met then the problem is ill-posed, leading to non-existence, non-uniqueness (due to discretization of the problem) and/or ill-conditioning (due to amplification of errors in x due to errors in y) (Doicu et al., 2010). It is for this reason that an a priori constraint is required. The fact that the problem is nonlinear requires an iterative method. Finally, in order to perform the iteration efficiently, while maintaining a stable step size, a form of regularization is required. In ORAC regularization is achieved with the Levenberg-Marquardt (Levenberg, 1944;Marquardt, 1963) method ap-plied to Gauss-Newton iteration, leading to where the subscript i denotes the number of the current iteration, K i is the m × n weighting function matrix, γ i is the Levenberg-Marquardt regularization parameter and D i is an n × n diagonal scaling matrix. Each column of K i contains the derivatives of the forward model with respect to each state parameter given by Thus, for a linear system, we could write Central to the Levenberg-Marquardt method is the regularization parameter γ i , which controls the type of step taken at each iteration. On the one hand, if γ i → 0, the algorithm behaves like Gauss-Newton iteration, which will provide an exact solution to a linear problem in one iteration. On the other hand, if γ i → ∞, the step direction tends to steepest descent and the step size tends to zero. The optimal value of γ i will be one that maximally reduces the cost function for each iteration. The procedure for determining the value of γ i is to start with a small value (so the initial iteration will resemble Gauss-Newton), then at each iteration if, as a result of the step given by Eq. (53), the cost function increases, do not update the state vector and increase γ i for the next step; if the cost function is decreased by the step given by Eq. (53), update the state vector and decrease γ i for the next step.
Our implementation uses a factor of 10 for increasing and decreasing γ i . The initial value γ 0 is chosen to be the mean of the diagonal elements of the Hessian K T i S −1 ,i K i . The scaling matrix D is used to ensure that the state space parameters are of similar magnitude to avoid ill-conditioned matrix operations and to therefore ensure numerical stability. Alternatively, in ORAC D = I, where I is the identity matrix, and the scaling is performed directly on the state vector parameters, their a priori values, their a priori uncertainties and the derivatives of the simulated measurements with respect to the state vector parameters according to the scaling values in Tables 3 and 4 for liquid water and ice cloud, respectively. With this method, scaling parameters may be chosen in a more intuitive way than setting the scaling matrix D directly. Finally, in some cases the step size will be large enough to push state parameters out of a physically reasonable range. In this case the values are bound to the ranges listed in Tables 3  and 4. This iterative procedure, presented as a flow chart in Fig. 5, is continued until either the convergence criteria are satisfied or a maximum number of iterations is exceeded. In the former case the retrieval is said to have "converged" while the latter case can generally be rejected as a failed retrieval. ORAC uses the change in the cost function between iterations to determine whether the algorithm has converged. A negligible change in cost between iterations indicates that the retrieval is no longer improving the fit between the measurements and the forward model. In ORAC the cost change threshold is set to 0.05 m and the maximum number of iterations is set by "trial and error" to 40, although the number of iterations usually required to achieve convergence is less than half this value. An additional, purely Gauss-Newton step (γ = 0) is performed to test for false convergence. If this step changes the cost by greater than one then γ is reinitialized and the iteration continues.
After successful convergence the retrieved statex is set to x i , where i is the index of the last iteration, and an uncertainty estimate for the retrieved stateŜ is calculated witĥ where the uncertainty of a particular parameterx k is defined as the square root of the corresponding diagonal element σ k = Ŝ kk .

Measurement vector and covariance matrix
In general, the measurement vector y contains solar reflectance (during the day) and/or thermal brightness temperature values (day and night) for any number of channels. as a result, for the rest of this paper, we will assume that these channels are available during the day while at night only the thermal channels 3.74, 10.8 and 12.0 µm are available. (Please see the introduction of this paper and the references therein for more details on the sensitivities of these channels.) The 1.61 µm channel is only available from a subset of the AVHRR-3 platforms and, in these cases, to save bandwidth the 1.61 µm channel is used during the day while at night the 3.74 µm channel is used. These are our effective radius sensitive channels and, to maintain a consistent record over the AVHRR heritage, we elect to use only the 3.74 µm channel by default unless it is unavailable during the day for AVHRR, in which case the 1.61 µm channel is used. The other instruments used with Cloud_cci and their channel configurations are discussed in Part 1. The optimal estimation framework allows for explicit inclusion of uncertainties from the measurements, the forward parameters and the forward model itself. These uncertainties are combined into the so-called "measurement and forward model" covariance matrix S ,i for inversion iteration i given by where S y is the covariance matrix describing the measurement uncertainties, S fm describes the forward model uncertainties due to incomplete physics or computational approximations, S b describes the uncertainties in the forward model input parameters in b and K b,i is a weighting function matrix which propagates this uncertainty into measurement space and is given by The dependence of S ,i on the iteration comes from the dependence of K b,i on the state vector; i.e., K b,i is a linearization around a base state defined by x i . Although it is possible to include covariance in the CC4CL implementation, in the current configuration all three covariance matrices are assumed to be diagonal, that is, it is assumed that there is no correlation in the noise from different channels and no correlation in the forward model and forward model parameter uncertainties. S y is based on the prelaunch error characterization of the instrument noise given in Part 1. Uncertainties in the forward model itself S fm were determined through rigorous sensitivity studies (Watts et al., 1998;Sayer, 2009;Siddans et al., 2011) and include effects from subpixel inhomogeneity, assumed liquid water droplet size distribution, ice crystal models (habit and size distribution), the surface BRDF model, the radiative transfer assumptions and assumptions made in the "fast" radiative transfer solution. Uncertainties in the forward model parameters S b are obtained from published uncertainties, if available, and include uncertainties in meteorology (pressure, temperature, water vapor and ozone) and surface parameters such as temperature, wind vector, surface reflectance and emissivity and ice/snow extent.

State vector and a priori state vector and covariance matrix
The retrieval state vector x can be written as where τ 0.55,c is the total cloud optical thickness at a wavelength of 0.55 µm, r e,c (µm) is the cloud particle effective radius, p c (hPa) is the cloud top pressure and T s (K) is the temperature of the surface. The transformation to log 10 space for optical thickness is desirable as the forward model is a strong nonlinear function of optical thickness. An added benefit is that negative values of optical thickness, which may be encountered during the inversion process and must be bounded to a minimum value, will be avoided in log 10 space. The size distribution for liquid water droplets is assumed to be the modified gamma distribution as discussed in Sect. 3.2.1 and aircraft measurements of ice crystals from optical probes indicate exponential type distributions (Heymsfield and Platt, 1984;Arnott et al., 1994;Mitchell and Arnott, 1994;Kinne et al., 1997;Wyser, 1998), for neither of which is a log 10 transformation of the effective radius appropriate. The relationship of the forward model with cloud top pressure and surface temperature is weakly nonlinear; therefore these parameters are also retrieved as absolute values. The a priori state vector x a is written analogously to the state vector x and the a priori covariance matrix S a is diagonal; i.e., the a priori standard deviations are assumed to be uncorrelated. The a priori state vector depends on the assumed phase, the values for which are presented in Tables 5  and 6, along with their associated uncertainties for liquid and ice cloud, respectively. The values for the cloud parameters, τ 0.55,c , r e,c and p c are chosen to be typical average values for each phase, although in the current retrieval setup the standard deviation for these values is set to 10 8 , so that the corresponding diagonal elements of S a are 10 16 , effectively eliminating any constraint they have on the retrieval.
Surface temperature T s was chosen as a retrieval parameter so that information in thermal channels can be used to refine the assumed surface temperature, minimizing errors which could bias retrieval results, especially cloud top pressure. Retrievals will be particularly sensitive to surface tem-perature in the case of thin clouds due to significant contributions to the measured brightness temperature in the thermal channels from both the cloud and the surface. For example, assuming a negative lapse rate, a positive change in surface temperature will have a similar effect as a positive change in cloud top pressure. In the case of low clouds, the thermal contrast between the surface and the clouds will be smaller approaching the uncertainties of thermal measurements and the surface temperature. Due to the similar, approximately linear, relationship that thermal measurements have with cloud top pressure and surface temperature, some constraint is required. Since cloud top pressure is unconstrained, the surface temperature must be constrained. The a priori surface temperature is obtained from the ECMWF ERA-Interim reanalysis input and is given standard deviations of 5 and 2 K for land and sea, respectively, so that the corresponding diagonal element of S a is either 25 or 4. Comparison with in situ measurements made at the surface indicate that the errors in the reanalysis surface temperature on the average remain well within these uncertainties but can approach and or exceed these values in situations such as ocean upwelling near the land and over land surfaces such as desert with strong diurnal effects. The values of 5 and 2 K for land and sea have been chosen through trial and error to provide an optimal balance between constraint and the use of the available measurement information on surface temperature. It should be noted that the retrieved surface temperature under cloudy conditions is not intended for use as a product but, depending on cloud optical thickness, and therefore the surface signal, there is at least some reduction in uncertainty relative to that of the ECMWF reanalysis inputs. In cases where the cloud is optically thick (negligible surface signal) the lack of sensitivity and the a priori constraint assures that the surface temperature does not diverge significantly from the ECMWF value.
It should also be noted that estimates of uncertainty do not account for systematic errors in the a priori, in particular on a regional basis. Even if the a priori inputs are unbiased globally they will have some regional bias. Users should be aware that when averaging these data, the uncertainty will not tend towards zero as the a priori uncertainty is systematic.
Our retrieval algorithm has different pathways depending on illumination conditions: "day" (solar zenith angle θ 0 < 80 • ), "twilight" (80 ≤ θ 0 < 90 • ) or "night" (θ 0 ≥ 90 • ). During the day all solar and thermal channels provided are used and the state vector is complete. During twilight conditions, due to the difficulty of modeling solar radiation at solar zenith angles greater than 80 • , we use only channels with a thermal component (11 and 12 µm) and include only p c and T s in the state vector. This can lead to significant biases as the cloud optical thickness and effective radius, affecting cloud infrared transparency, are fixed at the a priori values. As a result, twilight retrievals should be used with caution, especially for ice cloud. At night (no solar signal) the retrieval is limited to the thermal channels 3.7, 11 and 12 µm and lacks a large amount of optical thickness and effective radius in-formation but even in the thermal-only measurements a significant amount of this information on these parameters exists (see Sect. 1). As such, we include these parameters in the state vector for night (as in day) to allow these to vary in a way that is consistent with thermal-only measurements. This significantly improves our estimate of cloud top pressure at night. Ultimately, we do not report these in the final product at night but together they contain enough information for a single microphysical parameter so-called "effective emissivity" discussed in Sect. 4.6. Effective emissivity is a common parameter tied to cloud top pressure retrievals (see Sect. 1) used to describe cloud transparency that can be used for nighttime radiative flux calculations.

First guess
The first guess x 0 defines the state of the inversion iteration at i = 0. This is distinct from the a priori state x a , which is the best estimate of the state before the measurements are made, i.e., it is independent of the measurements, such as that from climatology or the reanalysis input. It is common to use the a priori state as the first guess but this is not the only option. In fact, measurements may be used to determine a first guess that is closer to the retrieval than the a priori resulting in faster convergence of the inversion. In some cases the choice of first guess may change the retrieval result depending on the existence of less than optimal minima that may or may not exist. In our case, the first guesses for τ 0.55,c , r e,c and T s are set to the a priori values.
The first guess for p c is derived from the measurements by interpolating the observed brightness temperature in the 11 µm window channel within the reanalysis temperature profile, which is in fact a simple retrieval of cloud top pressure in itself, assuming an opaque cloud. The methodology carefully deals with both tropospheric inversions and the tropopause to bypass their effect. The steps involved are as follows: 1. Remove inversions including the tropopause from the temperature profile.
-Skip past any surface inversion by searching for the lowest level at which the temperature decreases with height.
-Locate inversions within the boundary layer defined as levels with a temperature lower than that of the level above them.
-If an inversion is found, locate the top of the inversion, being the next level up at which the temperature decreases relative to the previous. -Overwrite all values from the bottom of the inversion to two points above the top of the inversion (assuring the inversion has some width) by linearly extrapolating from the two levels just beneath the inversion.
-Locate the tropopause as the lowest level between 500 and 30 hPa for which the lapse rate is less than 2 K km −1 and remains below that level for at least 2 km. -Overwrite all values from the tropopause up by extrapolating from the two levels just beneath the tropopause.
2. Interpolate the 11 µm brightness temperature onto the new temperature profile.
-If the brightness temperature is outside the range of the profile set p c,0 to the minimum or maximum temperature (as appropriate). -Search through the profile for the first pair of levels that bound the requested temperature. For a liquid phase retrieval search from the bottom of the profile up. Otherwise, search top down. -Linearly interpolate brightness temperature between those located levels to determine p c,0 .

Diagnostics
The retrieval implementation produces a number of diagnostic fields, a few of which will be mentioned here as they are presented later. First, the final cost χ 2 (Eq. 52) normalized by the number of measurements m, given by χ 2 N = χ 2 /m, is output. This quantity serves to indicate how well the measurements fit the forward model given the final estimated state vector, i.e., the state vector from the last iteration. A value of less than unity is generally accepted as a "good fit", but it should be noted that a good fit does not necessarily mean that the retrieval is an accurate estimate of the true state. We will show in Sect. 5 that, due to non-uniqueness in the retrieval space, it is possible to obtain a good fit with an unrepresentative estimate of the state. The number of iterations used to achieve convergence on the retrieved state is also output. This can also be useful to indicate the quality of the retrieval but, again, a large (small) number of iterations does not necessarily indicate a poor (good) retrieval. Finally, two information quantities are produced, including the averaging kernel matrix A and the number of degrees of freedom for signal d s . The averaging kernel is given by where G is referred to as the gain matrix. A quantifies the response of the retrieval to changes in the true state vector about the retrieved state vector. The diagonal elements of A range from 0 to 1, where for a perfect retrieval A would be an identity matrix indicating that changes in each state vector element are perfectly represented by the retrieval. Instead of the averaging kernel we will present the degrees of freedom for signal given by where tr(A) is the trace of A, which describes the number of useful independent quantities there are in the measurements, i.e., the number of independent pieces of information.

Derived products
Several products are produced that are derived from the retrieved state and the assumed input parameters. These include cloud top height H c (km), cloud top temperature T c (K), cloud water path (CWP, g m −2 ), spectral cloud albedo R bd (λ, θ 0 ) and spectral cloud effective emissivity ε(θ v ). H c and T c are obtained by linear interpolation onto the pressure profile at the retrieved cloud top pressure p c .
CWP is derived from the retrieved cloud optical thickness τ 0.55,c and effective radius r e,c (Han et al., 1994) with assuming a density ρ for water and ice of 1.0 and 0.9167 g cm −3 , respectively, and an extinction coefficient Q e for water and ice of 2.0 and 2.1 g cm −3 , both valid for particles large with respect to wavelength. A "spectral cloud albedo", a directional-hemispherical reflectance, also referred to as black-sky albedo, defined as (with dependence on λ, τ c,0.55 and r e,c assumed) is derived from the retrieved τ c,0.55 and r e,c , where R bb (λ, τ c , r e,c , θ 0 , θ v , φ) is the bidirectional reflectance of either liquid water or ice cloud introduced in Sect. 3.4. Cloud albedo R bd (λ, τ c,0.55 , r e,c , θ 0 ), is obtained from LUTs, depending on phase and wavelength, built in the same manner as the operators introduced in Sect. 3.4. Similarly "spectral cloud effective emissivity" ε(λ, τ c,0.55 , r e,c , θ v ) is derived from the LUTs and is in fact the same as the cloud emissivity operator described in Sect. 3.4. At night, this parameter can be thought of as containing the microphysical information retrieved in the optical thickness and effective radius, both of which are not reported at night.
Finally, to account for the fact that the retrieval algorithm may place the cloud top at a location lower than the physical top, as discussed in Sect. 3.8, separate corrected cloud top pressure, height and temperature values are produced by approximating the observed brightness temperature as emitted from one optical depth into the cloud. Assuming the cloud is vertically homogeneous with a constant lapse rate, we can write the corrected T c as where BT is the observed brightness temperature atmospherically corrected to cloud top, σ e is the cloud particle extinction cross section and N is the cloud particle number concentration. Using the measurements at 11 and 12 µm provides two simultaneous equations in T c which can be solved using σ e values in an LUT. Corrected P c and H c values are obtained by linear interpolation onto the temperature profile at the corrected T c . It must be noted that the result of this correction will no longer be radiatively consistent with the other retrieved variables. In other words, broadband radiative flux computations using the corrected cloud top pressure or height values will be biased. As with the primary retrieval parameters, an estimate of the uncertainty in the derived parameters is computed. For this, standard uncertainty propagation is used: where x d is a particular derived parameter.

Retrieval performance
In this section we test the performance of the retrieval system just presented. The methodology can be summarized as follows: 1. First, produce radiances using the "fast" forward model for a given set of parameters. These parameters include the parameters that are assumed to be known, such as meteorology and surface reflectance/emissivity to which a Gaussian variability is applied in accordance with their uncertainty. The parameters also include the set of retrieval parameters which are taken as exact.
2. Then, apply Gaussian noise to the radiances. For this we choose the prelaunch noise characteristics for MODIS.
3. Finally, perform a retrieval on the simulated radiances and compare the results to the retrieval parameters used to produce the radiances.
The channels simulated and subsequently used in the retrieval were the MODIS bands comparable to the AVHRR heritage channels 0.630 0.862, 3.74, 10.8 and 12.0 µm, specifically MODIS bands 1, 2, 20, 31 and 32. Note that we performed the same performance analysis using the 1.61 µm channel rather than the 3.74 µm channel and the results were similar.
In the comparisons that follow we look at the fractional error of the retrieved optical thickness τ c,0.55 , effective radius r e,c and cloud top pressure p c , defined as where x is the true value of a retrieved parameter andx is the retrieved estimate of the parameter. In addition, we will show the estimate of the uncertainty of each retrieval parameter normalized by the retrieved estimate given by where σ (x) is the estimated standard deviation of the parameter x such that σ x k = Ŝ kk , where k is the index of the parameter in the state vector x. Finally, we will show the final cost χ 2 normalized by the number of measurements m: χ 2 N , the number of iterations used and the degrees of freedom for signal computed for the retrieved state d s .
In Fig. 6 we present daytime liquid water cloud retrieval results as a function of optical thickness τ c and effective radius r c,e , with everything else set to our base state. A cost χ 2 < 1 is considered a good fit. It is clear that there is a good fit in all the retrievals. This does not necessarily mean the retrievals are accurate, only that a good fit has been obtained. As already mentioned, these retrievals suffer from non-uniqueness, which means it is possible to obtain a good fit for a less-than-optimal solution. Looking at the error (x) (red and blue represent overestimation and underestimation, respectively) for optical thickness and effective radius we can see that there is a breakdown around the critical optical thickness of unity, where a transition from the singlescattering regime to multiple scattering approximately occurs. It is worth pointing out that absolute errors in optical thickness are still quite small at low cloud optical thickness in the context of cloud retrievals. Given the measurement information available it is difficult to obtain an accurate retrieval for optical thicknesses less than 0.1. We show these low optical thicknesses because subvisible cirrus (optical depth > 0.03) are known to exist (Jensen et al., 1996;Reverdy et al., 2012) but also to demonstrate that the retrieval is acceptable in most cases (optical depth > 1.0). There is also an increase in the number of iterations and degrees of freedom at low optical thicknesses. The increase in the degrees of freedom is due to the increased sensitivity to surface temperature at low cloud optical thicknesses, making it more difficult to distinguish the thermal cloud signal from the thermal surface signal and therefore subjecting the retrieval to more non-uniqueness. The increases in the error in effective radius at low optical thicknesses and smaller particle sizes are due to a decreased sensitivity to smaller particles. With cloud top pressure there is a very small error. It should be noted that the analysis does not take into account the potential biases from treating the cloud as geometrically infinitely thin, as discussed in Sect. 3.8, since we used that model to produce our synthetic measurements. The uncertainties for optical thickness are below 20 % for optical thicknesses from about 4 to 30 but increase outside of this range and are invariant with effective radius. For effective radius the uncertainties are almost all below 20 % above an optical thickness of approximately 5 except for effective radii below approximately 6 µm. Finally, the uncertainties for cloud top pressure are all below approximately 10 % for an optical thickness greater than unity. Our base state describes the parameters that are not explicitly indicated as something else in our discussion of the results. These include the use of the same base state values used for the forward model validation in Sect. 3.9. Unlike the forward model validation, in the interest of brevity, we do not show results as a function of geometry and choose the same base state values (θ 0 = 35.0 • , θ v = 35.0 • and φ = 90.0 • ).
For ice cloud (Fig. 7), the cost increases slightly for a small set of retrievals, the number of iterations increases for an optical thickness around unity and the degrees of freedom for optical thicknesses less than unity also increase relative to liquid water due to a stronger sensitivity to ice cloud at thermal wavelengths and therefore to surface emission. The error for optical thickness is similar to that for liquid water while the larger error in effective radius when τ c < 1 is now for larger particles and opposite in sign. This suggests that the peak in sensitivity to effective radius is at around 30 µm and the effective radius is subject to overestimation for smaller particles and underestimation for larger particles. For cloud top pressure, the error is larger for ice cloud when τ c < 1, which is most likely due to an increase in the importance of absorption and emission for larger particles at thermal wavelengths which will decrease in importance as optical thickness increases from the single-scattering regime to multiple scattering. Finally, relative to liquid water cloud, the minimum in the optical thickness uncertainties shifts to slightly lower optical thicknesses, the effective radius uncertainties are improved down to an optical thickness of around 0.5 due to greater sensitivity to effective radius at thermal wavelengths, and the cloud top pressure uncertainties are somewhat larger due to the reason stated above for error and due to the greater sensitivity to surface emission and the difficulty of separating cloud and surface signals. Figures 8 and 9 repeat Figs. 6 and 7, respectively, except at night. Even though we do not currently report optical thickness and effective radius at night, we still present them here to show that there is enough sensitivity to these parameters to help the cloud top pressure retrieval. From the cost we see that we obtain a good fit for liquid water cloud and the number of iterations is mostly lower than 14. The degrees of freedom vary little, with values around 3. The reason for this is the lack of sensitivity to optical thickness and therefore relatively little change to the sensitivity to surface temperature with optical thickness. The error in optical thickness is beyond 50 % almost everywhere except for some parts of the 1-10 µm effective radius region. The error for effective radius is significantly better and it is this microphysical information that really helps improve the cloud top pressure retrieval of liquid water cloud, which has very little error for clouds of greater than approximately 1 optical depth. The uncertainties for optical thickness are all greater than 100 % while for effective radius they are lower for larger optical thicknesses and effective radii (30-70 %), as expected. For cloud top pressure the uncertainties are almost all less than 20 % for optical thicknesses greater than unity.
For ice cloud (Fig. 9) the night retrieval obtains good fits for all, a slightly larger number of iterations in the optical thickness region around unity and similar degrees of freedom compared to the liquid water cloud retrieval at night. The interesting observation is that the retrieval of optical thickness at night is better for ice cloud in the range between 1 and 10. This is important as it is in this range that semitransparent cirrus clouds occur. Without this sensitivity to optical thickness, the cloud top pressure would be significantly overestimated. Notice that at optical thicknesses below unity, both the optical thickness and cloud top pressure are overestimated. Finally, like the rest of the cases, estimates of uncertainty are overestimated but one can argue that the estimate at night is better for ice cloud than for liquid water cloud. The uncertainties for optical thickness and effective radius are improved relative to liquid water cloud in cases where the error is also improved and the uncertainties for cloud top pressure are mostly still below 20 % at optical thicknesses greater than unity except at large effective radii. In Fig. 10 we show daytime liquid water cloud retrieval results as a function of optical thickness τ c and cloud top pressure p c , with otherwise everything else set to our base state. We perform the retrieval for both a relatively small (for liquid water droplets) effective radius (r e,c = 4 µm) and a relatively large effective radius (r e,c = 20 µm). We choose these values to be well away from the a priori effective radius for liquid water cloud of 12 µm. The small effective radius retrieval suffers from large cost values at optical thicknesses from 1 to 10 and at cloud top pressure values lower than approximately 550 hPa, although most liquid water clouds will be lower in the atmosphere. The number of iterations is consistent with the cost, with more iterations used for higher cost retrievals. The larger degrees of freedom at large cloud top pressure values, even with larger optical thicknesses, are due to the closer proximity to the surface, making it harder to distinguish the cloud from the surface thermal signals. The retrieval errors are mostly what we expect for liquid water cloud. For optical thickness, the errors are small for lower clouds with an optical thickness greater than approximately unity (bearing in mind that most liquid water clouds will have considerably larger optical thicknesses than unity). For effective radius small errors begin at an optical thickness of approximately two. Finally, for cloud top pressure the error is small for optical thicknesses greater than unity for clouds below 400 hPa. For an effective radius of 20 µm the results are in general better than for 4 µm. This is not surprising as the sensitivity shifts closer to 3.7 µm and the other thermal channels. Figure 11 shows the same results as Fig. 10 but for ice cloud. In this case we choose small and large effective radii of r e,c = 10 µm and r e,c = 50 µm, respectively, to be well away from the a priori effective radius for ice cloud of 30 µm. The results are, in general, better as larger errors are confined to lower optical thicknesses. The cost and number of iteration plots are much better in the region of large values that were observed in the corresponding plots for liquid water. As we have seen before, for most of the cases of optical thickness less than unity, the degrees of freedom for ice increase relative to liquid water. Also, unlike for the small effective radius, the number of degrees of freedom is larger for optical thicknesses less than unity at all levels, not just low levels. This is due to the greater sensitivity to larger particles at thermal wavelengths and the difficulty of separating the cloud and surface emission signals. Finally, looking at the effective radius errors for both the liquid water cloud with an effective radius of 4 ( Fig. 10) and the effective radius errors for ice cloud with an effective radius of 50 (Fig. 11), we can see the flip in errors that we discussed in reference to Figs. 6 and 7 for small/large liquid water/ice particles. In Fig. 12 we show daytime liquid water cloud retrieval results as a function of optical thickness τ c and r c,e . In this case we perform the retrieval for both a relatively small surface albedo (A = 0.01) and a relatively large albedo (A = 0.9), with everything else set to our base state. a small spike around an optical thickness of unity for small effective radii, while the number of iterations are all reasonable. The number of degrees of freedom follows the same pattern as in Fig. 6, which is the same as this case except with the base state albedo of 0.2, but the values at low optical thicknesses are much higher. This is because of the increased influence of the surface emission signal if we assume the surface emissivity is 1 minus albedo. The errors are almost all acceptable (even for cloud top pressure at small optical thicknesses). These results indicate why aerosol retrievals generally perform well over ocean and that the retrieval of subvisible cirrus properties may also be possible over ocean. For an albedo of 0.9 the results for optical thickness and effective radius are significantly different and highlight the difficulty of retrieving these properties over bright surfaces (bearing in mind that an albedo of 0.9 is typical of fresh snow cover). The number of degrees of freedom is now fixed at 3, the value we would expect if surface temperature had little effect. The retrieval of optical thickness is overestimated for optical thicknesses less than unity, something that we expect as the surface will look like cloud in this case. More interesting is that optical thickness is underestimated for optical thicknesses greater than 10. Our first thought is that at these optical thicknesses the surface should have little effect and that this error suggests a problem with non-uniqueness and that we may be getting stuck in a suboptimal minimum. The retrieval of effective radius in this case is also problematic, with underestimation at small optical thicknesses and overestimation at large values. In fact, the opposite direction of the errors compared to that of optical thickness further suggests a problem with non-uniqueness.
For ice cloud (Fig. 13) the results with the low and high albedos are comparable to that of liquid water, with the differences primarily with the large cost values for large optical thicknesses and large effective radii and the overestimation of cloud top pressure for optical thicknesses less than unity. In this case it is the overestimation of optical thickness that drives the cloud top pressure solution lower in the atmosphere.

Conclusions
This paper describes the optimal estimation component of the Community Cloud retrieval for Climate (CC4CL) based on the Optimal Retrieval of Aerosol and Cloud (ORAC) algorithm. An extensive forward model is described which includes emission, absorption and multiple scattering of radiation from both solar and thermal sources. The surface is characterized by a bidirectional reflectance distribution function (BRDF) specific to either ocean or land. The model's "fast" radiative transfer solution is separated into solar and thermal components and the model's assumptions and limitations were addressed. Validation was undertaken with a reference forward model, i.e., a more extensive forward model that attempts to eliminate some of the most important assumptions in the "fast" solution. Results show that in relation to the simple scalar operators, for optical thicknesses greater than 10, the errors are comparable to instrument noise, but it should be noted that this error is the difference between the reference forward model and the "fast" forward model and not a measure of the total errors in forward modeling. At small optical thicknesses (less than 0.1-1.0) the errors become larger, especially at optical thicknesses approaching the critical regime of unity, where the contribution of single and multiple scattering to the total shortwave signal is comparable. Fortunately, these optical thicknesses of less than unity are uncommon for most cases in cloud remote sensing.
The retrieval method is then described, including the optimal estimation approach, the input measurements and a priori quantities along with their associated uncertainties, our choice of the iteration first guess and quantities derived from the retrieved parameters. Particular attention was paid to the estimation of the retrieval uncertainty. The performance of the retrieval method was assessed theoretically by simulating measurements using a range of values for the retrieval parameters and then subsequently performing a retrieval on these simulated measurements to which Gaussian noise levels appropriate for MODIS were added. The errors are less than 10 % for optical thicknesses larger than 10 and less than 20 % for optical thicknesses larger than unity. For night our retrieval does not report cloud optical thickness or effective radius but uses the information content in these values to improve the cloud top pressure results. These results are consistent with our forward model analysis. For optical thicknesses less than unity the results become problematic, which could have implications for the retrieval of subvisible cirrus, but, as with successful aerosol retrievals over dark surfaces (such as ocean), the results are comparable to those of optical thicknesses larger than 10. Finally, compared to the actual errors our estimation of the retrieval uncertainty is comparable, again, at cloud optical thickness greater than unity.
It is worth noting that the ORAC algorithm is being extended to retrieve properties in two cloud layers. The publication for this work is in progress and is expected to have the citation of McGarragh et al. (2018).