Interactive comment on “ A method for cloud detection and opacity classification based on ground based sky imagery ” by M . S . Ghonima

Regarding the dependence of the performance of the algorithm on solar zenith angle: Our method has been recently applied to a TSI stationed at Copper Mountain, Nevada (35.785 N, 115.0012 W). The table below shows how the algorithm compares to a set of 60 manually annotated images encompassing a wide variety of cloud conditions, ranging from broken altostratus clouds to scattered fair-weather cumulus clouds. The


Introduction
Clouds play an important role in Earth's climate; however there are still large uncertainties in the cloud-climate feedback (Solomon et al., 2007).Cloud reflection and in some cases enhancement of incoming solar radiation is an active research area (Cess et al., 1995;Kindel et al., 2011;Luoma et al., 2012).For solar power applications cloud transmissivity is the critical parameter and it is a function of cloud characteristics such as vertical and horizontal extent, droplet concentration and size distribution.Aerosols also affect the radiation budget firstly by scattering and absorbing solar radiation and secondly by acting as cloud condensation nuclei thereby modifying the radiative properties of clouds (Twohy et al., 2005;Kim and Ramanathan, 2008).
Cloud properties such as cloud optical depth and cloud fraction can be estimated from satellite images (Rossow and Schiffer, 1999;Zhao and Di Girolamo, 2006).Satellites sample on a global scale; however, their resolution is coarse at 1 km for the geostationary (GOES-12-15 series) satellites and 250 m resolution with only 1-2 images taken per day for the polar orbiting MODIS satellite.Another major shortcoming of satellite retrieved data, in the field of solar resource assessment, is the inability of satellites to determine solar obstruction accurately for a specific site due to uncertainties in cloud height and depth retrievals.Thus, ground based sky imagers (henceforth, we will refer to all types of ground based sky imagers as SIs) were developed in order to address the need for atmospheric imaging at higher spatial and temporal resolution.Since the first digital SIs were developed at University of California, San Diego (Johnson et al., 1989;Shields et al., 1993, Shields et al., 1998a;Shields et al., 1998b;Shields et al., 2009), various groups have designed different SIs.The most popular design consists of a digital camera coupled with an upward looking fisheye-lens to provide field of view (FOV) of about 180 • , (Seiz et al., 2007;Souza-Echer et al., 2006;Calbó et al., 2008;Cazorla et al., 2008b;Román et al., 2012).Another SI system design uses a downward looking camera on top and a spherical mirror (Pfister et al., 2003;Long et al., 2006;Neto et al., 2010;Chow et al., 2011).
Cloud detection using SIs is generally based on a thresholding technique that utilizes the camera's red-greenblue (RGB) channel magnitudes to determine the red-blue ratio (RBR) (Shields et al., 1993).The Shields et al. (1993) algorithm uses fixed ratio thresholds to identify opaque clouds; thin clouds are detected through a comparison with a clear sky background RBR library as a function of solar angle, look angle and site location.Souza-Echer et al. (2006) used saturation in the hue, saturation and luminance (HSL) colorspace with fixed thresholds for cloud detection.Cazorla et al. (2008b) classified clouds based on neural networks.Neto et al. (2010) utilized the multidimensional Euclidean geometric distance (EGD) and Bayesian methods to classify image pixels based on cloud and sky patterns.Shields et al. (2010) added an adaptive thresholding technique to account for variations in haze amount in real time.Finally, Li et al. (2011) developed a hybrid thresholding technique (HYTA) that is based on both fixed and adaptive thresholding techniques for cloud detection.
SIs have also been used to detect aerosols (Cazorla et al., 2008a(Cazorla et al., , 2009;;Huo and Lü, 2010).Cazorla et al. (2008a) obtained Aerosol Optical Depth (AOD) at different wavelengths from pixel counts in the red and blue channels of the SI input to a neural network.The presence of aerosols modifies the ratio of red to blue scattered light and can adversely impact the performance of cloud classification algorithms.The main purpose of this paper is to create dynamic thresholding techniques for cloud detection that account for aerosol variations.Clear sky, optically thin and thick cloud pixels are classified on a pixel by pixel basis for each image.Compared to other algorithms in the literature, our method provides an accurate mean to classify the pixels of a sky image captured by a commercially produced sky imager into three different classes as it takes aerosol conditions into account.Section 2 presents the experimental set up.Section 3 outlines the method by which the images are classified and Sect. 4 presents the results and discussion of the classification.Finally, Sect. 5 provides concluding remarks.

Sky camera setup and environment
The University of California, San Diego (UCSD) is located 0.5 km from the Pacific Ocean in a temperate climate averaging 5 kWh m −2 day −1 of global horizontal irradiation.Maritime shallow cumulus clouds are the most common form of clouds, however during the summer mornings, marine layer stratus overcast clouds are prevalent.Maritime aerosols such as sea salt are dominant, but also urban-industrial aerosols originating locally and sometimes from the Los Angeles metropolitan area impact the San Diego atmosphere (Ault et al., 2009).In the absence of clouds, the AOD at 500 nm averages about 0.1 and typically ranges from 0.02 to 0.3.
A Total Sky Imager 440A (TSI Yankee Environmental Systems) was installed on the UCSD campus (32.885 • N, 117.240 • W, and 124 m m.s.l.) in August 2009.The TSI consists of a camera that looks down on a spherical mirror reflecting the sky.The mirror contains a dull black rubber shadowband which tracks the sun in order to reduce the dynamic range of the sampled sky signal, thus increasing radiometric resolution in the portion of the sky which is of interest.Images are taken by the TSI every 30 s. Sky images on selected days between January and July 2011 were used, representing a range of cloud and atmospheric conditions.
The TSI outputs 24-bit (8 bit for each RGB channel) JPEG images with a resolution of 640 by 480 pixels, of which the mirror occupies 420 by 420 pixels.A small loss of information occurs due to JPEG compression.Pixels of the image corresponding to the shadowband and the camera arm are identified automatically and excluded.Pixels at a FOV > 140 • are also excluded due to distortion.

Image metrics for cloud and aerosol characterization
Compared to the clean cloudless atmosphere, both clouds and aerosols enhance red versus blue intensity increasing the RBR (Shields et al., 1993) and the red-blue difference (RBD, Heinle et al., 2010).
Both RBR and RBD take into account the chrominance (C r C b ), reflected by the difference (R − B); RBR is also a function of the intensity or luminosity (Y ) of the image due to normalization by B while RBD is not a function of Y .Images captured by the TSI are automatically compressed to JPEG with a downsampling ratio of 4 : 2 : 0, in which C r C b are sampled on each alternate line and Y is not subsampled.As a result of this downsampling, chrominance has lower resolution than luminosity and RBR will have a higher resolution (i.e. more unique values) than RBD.
Another parameter proposed by Yamashita et al. (2005) and Li et al. (2011) is the normalized red-blue ratio: Equation (3) shows that NRBR can be written as a nonlinear monotonically decreasing function of RBR.For our cloud decision algorithm, we will be using an offset to a clear sky pixel RBR magnitude for cloud detection and opacity classification thus there will be no difference in accuracy between RBR and NRBR.Finally, for this paper we will use the RBR parameter as it has a higher resolution than the RBD and will provide similar results to NRBR in our cloud detection and opacity classification (CDOC) algorithm.

Effect of atmospheric properties on spectral features
In a clean, cloudless atmosphere, Rayleigh scattering of incoming solar radiation dominates.Since the magnitude of Rayleigh scattering is inversely proportional to the fourth power of the wavelength, visible light in the blue spectrum is predominately scattered.Consequently, in a clear atmosphere and outside the circumsolar region, Rayleigh scattering causes the input to the blue channel of the TSI camera to be higher than that of the red and green channels.In a cloudless atmosphere with high AOD, incoming solar radiation is scattered due to both Mie and Rayleigh scattering.Since Mie scattering is less dependent on wavelength, more light at larger wavelengths is scattered.This in turn causes the magnitude of the red and green channels to increase relative to the blue channel, especially near the circumsolar region as the forward lobe Mie scattering is dominant.Near and inside the circumsolar region, the RGB channels of the image saturate due to the high intensity of the direct solar beam and forward scattering of aerosols.Thin clouds are challenging to detect as their RBR is similar to clear sky especially in haze (atmosphere with high AOD).Optically thick clouds, on the other hand result in similar signals across the RGB wavelengths.Since thick clouds have a RBR of around one, they can be easily identified in a clear atmosphere (RBR ∼ 0.5) even under high AOD.It should be noted that the measured RBRs are also affected by camera specifications such as spectral responsitivity of the sensing device.Thus, the RBR will vary between different SI instruments.

Effect of aerosol optical depth on clear sky red blue ratio
In order to determine the effect of AOD variation on the channel magnitudes of the TSI, we compared the RBR of the TSI images with the Aerosol Optical Depth (AOD) measurement taken at 500 nm by an Aerosol Robotic NETwork (AERONET; Holben et al., 1998;Smirnov et al., 2000) sun photometer located less than 3 km away, at the Scripps Institution of Oceanography, UCSD.Cloudless sky condition images on 35 days between January and June 2011 with solar zenith angles (SZAs) less than 70 • were considered.Absence of clouds on these days was confirmed using visual inspection of the images.While TSI images are taken every 30 s, AERONET timesteps are irregular at 0.25 air mass intervals for SZA < 70 • .To generate a representative RBR value for an image, an average was taken from all the pixels that lie in a circular band between the 35 • and 45 • scattering angles.
The mean RBR of the pixels is then compared with the nearest AOD measurement (no more than 5 min time difference, depending on SZA; Ghonima, 2011).There is a strong correlation between RBR and 500 nm AOD (τ 500 ) with a coefficient of determination (COD) of 0.797 for a linear regression of RBR = 0.87τ 500 + 0.40.
The direct relationship can be explained by the fact that increasing AOD in the atmosphere increases Mie scattering.As a result of the increased Mie scattering, light will be scattered more evenly across the spectrum hereby increasing the RBR.The correlation was higher for the RBR versus τ 500 correlation was higher for the RBRthan it was for the red channel versus τ 500 , because normalizing the red channel by the blue channel helps to remove variations caused by SZA and image zenith angles (IZA) dependence, resulting in a more stable metric for comparison.Figure 1 demonstrates that AOD affects the RBR and furthermore that the AOD can be determined from the RBR of a TSI, enabling haze corrections to the CDOC algorithm thresholds.

Cloud detection and opacity classification algorithm
In our algorithm, pixels in the images collected by the TSI are classified into three classes (clear, thin or thick) based on the difference between a pixel's actual RBR and the corresponding expected RBR if the pixel were clear.A haze correction factor (HCF) is added to account for the effects of variations in AOD on RBR.

Clear sky library
In a clear sky, the RBR is largest near the sun and decreases with increasing sun-pixel angle (SPA, Figs. 2, 4c).The RBR also increases near the horizon (large IZAs) due to increased optical path and larger aerosol concentrations near the surface (Gueymard and Thevenard, 2009).Consequently, for cloud detection these dependencies should be removed.A clear sky library (CSL, Shields et al., 1993) provides reference RBR for each pixel and time from historical clear day images.In our CSL the RGB intensities for each pixel is stored in a matrix as a function of IZA (Fig. 2), SPA, and solar zenith angle (SZA) from historical images on a clear day (Fig 3a).Given the large sun-earth distance, the SPA is nearly identical to the scattering angle that the photon experiences at the scattering molecule or particle.
The CSL is updated on every clear day throughout the year because changing solar position and its projection on the TSI mirror and aerosol climatology affects RGB magnitudes and RBR.For example, the variation in RBR of the CSL computed on different days is highest near the solar region (small SPA) with a standard deviation of 0.1 versus magnitudes of ∼ 0.5 in Fig. 3b).Therefore, when the cloud decision algorithm is applied for a certain day it utilizes the CSL generated on the closest date.

Cloud detection and opacity classification metrics
An example sky image from 20 February 2011 and RBR image are shown in Figs.4a and b, respectively.For each integer SZA, the RBR from the CSL is obtained as a function of SPA and IZA for each pixel (Fig. 4c).A pixel is classified as a thick cloud if the difference (Diff Fig. 4d) between the pixel RBR and CSL is greater than the thick cloud threshold (see Sect. 3.3).(small RBR).Once the CSL is subtracted, Fig. 4d shows that all clear areas assume a similar Diff value and opaque clouds in all areas of the image can now be clearly distinguished from clear sky.Consequently, Diff allows the use of a uniform threshold for comparison of all clouds with respect to the clear sky ratio across all pixels.
Figure 1 showed that AOD significantly affects the RBR, but this is not accounted for in the CSL.By dynamically correcting the CSL for aerosol content, more consistent thresholds can be chosen to distinguish between clear pixels and thin clouds.For example, if the CSL was generated on a day with small AOD and was applied to a clear day with large AOD, Diff would be positive throughout the image, which may lead to false overcast cloud detection.Thus, a haze correction factor (HCF, Shields et al., 2010) to the CSL is introduced to account for variation of AOD.A single HCF is used across each CSL as Shields et al. (2010) found that the change in RBR with AOD is approximately independent of SPA and IZA, except for a small dependence in the solar aureole and near the horizon.
HCF is determined iteratively at each time step (Fig. 5).First, the CSL is initialized and HCF is set to 1 (first box).The 3rd box (decision diamond) describes how clear pixels are selected based on Diff and the "yes" branch shows how clear sky RBR is obtained from these clear pixels.Pixels are determined to be clear with 96 % confidence if Diff is below a threshold which is calculated based on the probability density function (PDF) of clear pixels (see Sect. 3.3).Next, HCF is calculated by dividing the mean of the clear pixels' RBR by the corresponding CSL's mean RBR.The CSL is then multiplied by the HCF to obtain the aerosol-corrected CSL (CSL HCF ).Depending on the difference in AOD between the day under consideration and the day when the CSL was generated, HCF can either be greater or less than one.The iteration continues until convergence below an error threshold.If no clear pixels can be identified (e.g. for overcast skies) or if the correction is too large (more than 20 %) then HCF = 1.Now the difference between the image and the CSL corrected by the HCF can be calculated as Diff HCF  = RBR − (CSL × HCF).( 6) Another method to control for the AOD effect on RBR proposed by Shields et al. (2010) is the perturbation ratio, which is the ratio of the current pixel RBR to the CSL pixel RBR.   5) while the distinction between clear sky and thin clouds is based on the Diff HCF in Eq. ( 6).A similar process is applied for the perturbation ratio (Eqs.7 and 8).

Prt
There are other differences however.In the Shields et al. ( 2010) algorithm, the perturbation ratio was used only for thin clouds, and not for thick clouds; also the spatial variance in this perturbation ratio was used to help distinguish between heavy haze and thin cloud (Shields et al., 2010(Shields et al., , personal communication, 2011)).In Sect.4.1, we will compare the performance of the CDOC algorithm based on Diff and Prt.

Training data and threshold determination
We generated a training set that consisted of 60 images collected on 5 different days between January and June 2011.Twelve images were sampled from each day spaced at 5 min intervals to avoid excessive overlaps in the clouds sampled.The days were chosen to represent the different cloud and atmospheric conditions encountered in coastal southern California.Images with completely clear skies were excluded from the training set.Each pixel in the image was manually classified into clear, thin, and thick (opaque) clouds by drawing polygons on the image.
The training set of manually annotated images was utilized to determine the thick cloud and clear sky Diff threshold values through trial and error.The objective was to maximize the overall accuracies for all 3 classes subject to the constraint (refer to Sect.4.1) that the clear sky and thick cloud accuracy is greater than 80 %.For photovoltaic solar power generation, attenuation of solar radiation by thick cloud causes the most significant impact while thin clouds have a relatively small effect.As a result, the clear sky and thick cloud thresholds are chosen to maximize thick cloud and clear sky accuracy rather than thin cloud accuracy.
The CDOC algorithm is as follows: first, the image RBR and CSL RBR are input (Fig. 6).Second, the HCF is determined as outlined in Sect.3.2.Next, the Diff, Diff HCF or Prt, Prt HCF are computed.Finally, based on the thick cloud and clear sky thresholds pixels are classified into clear sky, thin cloud and thick cloud classes.

Training set
In order to understand the potential accuracies for the different methods, based on the training set of manually annotated images, a PDF was generated for clear, thin, and thick cloudy pixels using the metrics of Diff and Prt with the HCF applied for the clear and thin cloud pixels (Fig. 7).For the metrics to be selective one would expect distinct and sharp peaks with little overlap in the distributions.All PDFs of Diff HCF and Prt HCF for the different classes followed a near Gaussian distribution.If the HCF is not applied (Fig. 8), there is more variance in the clear sky PDF and a second peak appears due to the variations in aerosol content within the training set.The thick clouds have a distinct PDF with a much larger RBR than clear sky or thin clouds.The Prt metric results in greater overlap between the clear sky and thin cloud pixel distributions causing more misclassifications.
Visual inspection revealed that the algorithm sometimes misclassified thick clouds in the circumsolar region on overcast days.The reason for the misclassification is that pixels in the circumsolar region saturate on clear days.Consequently, the RBR in the CSL is close to 1, which is similar to the RBR of thick clouds.Hence the small difference or ratio between CSL the thick cloud RBR and the CSL results in clouds being misclassified as clear or thin.In order to correct the misclassification, we decreased the thick cloud threshold value in the circumsolar region (0 In order to evaluate the performance of the CDOC algorithm, we use a confusion matrix (Kohavi and Provost, 1998) for the three classes (1): clear pixels, (2): thin cloud pixels, and (3): thick cloud pixels, thus there are nine possible outcomes for the CDOC algorithm (Table 1).Kohavi and Provost (1998) define accuracy as the sum of correct classification made by the algorithm, i.e. (TC11 + TC22 + TC33) divided by the sum of all categories.This metric is indepen- dent of the number of clear, thin, or thick clouds observed and will be used for evaluating the performance of our algorithm and to determine the threshold values.For the CDOC algorithms based on Diff and Prt, we generated confusion matrices with and without the application of the HCF (Tables 2, 3).
The cloud decision algorithm based on Diff outperforms the algorithm based on Prt.The Diff algorithm has a high accuracy in classifying thick cloud and clear sky pixels.However, the accuracy is smaller for pixels with thin clouds.The HCF improved the Diff thin pixel accuracy by 5 points (Table 2).Especially noteworthy is the very low likelihood of Diff clear/thick cloud confusion; less than 2 % of clear pixels were classified as thick clouds and less than 3 % of thick clouds were classified as clear.The low accuracy for thin clouds is at least partly related to the biases in the manual classification due to human error by the observer; visually it is hard to delineate the "cloud edges" of thin clouds.Moreover, thin clouds usually have gaps of clear skies and do not have uniform textures.This is reflected in the overlap in Diff and Prt values between the classes that is evident in the PDFs (Figs. 7, 8).Thus, with a fixed threshold we are bound to misclassify pixels that lie in the overlap region.We will base our cloud decision algorithm on Diff HCF as it has yielded more accurate results.
Table 1.Confusion matrix for CDOC.For example, true class 11 (TC11) denotes the percentage of clear pixels that were correctly classified as clear pixels, false class 12 (FC12) denotes the percentage of clear pixels that were classified as thin cloud pixels, false class 13 (FC13) denotes the percentage of clear pixels that were classified as thick cloud pixels, false class 21 (FC21) denotes the percentage of thin cloud pixels that were classified as clear pixels, and so on.

Manual
Algorithm Classification Classification Clear (1) Thin ( 2) Thick ( 3) While the results in Tables 2, 3 were obtained for challenging conditions with a mixture of cloud types, the CDOC algorithm has high accuracies for classifying clear sky images as will be shown in the next section (however, note that much of the more challenging circumsolar region is not considered due to the shadowband, Fig. 4).Thus, manual inspection is only required for one or two clear days to initialize the CSL.Afterwards, days during which no clouds are detected by the algorithm can readily be added to the CSL.

Validation set
In order to evaluate the performance of the CDOC algorithm, an "out-of-sample" set of 30 manually annotated images was chosen.To avoid biasing the selection of images towards particular sky conditions, we used images collected within 30 min of solar noon for 12 to 16 February and 16 to 20 April.These periods were chosen because -at this site -they represented a large range in aerosol content from 0.1-0.13 in the April set compared to the 0.017-0.059 in the February set.Reviewing the algorithm's classification accuracies at different SZAs (39-65 • ) in the training set at a different site (not shown), there was no considerable change in accuracy with SZA.Thus, images captured around solar noon were chosen for the validation set.
Table 4 shows the CDOC performance metrics by image and certain images from the set are illustrated in Fig. 9.For overcast skies (100 % thick, Table 4, Images 7-10, 13-15, 25, 29; Fig. 9a, b), the CDOC algorithm accurately classified more than 95 % of the pixels.However, in some cases even after the lower thick cloud threshold was applied in the solar region (Sect.3.3), thick cloud pixels were incorrectly classified as thin clouds due to the high RBR of the CSL (Fig. 4c).
In the case of clear skies, the algorithm was on average over 99 % accurate (Table 4, Images 1-6, 16-18, 22; Fig. 9c, d), but a few pixels were misclassified as thin clouds due to objects or corrosion present on the mirror, especially in the solar region.
For the case of skies with few thin clouds (Table 4, Image 12; Fig. 9e, f), 91 % of pixels were correctly classified.Discrepancies between visual and automated classification can be explained by inaccuracies of the visual classifier.For broken skies with a mixture of thick and thin clouds (Fig. 9g, h, i, j) the algorithm performed close to that of manual classification (Table 4,Images 19,24).
A confusion matrix was generated to determine the performance of the algorithm for the validation set (Table 5).We note that the accuracy for clear sky pixel identification and thick cloud pixel identification is higher than that of the test sample (Table 2) because for randomly chosen images there are more cases of completely or predominantly clear or overcast images which simplify CDOC (Table 4).Thin clouds pixels have a lower classification accuracy compared to the other classes as their Diff HCF value falls into a transition region between clear sky and thick cloud pixels which create difficulties as discussed in Sect.4.1.

Comparison to fixed thresholding technique
We compared the CDOC algorithm against classifying the RBR image based on fixed uniform thresholds used in the original TSI algorithm (Long et al., 2006).The TSI algorithms now shipped with the instrument fit a predetermined function to vary the clear/think/thick RBR thresholds across the image depending on the sun-pixel distance.To apply the technique, we first used the training set to determine the optimal RBR thresholds to yield the highest accuracies for the three classes (Table 6).Then, we applied these thresholds to the validation set (Table 7).
Comparing both methods, we see that the CDOC method is superior to the fixed threshold method as higher accuracies were obtained for all three classes for both the training and validation set.That is, in the validation for clear, thin, and thick cloud, respectively, the CDOC algorithm provided 96.0 %, 60.0 %, and 96.3 % accuracy, as compared with 89.3 %, 56.1 %, and 91.5 %, even though we had optimized the fixed RBR thresholds.One of the short comings of the fixed thresholds method is that the threshold values need to be modified throughout the year to account for changes in AOD and degradation and/or soiling of the mirror.Also, the classification accuracy is sensitive to the thresholds chosen.
Our CDOC algorithm, on the other hand, classifies pixels by comparing them to a CSL that is modified through the year to account for changes in aerosol content, solar position and instrument degradation.

Conclusions
The purpose of this study was to develop a methodology to automatically classify clear skies, thin cloud, and thick cloud image pixels obtained from a ground-based sky imager.This method was applied to Total Sky Imager (TSI) imagery.The red-blue ratio (RBR) on cloudless days was shown to be well-correlated to aerosol optical depth (AOD).As a result, a haze correction factor (HCF) was introduced to account for AOD effects on the RBR.By applying the correction factor we were able to better distinguish between haze and thin clouds in the atmosphere.In order to classify the images we compared each pixel's RBR to the corresponding clear sky's RBR that was auto-calibrated using the HCF.CDOC was found to be more accurate when based on the difference in RBR from a clear sky RBR rather than the ratio of RBR to clear sky RBR.Comparing automated and visually classified images, the algorithm was found to be very accurate in classifying thick cloud and clear sky pixel in a variety of sky conditions.Thin cloud pixel classification accuracy was lower, due to a small range of Diff values over which it was classified as well as difficulties in marking and defining thin cloud boundaries.
The method developed provides a significant improvement over the TSI's original software in pixel classification accuracy.The haze correction factor method avoids the need to constantly adjust the threshold values for cloud classification.This paper introduces a method of applying some aspects of the HCF method developed by Shields et al. (1993Shields et al. ( , 2010) ) in a manner that can be applied with a TSI instruments.We also found that with our algorithm we got better results using the difference between the image RBR and the CSL's RBR to identify the pixels as thick, thin or clear, rather than the ratios.
The CDOC algorithm will be implemented to improve short-term solar forecast accuracy by improving cloud detection as well providing the added information of cloud opacity.

Fig. 1 .
Fig. 1.Scatter graph of RBR from a total sky imager versus AOD from AERONET for data on 35 clear days in January-June 2011 (dots), RBR is extracted from sun-pixel (scattering) angles of 35 • to 45 • .The line is a linear regression fit (Eq.4).

Fig. 2 .
Fig. 2. Spherical mirror of the TSI and solar geometry.The thick black line shows the TSI's spherical mirror.The square on the left shows a particular pixel for illustration purposes.The sun-pixel angle is the angle between the solar direct beam and the pixel.The thin black circle is drawn through the pixel and denotes a line of constant Image Zenith Angle (IZA).IZA is the angle between the pixel and a vertical line through the center of the imager.

FigureFigure 5 .
Figure4cshows the CSL extracted from 12 February for SZA = 43 • .Consistent with Fig.3a, the CSL is fairly homogeneous across the image with the exception of the solar region (small SPAs, large RBR) and large SPAs and IZAs

Fig. 5 .
Fig. 5. Flow chart for determining HCF which is executed pixel-by-pixel.The first box represents the initialization of HCF, RBR, and CSL.Since the selection of clear pixels also depends on HCF (see 3rd box from the top), the HCF must be obtained iteratively.(i, j ) denote the pixel number in the image.

Figure 6 .
Figure 6.Flowchart of the CDOC algorithm which is executed pixel-by-pixel.Note that thick clouds are determined based on the Diff in Eq. 5 while the distinction between clear sky and thin clouds is based on the Diff HCF in Eq. 6.A similar process is applied for the perturbation ratio (Eqs.7 and 8).

Fig. 6 .
Fig. 6.Flowchart of the CDOC algorithm which is executed pixel-by-pixel.Note that thick clouds are determined based on the Diff in Eq. (5) while the distinction between clear sky and thin clouds is based on the Diff HCF in Eq. (6).A similar process is applied for the perturbation ratio (Eqs.7 and 8).

Fig. 9 .
Fig. 9.Total sky image (a, c, e, g, i) and CDOC (b, d, f, h, j) for: (1) overcast skies (a, b) taken on 20 April 2011 and corresponding to image 15 in Table 4; (2) clear skies (c, d) taken on 12 February 2011 and corresponding to image 16 in Table 4; (3) few thin clouds (e, f) taken on 19 April 2011, and corresponding to image 12 in Table 4; (4) partly cloudy skies (g, h), (i, j) taken on 13 February 2011, and 14 February 2011, and corresponding to images 19 and 24 in Table 4, respectively.For the classification images, a value of 3 on the color scale represents thick clouds, 2 represents thin clouds, and 1 represents clear skies.

Table 2 .
Confusion matrix for CDOC of training set based on the Diff and Diff HCF metric (Eq.5, 6 respectively) All values are in [%].

Table 3 .
Confusion matrix for CDOC of training set based on the Prt metric (Eqs.7, 8 respectively).All values are in [%].

Table 4 .
Results of manual classification and CDOC algorithm for the validation images as well as AOD measurements at 500 nm averaged during the time period of the sky images.Note that for overcast skies (18-20 April) there are no AOD measurements.

Table 5 .
Confusion matrix for CDOC of validation set based on the Diff HCF metric (Eq.6).All values are in [%] and add up to 100 % across rows.

Table 6 .
Confusion matrix for CDOC of training set based on fixed RBR thresholds.All values are in [%] and add up to 100 % across rows.

Table 7 .
Confusion matrix for CDOC of validation set based on fixed RBR thresholds.All values are in [%] and add up to 100 % across rows.