Real time automatic cloud detection using a low-cost sky camera

Characterizing the atmosphere is one of the most complex studies to undertake due to the non-linearity and phenomenological variability. Clouds are also amongst the most variable of the atmospheric constituents, changing their size and shape over a short period of time. There are several sectors in which the study of cloudiness is of vital importance. In the renewable field, the increasing development of solar technology and the emerging trend for constructing and operating solar plants across the earth’s surface requires very precise control systems that provide optimal energy production management. 5 Likewise, airports are hubs where cloud coverage is required to provide high-precision periodic observations that inform airport operators about the state of the atmosphere. This work presents an autonomous cloud detection system, in real time, based on the digital image processing of a low-cost sky camera. The system’s overall success rate is approximately 94% for all types of sky conditions.

. Mobotix Q24 camera installed in the University of Almería.
The facility has a Mediterranean climate with a high maritime aerosol presence. The images produced are high resolution from a fully digital color CMOS (2048 x 1536 pixels). One image is recorded every minute in JPEG format, the optimal time to identify clouds in the sky. The three distinct channels represent red, green and blue levels. Each image pixel is made up of 8 75 bits, obtaining values between 0 and 255.
For this work, images were selected from all possible sky types, spanning the earliest times of the day to the latest, just before sunset, and at different times of the year. The period studied was from 2013 to 2019. methodology and the steps involved to process an image. All developments have been carried out in the Matlab environment because it is an optimal platform to operate with matrices in record time. Its efficiency is the main reason we used this software for the methodology. To consider the starting point, one raw image is presented in Figure 2, in which the sky is represented as a common image. Here, we see the circular representation of the sky appearing over a black background. Therefore, the first step is to determine 85 the area of interest from the raw image by applying a white mask, thus obtaining the image in Figure 3, where the sky is perfectly identified in the circular area. After applying this customized mask, the image is ready to be used in the developed algorithm to detect clouds. The Matlab environment allows us to make changes in the color space of the images so that we can study specific properties. This is the case with HSV (Hue, Saturation, Value), NTSC and yCbCr color spaces. The first gives information about the gray color scale, 90 where pixels vary between 0 and 1. The second is an RGB color space, where the first component, luminance, represents grayscale information and the last two components make up the chrominance (color information). The last one is used to digitally encode the color information in the computing systems: Y represents the brightness of the color, Cb is the blue component relative to the green component and Cr is the red component relative to the green component. Figure 4 shows the image presented beforehand in the three color spaces. The different colors of the three images represent the main image characteristics. Focusing solely on the inside of the circle, the blue color identifies the more saturated areas of the HSV image. These areas are represented by red and yellow in the NTSC image, and by orange, rose and red in the yCbCr image. Normally, in the NTSC and yCbCr color spaces, the pixel values acquire inflexible static values in each color space channel whereas in the HSV color space, the pixels (represented by the three bands/channels) provide values with better precision for the purpose of cloud detection. Moreover, the representation 100 of clouds is not perfectly represented in the three color spaces, so it is important to define the most significant color space to work with. Given that the HSV color space represents cloudy pixels better, and more clearly, it has been used together with the RGB color space to identify clouds in the image processing procedure. For the complete image processing procedure, the developed algorithm was structured into different parts, as described in the following subsections. Different tables appear to define specific criteria for image processing. Basically, each table has been made following the purpose of each subsection and 105 fitting the intensity levels of the channels, obtaining a precise detection of the zones (clouds or sky). For that, different images have been analysed at different times of the day and times of the year. Afterwards, a general initial state has been assumed, to be precise in the adjustment of intensity values. Normally, the general state of each table starts analyzing the R value and the comparison with G and B channels. If the intensities of these channels are enclosed in particular values, the HSV channels are also analyzed. Therefore, this algorithm has been gradually formed by parts, finally remaining connected and sequential, in the 110 same order as it appears in the manuscript.
3.1 Recognition of the solar area: classification of pixels The first step for detecting clouds in the whole sky image is to determine the solar area. Being able to recognize this area is fundamental for establishing the Sun's position in the image. To track the angle of solar altitude each minute, the Cartesian coordinates are obtained, with the south being represented by the center-bottom pixels and the east by the center-right pixels.

115
Subsequently, the original image (in JPG format), defined by the RGB color space, is also converted into the HSV color space.
As seen in the previous images, the sun appears as bright pixels, so one needs to consider the position of the pixels to determine the bright solar pixels. To do this, after locating the sun pixel, a matrix is created to determine the distance of the other pixels from it. This operation allows us to classify whether a bright pixel is a 'solar pixel' or not (based on its position). As a general rule, when the value of the red, blue and green pixels is greater than 160, the pixels are identified as being in the sun area. 120 Figure 5 shows the general detection of the sun pixels. The main step consists of applying a green mask to pixels that are placed in the sun area. After that, the idea is to detect if these pixels are cloudless or overcast. Table 1 shows the rules for determining cloudless pixels in the solar area.
Different strategies are carried out to determine cloudless pixels in the sun area according to the pixel intensity in each image channel. Figure 6 shows the general detection of cloudless pixels in the sun area (represented in red) after this filter has been 125 applied.
Subsequently, the algorithm looks for cloudy pixels in the same area in case some cloud are present. Table 2 shows the condition for classifying the pixels in the solar area as cloudy.  Only one sentence is applied for detecting cloudy pixels in the solar area. In these situations, when a cloud is identified by means of a pixel, the mask applied is also green. When the solar area has been fully treated, the algorithm focuses on the rest 130 of the image, starting with the solar area periphery. Table 2. Criteria for selecting cloudy pixels in the solar area.

Detection of bright zones around the solar area
The pixels located around the solar area have an intermediate bright characteristic. In other words, the pixels present values lower than the solar area pixels and higher than those in the rest of the image. The size of this area varies according to the day and the atmospheric conditions at each moment. Table 3 shows the adjusted criteria for determining these pixels.
135 Table 3. Criteria for detecting bright pixels around the solar area.
One of the most important tasks is to locate each pixel. The Dis variable almost always appears because the pixel emplacement is very important in this process. Therefore, to distinguish previously classified areas in subsequent processes, the yellow color is used to mark the new area (Fig. 7).
The new pixels classified as yellow do not represent a homogeneous area; they are dispersed across the image but at a distance of less than 650 units from the central solar pixel. In this new preprocessing, there are gaps between the yellow and 140 green pixels that need to be classified beforehand. With the solar and surrounding area processed, the algorithm looks for cloudy pixels in the rest of the image.

Detection of cloudy pixels in the rest of the image
In general, clouds present several characteristics that allow us to identify the most common cloud types (white or extremely dark clouds). Table 4 shows the general pattern for detecting the clouds in the complete image by characterizing the digital 145 levels of these common cloud types.
If a cloudy pixel is detected, it is marked in white. There are many cases in which some pixels are identified as cloudy although no clouds are present in the sky. This is caused by the similarity in the range of channel values, whereby dark skies can be confused for dark clouds. However, this mistake can be remedied during the algorithm's next steps. An example is presented in Figure 8, where a few pixels are classified as cloudy in white color.

150
Only a few pixels are classified as cloudy near the sun area. The first picture for this day showed no clouds appearing in the image so no cloudy pixels could be generated. In spite of this, a few pixels are interpreted as cloud. When solar area pixels and cloudy pixels are evaluated, the process continues to detect the pixels as unclassified.

Detection of cloudless pixels in the image excluding the solar area
After the solar area has been classified, the rest of the image is analyzed to identify if a pixel represents a cloud or not. Table   155 5 represents the set of sentences implemented to detect the cloud-free pixels in the parts of the image not including the solar area.
On can see that the Dis variable was not used even though we have presented the criteria to identify pixels that are outside the solar area. This is because, for cloudy pixels, the digital pixel levels never appear as in the range levels shown in the table.
For this reason, it was not necessary to include the aforementioned variable in the sentences used (Fig. 9).    In the image, the sun area and the surrounding area are processed, along with a small part of the remaining image. Therefore, at this point in the algorithm, it is possible that a large part of the image has still not been processed. Consequently, a further step is necessary to conclude the algorithm and classify all the pixels.

Determination of non-classified pixels
The final steps for classifying the pixels in a complete image establishes a statistical criterion which depends on the pixels that 165 have already been classified. Knowing the amount of pixels for each color, we determine those pixels that do not have a label.
To do this, there are different strategies for establishing the classification criteria for these, as yet, unclassified pixels. Table 6 shows the steps to determine whether the pixels should be classified as cloudless; if not, they will be classified as cloudy. In the table, different expressions appear. SkyP ixels are pixels that have been classified as cloudless, whereas CloudP ixels are those that have been labelled as cloudy. Red, green and yellow pixels have been obtained in the previous processes and 170 N onClass is used to refer to the pixels that remain unclassified. For these operations, the Matlab environment allows us to perform matrix operations in an efficient way. This part of the algorithm results in a matrix in which all the pixels have been labelled, as shown in Figure 10.
As one can see, all the pixels have been assigned a color: blue, yellow, red or green. Now, the mission is to finish the classification process according to a common criterion.

Final step in the sky cam image classification
To finish the sky cam image processing, a final step is needed in which the differently colored pixels are converted to determine whether they are cloudless or cloudy. This process has been determined following the experience gathered working with a great number of images and scenarios. Table 7 shows the specific criterion for assigning the final pixel classification. Table 7. Criteria for the last classification of pixels in the sky cam image processing. and, consequently, the final processed image. An error in one of the criteria presented in the tables would mean an error in the cloud detection, and therefore the image would not have a valid processing and would be identified as wrong.

4 Results
In this section, we present the results of the cloud detection algorithm. In order to analyze the behavior of the software developed under different sky conditions, this section presents several pictures from various sky scenarios. A total of 850 images were taken from 2013 to 2019 at different times (from sunrise to sunset). The images were processed with the analytical objective of obtaining an accurate identification of clear sky and clouds. Therefore, this section is divided into subsections as follow.

Sky images processed under all sky conditions
To analyze the quality of the developed model, several images have been processed and studied. In general, the processed image should represent the most important clouds appearing in the original image when visually inspected. Important clouds are those that can be identified clearly (not only by a few pixels). To see several examples, figures will represent the image processing procedures carried out for the algorithm presented in the previous section. Figure 12 shows two examples chosen 195 randomly, in which a clear and cloudless sky appear.
In the case of cloudless skies, the sun can vary its form and size depending on the solar altitude angle. The algorithm contemplates the solar altitude angle to identify clouds based on the variability in pixel intensity according to the sun position.
Specifically, the sun is the key intensity point of the pixel value and the solar position determines a pathway for performing the cloud recognition. Despite this, for the two cloudless days represented in the image, the sky is free of clouds, as represented by 200 images with the letter H. As one can observe, for each original image (marked with the letter A), a sequence of images appears showing the steps the algorithm takes to finally obtain the image identifying the sky and cloudy pixels. In these cases, no cloud was detected and, therefore, the final images are completely blue.  When the sky is not completely free of clouds, one can have a sky that is either partially or completely covered. In the first case, when observing a sky camera image, varying portions of the sky and cloud may appear. Therefore, it is important to 205 determine the boundaries between the cloud and the sky as effectively as possible. Figure 13 shows two different partiallycloudy situations. classifying those green and yellow pixels as a cloud. In the other sequence of images (the bottom line), the area detected in green turns red where the sun appears. This is because the algorithm's established criteria have not been met for identifying the pixels as a cloud; therefore, they are marked in red. Following this, the clouds are optimally detected in image E and blue sky is detected in image F. Finally, the image is resolved, classifying the pixels in red, green and yellow as cloudless. It is curious how the clouds have generated significant brightness in the solar area, mistakenly classifying the solar area pixels as clouds.
215 Figure 14 shows two cases in which virtually the entire image is covered by clouds. The top sequence of images shows a day when there was a lot of cloudiness and only small portions of sky. As the algorithm is executed, it is interesting to see how no red zone has been detected (attributing the solar area as cloudless) because the clouds in this case have a profile that is perfectly identified by the set of sentences presented in the previous tables. Here, the breaking clouds have been correctly classified in image F. To conclude, image H shows the result of the process with the identification 220 and classification of all the processed pixels being virtually identical to the original image. The bottom sequence of images shows another day with more clouds in the image, and as one can observe, again no red pixels have appeared. Following the steps described in the previous sections, the image processing very precisely determines the areas of blue sky and clouds (image H).

Statistical results and comparison with TSI-880 225
In order to make a statistical evaluation of the developed model's efficacy, we used a model that is already established and published [Alonso et al., 2014b, a, ]; this works with images from a sky camera with a rotational shadow band (T SI − 880 model) installed on the CIESOL building, providing a hemispheric view of the sky (fish-eye vision). In short, the TSI-880 camera model is based on a sky classification using direct, diffuse and global radiation data. The sky is classified as clear, identification. An example of a partially cloudy day is presented in Figure 15 at a time shortly after sunrise.
In Equation 1, two variables are defined: Successes and Total cases. The first one is based on obtaining a processed image with practically perfect cloud identification. Almost perfect means that the processed image adequately represents what appears in the original image (either with the TSI-880 camera or with the Mobotix camera). Thus, a hit will be a final processed image that is like the original. The total number of cases will be the total number of images analysed. The evaluation is done visually, 245 since there is no tool capable of detecting the difference between a raw and a processed image (that is the importance and value of having a cloud detection algorithm). In this sense, the reference is always the original image. Any processed image should resemble the original image.
Once the function was defined, Figure 16 shows the image processing efficacy by comparing the Motobotix Q24 camera to the TSI-880 camera in terms of the sky classification.
250 Figure 16. Graphic success representation of image processing for the cloud detection by using different sky cams and algorithms, according to sky conditions.
The results have been divided into three basic groups: one representing the probabilistic results for clear skies, another for partially covered skies and the third for covered skies. In each group, a bar represents the success rate in (%). As can be seen, all the presented results are above 80%. The best results were obtained for clear skies, where the two cameras had the same success rate, at 98.8%. For partially covered skies, the Q24 camera provided better results than the TSI-880, with a success rate of 90.6%, while the TSI-880 success rate was 87.1%. In the case of overcast skies, the Mobotix camera again had a higher 255 success rate, at 87.1%, while the TSI-880 camera had a value of 84.4%. In general, we can say that the cameras had a very similar success rate despite the slight differences found on days with clouds. Figure 17 shows a comparative graph of the overall hits in cloud detection.
As can be seen from the graph, the two cameras had very similar values overall; the cloud detection image processing for the Mobotix camera had a success rate of 93.7% while the TSI-880 camera had a value of 92.3%.

260
In addition, it should be noted that the TSI-880 camera requires a high level of maintenance to ensure the optimal quality of the images taken; this is due to its special design in which the glass is a rotating dome that must be cleaned periodically, taking care not to scratch it. Moreover, the glass is rotated by a motor that needs to be checked regularly in order to operate properly. In contrast, the Mobotix Q24 camera has dimensions similar to a surveillance camera, with only a small glass panel protecting the lens; this means that the maintenance requirements are reduced significantly. For our work, we can state that it 265 was not necessary to clean the glass for several months. The device produced sharp, appropriate images allowing the algorithm to correctly identify the clouds present. Consequently, this article demonstrates that a new algorithm has been developed which is capable of offering the same performance as the TSI-880 camera without needing radiation measurements to perform the digital image processing and requiring only minimal maintenance to acquire quality images for the cloud detection process.

270
This work presents a model to detect the cloudiness present in images in real time. It uses a low-cost Mobotix Q24 sky camera which only requires the digital levels of the image.
To detect clouds in the camera images, different areas of the image are differentiated. First, pixels are tagged in the solar area and the surrounding areas, assigning them with red, green or yellow colors. Subsequently, the algorithm detects cloudy pixels in the rest of the image and then clear sky pixels. Finally, the tagged pixels, such as cloud or sky, are classified, obtaining 275 a final image that resembles the original. The cloud detection system developed has been compared to a published and referenced system that is also based on digital image levels but uses a TSI-880 camera. In general, the results are very similar for both models. Under all sky conditions, the system developed with the Mobotix Q24 camera presented a higher success rate (93%) than the TSI-880 camera (around 92%).
Under clear sky conditions, the processing of both cameras gave the same result (a 98% success rate). Under partially covered 280 skies, the Mobotix camera performed better with a success rate higher than 90% (the TSI-880 success rate was 87%). Under overcast skies, the Mobotix camera had a success rate of 87% while the TSI camera's success rate was 3% less.
One of the main advantages of the new system is that there is no need for direct, diffuse and global radiation data to perform the image processing (as is the case with the TSI-880); this greatly reduces costs as it makes cloud detection possible using sky cameras. Another major advantage is the minimal maintenance required to clean the camera, meaning the system is almost 285 autonomous and can automatically obtain high quality images in which the clouds can be defined optimally.
With this method, a new system is presented which combines the digital channels from a very low-cost sky camera. It can be installed in the control panel of any solar plant or airport, of whatever type. The system represents a new development in predicting cloud cover and solar radiation over the short term.
Moreover, this new development opens the possibility to extrapolate the algorithm to another cameras. This tasks probably is 290 not easy and direct, since each camera has its own optics, which makes it more difficult to adapt a custom-made algorithm for a camera with a determined optics. However, a novel idea has been appeared to adapt this new system to other cameras (with very slight modifications), to see if it is possible to obtain hits of the same range as for the Mobotix camera. The modifications will be necessary because each lens will have its own properties according to saturation levels, exposure, etc. Perhaps, it is possible to assume a correlation between the intensity of pixel channels for different technologies.