Estimating Solar Irradiance Using Sky Imagers

Ground-based whole sky cameras are extensively used for localized monitoring of clouds nowadays. They capture hemispherical images of the sky at regular intervals using a fisheye lens. In this paper, we propose a framework for estimating solar irradiance from pictures taken by those imagers. Unlike pyranometers, such sky images contain information about cloud coverage and can be used to derive cloud movement. An accurate estimation of solar irradiance using solely those images is thus a first step towards short-term forecasting of solar energy generation based on cloud movement. We derive and validate our model using pyranometers co-located with our whole sky imagers. We achieve a better performance in estimating solar irradiance and in particular its short-term variations as compared to other related methods using ground-based observations.

produced solar energy. Therefore, it is important to model the incoming solar radiation accurately. In this paper, we attempt to measure the rapid fluctuations of solar irradiance using ground-based sky cameras.
The analysis of clouds and several other atmospheric phenomena is traditionally done using satellite images. However, satellite images have either low temporal or low spatial resolutions. A popular instrument is Moderate-resolution Imaging Spectroradiometer (MODIS) (Pagano and Durham, 1993), which is on board the Terra and Aqua satellites and provides a large-scale view of cloud dynamics and various atmospheric phenomena. Data from MODIS are usually available only twice in a day for a particular location. This is useful for a macro-analysis of cloud formation on the earth's surface. Another illustrative example of such satellite data is the HelioClim-1 database from Global Earth Observation System of Systems (GEOSS) (Lautenbacher, 2006). It provides hourly and daily average of surface solar radiation received at ground level (Lefèvre et al., 2014). Ouarda et al. in (Ouarda et al., 2016) assessed the solar irradiance from six thermal channels obtained from Spinning Enhanced Visible and Infrared Imager (SEVIRI) instrument. However these are temporal and spatial averages. Solar energy applications requires knowledge of the solar irradiance at specific locations and at all times throughout the day. Therefore, images obtained from satellites are generally not conducive for continuous analysis and prediction, especially in regions where cloud formation is highly localized.

Related Work
Several existing works analyze ground-based images with different meteorological observations. Most of them correlate the cloud coverage obtained from sky images with meteorologists' observations. Silva and Souza-Echer (2016) validated cloud coverage measurements obtained from ground-based automatic imager and human observations for two meteorological stations in Brazil. Huo and Lu (2012) also performed such field experiments for three sites in China. The computation of such cloud coverage percentage is important in solar energy generation. It can hugely impact the amount of solar radiation falling at a particular place.
The correct estimation of solar irradiance, is particularly important in tropical countries like Singapore, where the amount of received solar irradiance is high. Rizwan et al. (2012) demonstrated that tropical countries are conducive for installing large central power stations powered by solar energy, because of the large amount of incident sunlight throughout the year.
Several attempts have been made to estimate the solar radiation from general meteorological measurements via temperature, humidity and precipitation (Hargreaves and Samani, 1985;Donatelli and Campbell, 1998;Bristow and Campbell, 1984;Hunt et al., 1998). These existing models aim to provide global solar radiation using different sensors. Alsadi and Nassar (2017) demonstrated such estimation models from the perspective of a photovoltaic (PV) solar field, showing that succeeding rows of PV panels receive less solar radiation than that of first row. They also provided an analytical solution by including the design parameters in the estimation model.
In addition to solar irradiance estimation, there have been several efforts in forecasting the solar irradiance, with a lead time of few minutes. Baharin et al. (2016) proposed a machine-learning forecast model for PV power output, using Malaysia as the case study. Similarly Chu et al. (2015) used a reforecasting method to improve the PV power output forecasts with a lead time of 5, 10, and 15 minutes. Satellite images have also been used in the realm of solar analytics. Mueller et al. (2004) proposed a clear sky model that is based on radiative transfer models obtained from Meteosat's atmospheric parameters. However, satellite data have lower temporal and spatial resolutions.
Recently, with the development of low-cost photogrammetric techniques, sky camera are being deployed for such purposes.
These sky cameras have both high temporal and high spatial resolutions, and are able to provide a more localized information about the atmospheric events. Alonso-Montesinos and Batlles (2015) used sky cameras to quantify the total solar radiation. Yang and Chen (2015) studied these solar irradiance variability using entropy and covariance. Dev et al. (2018a) used triple exponential smoothing for analyzing the seasonality of the solar irradiance. However, these approaches do not model the sharp short-term variations of solar radiation.

Outline
In this paper, we use images obtained from WSIs to accurately model the fluctuations of the solar radiation. There are several advantages of using a WSI for this instead of a pyranometer. Common weather stations generally use a solar sensor that measures the total solar irradiance. It is a point measurement providing scalar information for a particular location and does not provide information on clouds and their evolution over time. On the other hand, the wide-angle view of a ground-based sky camera provides us extensive information about the sky. It allows for the tracking of clouds over successive image frames, and also to predict their future location. In this paper, we attempt to model solar irradiance from sky images. This can also help in solar energy forecasting, which is useful in photovoltaic systems (Lorenz et al., 2009).
The main contributions of this paper are as follows: -We develop a framework to accurately estimate and track the rapid fluctuations of solar irradiance; -We propose a method for estimating solar irradiance using ground-based sky camera images; -We conduct extensive benchmarking of our proposed method with other solar irradiance estimation models.
The rest of the paper is organized as follows. Section 2 describes our experimental setup that captures the sky/cloud images and collects other meteorological sensor data. Our framework for estimating solar irradiance is presented in Section 3. Section 4 discusses the evaluation of our approach, and its benchmarking with other existing solar estimation models. We discuss the possible applications of our approach in Section 5. We also point out a few limitations of our approach, and ways to address them. Section 6 concludes the paper.

Data Collection
Our experimental setup consists of weather stations and ground-based WSIs. These devices are co-located on the rooftop of our university building in Singapore (1.34 • N, 103.68 • E). They continuously capture various meteorological data, and we archive them for subsequent analysis.

Whole Sky Imager (WSI)
Commercial WSIs are available in the market. However, those imagers have high cost, low image resolution, and little flexibility in operation. In our research group, we have designed and built custom low-cost high-resolution sky imagers, which we call WAHRSIS, i.e. Wide Angle High Resolution Sky Imaging System (Dev et al., 2014). A WAHRSIS imager essentially consists of a high-resolution digital single-lens reflex (DSLR) camera with a fish-eye lens and an on-board micro-computer. The entire device is contained inside a weather-proof box with a transparent dome for the camera. Over the years, we have built several versions of WAHRSIS (Dev et al., 2014(Dev et al., , 2015. They are now deployed at several locations around our university campus, capturing images of the sky at regular intervals. Our WAHRSIS camera is calibrated with respect to white balancing, geometric distortions and vignetting. The imaging system in WAHRSIS is modified so that it captures the near-infrared region of the spectrum. Hence, the red channel of the captured image is more prone to saturation, which renders the captured image reddish in nature. Therefore, we employ custom white balancing in the camera, such that it compensates the alteration owing to the near-infrared capture. Figure 1 depicts the captured images obtained from automatic and custom white balancing. We use the popular toolbox by Scaramuzza et al. (2006) for the geometric calibration of WAHRSIS. This process involves the computation of the intrinsic parameters of the camera. We use a black-and-white regular checkerboard pattern, and position it at various points around the camera. Figure 2 illustrates a few sample positions of the checkerboard in the calibration process.
Using the corner points and the known dimensions of the pattern, we can estimate the intrinsic parameters of the camera.
Finally, we apply vignetting correction to the images captured by our sky camera. Owing to the fish-eye nature of the lens, the area around the centre of the lens is brighter than at the sides. We use an integrating sphere to correct this variation of illumination. Figure 3 depicts an image captured inside an integrating sphere that provides a uniform illumination distribution in all directions. We use luminance characteristics from this reference image to correct the images captured by our sky camera.

Weather Station
In addition to the sky imagers, we have also installed co-located weather stations. We use Davis Instruments 7440 Weather Vantage Pro for our recordings. It measures rainfall, total solar radiation, temperature and pressure at intervals of 1 minute.
The resolution of the tipping-bucket rain gauge is 0.2 mm/tip.
It also includes a solar pyranometer measuring the total solar irradiance flux density in W/m 2 . This consists of both direct and diffuse solar irradiance components. The solar sensor integrates the solar irradiance across all angles, and provides the total solar irradiance. On a clear day with no occluding clouds, the solar sensor can be approximated by a typical cosine response, shown in Figure 4 for varying degrees of solar incident angle. The solar sensor reading is highest around noon when the incident angle of sun rays is at the minimum, whilst the reading is low during morning and evening hours.
The solar radiation under a clear sky can be modeled using the solar zenith angle and the earth's eccentricity. Several clear sky models have been developed for various regions. The best clear-sky model for Singapore is provided by Yang et al. (2012).
We performed a comparison of various clear sky models in Singapore (Dev et al., 2017), and found that the follows: where E 0 is the eccentricity correction factor for earth, I sc is the solar irradiance constant (1366.1Watt/m 2 ), and α is the solar zenith angle (measured in degrees). The factor E 0 is calculated as: where Γ = 2π(d n − 1)/365 is the day angle (measured in radians) and d n is the day number of the year.
As an illustration, we show the clear-sky radiation for the 1st of September 2016 in Figure 5, compared to the actual solar irradiance measured by our weather station. We also show the deviation of the measured solar radiation from the clearsky model. We observe that there are rapid fluctuations in the measured readings. In our previous work (Dev et al., 2016b), we observed that these rapid fluctuations are caused by the incoming clouds that obstruct the sun from direct view. Such information about the cloud profile and its formation cannot be obtained from a point-source solar recording. Therefore, we aim to model these rapid fluctuations in the measured solar radiation from wide-angle images captured by our sky cameras.

Modeling Solar Irradiance
This section presents our model for computing solar irradiance from images captured by a whole sky imager. We sample pixels using a cosine weighted hemispheric sampling to simulate the behavior of a pyranometer based on the fisheye camera lens.
We then compute the relative luminance using the image capturing parameters after gamma correction. We finally derive an empirical fitting function to scale the computed luminance estimates to match measured irradiance values.

Cosine Weighted Hemispheric Sampling
The behavior of our fisheye lens with focal length f is modeled by the equisolid equation r = 2f sin(θ/2), relating the distance (r) of any pixel from the center of the image to its incident light ray elevation angle (θ). This allows to project a captured image onto the unit hemisphere, as shown in Figure 6.
The solar irradiance is composed of a direct component relating the sun light reaching the earth without interference, as well as diffuse and reflected components. Given the high resolution of our images, we consider randomly sampled pixel locations on the hemisphere as input to the luminance computation. We follow a cosine weighted hemispheric distribution function, the center of which is at the location of the sun. This is because clouds in the circumsolar region have the highest impact on the total solar irradiance received on the earth's surface (Dev et al., 2016b). We provide more emphasis to the clouds around the sun, as compared to those near the horizon. In our previous work (Dev et al., 2016c), we used a cloud mask around the sun to estimate the solar irradiance. However, this requires the additional step of optimizing the size of the cropped image for best results. Therefore, we adopt the strategy of cosine weighted hemispheric sampling.
The first step is to compute the sampled locations from the top of the unit hemispheric dome. Each of the locations are computed as follows, using two random floating points R 1 and R 2 as input, where (0 ≤ R 1 , R 2 ≤ 1): This is represented in Figure 6.
The second step is to detect the location of the sun using a thresholding method. This is needed to align the center of the previously computed distribution (i.e. top of the hemispheric dome) to the actual sun location in the unit sphere. We choose a threshold of 240 in the red channel R of the RGB captured image, and compute the centroid of the largest area above the threshold (Savoy et al., 2016). We then compute the rotation matrix transforming the z-axis unit vector to the unit vector pointing towards the sky. We apply this rotation to all the sampled points, resulting in Figure 6. This means that the number of sampled points in a region of the hemisphere is proportional to the cosine of the angle between the sun direction and the direction to that region. We experimentally concluded that this achieves a good balance between all irradiance components.
We consider the pixel values of a total of 5000 points sampled using this method as input for the irradiance estimation.

Relative Luminance Calculation
For each of the i sampled pixels in the RGB image, we compute its luminance value using the following formula. The formula is proposed in SMPTE Recommended Practice 177 (SMPTE, 1993) to compute the luminance of an image from the R, G and B values of the RGB image.
The JPEG compression format encodes images after applying a gamma correction. This non-linearity mimics the behavior of the human eye. This needs to be reversed in order to compute the irradiance. We use a gamma correction factor of 2.2,

8
which is most commonly used in imaging systems (Poynton, 2003). We thus apply the following formula, assuming pixel values normalized between 0 and 255: We then average the pixel values across all the i sampled points in the image, and denote it by N = (1/n) n i=1 Y i , the average luminance value of the sampled points from the image.
However, each image of the sky camera is captured with varying camera parameters such as ISO, F-number and shutter speed. These camera parameters can be read from the image metadata, and are useful to estimate the scene luminance. The amount of brightness of the sampled points N , is proportional to the number of photons hitting the camera sensor. This relationship between scene luminance and pixel brightness is linear (Hiscocks and Eng, 2011), and can be modeled using the camera parameters: where N is the pixel value, K c is a calibration constant, e t the exposure time in seconds, f s the aperture number, S the ISO sensitivity and L s the luminance of the scene.
We can thus compute the relative luminance L r as follows:

Modeling Irradiance from Luminance Values
Using our hemispheric sampling and relative luminance computation, we therefore have one relative luminance value L r per image. We use this relative luminance value to estimate the solar radiation. The usual sunrise time in Singapore is between 6:40am and 7:05am, and sunset time is approximately between 6:50pm and 7:10pm local time. This information is obtained from (nea). Therefore, we consider images captured in the time interval of 7:00am till 7:00pm.
We use our ground-based whole sky images captured during the period from January 2016 till August 2016 to model the solar radiation. The solar irradiance is computed as the flux of radiant energy per unit area normal to the direction of flow. The first step in estimating irradiance from the luminance is thus to cosine weight it according to its direction of flow. We weight our measurements according to the solar zenith angle α. This is based on empirical evidences of our experiments on solar irradiance estimation. The modeled luminance L is expressed as: Let us assume that the actual solar radiation recorded by the weather station is S. We check the nearest weather station measurement for all the images captured by WAHRSIS between April 2016 till December 2016. Figure 7 shows the scatter plot between the image luminance and solar radiation. The majority of the data follows a linear relationship between the two.
However, it deviates from linearity for higher values of luminance. This is mainly because of the fact that the mapping between  scene luminance and obtained pixel value in the camera sensor becomes non-linear for large luminances. A more detailed discussion on this is provided in Section 5.
We attempt to fit a linear model and other higher-order polynomial regressors to model the relationship between image luminance from sky camera images and the measured solar radiation. Figure 7 shows the best fit curve for several orders of polynomial function. In order to provide an objective evaluation of the different models, we also compute the RMSE value between the actual and regressed values. Table 1 summarises the performance of the different order polynomials. We observe that lower order polynomials of degree 1 and 2 perform slightly inferior to those of higher order polynomials. However, higher-order polynomial models are ill-conditioned. Therefore, we choose the cubic model as our proposed model to model the measured solar radiation S from the image luminance L. This is based on the assumption that the mapping from scene luminance to pixel values in the captured image is linear for lower luminance values, and it behaves in a non-linear fashion for higher luminance values. We use this selected model in all our subsequent discussions and evaluations. We model solar radiation as: S = a 3 ×L 3 +a 2 ×L 2 +a 1 ×L+a 0 , with a 3 = −4.25e−12, a 2 = 3.96e−07, a 1 = 0.00397 and a 0 = 7.954 for our data. This model is derived specifically for the equatorial region like Singapore, and the regression constants are based on our WAHRSIS sky imaging system. They would have to be fine-tuned for other regions and different imaging systems using our methodology. To facilitate this, we make the source code of all the simulations in this paper available online at https://github.com/Soumyabrata/estimate-solar-irradiance.

Experimental Validation
In this section, we evaluate the accuracy of our proposed approach. It is derived based on WAHRSIS images captured from January to August 2016. We also use these images to evaluate the accuracy of our proposed model. Furthermore, we benchmark our algorithm with other existing solar radiation estimation models.

Evaluation
One of the main advantages of our approach is that all rapid fluctuations of solar radiation can be accurately tracked from the image luminance. We illustrate this by providing the measured solar readings of 01-Sep-2016 in Figure 8. The clear-sky model follows a cosine response and is shown in black, while the measured solar recordings are shown in red. We normalize our computed luminance in such a manner that it matches the measured solar readings. We multiply each data point with a conversion factor, such that the distance between corresponding inter-samples of luminance and weather station recordings is minimized (cf. Appendix A for details). We use this normalization factor in order to map the computed image luminance to have a similar scale to the cosine clear sky model. We observe that our computed luminance from the whole-sky image and the measured solar radiation closely follow each other. We emphasize here that it is important to accurately track the rapid solar fluctuations. Unlike other solar estimation models based on meteorological sensor data, our proposed model can successfully estimate the peaks and troughs of solar readings. Using our proposed methodology, we compute the luminance of all the captured images. Subsequently, using our proposed cubic model, we estimate the corresponding solar radiation values. The estimated solar irradiance values are compared with the actual irradiance values obtained from the solar sensors in the co-located weather station, which serve as the ground-truth measurements. Figure 9 shows the histogram of differences between the estimated and actual solar radiation. We observe that the estimates do not deviate much from the actual solar radiation. Nearly half (

Benchmarking
We benchmark our proposed approach with other existing solar estimation models. To the best of our knowledge, there are no proposed models to estimate short-term fluctuations of solar irradiance from ground-based images. However, most remote sensing analysts have been using other meteorological sensor data, e.g. temperature, humidity, rainfall and dew point temperature to estimate daily solar irradiance. One of the pioneer works was done by Hargreaves and Samani (1985), who proposed a model based on daily temperature variations. Donatelli and Campbell (1998) improved the model by including clear sky transitivity as one of the factors. On the other hand, Bristow and Campbell (1984) also proposed a new model of solar radiation estimation, by including the atmospheric transmission coefficient. Subsequently, Hunt et al. (1998) showed that the solar estimation model can be further improved by incorporating precipitation data in the model. We benchmark our proposed approach with these different existing models. We illustrate the various benchmarking models in Figure 10. Unfortunately, most of these other approaches fail to capture the short-term variations of the solar radiation.
We calculate the Root Mean Square Error (RMSE) of the estimated solar radiation and Spearman's rank correlation coefficient as evaluation metrics. The RMSE of an estimation algorithm represents the standard deviation of the actual and estimated solar radiation values. Spearman correlation is a non-parametric measure to characterize the relationship between measured and estimated solar radiation, which does not assume that the underlying dataset are derived from a normal distribution. We report both metrics in Table 2. Our proposed approach achieves the best results amongst all methods. Note that the training and testing set of images are identical, and all images are considered for benchmarking purposes.  We choose a random selection of images as the training set, and fit our linear regressor on these selected training images.
The RMSE values are then calculated on these training images. We perform this analysis for varying percentages of training images. Each experiment is performed 100 times to remove any selection bias.
Figure 11(a) shows the results on training images. We observe that the variation of the RMSE values gradually decreases as we increase the number of training images. Moreover, we check the variation of RMSE values when the test images are not identical as training images. Once we choose a random selection of images as training set, the remaining images are considered as the test set. We show the RMSE on such images in Fig 11(b). As expected, the variation of RMSE values increases with higher percentage of training images. The linear regressor model overfits the data, and provides higher variation in the error when tested on a fewer test images. However, the average RMSE does not vary much in all cases. Therefore we conclude that our proposed model is free from selection bias, and generalizes well with random selection of training and testing images.
We show the scatter plot between the measured solar radiation and estimated solar radiation for the different benchmarking algorithms in Figure 12. We observe that there is no strong correlation for most of these existing algorithms. This is because meteorological sensor data alone, with no cloud information cannot determine the sharp fluctuations of the solar radiation.
This is an important limitation of these models, which we have attempted to address in this paper. Our model based on sky images has additional information about cloud movement and its evolution, which is the fundamental reason behind rapid solar radiation fluctuations. In our proposed model, most of these short-term variations are captured (cf. Figure 8).  (a) Hargreaves and Samani (1985).

Short-term Forecasts
Our proposed approach can estimate the solar radiation accurately with the smallest RMSE compared to other models. The main advantage of our approach is that it can be used on predicted images as well, opening the potential for short term solar irradiance forecasting, which is needed in the solar energy field. As an initial case study, we have exploited optical flow techniques to estimate the direction and flow of cloud motion vectors between two successive image frames. We use the (B − R)/(B + R) ratio channel of the sky/cloud image, where B and R are the blue and red channels respectively. We implement an optical flow technique (flo) that uses a simpler conjugate gradient solver to obtain the flow field. Figure   Using the images captured at t and t − 2 minutes, we estimate the horizontal and vertical translation. Under the assumption that the flow of cloud motion vectors for the successive t + 2 minutes is similar to that of previous frames, we estimate the future t + 2 minutes frame, and subsequently the t + 4 minutes frame. Figure 14 illustrates this. We obtain a forecast accuracy of 70% for a prediction lead time upto 6 minutes.
In future work, we plan to use our proposed methodology of estimating solar irradiance on this predicted sky/cloud image.
This will enable us to provide more stable and reliable forecasts of solar irradiance.

Scope for Improvement
There is still scope for improvement in our approach. First, we use JPEG images instead of uncompressed RAW images for the computation of scene luminance. The JPEG compression algorithm introduces non-linearities in the pixel values, which affects the process of estimating solar irradiance. We can generate more consistent results by using only RAW format images.
Nevertheless, we still use JPEG images, as they have a significantly smaller size, which is more practical from an operational point of view. In contrast, uncompressed RAW images are much larger in size, which makes it impossible to capture and store RAW images at short intervals due to the processing requirements. In our future work, we intend to explore the use of RAW format images for the computation of solar irradiance values from sky cameras.
Secondly, our captured images have a wide range of camera settings with varying shutter speed, ISO and aperture values. This is disadvantageous because the relationship between pixel value and camera aperture value becomes non-linear for larger F-numbers. The relationship deviates from linearity for F-numbers above 4.0 (Hiscocks and Eng, 2011). Figure  wide range of F-numbers in the captured images used in deriving our proposed model. We observe that a significant percentage of images have large F-numbers, where the non-linearity sets in. This can be solved by using the aperture priority mode of the sky camera, wherein the F-number is fixed, and the exposure time varies dynamically to match the lighting conditions of the scene.

Conclusions & Future work
We presented a method for estimating the rapid fluctuations of the solar irradiance using the luminance of images taken by a whole sky imager. We are able to estimate the rapid short-term variations, which significantly improves on the state-of-the-art.
This approach is of interest for solar energy generation, because these variations cause a sudden decrease in the electricity output from solar panels. Short-term predictions of such ramp-downs are needed to maintain the stability of the power grid.
Combining our solar irradiance estimation approach with cloud movement tracking in the input images could ultimately lead to better irradiance predictions. Such information on rapid fluctuations of solar irradiance can assist in establishing a high-reliability solar energy generation system. We also plan to explore methodologies from time-series modelling (Dev et al., 2018a) to predict solar irradiance.

Code availability
The source code of all simulations in this paper is available at https://github.com/Soumyabrata/estimate-solar-irradiance.

Appendix A: Derivation of Normalization Factor
Let a 1 , a 2 , . . . , a t be the weather station records for t number of time stamps. The luminance values computed for each of the corresponding weather station points are represented by b 1 , b 2 , . . . , b t . We attempt to estimate the conversion factor x, such that the objective function f (x) representing the inter-sample distances between weather station and computed luminance value is minimized. We represent objective function f (x) as: We compute the derivative f (x), and equate it to 0, and the normalization factor x is found as: