19 Mar 2021

19 Mar 2021

Review status: this preprint is currently under review for the journal AMT.

Applying self-supervised learning for semantic cloud segmentation of all-sky images

Yann Fabel1, Bijan Nouri1, Stefan Wilbert1, Niklas Blum1, Rudolph Triebel2,4, Marcel Hasenbalg1, Pascal Kuhn6, Luis F. Zarzalejo5, and Robert Pitz-Paal3 Yann Fabel et al.
  • 1German Aerospace Center (DLR), Institute of Solar Research, 04001 Almeria, Spain
  • 2German Aerospace Center (DLR), Institute of Robotics and Mechatronics, 82234, Oberpfaffenhofen-Weßling, Germany
  • 3German Aerospace Center (DLR), Institute of Solar Research, 51147 Cologne, Germany
  • 4Technical University of Munich, Chair of Computer Vision & Artificial Intelligence, 85748 Garching, Germany
  • 5CIEMAT Energy Department, Renewable Energy Division, 28040 Madrid, Spain
  • 6EnBW Energie Baden-Württemberg AG, 76131 Karlsruhe, Germany

Abstract. Semantic segmentation of ground-based all-sky images (ASIs) can provide high-resolution cloud coverage information of distinct cloud types, applicable for meteorology, climatology and solar energy-related applications. Since the shape and appearance of clouds is variable and there is high similarity between cloud types, a clear classification is difficult. Therefore, most state-of-the-art methods focus on the distinction between cloudy- and cloudfree-pixels, without taking into account the cloud type. On the other hand, cloud classification is typically determined separately on image-level, neglecting the cloud's position and only considering the prevailing cloud type. Deep neural networks have proven to be very effective and robust for segmentation tasks, however they require large training datasets to learn complex visual features. In this work, we present a self-supervised learning approach to exploit much more data than in purely supervised training and thus increase the model's performance. In the first step, we use about 300,000 ASIs in two different pretext tasks for pretraining. One of them pursues an image reconstruction approach. The other one is based on the DeepCluster model, an iterative procedure of clustering and classifying the neural network output. In the second step, our model is fine-tuned on a small labeled dataset of 770 ASIs, of which 616 are used for training and 154 for validation. For each of them, a ground truth mask was created that classifies each pixel into clear sky, low-layer, mid-layer or high-layer cloud. To analyze the effectiveness of self-supervised pretraining, we compare our approach to randomly initialized and pretrained ImageNet weights, using the same training and validation sets. Achieving 85.8 % pixel-accuracy on average, our best self-supervised model outperforms the conventional approaches of random (78.3 %) and pretrained ImageNet initialization (82.1 %). The benefits become even more evident when regarding precision, recall and intersection over union (IoU) on the respective cloud classes, where the improvement is between 5 and 20 % points. Furthermore, we compare the performance of our best model on binary segmentation with a clear-sky library (CSL) from the literature. Our model outperforms the CSL by over 7 % points, reaching a pixel-accuracy of 95 %.

Yann Fabel et al.

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on amt-2021-1', Anonymous Referee #1, 23 Apr 2021
    • AC2: 'Reply on RC1', Yann Fabel, 11 Oct 2021
  • RC2: 'Comment on amt-2021-1', Anonymous Referee #2, 06 Sep 2021
    • AC1: 'Reply on RC2', Yann Fabel, 11 Oct 2021

Yann Fabel et al.

Yann Fabel et al.


Total article views: 641 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
425 195 21 641 15 6
  • HTML: 425
  • PDF: 195
  • XML: 21
  • Total: 641
  • BibTeX: 15
  • EndNote: 6
Views and downloads (calculated since 19 Mar 2021)
Cumulative views and downloads (calculated since 19 Mar 2021)

Viewed (geographical distribution)

Total article views: 618 (including HTML, PDF, and XML) Thereof 618 with geography defined and 0 with unknown origin.
Country # Views %
  • 1


Latest update: 17 Oct 2021
Short summary
This work presents a new approach to exploit unlabeled image data from ground-based sky observations to train neural networks. We show that our model can detect cloud classes within images more accurately than models trained with conventional methods using small, labeled datasets only. Novel machine learning techniques as applied in this work enable training with much larger datasets leading to improved accuracy in cloud detection and less need for manual image labeling.