<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" dtd-version="3.0">
  <front>
    <journal-meta>
<journal-id journal-id-type="publisher">AMT</journal-id>
<journal-title-group>
<journal-title>Atmospheric Measurement Techniques</journal-title>
<abbrev-journal-title abbrev-type="publisher">AMT</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">Atmos. Meas. Tech.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">1867-8548</issn>
<publisher><publisher-name>Copernicus GmbH</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>

    <article-meta>
      <article-id pub-id-type="doi">10.5194/amt-8-1757-2015</article-id><title-group><article-title>Bayesian cloud detection for MERIS, AATSR, and <?xmltex \hack{\newline}?> their combination</article-title>
      </title-group><?xmltex \runningtitle{Bayesian cloud detection for MERIS, AATSR, and their combination}?><?xmltex \runningauthor{A.~Hollstein et al.}?>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes" rid="aff1">
          <name><surname>Hollstein</surname><given-names>A.</given-names></name>
          <email>andre.hollstein@gfz-potsdam.de</email>
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2">
          <name><surname>Fischer</surname><given-names>J.</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2">
          <name><surname>Carbajal Henken</surname><given-names>C.</given-names></name>
          
        <ext-link>https://orcid.org/0000-0002-3408-5925</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2">
          <name><surname>Preusker</surname><given-names>R.</given-names></name>
          
        </contrib>
        <aff id="aff1"><label>1</label><institution>GeoForschungsZentrum Potsdam (GFZ), Telegrafenberg A17,  14473 Potsdam  Germany</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>Institute for Space Sciences, Department of Earth Sciences, Freie Universität Berlin, <?xmltex \hack{\newline}?> Carl-Heinrich-Becker-Weg 6–10, 12165 Berlin, Germany</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">A. Hollstein (andre.hollstein@gfz-potsdam.de)</corresp></author-notes><pub-date><day>15</day><month>April</month><year>2015</year></pub-date>
      
      <volume>8</volume>
      <issue>4</issue>
      <fpage>1757</fpage><lpage>1771</lpage>
      <history>
        <date date-type="received"><day>30</day><month>September</month><year>2014</year></date>
           <date date-type="rev-request"><day>6</day><month>November</month><year>2014</year></date>
           <date date-type="rev-recd"><day>27</day><month>February</month><year>2015</year></date>
           <date date-type="accepted"><day>2</day><month>March</month><year>2015</year></date>
      </history>
      <permissions>
<license license-type="open-access">
<license-p>This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/3.0/">http://creativecommons.org/licenses/by/3.0/</ext-link></license-p>
</license>
</permissions><self-uri xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015.html">This article is available from https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015.html</self-uri>
<self-uri xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015.pdf">The full text article is available as a PDF file from https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015.pdf</self-uri>


      <abstract>
    <p>A broad range of different of Bayesian cloud detection schemes is applied to
measurements from the Medium Resolution Imaging Spectrometer (MERIS), the
Advanced Along-Track Scanning Radiometer (AATSR), and their combination. The
cloud detection schemes were designed to be numerically efficient and suited
for the processing of large numbers of data. Results from the classical and
naive approach to Bayesian cloud masking are discussed for MERIS and AATSR as
well as for their combination. A sensitivity study on the resolution of
multidimensional histograms, which were post-processed by Gaussian smoothing,
shows how theoretically insufficient numbers of truth data can be used to set
up accurate classical Bayesian cloud masks. Sets of exploited features from
single and derived channels are numerically optimized and results for naive
and classical Bayesian cloud masks are presented. The application of the
Bayesian approach is discussed in terms of reproducing existing algorithms,
enhancing existing algorithms, increasing the robustness of existing
algorithms, and on setting up new classification schemes based on manually
classified scenes.</p>
  </abstract>
    </article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <title>Introduction</title>
      <p>Cloud masking of Earth observation measurements is an important and often
crucial part of various remote sensing retrievals. This includes, but is not
limited to, the retrieval of cloud and aerosol microphysical parameters, the
estimation of cloud cover, ocean color retrievals, and in general,
algorithms which include atmospheric correction schemes. Cloud masking
algorithms differ widely in their complexity, computational requirements, and
assumptions about what a cloud is and which physical process is
exploited for their detection. Implementation of particular algorithms are
often application specific, which makes the cloud masks as well application
specific and generally complicates the inter-comparison of results from
different cloud masks.</p>
      <p>This paper emphasizes the application of Bayesian methods for the cloud
masking of the complete 9.5 year time series of the Medium Resolution Imaging
Spectrometer (MERIS) <xref ref-type="bibr" rid="bib1.bibx20" id="paren.1"/> and the Advanced Along-Track Scanning
Radiometer (AATSR) <xref ref-type="bibr" rid="bib1.bibx13" id="paren.2"/> on-board the Environmental
Satellite (ENVISAT)  and is part of the European Space Agency (ESA)
Cloud CCI (Climate Change Initiative) project <xref ref-type="bibr" rid="bib1.bibx11" id="paren.3"/>. Thus,
the requirements for the cloud masking scheme, which is described in Sects. <xref ref-type="sec" rid="Ch1.S2"/> to <xref ref-type="sec" rid="Ch1.S5"/>, are
robustness, accuracy, and computational efficiency. Several possible
applications of the Bayesian method are discussed in Sect. 
<xref ref-type="sec" rid="Ch1.S7"/>. Results for MERIS and AATSR are discussed
separately but with a focus on their combination within the Synergy
product, in which daytime AATSR measurements are mapped on the MERIS swath and
their mutual overlap is used. The Synergy data set in combination with one of
the presented cloud detection schemes will be used for the retrieval of cloud
microphysical parameters using the FAME-C algorithm which was described by
<xref ref-type="bibr" rid="bib1.bibx1" id="text.4"/>. The development within Cloud CCI is ongoing and
finalization of the actual algorithm is planed for the near future.</p>
      <p>Major challenges of cloud detection are validation, the correct
classification of scenes with clouds for mountainous regions and over snow-
and ice-covered areas, and the distinction between clouds and optically thick
aerosol plumes such as dust storms. These points are discussed in more detail
in Sect. <xref ref-type="sec" rid="Ch1.S7"/>.</p>
      <p>Common approaches to cloud masking are hierarchies of thresholds
<xref ref-type="bibr" rid="bib1.bibx21 bib1.bibx22" id="paren.5"><named-content content-type="pre">e.g.,</named-content></xref>, complex statistical models
<xref ref-type="bibr" rid="bib1.bibx18 bib1.bibx6" id="paren.6"><named-content content-type="pre">e.g.,</named-content></xref>, or other Bayesian approaches
<xref ref-type="bibr" rid="bib1.bibx3 bib1.bibx24 bib1.bibx16 bib1.bibx14 bib1.bibx9" id="paren.7"><named-content content-type="pre">e.g.,</named-content></xref>.
A classification scheme for Bayesian cloud masks which helps to clearly
distinguish the various approaches to Bayesian cloud masking is introduced in Sect. <xref ref-type="sec" rid="Ch1.S3"/> and, in addition, a short overview
of the relevant literature using such schemes is given.</p>
      <p>The results presented here are computational highly efficient and are very
well suited for the processing of large numbers of data, which makes these
results very well suited for future application to the Ocean Land Colour
Instrument (OLCI) <xref ref-type="bibr" rid="bib1.bibx19" id="paren.8"/> and the Sea and Land Surface Temperature
Radiometer (SLSTR) <xref ref-type="bibr" rid="bib1.bibx2" id="paren.9"/> on-board the Sentinel-3 satellite
<xref ref-type="bibr" rid="bib1.bibx17" id="paren.10"/> and its operational follow-ups.</p>
</sec>
<sec id="Ch1.S2">
  <title>Bayesian inference for cloud masking</title>
      <p>Bayes' theorem can be used to reverse joint probabilities. It is appealing to
apply it to cloud masking since its theory is widely adopted, its
implementation on a computer system is straightforward, and its results are
probabilities which can be directly interpreted. The theorem allows the
computation of the probability <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>  that a particular measurement
with feature <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">F</mml:mi></mml:math></inline-formula> is affected by a cloud when the occurrence
probabilities <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> of the feature under
cloudy and non-cloudy conditions are known. Here, <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:mi>b</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes the
occurrence probability of <inline-formula><mml:math display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> under the condition of the occurrence of <inline-formula><mml:math display="inline"><mml:mi>b</mml:mi></mml:math></inline-formula>.</p>
      <p>With <inline-formula><mml:math display="inline"><mml:mi>C</mml:mi></mml:math></inline-formula> being the case that a measurement is affected by clouds and
<inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">F</mml:mi></mml:math></inline-formula> being a set of features associated with that measurement,
<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> can be expressed as

              <disp-formula specific-use="align" content-type="numbered"><mml:math display="block"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mtd></mml:mtr><mml:mlabeledtr id="Ch1.E1"><mml:mtd/><mml:mtd/><mml:mtd><mml:mrow><?xmltex \hack{\hspace{1.3cm}}?><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

          where <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the background probability of cloudiness and <inline-formula><mml:math display="inline"><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover></mml:math></inline-formula> is the
negation of <inline-formula><mml:math display="inline"><mml:mi>C</mml:mi></mml:math></inline-formula>, which states that a measurement is not affected by clouds.
The occurrence probability of the feature <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> can be expressed in
terms of the joint probabilities <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>,
because cloudiness and non-cloudiness are the only two considered classes for
each measurement.</p>
      <p>Evaluating Bayes' theorem involves only a few arithmetic operations so that
a specific implementation can be very fast and efficient, which is of
importance when large numbers of data are to be processed. Additional
computations involve the feature <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">F</mml:mi></mml:math></inline-formula> and the a priori joint
probabilities <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, which are discussed in
the following sections.</p>
      <p>With an appropriate set of thresholds, one can convert the probability
<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> into a cloud mask. For instance, any probability strictly
higher than 50 % could be interpreted as cloud, but other thresholds or
more classes can be used. This is discussed in more detail in Sect. <xref ref-type="sec" rid="Ch1.S7.SS1"/>, but this choice clearly depends on the target
application and is independent of the Bayesian approach. Other
applications, such as the construction of a cost function as described by
<xref ref-type="bibr" rid="bib1.bibx3" id="text.11"/>, are also viable alternatives.</p>
      <p>Estimating the value of the background probability <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is not discussed in
detail in this paper and for all following applications a value of <inline-formula><mml:math display="inline"><mml:mn>0.5</mml:mn></mml:math></inline-formula> is
used. This choice basically states that for each measurement an equal
probability of it being cloudy or not cloudy is assumed. This assumption is
of course valid neither on a global nor local scale, and a rich body of
knowledge about the spatial and temporal distribution of cloud occurrence
probabilities exists. Such knowledge, typical in the form of external
climatologies, could be used to estimate <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> but would eventually shift
derived climatologies towards the external one, which would then effectively
lead to circular arguments. This point might be of lower importance for some
applications, i.e., operational processing by weather services, but within
Cloud CCI climatological data sets will be derived from the full MERIS
and AATSR time series and circular arguments are best to be avoided. Since a
decision for the actual value of <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> has to be made, its impact on derived
data sets should be investigated and communicated to potential users. In
general, the background probability can be a function of external or auxiliary
data like position or time of year. In the general case, the estimation of
the joint probabilities <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> should be
consistent with <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
      <p>For the special case of <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mn>0.5</mml:mn></mml:mrow></mml:math></inline-formula>, Eq. (<xref ref-type="disp-formula" rid="Ch1.E1"/>) simplifies to
          <disp-formula id="Ch1.E2" content-type="numbered"><mml:math display="block"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
        Setting up a particular Bayesian cloud mask algorithm involves several
decisions, such as specifying the measurement feature <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">F</mml:mi></mml:math></inline-formula> and choosing
a technique to estimate <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, which allows
us
to group the various possible approaches to Bayesian cloud masking into
distinct subgroups. This natural grouping allows to clearly separate the
presented approach from other algorithms and is discussed in Sect. <xref ref-type="sec" rid="Ch1.S3"/>. In addition, a short overview about the
relevant literature is given.</p>
</sec>
<sec id="Ch1.S3">
  <title>Classification of Bayesian cloud masks</title>
      <p>Several papers on Bayesian approaches to cloud masking have been published in
the past and fundamental differences between the various algorithms are often
buried in the technical details of the particular paper. A nomenclature which
aims to clearly separate different approaches to Bayesian cloud masking is
discussed in the following.</p>
      <p>Let the feature <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">F</mml:mi></mml:math></inline-formula> from Eq. (<xref ref-type="disp-formula" rid="Ch1.E1"/>) be a set of
<inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> real numbers <inline-formula><mml:math display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, where
the <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are typically determined from measurements <inline-formula><mml:math display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">M</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>M</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, auxiliary data <inline-formula><mml:math display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">A</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>A</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, and
external data <inline-formula><mml:math display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">E</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>E</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>. The components <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are
computed from prescribed feature functions <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, which generally depend on
all of the above introduced classes of data:
<inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">M</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">A</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">E</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. In the case of the MERIS and AATSR
Synergy, the set of measurements <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">M</mml:mi></mml:math></inline-formula> includes radiances and brightness
temperatures for a single collocated pixel. Auxiliary <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">A</mml:mi></mml:math></inline-formula> data are
available with negligible computational cost, such as time stamps,
geolocation, and data flags. External data may be a function of the
available measurements <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">M</mml:mi></mml:math></inline-formula> and auxiliary data <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">A</mml:mi></mml:math></inline-formula> and their
procurement are by definition associated with non-negligible computational
cost. This category essentially introduces significant external knowledge
about the measurement and common examples are online radiative transfer (RT)
simulations, nontrivial interpolation in numerical weather prediction (NWP)
data, or the use of climatologies.</p>
      <p>Let us call the feature set <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">F</mml:mi></mml:math></inline-formula> independent when it is only a function
of the measurements <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">M</mml:mi></mml:math></inline-formula> and auxiliary data <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">A</mml:mi></mml:math></inline-formula> and dependent when
external data <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">E</mml:mi></mml:math></inline-formula> are additionally exploited. Both classes can be further
subdivided with respect to weak and strong dependence to describe <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">F</mml:mi></mml:math></inline-formula>
even more precisely. A weakly dependent feature set could, for example, depend on
interpolation in NWP data, which is of negligible computational cost, while a
strongly dependent feature set could depend on online RT with non-negligible
numerical cost. Strongly independent feature sets would then depend only on
measurements <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">M</mml:mi></mml:math></inline-formula>, while weakly independent feature sets could in
addition depend on auxiliary data <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">A</mml:mi></mml:math></inline-formula>.</p>
      <p>This paper  focuses on Bayesian cloud masks based on strongly independent
features. Only MERIS and AATSR measurements and trivial functions operating
on them are used to construct the feature set. This class of features allows
to implement a numerically highly efficient algorithm with simple
opportunities to parallelization and vectorization. With no dependency on
external data, the algorithm can be used in non-operational environments
where the acquisition of NWP data can require significant effort. In general,
there is no obvious reason why the techniques which are discussed in the
following sections are limited to the independent case.</p>
      <p>The second major branch in Bayesian cloud masking schemes involves the
computation of the joint probabilities <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and
<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The classical approach aims at the direct computation
of these two joint probabilities, while the naive approach treats the
components <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> of the feature set <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">F</mml:mi></mml:math></inline-formula> as statistically independent
and decouples the joint probabilities on <inline-formula><mml:math display="inline"><mml:mi mathvariant="bold-italic">F</mml:mi></mml:math></inline-formula> into a product of joint
probabilities of the <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>:
          <disp-formula id="Ch1.E3" content-type="numbered"><mml:math display="block"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∏</mml:mo><mml:mi>i</mml:mi></mml:munder><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
        One can either construct the feature set very carefully, such that this
strong assumption holds (e.g., follow <xref ref-type="bibr" rid="bib1.bibx16" id="text.12"/> and their
discussion on cloud texture and cloud top temperature), or simply accept its
violation and the possible effects on the cloud masking scheme. Formally
proving the statement of Eq. (<xref ref-type="disp-formula" rid="Ch1.E3"/>) seems to be only
possible for a rather limited class of features.</p>
      <p>Computing the joint probabilities in the classical approach can be greatly
simplified by assuming an analytic form and estimating its parameters.
Depending on the assumed form, for instance multivariate Gaussian
<xref ref-type="bibr" rid="bib1.bibx24 bib1.bibx16" id="paren.13"><named-content content-type="pre">e.g.,
see</named-content></xref>, the resulting cloud mask could be called
classical Gaussian. As for the naive approach, it will be difficult to
formally prove the validity of such assumptions.</p>
      <p>The classical and naive approaches can be mixed when one or more subsets of the
<inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are treated as statistically independent, such that the decoupling of
<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> becomes partial. For this class of
Bayesian cloud masks we propose using the terms “mostly naive” when the majority
of features are decoupled and “mostly classical” when the majority of features
are not decoupled.</p>
      <p>This paper is mainly concerned with the discussion of the classical and naive
approach with an emphasis on the classical one. In conclusion, this paper is
mostly concerned with the application of classical Bayesian cloud masks based
on strongly independent features. As it will be shown later in the paper, the
classical approach gives better results for the cloud masking in our scenario
and the strongly independent feature set was chosen to allow the
implementation of a very fast algorithm.</p>
      <p>Cloud detection methods based on Bayesian probabilities have been used for
cloud masking in the past, and a short overview is given now but without the attempt to
fully outline them. <xref ref-type="bibr" rid="bib1.bibx3" id="text.14"/> used Bayesian
probability with strongly dependent features to derive a cost function for a
1D-Var retrieval of cloudiness. The classification is based on the
exploitation of microwave and infrared channels and, in addition, external data
from NWP simulations. <xref ref-type="bibr" rid="bib1.bibx24" id="normal.15"/> used Advanced Very High Resolution Radiometer (AVHRR) channels, derived
channels such as reflectance ratios and brightness temperature differences,
and textural measures to construct strongly independent features. It was
found that textural measures are  most important for nighttime
measurements. The joint probabilities were separated by assuming a
multivariate Gaussian form and were expressed in terms of mean values and
associated covariances. <xref ref-type="bibr" rid="bib1.bibx16" id="normal.16"/> used nighttime thermal infrared
measurements at <inline-formula><mml:math display="inline"><mml:mn>3.7</mml:mn></mml:math></inline-formula>, <inline-formula><mml:math display="inline"><mml:mn>11</mml:mn></mml:math></inline-formula>, and <inline-formula><mml:math display="inline"><mml:mn>12</mml:mn></mml:math></inline-formula> <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m to construct a mostly
classical Bayesian cloud mask. Textural features were assumed to be
independent from measurements in thermal channels and were separated when
computing the joint probabilities. <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> was estimated from NWP data and the
algorithm was discussed with an emphasis on operational NWP centers such
that these feature are likely only weakly dependent.
<xref ref-type="bibr" rid="bib1.bibx14 bib1.bibx15" id="normal.17"/> discussed a mostly classical Bayesian algorithm
with strongly dependent features for the <inline-formula><mml:math display="inline"><mml:mn>3.9</mml:mn></mml:math></inline-formula>, <inline-formula><mml:math display="inline"><mml:mn>11</mml:mn></mml:math></inline-formula>, and <inline-formula><mml:math display="inline"><mml:mn>12</mml:mn></mml:math></inline-formula> <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m
channel of the SEVIRI instrument. External knowledge is introduced by NWP
data, and a fast radiative transfer model and textural features were separated
from spectral features assuming independence. <xref ref-type="bibr" rid="bib1.bibx9" id="text.18"/> discussed
a naive Bayesian cloud mask with strongly dependent features for the AVHRR
instrument. A surface type classification using external MODIS data was used
to maximize the detection rate. CALIPSO lidar measurements were used as truth
data to compute histograms from which the occurrence probabilities for each
feature were estimated.</p>
</sec>
<sec id="Ch1.S4">
  <title>Construction of feature sets</title>
      <p>Channels of the MERIS and AATSR instruments cover the spectral range from
412 nm to <inline-formula><mml:math display="inline"><mml:mn>12</mml:mn></mml:math></inline-formula> <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m and are referenced in this paper by their central
wavelength, while for MERIS the unit of nm and for AATSR the unit <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m
is used. Figures <xref ref-type="fig" rid="Ch1.F1"/> and <xref ref-type="fig" rid="Ch1.F2"/> show examples of
possible features for two particularly interesting scenes over Greenland and in the
vicinity of the Korean peninsula. Each figure shows an RGB image, various
single channels, and a selection of trivial functions which combine two
channels. Both figures include a panel with results of the non-Bayesian
Synergy cloud mask, which is briefly discussed in Sect. <xref ref-type="sec" rid="Ch1.S6"/>. Figure <xref ref-type="fig" rid="Ch1.F1"/> shows a scene over Greenland
with its center located at <inline-formula><mml:math display="inline"><mml:mrow><mml:msup><mml:mn>59</mml:mn><mml:mo>∘</mml:mo></mml:msup><mml:msup><mml:mn>31</mml:mn><mml:mo>′</mml:mo></mml:msup><mml:msup><mml:mn>12</mml:mn><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> W and <inline-formula><mml:math display="inline"><mml:mrow><mml:msup><mml:mn>79</mml:mn><mml:mo>∘</mml:mo></mml:msup><mml:msup><mml:mn mathvariant="normal">0</mml:mn><mml:mo>′</mml:mo></mml:msup><mml:msup><mml:mn mathvariant="normal">0</mml:mn><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> N with
high and low clouds over a large ice- or snow-covered region. Figure <xref ref-type="fig" rid="Ch1.F2"/> shows a scene in the vicinity of the Korean peninsula with
its center located at <inline-formula><mml:math display="inline"><mml:mrow><mml:msup><mml:mn>125</mml:mn><mml:mo>∘</mml:mo></mml:msup><mml:msup><mml:mn>52</mml:mn><mml:mo>′</mml:mo></mml:msup><mml:msup><mml:mn>12</mml:mn><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> E and <inline-formula><mml:math display="inline"><mml:mrow><mml:msup><mml:mn>37</mml:mn><mml:mo>∘</mml:mo></mml:msup><mml:msup><mml:mn>45</mml:mn><mml:mo>′</mml:mo></mml:msup><mml:msup><mml:mn>36</mml:mn><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> N. This
scene shows a pronounced dust storm mixed with a deck of clouds.</p>
      <p>Strongly independent features are constructed using a single channel or any
combination of channels in a trivial function. Such combinations have been
called derived channels in the literature <xref ref-type="bibr" rid="bib1.bibx24" id="paren.19"><named-content content-type="pre">e.g.,</named-content></xref>.
Considered here are all basic arithmetic operations (<inline-formula><mml:math display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mo>,</mml:mo><mml:mo>-</mml:mo><mml:mo>,</mml:mo><mml:mo>×</mml:mo><mml:mo>,</mml:mo><mml:mo>/</mml:mo></mml:mrow></mml:math></inline-formula>) and, in
addition, the index function d<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>x</mml:mi><mml:mo>(</mml:mo><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:mi>b</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:mi>a</mml:mi><mml:mo>-</mml:mo><mml:mi>b</mml:mi><mml:mo>)</mml:mo><mml:mo>/</mml:mo><mml:mo>(</mml:mo><mml:mi>a</mml:mi><mml:mo>+</mml:mo><mml:mi>b</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, which can be used to
create indices such as the normalized difference vegetation index
<xref ref-type="bibr" rid="bib1.bibx12" id="paren.20"><named-content content-type="pre">see</named-content></xref>, the normalized difference snow index
<xref ref-type="bibr" rid="bib1.bibx7" id="paren.21"><named-content content-type="pre">see</named-content></xref>, or other general channel indices. Even when well-known
and generally accepted combinations of channels and indices are used, it is
unclear whether a specific combination is the best possible candidate for the
particular data set and one has to rely on the experience of the involved
experts.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F1"><caption><p>Several views of a scene over Greenland from
17 July 2007 with the image centered at <inline-formula><mml:math display="inline"><mml:mrow><mml:msup><mml:mn>59</mml:mn><mml:mo>∘</mml:mo></mml:msup><mml:msup><mml:mn>31</mml:mn><mml:mo>′</mml:mo></mml:msup><mml:msup><mml:mn>12</mml:mn><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> W and
<inline-formula><mml:math display="inline"><mml:mrow><mml:msup><mml:mn>79</mml:mn><mml:mo>∘</mml:mo></mml:msup><mml:msup><mml:mn mathvariant="normal">0</mml:mn><mml:mo>′</mml:mo></mml:msup><mml:msup><mml:mn mathvariant="normal">0</mml:mn><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> N. Single panels include a pseudo RGB view, results of the
non-Bayesian Synergy cloud mask (with white indicating clouds; see
Sect. <xref ref-type="sec" rid="Ch1.S6"/>), as well as single channels and simple functions
operating on two channels. The function d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> denotes the index function and
is defined as d<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>x</mml:mi><mml:mo>(</mml:mo><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:mi>b</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:mi>a</mml:mi><mml:mo>-</mml:mo><mml:mi>b</mml:mi><mml:mo>)</mml:mo><mml:mo>/</mml:mo><mml:mo>(</mml:mo><mml:mi>a</mml:mi><mml:mo>+</mml:mo><mml:mi>b</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Units are not shown and the
color scales are stretched to maximize the visible contrast.</p></caption>
        <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="./https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f01.jpg"/>

      </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F2"><caption><p>Similar to Fig. <xref ref-type="fig" rid="Ch1.F1"/> but for a
scene near the Korean peninsula. The center of the images is located at
<inline-formula><mml:math display="inline"><mml:mrow><mml:msup><mml:mn>125</mml:mn><mml:mo>∘</mml:mo></mml:msup><mml:msup><mml:mn>52</mml:mn><mml:mo>′</mml:mo></mml:msup><mml:msup><mml:mn>12</mml:mn><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> E and <inline-formula><mml:math display="inline"><mml:mrow><mml:msup><mml:mn>37</mml:mn><mml:mo>∘</mml:mo></mml:msup><mml:msup><mml:mn>45</mml:mn><mml:mo>′</mml:mo></mml:msup><mml:msup><mml:mn>36</mml:mn><mml:mrow><mml:mo>′</mml:mo><mml:mo>′</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> N.</p></caption>
        <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="./https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f02.jpg"/>

      </fig>

      <p>In contrast to approaches based on expert knowledge, an objective measure for
any given set of feature functions is exploited to numerically search for the
best possible set of feature functions. Maximizing the Hanssen–Kuipers skill
score <xref ref-type="bibr" rid="bib1.bibx8 bib1.bibx26" id="paren.22"><named-content content-type="pre">see</named-content></xref> with respect to a given
validation data set is an appropriate metric for this problem. It is also
sometimes referred to as a Hanssen–Kuipers discriminant and is essentially the
difference of the hit rate and the false alarm rate of the cloud mask with
respect to a validation data source. It covers the range of <inline-formula><mml:math display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> to <inline-formula><mml:math display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula>,
with <inline-formula><mml:math display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> being a perfect representation of the validation source. From now
on, only the term “skill score” is used.</p>
      <p><?xmltex \hack{\newpage}?>Validation of cloud masks for MERIS and AATSR on-board ENVISAT is a difficult
task since no generally accepted and available set of truth data exists. A
generally used approach is to generate truth data by means of manual
classification of images by human experts or the use of data from ground-based stations. Converting a ground truth to a pixel-by-pixel truth can be
complicated, and possibly insufficient spatial coverage can limit the
applicability of that approach. Consequently, most approaches for generating
truth data for MERIS and AATSR are based on the manual
classification of sample data by human experts
<xref ref-type="bibr" rid="bib1.bibx5 bib1.bibx6 bib1.bibx22" id="paren.23"><named-content content-type="pre">e.g.,</named-content></xref>. Such data sets
can be called artificial truth because, although they are used as if they were truth,
it is arguable whether such data sets are in fact truth.</p>
      <p>To demonstrate the feasibility of the Bayesian approach, results from the
Synergy cloud mask (see <xref ref-type="bibr" rid="bib1.bibx6" id="text.24"/> and Sect. <xref ref-type="sec" rid="Ch1.S6"/> for a brief description) were chosen as a source of
artificial truth data; it is therefore assessed whether Bayesian cloud
masks can reproduce this Synergy cloud mask. The major advantage of this
approach is that large numbers of artificial truth data can be created
without significant effort. Clearly, all shortcomings of this seeding
algorithm will be present in this data set and will limit the success of the
application of the Bayesian technique.</p>
      <p>Optimizing the choice for a particular set of feature functions is not
straightforward, since this problem is noncontinuous with a varying number of
free parameters. First, the number of feature functions has to be set. Then,
for each feature, a feature function from the pool of considered functions
has to be selected. The identity function, all four basic arithmetic
operations, and the index function are considered as feature functions. As a
last step, the input channels for each feature function must be set.
Depending on the chosen functions and channels, a maximum of <inline-formula><mml:math display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>×</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>
channels can be included in the computation of a feature set with <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>
elements.</p>
      <p>Then, for a particular feature set, the prerequisites for computing the joint
probabilities must be carried out, which is described in detail in Sect. <xref ref-type="sec" rid="Ch1.S5"/>. Once this step is completed, the
Hanssen–Kuipers skill score for the selected set of validation data can be
computed.</p>
      <p>The only numeric optimization procedure that we are aware of, which is
generally applicable to this situation, is a random search in the huge search
space spanned by this outlined procedure. This is quite a different approach
to that of a human expert, who would likely start an educated search but
might not attempt to cover the whole search space. The number of possible
combinations depends on the number of chosen features and the number of
available channels (22 in the case of the MERIS and AATSR Synergy) and can be
estimated using the binomial coefficient. In the simplest case, where merely
the identity function is used, no channel is used more than once, and four
features are to be selected, the search space spans <inline-formula><mml:math display="inline"><mml:mrow><mml:mfenced close=")" open="("><mml:mfrac linethickness="0"><mml:mn>22</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:mfrac></mml:mfenced><mml:mo>=</mml:mo><mml:mn>7315</mml:mn></mml:mrow></mml:math></inline-formula>
elements. When only functions of two channels are to be selected and
re-selected and channels can be used multiple times, then the search space
consists of <inline-formula><mml:math display="inline"><mml:mrow><mml:mfenced open="(" close=")"><mml:mfrac linethickness="0"><mml:mrow><mml:mn mathvariant="normal">5</mml:mn><mml:mo>×</mml:mo><mml:mo>(</mml:mo><mml:msup><mml:mn>22</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>-</mml:mo><mml:mn>22</mml:mn><mml:mo>)</mml:mo></mml:mrow><mml:mn mathvariant="normal">4</mml:mn></mml:mfrac></mml:mfenced><mml:mo>=</mml:mo><mml:mfenced close=")" open="("><mml:mfrac linethickness="0"><mml:mn>2310</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:mfrac></mml:mfenced><mml:mo>≈</mml:mo><mml:mn>1.2</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mn>12</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> entries. The enormous size of the search space makes it difficult to
completely cover it by a search, but the random search can be allowed to run
appropriately long such that a result of sufficient quality is obtained. One
can expect that a large number of different sets of feature functions will
essentially exhibit very similar classification skills. The considered
feature functions are not symmetric under a change of the parameter order, but
the overall classification result might be approximately symmetric. This
alone would decrease the search space by a factor of approximately <inline-formula><mml:math display="inline"><mml:mn>16</mml:mn></mml:math></inline-formula>. In
addition, the classification results might be only weakly dependent with
respect to the feature function itself; i.e., the index function d<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>x</mml:mi><mml:mo>(</mml:mo><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:mi>b</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>
might be as effective as a ratio <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>a</mml:mi><mml:mo>/</mml:mo><mml:mi>b</mml:mi></mml:mrow></mml:math></inline-formula>, which would decrease the effective
size of the search space.</p>
      <p>The proposed random search might not be able to cover the complete search
space, but with a sufficiently long runtime one will be able to find solutions
with a sufficiently high skill score. In addition, unusual combinations of
channels might be found which would not be considered in an educated search by
a human expert. The features shown in Figs. <xref ref-type="fig" rid="Ch1.F1"/> and <xref ref-type="fig" rid="Ch1.F2"/>
are frequently found in searches when results from the non-Bayesian Synergy cloud mask are used as artificial truth.</p>
      <p>The physical meaning of a certain feature set and why it might be better or
worse than a different one is not discussed here and is also not within the
scope of this paper. This knowledge is very useful for educated searches but
is not necessarily needed in this setup. However, for the experienced expert
it might be only slightly surprising which channels are found to be successful
by the optimization scheme. There is also no apparent reason why human
experts should not compete with the optimization scheme in order to find an
optimum set of features. This is especially important for applications where
only a small fraction of the search space can be tested using the
optimization approach.</p>
      <p>Implementing such a search strategy is straightforward. A generator of random
feature functions must be implemented and each of these instances can be
tested for its skill score with respect to the artificial truth. This
procedure is easily parallelizable, and one could store only the results with
a
higher skill score rather than some predefined value. At any given time during an
ongoing search, one can sort these results and evaluate the top results.</p>
</sec>
<sec id="Ch1.S5">
  <title>Estimation of background joint probabilities</title>
      <p>The background joint probabilities <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>
could be computed in various ways, but here only the frequentist approach
based on sample data is considered. A sufficiently large number of already-classified measurements are converted into their corresponding set of
features,
and probability density histograms are produced, from which the probabilities
are estimated. In the naive Bayesian approach, as many one-dimensional
histograms as there are features are needed, while in the classical Bayesian
approach a single <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>-dimensional histogram is used. When these histograms
are stored in a computer system, the handling of any reasonable number of one-dimensional histograms poses no specific problem, while an array of dimension
<inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> grows rapidly in memory with increasing number of bins <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. For the
sake of simplicity, the same number of bins is assumed for each particular
dimension. With four bits per float and twenty bins per feature, one would
need <inline-formula><mml:math display="inline"><mml:mn>0.6</mml:mn></mml:math></inline-formula> Gb to store a single histogram for four features but already
about <inline-formula><mml:math display="inline"><mml:mn>4883</mml:mn></mml:math></inline-formula> Gb for seven features. This limits the practical number of
features for the classical Bayesian approach to about four to six at the time
of writing this paper.</p>
      <p>However, the main argument of <xref ref-type="bibr" rid="bib1.bibx24" id="text.25"/> and <xref ref-type="bibr" rid="bib1.bibx9" id="text.26"/>
against the use of the classical approach is that one has generally not
enough truth data available to robustly derive the histograms in a completely
frequentist way. This can be a valid point for real truth data, which are
limited in principle, but not so much for artificial truth data. Here, the
number of available data is merely a function of the available human labor
for manual classification or computational resources when an existing cloud
masking scheme is used to produce artificial truth data.</p>
      <p>Both left panels of Fig. <xref ref-type="fig" rid="Ch1.F3"/> and
<xref ref-type="fig" rid="Ch1.F4"/> show results of two-dimensional histograms for
MERIS and AATSR Synergy data. For both cases, almost 1 million spectra were
used to compute both histograms. Shown is the difference between the histograms
for <inline-formula><mml:math display="inline"><mml:mi>C</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math display="inline"><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover></mml:math></inline-formula>. Both choices of features recreate the Synergy cloud
mask reasonably well with a skill score of about <inline-formula><mml:math display="inline"><mml:mn>0.76</mml:mn></mml:math></inline-formula>. The cloud masking
setup is discussed in detail in Sect. <xref ref-type="sec" rid="Ch1.S7"/>.
The main point here is that with enough data points these histograms can be
computed. The two-dimensional case was chosen since this is simple to
visualize.</p>
      <p>Both right panels of Figs. <xref ref-type="fig" rid="Ch1.F3"/> and <xref ref-type="fig" rid="Ch1.F4"/> show remarkably similar histograms with just barely
smaller skill score values of about <inline-formula><mml:math display="inline"><mml:mn>0.75</mml:mn></mml:math></inline-formula>, but only <inline-formula><mml:math display="inline"><mml:mn>1000</mml:mn></mml:math></inline-formula> randomly selected
measurements from the original data set were used to produce these histograms.
A simple Gaussian smoothing filter was applied to both histograms and each
Gaussian smoothing factor was chosen such that the skill score as a function
of the Gaussian smoothing was maximized. This is the first main result of
this paper. This numerical experiment shows that, at least for some sets of
feature functions, the <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>-dimensional histograms can be approximated by
using very few data points and an appropriate Gaussian smoothing factor. The
best smoothing factor for both cases is slightly different and is obtained
from optimization. More detailed results are shown in Sect. <xref ref-type="sec" rid="Ch1.S7.SS1"/>. In addition to the previously discussed parameters,
e.g., the construction of features, classical Bayesian cloud masks are defined
by the number of bins used in the histograms and the chosen Gaussian
smoothing parameter, which is discussed in Sect. <xref ref-type="sec" rid="Ch1.S7.SS1"/>.</p>
      <p>It should be noted that this is an extreme case and we do not propose to use
so few data points to construct cloud masks for real-world applications.
These two examples merely show how well this approach operates and that a
surprisingly small number of data might be sufficient to explore the
application of classical Bayesian cloud masks.</p>
      <p>The Gaussian smoothing approach works reasonably well and is so far only
justified by its actual success for a particular problem, where in fact
sufficient numbers of artificial truth data are available. Its general
application to situations with limited numbers of such data is therefore not
very well justified. However, numerical experiments with the available data
have shown that this approach yields remarkably good results. Other
functional kernels have not been tested, but the Gaussian approach seems
sufficient since the convoluted histograms yield nearly the same skill score
as the original histograms. Success of this approach is likely based on the
fact that the smoothing procedure distributes data to neighbor bins but
does not strongly change the defining spectral features of the measurements.
That is, it implicitly creates data which could represent different viewing
geometries or situations with slightly varying optical parameters. Hence,
this approach is not justified by first principles but rather with working
examples which strengthen our expectations that this approach will work
reasonably well for any other set of features.</p>
</sec>
<sec id="Ch1.S6">
  <title>Synergy cloud mask</title>
      <p>The Synergy cloud mask is discussed in detail by <xref ref-type="bibr" rid="bib1.bibx6" id="text.27"/> and
is implemented as an external processor for the BEAM toolbox
<xref ref-type="bibr" rid="bib1.bibx4" id="paren.28"/>. It is based on radiative transfer simulations covering
all spectral bands of MERIS and AATSR and statistical analysis of classified
data by human experts. Within the frame of the ESA Cloud CCI project phase 1,
the years 2007–2009 of the MERIS and AATSR time series were processed. The
derived cloud cover (or cloud number) was assessed in several validation
exercises, e.g., compared to cloud numbers from the GEWEX CA database
<xref ref-type="bibr" rid="bib1.bibx23" id="paren.29"/>, which consists of a number of data sets with gridded
and monthly mean cloud number derived from a variety of satellite
instruments. Results of global mean cloud number are in line with GEWEX cloud
numbers <xref ref-type="bibr" rid="bib1.bibx10" id="paren.30"/>. The cloud mask product from the years 2007 to
2009 can be used as a large source of artificial truth data for the synergy
data set.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F3" specific-use="star"><caption><p>Difference of the two-dimensional histograms
for <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The left panel show a direct
results using 990 000 globally distributed measurements, while for the right
panel only 1000 measurements were used. The histograms on the right side were
post-processed using Gaussian smoothing with a width parameter of <inline-formula><mml:math display="inline"><mml:mn>1.84</mml:mn></mml:math></inline-formula>.</p></caption>
        <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f03.pdf"/>

      </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F4" specific-use="star"><caption><p>Similar to Fig. <xref ref-type="fig" rid="Ch1.F3"/>
but for a different set of features and a different Gaussian smoothing factor
of <inline-formula><mml:math display="inline"><mml:mn>2.15</mml:mn></mml:math></inline-formula>. This set of features includes the MERIS Oxygen A band absorption
channel.</p></caption>
        <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f04.pdf"/>

      </fig>

</sec>
<sec id="Ch1.S7">
  <title>Application to MERIS, AATSR, and their synergistic product</title>
      <p>When the computation of <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>C</mml:mi><mml:mo mathvariant="normal">¯</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is based on the
frequentist approach and artificial truth data, then three major
applications of the technique become feasible. Results from existing
algorithms can be reproduced using the Bayesian technique, which could
potentially speed up and simplify the cloud masking of large numbers of data.
With the Synergy cloud mask from Sect. <xref ref-type="sec" rid="Ch1.S6"/> as an example,
this procedure is discussed in the following Sect. <xref ref-type="sec" rid="Ch1.S7.SS1"/>.
When the existing algorithm is reproduced reasonably well, one can use this
technique to further enhance the algorithm, which is discussed in Sect. <xref ref-type="sec" rid="Ch1.S7.SS2"/>. A simple example in which data classified by a human
expert are used to set up a Bayesian cloud mask is discussed in Sect. <xref ref-type="sec" rid="Ch1.S7.SS3"/>.</p>
<sec id="Ch1.S7.SS1">
  <title>Reproduction of existing algorithms</title>
      <p>A Bayesian cloud mask can be used to approximate independent algorithms but
with the advantage of possibly drastically decreased computation times.
However, it is not obvious that a particular algorithm is reproducible to a
sufficient extent with this technique. Artificial truth data from the Synergy
cloud mask, which was shortly discussed in Sect. <xref ref-type="sec" rid="Ch1.S4"/>,
are used as a test case and a large number of Bayesian cloud masks with
different feature sets were created and ranked according to their skill
score. The joint probabilities were estimated using globally equally
distributed data from the year <inline-formula><mml:math display="inline"><mml:mn>2007</mml:mn></mml:math></inline-formula>, and similarly distributed data from the
year <inline-formula><mml:math display="inline"><mml:mn>2008</mml:mn></mml:math></inline-formula> were used to compute the skill score, which is used to assess the
ability of the cloud mask to reproduce the Synergy cloud mask. The regional
and temporal even distribution of the initial data is crucial to cover the
widest possible range of combinations of surface reflectance, atmospheric
condition, and non-cloudy and cloudy cases. Correct classifications are only
limited by the information content carried within their set of features when
the background probabilities are estimated such that they cover the same
representative range of surface and atmospheric conditions. For instance,
when
bright snow and desert surfaces are not included in the set of cloud-free
cases, such examples could be easily misclassified as cloudy, even when the set
of features would be in principle sufficient for a correct classification.</p>
      <p>The presented results do not have to represent a global optimum since only a
small fraction of the search space was covered in the finite search time.
Depending on the number of features and the classical or naive Bayesian
approach, a certain upper bound of skill scores for any test case was not
exceeded, but many feature sets with similar skill score to that soft limit
were found.</p>
      <p>Figures <xref ref-type="fig" rid="Ch1.F5"/> and <xref ref-type="fig" rid="Ch1.F6"/> show the global
distribution of skill scores for two classical Bayesian cloud masks based on
sets of two and four features. The increase to four features improves the
results, although not dramatically for the mean global skill score. The two
feature sets were the best candidates within the allowed search time for the
full Synergy set of channels. The results are best for ocean areas and worst
for areas with mountains (Nepal, west coast of northern USA, deserts
(Sahara, Arabian peninsula), and ice- and snow-covered areas (poles, Siberia).
These are actually the areas where one naturally would expect major
difficulties in detecting clouds. The local skill scores in these areas were
significantly improved by increasing the number of used features to four.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F5"><caption><p>Global distribution of skill scores for a
classical Bayesian cloud mask using only two strongly independent features.
Data are shown for the year <inline-formula><mml:math display="inline"><mml:mn>2008</mml:mn></mml:math></inline-formula> and the joint probabilities of the mask
were estimated with data from the year <inline-formula><mml:math display="inline"><mml:mn>2007</mml:mn></mml:math></inline-formula>. The global skill score is
<inline-formula><mml:math display="inline"><mml:mn>0.78</mml:mn></mml:math></inline-formula> and the used features are shown in the title of the figure.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f05.pdf"/>

        </fig>

      <p>Interpreting spatial patterns of skill score or reproducibility is not
straightforward. It is difficult to differentiate between poor
reproducibility caused by inherent limitations of the selected feature set
and that caused by inconsistencies or errors in the truth data. In general,
when one decides to trust the truth data, one can only explore the state of
methodological parameters such as the selected features or bin size of the
histograms in order to optimize the reproducibility. It is then up to the
potential user whether a certain skill score meets the requirements for the
desired application.</p>
      <p>The data used to produce Figs. <xref ref-type="fig" rid="Ch1.F5"/> and <xref ref-type="fig" rid="Ch1.F6"/>
were sorted and used to generate the overview shown in Fig. <xref ref-type="fig" rid="Ch1.F7"/>.
Shown is the computed cloud probability
from the two Bayesian cloud masks, separated for the cloudy and
non-cloudy group as classified by the Synergy cloud mask. The threshold of
<inline-formula><mml:math display="inline"><mml:mn>0.5</mml:mn></mml:math></inline-formula> cloud probability is also shown and was used as separation between the
cloudy and non-cloudy class. This representation shows the cause of non-unity
skill score. Here, the misses (number of red points before crossing the blue
line vs. those beyond) and the false alarms (number of green points after
crossing the blue line vs. number of points before crossing) are quite
similar. Figure <xref ref-type="fig" rid="Ch1.F7"/> shows that the Bayesian cloud mask
with four features exhibits a much smoother distribution of probabilities and
a decreased rate of misses, while the improvement of the false alarm rate is
only minor. Also, the impact of changing the threshold value can be nicely
seen. The overall skill score seems to be almost unaffected when changing the
threshold. The false alarm rate decreases when the threshold is increased,
but at the same time the rate of misses increases, which would decrease the
skill score.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F6"><caption><p>Similar to Fig. <xref ref-type="fig" rid="Ch1.F5"/>
but for a different classical Bayesian cloud mask based on four strongly
independent features. The global skill score is <inline-formula><mml:math display="inline"><mml:mn>0.83</mml:mn></mml:math></inline-formula>.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f06.pdf"/>

        </fig>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T1"><caption><p>Best found results for
feature sets of classical Bayesian cloud masks with two strongly independent
features which best recreate Synergy cloud mask results. The results are
separated for the Synergy of MERIS and AATSR, MERIS, and AATSR. Channels are
referenced by their central wavelength. MERIS channels use the unit nm,
while  AATSR channels use <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m.</p></caption><oasis:table frame="topbot"><?xmltex \begin{scaleboxenv}{.95}[.95]?><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row>  
         <oasis:entry colname="col1"><inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col2">Instrument</oasis:entry>  
         <oasis:entry colname="col3">Skill</oasis:entry>  
         <oasis:entry colname="col4">Feature set</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1"/>  
         <oasis:entry colname="col2"/>  
         <oasis:entry colname="col3">score</oasis:entry>  
         <oasis:entry colname="col4"/>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>  
         <oasis:entry colname="col1">2</oasis:entry>  
         <oasis:entry colname="col2">Synergy</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.781</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4"><inline-formula><mml:math display="inline"><mml:mn>620</mml:mn></mml:math></inline-formula>–900 nm, 412 nm–11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">2</oasis:entry>  
         <oasis:entry colname="col2">Synergy</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.780</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4"><inline-formula><mml:math display="inline"><mml:mn>442</mml:mn></mml:math></inline-formula> nm–11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 778–708 nm</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">2</oasis:entry>  
         <oasis:entry colname="col2">Synergy</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.776</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">885–620 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 442 nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">2</oasis:entry>  
         <oasis:entry colname="col2">MERIS</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.781</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4"><inline-formula><mml:math display="inline"><mml:mn>412</mml:mn></mml:math></inline-formula> nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(885, 865 nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">2</oasis:entry>  
         <oasis:entry colname="col2">MERIS</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.774</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">412 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(900, 681 nm)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">2</oasis:entry>  
         <oasis:entry colname="col2">MERIS</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.773</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">442 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(900, 708 nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">2</oasis:entry>  
         <oasis:entry colname="col2">AATSR</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.707</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">12<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>0.55 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 3.7<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">2</oasis:entry>  
         <oasis:entry colname="col2">AATSR</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.706</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">0.55<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>3.7 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(3.7, 12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">2</oasis:entry>  
         <oasis:entry colname="col2">AATSR</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.706</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">0.55<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(12, 3.7 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m)</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup><?xmltex \end{scaleboxenv}?></oasis:table></table-wrap>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T2" specific-use="star"><caption><p>Similar to
Table <xref ref-type="table" rid="Ch1.T1"/> but  for classical
Bayesian cloud masks based on four strongly independent features.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row>  
         <oasis:entry colname="col1"><inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col2">Instrument</oasis:entry>  
         <oasis:entry colname="col3">Skill</oasis:entry>  
         <oasis:entry colname="col4">Feature set</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1"/>  
         <oasis:entry colname="col2"/>  
         <oasis:entry colname="col3">score</oasis:entry>  
         <oasis:entry colname="col4"/>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>  
         <oasis:entry colname="col1">4</oasis:entry>  
         <oasis:entry colname="col2">Synergy</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.826</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">1.6 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 681<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>778 nm,  d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(0.55 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 760 nm),  d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 412 nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">4</oasis:entry>  
         <oasis:entry colname="col2">Synergy</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.821</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">412 nm, 12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m,  753 nm <inline-formula><mml:math display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 1.6 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(11, 12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">4</oasis:entry>  
         <oasis:entry colname="col2">Synergy</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.820</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">442 nm, 3.7–11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 3.7 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m <inline-formula><mml:math display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(665, 753 nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">4</oasis:entry>  
         <oasis:entry colname="col2">MERIS</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.822</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">412 nm, 900 nm <inline-formula><mml:math display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 510 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(760, 620 nm), d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(885, 865 nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">4</oasis:entry>  
         <oasis:entry colname="col2">MERIS</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.821</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">753<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>510 nm, 442 nm <inline-formula><mml:math display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 412 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(865, 753 nm), d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(885, 760 nm)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">4</oasis:entry>  
         <oasis:entry colname="col2">MERIS</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.818</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">442, 665<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>900 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(560, 510 nm), d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(865, 885 nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">4</oasis:entry>  
         <oasis:entry colname="col2">AATSR</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.765</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">12, 0.55 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 12<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>3.7 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 0.55<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>0.87 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">4</oasis:entry>  
         <oasis:entry colname="col2">AATSR</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.757</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">0.67 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 12<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>3.7 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 0.87<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>0.55 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 11<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>0.67 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">4</oasis:entry>  
         <oasis:entry colname="col2">AATSR</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.757</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">0.55<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>0.87 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m <inline-formula><mml:math display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 3.7 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(0.55, 3.7 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m), d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(12, 11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m)</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T3" specific-use="star"><caption><p>Similar to
Table <xref ref-type="table" rid="Ch1.T3"/> but  for naive Bayesian
cloud masks based on five strongly
independent features.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row>  
         <oasis:entry colname="col1"><inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col2">Instrument</oasis:entry>  
         <oasis:entry colname="col3">Skill</oasis:entry>  
         <oasis:entry colname="col4">Feature set</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1"/>  
         <oasis:entry colname="col2"/>  
         <oasis:entry colname="col3">score</oasis:entry>  
         <oasis:entry colname="col4"/>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>  
         <oasis:entry colname="col1">5</oasis:entry>  
         <oasis:entry colname="col2">Synergy</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.756</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 760, 412, 560 nm <inline-formula><mml:math display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 490 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(0.87 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 865 nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">5</oasis:entry>  
         <oasis:entry colname="col2">Synergy</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.751</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">681–900 nm, 11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m–412 nm, 0.87 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>865 nm, 560 nm <inline-formula><mml:math display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 3.7 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(708, 490nm)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">5</oasis:entry>  
         <oasis:entry colname="col2">Synergy</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.750</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">778, 560 nm, 11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m–412 nm, 900–620 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(1.6 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 442 nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">5</oasis:entry>  
         <oasis:entry colname="col2">MERIS</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.753</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">412, 442, 865, 560<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>490 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(681, 900 nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">5</oasis:entry>  
         <oasis:entry colname="col2">MERIS</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.750</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">412, 510–708 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(885, 760 nm), d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(665, 900 nm), d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(620, 412 nm)</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">  
         <oasis:entry colname="col1">5</oasis:entry>  
         <oasis:entry colname="col2">MERIS</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.749</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">760, 412, 865–490 nm, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(900, 708 nm), d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(681, 778nm)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">5</oasis:entry>  
         <oasis:entry colname="col2">AATSR</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.695</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">11, 12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 11–0.87 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 3.7<inline-formula><mml:math display="inline"><mml:mo>/</mml:mo></mml:math></inline-formula>11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m <inline-formula><mml:math display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 0.55 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">5</oasis:entry>  
         <oasis:entry colname="col2">AATSR</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.692</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">0.55, 11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 3.7–12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 11–0.87 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(11, 12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m)</oasis:entry>
       </oasis:row>
       <oasis:row>  
         <oasis:entry colname="col1">5</oasis:entry>  
         <oasis:entry colname="col2">AATSR</oasis:entry>  
         <oasis:entry colname="col3"><inline-formula><mml:math display="inline"><mml:mn>0.691</mml:mn></mml:math></inline-formula></oasis:entry>  
         <oasis:entry colname="col4">11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 12–3.7 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, 11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m <inline-formula><mml:math display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 3.7 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(0.87, 11 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m), d<inline-formula><mml:math display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula>(0.55, 12 <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m)</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p>Similar results can be achieved by using different combinations of feature
functions and channels. An overview of results for the Synergy data,
MERIS, and AATSR alone is given in Tables <xref ref-type="table" rid="Ch1.T1"/>,
<xref ref-type="table" rid="Ch1.T2"/>, and <xref ref-type="table" rid="Ch1.T3"/>.
Tables <xref ref-type="table" rid="Ch1.T1"/> and <xref ref-type="table" rid="Ch1.T2"/> show results for classical
Bayesian cloud masks with strongly independent feature sets for two and
four features, respectively. Table <xref ref-type="table" rid="Ch1.T3"/>
shows results for naive Bayesian cloud masks with five strongly independent
features. Classical Bayesian cloud masks based on two strongly independent
features show best results when the complete Synergy channel set or MERIS alone
is used. The results for AATSR alone are significantly inferior. For the
Synergy data set, the <inline-formula><mml:math display="inline"><mml:mn>11</mml:mn></mml:math></inline-formula> <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m channel in combination with a MERIS channel
in the blue (412  and 442nm) is found in all three top results. For
MERIS alone, a combination of a channel in the blue and an index of red and
short-wave infrared channels is found in the top results. It is quite counterintuitive that the best results for MERIS are achieved with only three
different channels, while the algorithm had the freedom to select up to four
channels. The best result for the set of Synergy channels included four
channels, which relates more to the naive intuition that more channels carry
more information and would therefore be better suited for the application.
However, since the search space was not fully covered, a better solution for
MERIS with four channels could still be found.</p>
      <p>Table <xref ref-type="table" rid="Ch1.T2"/> shows similar results but
for classical Bayesian cloud masks based on a set of four features. Again,
Synergy and MERIS results are significantly better than those from AATSR,
while the Synergy results are only slightly better then those from MERIS
alone. All possible feature functions are used within the results but of
course not all the time for any result.</p>
      <p>Similar studies were also performed for higher numbers of features, but no
results with significantly higher skill scores were found. The skill score
results for using three features are positioned right in the middle of the
two discussed results, such that four features seems to be the best choice to
reproduce the Synergy cloud mask with a classical Bayesian cloud mask based
on strongly independent features.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F7"><caption><p>Cloud probability from the two classical
Bayesian cloud masks from Fig. <xref ref-type="fig" rid="Ch1.F5"/> (dashed line,
<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mi mathvariant="bold-italic">A</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>) and Fig. <xref ref-type="fig" rid="Ch1.F6"/> (solid line,
<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">F</mml:mi><mml:mi mathvariant="bold-italic">B</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>) separated by cases which were labeled as cloudy (red) and
non-cloudy (green) by the Synergy cloud mask. The same data as in
Figs. <xref ref-type="fig" rid="Ch1.F5"/> and <xref ref-type="fig" rid="Ch1.F6"/> were used and the
results were sorted for a better overview. The threshold of 0.5 cloud
probability is marked with a blue line.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f07.pdf"/>

        </fig>

      <p>Similar searches for naive Bayesian cloud masks with strongly independent
features were performed for 5 and <inline-formula><mml:math display="inline"><mml:mn>15</mml:mn></mml:math></inline-formula> features, and results for five
features are shown in Table <xref ref-type="table" rid="Ch1.T3"/>. The
search with <inline-formula><mml:math display="inline"><mml:mn>15</mml:mn></mml:math></inline-formula> features did not show significantly better result than the
ones shown. In general, these results are not as successful in reproducing
the Synergy cloud mask as the approaches with the classical Bayesian cloud
mask. Skill scores for AATSR alone are smaller than for MERIS and Synergy and
also generally smaller than for the classical approach with four features.</p>
      <p>Concluding this aspect, it is possible to find feature sets that reproduce
the Synergy cloud mask reasonably well even without covering the complete
search space. For a soft upper limit of the skill score, different feature
sets with similar skill score can be found. This is actually not surprising
and represents the fact that the same classification results in terms of
skill score can be achieved with many different feature sets. From a
technical point, it is then sufficient to choose one of those results with
best skill scores, even if this might not be the absolute global maximum.</p>
      <p>Some commonly used features, such as the brightness temperature difference of
<inline-formula><mml:math display="inline"><mml:mn>11</mml:mn></mml:math></inline-formula>  and <inline-formula><mml:math display="inline"><mml:mn>12</mml:mn></mml:math></inline-formula> <inline-formula><mml:math display="inline"><mml:mi mathvariant="normal">µ</mml:mi></mml:math></inline-formula>m, did not appear in the shown results. However, this
does not indicate that the found features are in general superior to those
missing. It simply states that during the search no set of features were
found which included them and shows better results. Restricting the search
space to cover only selected features is simple and could be used to limit
the results to features with known physical meaning.</p>
      <p>For both classical and naive Bayesian cloud masks, a specific set of
features should be evaluated as a whole. The effect of a certain feature on
the skill score for the total feature set can be estimated by evaluating
results for a particular set with and without the feature in question. The
effect on the skill score when adding a feature to a given set might strongly
depend on the original feature set. In addition, features which show only
poor reproduction skill when used alone might significantly improve the skill
score for a certain set of features.</p>
      <p>Next, the impact of the number of bins <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, Gaussian smoothing value, and
sample size of the artificial truth data set is discussed. The sensitivity of
the Bayesian cloud mask in terms of skill score with respect to a certain
feature set is shown in Figs. <xref ref-type="fig" rid="Ch1.F8"/> and <xref ref-type="fig" rid="Ch1.F9"/>. Both figures show skill scores for Synergy
cloud mask artificial truth data with respect to number of bins, Gaussian
smoothing factor, and sample size of the artificial truth data. Figure <xref ref-type="fig" rid="Ch1.F8"/> shows an extreme case where only <inline-formula><mml:math display="inline"><mml:mn>100</mml:mn></mml:math></inline-formula> randomly
selected globally distributed cases were used as artificial truth. Again, the
year <inline-formula><mml:math display="inline"><mml:mn>2007</mml:mn></mml:math></inline-formula> was used as pool for the artificial truth and the year <inline-formula><mml:math display="inline"><mml:mn>2008</mml:mn></mml:math></inline-formula> to
compute the skill score. The skill scores of the cloud mask which is based on
such a small sample size clearly depends on the sample itself. The procedure
was repeated 10 times and the achieved mean skill score is shown. The
standard deviation in the last digit is shown in parenthesis.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F8"><caption><p>Skill score of a classical Bayesian
cloud mask with four strongly independent features with respect to number of
bins for each dimension of the underlying histograms and the applied Gaussian
smoothing. Artificial truth data are taken from <inline-formula><mml:math display="inline"><mml:mn>2007</mml:mn></mml:math></inline-formula> and skill scores were
computed for the year <inline-formula><mml:math display="inline"><mml:mn>2008</mml:mn></mml:math></inline-formula>. Only <inline-formula><mml:math display="inline"><mml:mn>100</mml:mn></mml:math></inline-formula> randomly selected and globally
distributed spectra were used to compute the histograms. This selection was
repeated 10 times and mean values for the skill score are shown. The
standard deviation on the last significant digit is shown in parenthesis.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f08.pdf"/>

        </fig>

      <p>With no Gaussian smoothing applied, the skill score clearly decreases with
increasing number of bins since the sample size is much too small for this
resolution. Also, the impact of the sample is largest when the standard
deviation is highest. The skill score increases with increasing number of
bins and Gaussian smoothing until a maximum is reached. With the increasing
bin number and smoothing, the skill score decreases only slightly. In
this case, an optimal set of bin size and smoothing can be found. When smaller
vales are used, the skill scores are drastically reduced, but when larger
values are used, the skill score decreases only slightly.</p>
      <p>A similar sensitivity study is shown in Fig. <xref ref-type="fig" rid="Ch1.F9"/>, but here a much larger sample size of
artificial truth data was used. Again, without Gaussian smoothing the
smallest number of bins shows the best results, while with increasing number
of bins the skill score decreased because the total number of bins grows with
the fourth potential of the number of bins. A large plateau of consistently
stable and high skill score values is found for numbers of bins above <inline-formula><mml:math display="inline"><mml:mn>25</mml:mn></mml:math></inline-formula>
and Gaussian smoothing  above <inline-formula><mml:math display="inline"><mml:mn>0.9</mml:mn></mml:math></inline-formula>.</p>
      <p>In both cases, for small and very large sample sizes of artificial truth
data the skill score decreases with increasing Gaussian smoothing for small
numbers of bins. This clearly shows that too strong Gaussian smoothing can
destroy information in an accurately estimated histogram but distributes
information in incomplete histograms such that it better represents the true
probability density.</p>
      <p>In general, one can not perform such studies to assess the optimal number of
bins and value of Gaussian smoothing parameter, because only an insufficient
number of artificial truth data might be available. The presented results
from numerical experiments indicate that for four features and a sufficiently
large sample of artificial truth data, a bin size of <inline-formula><mml:math display="inline"><mml:mn>40</mml:mn></mml:math></inline-formula> with a Gaussian
smoothing of <inline-formula><mml:math display="inline"><mml:mn>1.5</mml:mn></mml:math></inline-formula> is a good choice. This result holds not only for the
presented feature set but also for many other sets which have been assessed
during this research.</p>
</sec>
<sec id="Ch1.S7.SS2">
  <title>Enhancements of existing algorithms</title>
      <p>It was shown so far that Bayesian cloud masks can be used to reproduce at
least one existing cloud mask up to a certain extent. It is unclear, however,
what the limiting factors are in global skill score with respect to this
particular cloud mask. A major contributor to this upper limit can be
inconsistencies in the artificial truth data set. Examples are shown in panel
a and b of Figs. <xref ref-type="fig" rid="Ch1.F10"/> and <xref ref-type="fig" rid="Ch1.F11"/>, which actually show the surroundings of the scenes
shown in Figs. <xref ref-type="fig" rid="Ch1.F1"/> and <xref ref-type="fig" rid="Ch1.F2"/>. Both figures show
some classification errors of the Synergy cloud mask. The top part of Fig. <xref ref-type="fig" rid="Ch1.F10"/> shows a partly cloudy scene over a large ice- or
snow-covered area which is completely masked as cloudy (white areas in panel
b). In addition, the arrow-shaped land area in the lower part of the figure
(Brodeur Peninsula on Baffin Island) is clearly not cloudy but is classified
as cloudy. Similarly in Fig. <xref ref-type="fig" rid="Ch1.F11"/>, the complete dust
storm east of the Korean peninsula is marked as cloudy. Such classification
errors introduce inconsistencies which affect the produced histograms and are
in general difficult to reproduce with an independent system.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F9"><caption><p>Similar to
Fig. <xref ref-type="fig" rid="Ch1.F8"/> but the sample size of the artificial
truth was 1000 times larger with 100 k cases.</p></caption>
          <?xmltex \igopts{width=241.848425pt}?><graphic xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f09.pdf"/>

        </fig>

      <p>The appearance of such errors does not mean that the algorithm should be
abandoned and with it all the work that has been invested into developing it.
Panels c and d in Figs. <xref ref-type="fig" rid="Ch1.F10"/> and <xref ref-type="fig" rid="Ch1.F11"/> show how the Bayesian cloud mask technique can be
used to enhance this existing algorithm when errors in the artificial truth
data are manually corrected by an human expert. Synergy cloud mask results
from these two orbits were manually corrected and used as artificial truth to
produce a classical Bayesian cloud mask based on four strongly independent
features. The two orbits were then reprocessed and the resulting cloud masks
and cloud probabilities are shown. Some artifacts at land and ice boundaries
are still present, but the major classification errors were strongly reduced.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F10"><caption><p><bold>(a)</bold> shows an RGB view of a
larger area of the scene which is shown in Fig. <xref ref-type="fig" rid="Ch1.F1"/>.
<bold>(b)</bold> shows results of the non-Bayesian Synergy cloud mask with some
classification errors over the top snow and ice region and the arrow-shaped
land area in the bottom of the figure. <bold>(c)</bold> shows results of a
Bayesian cloud mask which is based on corrected artificial truth from this
scene and the one shown in Fig. <xref ref-type="fig" rid="Ch1.F11"/>. <bold>(c)</bold> shows
the cloud probability results of this Bayesian cloud mask.</p></caption>
          <?xmltex \igopts{width=227.622047pt}?><graphic xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f10.jpg"/>

        </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F11"><caption><p>Similar to
Fig. <xref ref-type="fig" rid="Ch1.F10"/> but a larger area corresponding to
Fig. <xref ref-type="fig" rid="Ch1.F2"/> is shown. <bold>(b)</bold> shows results of the non-Bayesian
Synergy cloud mask where the strong dust storm is completely classified as
cloud. <bold>(c)</bold> shows results of a Bayesian cloud mask which is based on
corrected artificial truth from this scene and the one shown in
Fig. <xref ref-type="fig" rid="Ch1.F10"/>. <bold>(c)</bold> shows the cloud probability
results of this Bayesian cloud mask.</p></caption>
          <?xmltex \igopts{width=227.622047pt}?><graphic xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f11.jpg"/>

        </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F12" specific-use="star"><caption><p>Manual classifications of the scenes shown in
Figs. <xref ref-type="fig" rid="Ch1.F1"/> and <xref ref-type="fig" rid="Ch1.F2"/>. Shown are the cloudy and
non-cloudy classification together with an RGB view for two scenes (two
leftmost panels, blue is non-cloudy, red is cloudy), the resulting
cloud mask (two middle panels), and the cloud probability (rightmost two
panels).</p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://amt.copernicus.org/articles/8/1757/2015/amt-8-1757-2015-f12.jpg"/>

        </fig>

      <p>This result is merely shown as proof of concept for the enhancement of
existing algorithms. The shown case was limited to only two scenes which were
manually corrected and used as artificial truth for the Bayesian cloud mask,
which is therefore only strictly applicable to these two scenes. In a
realistic approach, one would need some knowledge on where the existing
algorithm performs below the requirements. This poses no real limitation and
will always be the case; otherwise one would have no incentive to improve the
existing algorithm. These cases, e.g., limited to certain areas, known weather
conditions, or certain periods of time, could be excluded from the artificial
truth data set while other correctly classified results are still included.
These introduced data gaps, or better representativity gaps, can then be
filled with artificial truth data from manual classification. Such an
approach can be used to focus the attention of the human experts to areas
where their expertise is most strongly needed and to use their available
labor in the most efficient way.</p>
      <p>As discussed in Sect. <xref ref-type="sec" rid="Ch1.S7.SS1"/>, possibly many
different feature sets can be used to recreate the algorithms which were used
to produce the artificial truth data. This property can be used to produce
much more robust cloud masking algorithms. When the seeding algorithm cannot
cope with missing data when, e.g., a certain needed channel is flagged as
unusable or saturated, one can simply switch to a different Bayesian cloud
mask which does not depend on that channel. The operational version of the
cloud mask for the Cloud CCI project contains several ranked Bayesian cloud
masks, and when the top mask fails to produce a result, a mask of lower rank is
used until the last mask is used or the algorithm produces a result. This
approach can greatly reduce the number of unprocessed measurements for a
cloud masking scheme.</p><?xmltex \hack{\newpage}?>
</sec>
<sec id="Ch1.S7.SS3">
  <title>Cloud masks from manually classified data</title>
      <p>Human experts can produce artificial truth data of high quality by careful
manual classification of MERIS, AATSR, or Synergy images. It is of great
advantage that the spatial resolution of MERIS and AATSR images is high
enough  that spatial and spectral patterns together can be used to
classify data points. Cloud shadows, for instance, can be used to clearly
distinguish clouds from snow and ice surfaces. In that respect, the algorithm
itself is not based on spatial information, but it was surely used to create
the artificial truth data. It is beyond the scope of this paper to produce a
cloud mask with global applicability, but it should be demonstrated how
straightforward such a procedure would be. The results presented here are
then clearly applicable to OLCI and SLSTR on-board the upcoming Sentinel-3
satellite.</p>
      <p>The same two orbits which were discussed in Sects. <xref ref-type="sec" rid="Ch1.S4"/>
and <xref ref-type="sec" rid="Ch1.S7.SS2"/> are used for the procedure. Both orbits contain
scenes which are in general difficult to classify accurately, such as clouds
over a snow- and ice-covered region, cloud-free snow- and ice-covered surfaces,
or a pronounced dust storm. The manual classification setup was designed
such that no special computational knowledge is needed to perform the cloud
classification. For each test orbit, image files containing several layers
were created. The various image layers include an RGB image, contrast
stretched gray-scale images from Synergy channels, and several feature
functions which were found to be of good performance in the Bayesian
framework. To classify actual pixels, the human expert has to color areas
(e.g., blue color for cloud free and red color for cloudy) in a blank image
layer. By adjusting the transparency of the single layers, each scene can be
carefully inspected before a decision is made. The actual shape of the
colored areas is of lower importance as well as the actual number of
classified areas. However, the total variability of possible cases and scenes
should be included in the classification.</p>
      <p>Results of this procedure are shown in Fig. <xref ref-type="fig" rid="Ch1.F12"/>. The
leftmost two panels show an RGB view of the scene and with blue and red color
the areas which were classified by the human expert. The actual number of the
classified area is small compared to the total size of the scene. Then, this
data set was used as artificial truth and a classical Bayesian cloud mask with
four strongly independent features was set up to process the two orbits.
The resulting cloud masks are shown in the middle two panels, while the
actual cloud probability is shown in the two rightmost panels.</p>
      <p>The Bayesian cloud mask is clearly able to separate the clouds from the snow
and ice underground, does not misclassify the land area (see Sect. <xref ref-type="sec" rid="Ch1.S7.SS2"/>), and is able to mostly separate clouds from the dust
storm. Most importantly, the human expert does not need to be an expert on
how to implement this mask or how to design hierarchies of thresholds;
rather, they
simply translate classification decisions into cloud mask results. These
images can be stored for future enhancements of the artificial truth data set
and as self-describing documentation of the algorithm.</p>
      <p>This approach is most straightforward when the spatial resolution of the
instrument in question is high enough that the human expert can use the
spatial pattern information to correctly classify cloudy from non-cloudy
areas. For global applicability, a higher number of orbits with
representative spatial and seasonal sampling should be included in the set of
considered artificial truth data. Especially complex cases such as scenes
with ice, snow, sun glint, mountains, or dust storms should be included in
the classification effort.</p>
</sec>
</sec>
<sec id="Ch1.S8" sec-type="conclusions">
  <title>Conclusions</title>
      <p>The application of the classical and naive Bayesian cloud masking technique
to MERIS, AATSR, and their Synergy was discussed in detail. Bayesian cloud
masks based on independent features are numerically highly efficient and are
very well suited for the fast processing of large numbers of data. This technique will be applied to a
reprocessing of the 9.5 year time series of MERIS and AATSR measurements within
ESA's Cloud CCI project.</p>
      <p>Details of the actual implementation of the Bayesian cloud mask for Cloud CCI
are not part of this paper. The algorithm is implemented in Python and is
based on the multiprocessing, SciPy, and NumPy libraries <xref ref-type="bibr" rid="bib1.bibx25" id="paren.31"/>.
Effective parallelization is achieved trough separation of CPU bound and
input / output (<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>I</mml:mi><mml:mo>/</mml:mo><mml:mi>O</mml:mi></mml:mrow></mml:math></inline-formula>) tasks. Processing time per orbit is largely dominated by
<inline-formula><mml:math display="inline"><mml:mrow><mml:mi>I</mml:mi><mml:mo>/</mml:mo><mml:mi>O</mml:mi></mml:mrow></mml:math></inline-formula> and the actual time spend in the Bayesian scheme is 1 order of
magnitude smaller than the total run time. Currently, the scheme supports the
classical and naive approach for independent Bayesian cloud masks. The final
set of features for processing the complete Cloud CCI period of 9.5 years
will be determined in the near future before starting the generation of level
3 data.</p>
      <p>Sufficient numbers of artificial truth data and the frequentist approach
can be used to estimate multidimensional histograms for the estimation of
background joint probabilities. Gaussian smoothing of appropriate width can
be used to drastically reduce the actual numbers of truth data needed to
compute histograms for the classical Bayesian approach. This post-processing
step greatly simplifies our ability to further explore the classical Bayesian
approach.</p>
      <p>Due to restrictions of modern computer hardware, the practical limit for the
classical Bayesian approach is reached with six to seven features. This does
not actually restrict its applicability, since trivial feature functions can
be used which combine any number of measurements into a single feature.</p>
      <p>It was found that classical Bayesian cloud masks with four strongly
independent features are the best choice for the cloud masking of MERIS,
AATSR, and their Synergy measurements when the Synergy cloud mask is used as
a
benchmark. The classical approach gave significantly better results then the
naive approach. MERIS and the MERIS–AATSR Synergy give very similar results
in terms of cloud classification, while AATSR alone shows significantly
smaller skill scores. The MERIS Oxygen-A absorption channel was found to be
present in the best results when the set of selected feature functions and
channels was numerically optimized.</p>
      <p>The broad spectral range and the number of available channels within the
Synergy data set can be used to set up Bayesian cloud masks with very similar
classification skill but based on different combinations of channels. This
can be used to design cloud masking schemes which are robust against
partially missing data.</p>
      <p>It was shown how Bayesian cloud masks can be used to reproduce the results of
existing algorithms, improve existing algorithms and how to set up new
classification schemes based on manual classification by human experts.
Reproducing existing algorithms offers the perspective of increased numerical
efficiency and processing robustness. The approach based on manual image
classification is straightforward for the human expert. Classified scenes can
be stored and revisited if the produced cloud masks show misclassifications
in certain areas or weather conditions. When errors are not traceable to errors
in the manual classification,  additional scenes can be added to the set
of artificial truth data to increase the chance of correct classification.</p>
      <p>The presented results for MERIS and AATSR can be used to implement an
accurate and highly efficient cloud masking scheme for OLCI and SLSTR on-board
the upcoming Sentinel 3 satellite. Especially the additional oxygen
absorption channels from the OLCI instrument might be used within an improved
and numerically efficient cloud classification algorithm.</p>
      <p>Although this paper is focused on strongly independent Bayesian cloud masks,
there is no apparent reason which prevents the application of the introduced
techniques to the case of dependent Bayesian cloud masks. It is
straightforward to include external information such as clear sky radiance
estimators or NWP fields in the proposed optimization strategy for the
construction of features. The application of Gaussian smoothing to derived
histogram fields is independent of external information and can be used to
reduce the numbers of needed truth data. To actually assess the added value of
the external data, one must assure that the quality of the truth data is
sufficient. In the case of MERIS and AATSR, one likely needs a reasonable
large set of manually classified data.</p>
</sec>

      
      </body>
    <back><ack><title>Acknowledgements</title><p>This work has been funded by the European Space Agency in the framework
of the Climate Change Initiative project and by the German Federal
Ministry of Education and Research (BMBF) in the framework of the
<inline-formula><mml:math display="inline"><mml:mrow><mml:mtext>HD</mml:mtext><mml:mo>(</mml:mo><mml:mtext>CP</mml:mtext><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> project.<?xmltex \hack{\newline}?><?xmltex \hack{\newline}?>
Edited by:  A. Kokhanovsky</p></ack><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><label>Carbajal Henken et al.(2014)Carbajal Henken, Lindstrot, Preusker, and
Fischer</label><mixed-citation>Carbajal Henken, C. K., Lindstrot, R., Preusker, R., and Fischer, J.: FAME-C:
cloud property retrieval using synergistic AATSR and MERIS observations,
Atmos. Meas. Tech. Discuss., 7, 4909–4947, <ext-link xlink:href="http://dx.doi.org/10.5194/amtd-7-4909-2014" ext-link-type="DOI">10.5194/amtd-7-4909-2014</ext-link>,
2014.</mixed-citation></ref>
      <ref id="bib1.bibx2"><label>Coppo et al.(2010)Coppo, Ricciarelli, Brandani, Delderfield, Ferlet,
Mutlow, Munro, Nightingale, Smith, Bianchi, Nicol, Kirschstein, Hennig,
Engel, Frerick, and Nieke</label><mixed-citation>Coppo, P., Ricciarelli, B., Brandani, F., Delderfield, J., Ferlet, M., Mutlow,
C., Munro, G., Nightingale, T., Smith, D., Bianchi, S., Nicol, P.,
Kirschstein, S., Hennig, T., Engel, W., Frerick, J., and Nieke, J.: SLSTR: a
high accuracy dual scan temperature radiometer for sea and land surface
monitoring from space, J. Mod. Optic., 57, 1815–1830,
<ext-link xlink:href="http://dx.doi.org/10.1080/09500340.2010.503010" ext-link-type="DOI">10.1080/09500340.2010.503010</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx3"><label>English et al.(1999)English, Eyre, and Smith</label><mixed-citation>
English, S., Eyre, J., and Smith, J.: A cloud-detection scheme for use with
satellite sounding radiances in the context of data assimilation for
numerical weather prediction, Q. J. Roy. Meteor.
Soc., 125, 2359–2378, 1999.</mixed-citation></ref>
      <ref id="bib1.bibx4"><label>Fomferra and Brockmann(2005)</label><mixed-citation>
Fomferra, N. and Brockmann, C.: Beam-the ENVISAT MERIS and AATSR toolbox,
in: MERIS (A)ATSR Workshop 2005,   597, p. 13, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx5"><label>Gómez-Chova et al.(2006)Gómez-Chova, Camps-Valls,
Amorós-López, Guanter, Alonso, Calpe, and Moreno</label><mixed-citation>
Gómez-Chova, L., Camps-Valls, G., Amorós-López, J., Guanter, L.,
Alonso, L., Calpe, J., and Moreno, J.: New cloud detection algorithm for
multispectral and hyperspectral images: Application to ENVISAT/MERIS and
PROBA/CHRIS sensors, in: IEEE International Geoscience and Remote Sensing
Symposium, IGARSS,  2757–2760, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx6"><label>Gómez-Chova et al.(2008)Gómez-Chova, Camps-Valls,
Munoz-Marı, Calpe, and Moreno</label><mixed-citation>
Gómez-Chova, L., Camps-Valls, G., Munoz-Marı, J., Calpe, J., and Moreno,
J.: Cloud screening methodology for MERIS/AATSR Synergy products, in: Proc.
2nd MERIS/AATSR User Workshop, ESRIN, Frascati,   22–26, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx7"><label>Hall et al.(2002)Hall, Riggs, Salomonson, DiGirolamo, and
Bayr</label><mixed-citation>
Hall, D. K., Riggs, G. A., Salomonson, V. V., DiGirolamo, N. E., and Bayr,
K. J.: MODIS snow-cover products, Remote Sens. Environ., 83,
181–194, 2002.</mixed-citation></ref>
      <ref id="bib1.bibx8"><label>Hanssen and Kuipers(1965)</label><mixed-citation>
Hanssen, A. W.  and  Kuipers, W. J. A.: On the Relationship Between the Frequency of Rain and
Various Meteorological Parameters: (with Reference to the Problem of Objective Forecasting), Staatsdrukkerij-en Uitgeverijbedrijf, 1965.</mixed-citation></ref>
      <ref id="bib1.bibx9"><label>Heidinger et al.(2012)Heidinger, Evan, Foster, and
Walther</label><mixed-citation>
Heidinger, A. K., Evan, A. T., Foster, M. J., and Walther, A.: A naive Bayesian
cloud-detection scheme derived from CALIPSO and applied within PATMOS-x,
J. Appl. Meteorol. Clim., 51, 1129–1144, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx10"><label>Hollmann and Lecomte(2013)</label><mixed-citation>Hollmann, R. and Lecomte, D. P.: Climate Assessment Report, Tech. rep., ESA
Cloud CCI, available at:
<uri>http://www.esa-cloud-cci.org/sites/default/files/documents/public/Cloud_CCI_D4-2_CAR_1.0.pdf</uri> (last access: 13 April 2015), 2013.</mixed-citation></ref>
      <ref id="bib1.bibx11"><label>Hollmann et al.(2013)Hollmann, Merchant, Saunders, Downy, Buchwitz,
Cazenave, Chuvieco, Defourny, De Leeuw, Forsberg, Holzer-Popp, Paul, Sandven,
Sathyendranath, and Roozendael</label><mixed-citation>
Hollmann, R., Merchant, C., Saunders, R., Downy, C., Buchwitz, M., Cazenave,
A., Chuvieco, E., Defourny, P., De Leeuw, G., Forsberg, R., Holzer-Popp, T.,
Paul, F., Sandven, S., Sathyendranath, S., and Roozendael, M.: The ESA
climate change initiative: Satellite data records for essential climate
variables, B. Am. Meteorol. Soc., 94, 1541–1552,
2013.</mixed-citation></ref>
      <ref id="bib1.bibx12"><label>Kriegler et al.(1969)Kriegler, Malila, Nalepka, and
Richardson</label><mixed-citation>
Kriegler, F., Malila, W., Nalepka, R., and Richardson, W.: Preprocessing
transformations and their effects on multispectral recognition, in: Remote
Sens. Environ., VI, vol. 1, p. 97, 1969.</mixed-citation></ref>
      <ref id="bib1.bibx13"><label>Llewellyn-Jones et al.(2001)Llewellyn-Jones, Edwards, Mutlow, Birks,
Barton, and Tait</label><mixed-citation>
Llewellyn-Jones, D., Edwards, M., Mutlow, C., Birks, A., Barton, I., and Tait,
H.: AATSR: Global-change and surface-temperature measurements from Envisat,
ESA bulletin, 105, 11–21, 2001.</mixed-citation></ref>
      <ref id="bib1.bibx14"><label>Mackie et al.(2010a)Mackie, Embury, Old, Merchant, and
Francis</label><mixed-citation>
Mackie, S., Embury, O., Old, C., Merchant, C., and Francis, P.: Generalized
Bayesian cloud detection for satellite imagery. Part 1: Technique and
validation for night-time imagery over land and sea, Int. J.
Remote Sens., 31, 2573–2594, 2010a.</mixed-citation></ref>
      <ref id="bib1.bibx15"><label>Mackie et al.(2010b)Mackie, Merchant, Embury, and
Francis</label><mixed-citation>
Mackie, S., Merchant, C., Embury, O., and Francis, P.: Generalized Bayesian
cloud detection for satellite imagery. Part 2: Technique and validation for
daytime imagery, Int. J. Remote Sens., 31, 2595–2621,
2010b.</mixed-citation></ref>
      <ref id="bib1.bibx16"><label>Merchant et al.(2005)Merchant, Harris, Maturi, and
MacCallum</label><mixed-citation>
Merchant, C., Harris, A., Maturi, E., and MacCallum, S.: Probabilistic
physically based cloud screening of satellite infrared imagery for
operational sea surface temperature retrieval, Q. J. Roy.
Meteor. Soc., 131, 2735–2755, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx17"><label>Miguel et al.(2007)Miguel, Bruno, Jean-Loup, Mark, Florence, Ulf,
Constantinos, Pierluigi, Bruno, and Jerome</label><mixed-citation>
Miguel, A., Bruno, B., Jean-Loup, B., Mark, D., Florence, H., Ulf, K.,
Constantinos, M., Pierluigi, S., Bruno, G., and Jerome, B.: Sentinel-3 – the
ocean and medium-resolution land mission for GMES operational services, ESA
Bulletin (ISSN 0376-4265), 131, 24–29, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx18"><label>Murtagh et al.(2003)Murtagh, Barreto, and Marcello</label><mixed-citation>
Murtagh, F., Barreto, D., and Marcello, J.: Decision boundaries using Bayes
factors: the case of cloud masks, IEEE T. Geosci. Remote, 41, 2952–2958, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx19"><label>Nieke(2008)</label><mixed-citation>
Nieke, J.: Status of the optical payload and processor development of ESA's
Sentinel 3 mission, Geoscience and Remote Sensing Symposium, 2008, IGARSS
2008, IEEE International, 4, 427–430, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx20"><label>Rast et al.(1999)Rast, Bezy, and Bruzzi</label><mixed-citation>Rast, M., Bezy, J. L., and Bruzzi, S.: The ESA Medium Resolution Imaging
Spectrometer MERIS a review of the instrument and its mission,
Int. J. Remote Sens., 20, 1681–1702,
<ext-link xlink:href="http://dx.doi.org/10.1080/014311699212416" ext-link-type="DOI">10.1080/014311699212416</ext-link>, 1999.
</mixed-citation></ref><?xmltex \hack{\newpage}?>
      <ref id="bib1.bibx21"><label>Rossow and Garder(1993)</label><mixed-citation>
Rossow, W. B. and Garder, L. C.: Cloud detection using satellite measurements
of infrared and visible radiances for ISCCP, J. Climate, 6,
2341–2369, 1993.</mixed-citation></ref>
      <ref id="bib1.bibx22"><label>Schlundt et al.(2011)Schlundt, Kokhanovsky, von Hoyningen-Huene,
Dinter, Istomina, and Burrows</label><mixed-citation>Schlundt, C., Kokhanovsky, A. A., von Hoyningen-Huene, W., Dinter, T.,
Istomina, L., and Burrows, J. P.: Synergetic cloud fraction determination for
SCIAMACHY using MERIS, Atmos. Meas. Tech., 4, 319–337,
<ext-link xlink:href="http://dx.doi.org/10.5194/amt-4-319-2011" ext-link-type="DOI">10.5194/amt-4-319-2011</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx23"><label>Stubenrauch et al.(2012)Stubenrauch, Rossow, Kinne, Ackerman, Cesana,
Chepfer, and Di</label><mixed-citation>
Stubenrauch, C., Rossow, W., Kinne, S., Ackerman, S., Cesana, G., Chepfer, H.,
and Di, L.: Assessment of Global Cloud Data Sets from Satellites, A Project
of the World Climate Research Programme Global Energy and Water Cycle
Experiment (GEWEX) Radiation Panel, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx24"><label>Uddstrom et al.(1999)Uddstrom, Gray, Murphy, Oien, and
Murray</label><mixed-citation>
Uddstrom, M. J., Gray, W. R., Murphy, R., Oien, N. A., and Murray, T.: A
Bayesian cloud mask for sea surface temperature retrieval, J.
Atmos. Ocean. Tech., 16, 117–132, 1999.</mixed-citation></ref>
      <ref id="bib1.bibx25"><label>van der Walt et al.(2011)van der Walt, Colbert, and
Varoquaux</label><mixed-citation>van der Walt, S., Colbert, S., and Varoquaux, G.: The NumPy Array: A Structure
for Efficient Numerical Computation, Comput. Sci. Eng., 13,
22–30, <ext-link xlink:href="http://dx.doi.org/10.1109/MCSE.2011.37" ext-link-type="DOI">10.1109/MCSE.2011.37</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx26"><label>Woodcock(1976)</label><mixed-citation>
Woodcock, F.: The evaluation of yes/no forecasts for scientific and
administrative purposes, Mon. Weather Rev., 104, 1209–1214, 1976.</mixed-citation></ref>

  </ref-list><app-group content-type="float"><app><title/>

    </app></app-group></back>
    </article>
