HelioFTH : combining cloud index principles and aggregated rating for cloud masking using infrared observations from geost tionary satellites

In this paper a cloud mask and cloud fractional coverage (CFC) retrieval scheme called HelioFTH is presented. The algorithm is self-calibrating and relies on infrared (IR) window-channel observations only. It needs no input from numerical weather prediction (NWP) or radiative transfer models, nor from other satellite platforms. The scheme is applicable to the full temporal and spatial resolution of the Meteosat Visible and InfraRed Imager (MVIRI) and the Spinning Enhanced Visible and InfraRed Imager (SEVIRI) sensors. The main focus is laid on the separation of middleand high-level cloud coverage (HCC) from lowlevel clouds based on an internal cloud-top pressure (CTP) product. CFC retrieval employs a IR-only cloud mask based on an aggregated rating scheme. CTP retrieval is based on a Heliosat-like cloud index for the MVIRI IR channel. CFC from HelioFTH, the International Satellite Cloud Climatology Project (ISCCP) DX and the Satellite Application Facility on Climate Monitoring (CM SAF) were validated with CFC from the Baseline Surface Radiation Network (BSRN) and the Alpine Surface Radiation Budget (ASRB) network. HelioFTH CFC differs by not more than 5–10 % from CM SAF CFC but it is higher than ISCCP-DX CFC. In particular the conditional probability to detect cloud-free pixels with HelioFTH is raised by about 35 % compared to ISCCP-DX. The HelioFTH CFC is able to reproduce the day-to-day variability observed at the surface. Also, the HelioFTH HCC was inter-compared to CM SAF and ISCCP-DX over different regions and stations. The probability of false detection of cloud-free HCC pixels is in the same order as ISCCP-DX compared to the CM SAF HCC product over the full-disk area. HelioFTH could be used for generating an independent climate data record of cloud physical properties once its consistency and homogeneity is validated for the full Meteosat time series.


Introduction
Clouds play an essential role in determining the earth's radiation balance.They are essential factors regulating the global water cycle.Though of utmost relevance, the largest uncertainty of modeled climate predictions is related to the feedback of clouds to greenhouse gas changes (Trenberth et al., 2007).The automated identification of clouds in satellite measurements is a challenging task and a basic requirement for processing of cloudy and clear sky products.The Global Energy and Water Cycle Experiment (GEWEX) Radiation Panel initiated the GEWEX Cloud Assessment Project in 2005 with the objective to determine the accuracy and uncertainty sources of cloud properties retrieved from satellite observations in order to ease usability for the climate community.A summary of the GEWEX Cloud Assessment results was published by Stubenrauch et al. (2013).A wellknown and valuable cloud dataset is the International Satellite Cloud Climatology Project (ISCCP) DX dataset (Rossow et al., 1996).ISCCP-DX is a pixel level cloud product based on data from polar orbiting and geostationary satellites with B. D ürr et al.: Infrared-based cloud masking 30 km horizontal and 3 hours time resolution.It is the only global dataset that is freely available, that covers more than 25 yr and that resolves the diurnal cycle.Amongst others the ISCCP-DX product includes data of the Meteosat geostationary satellites since July 1983.
The main requirements for the new cloud masking and cloud-top pressure retrieval scheme called HelioFTH are 1.It shall be applicable to daytime and nighttime Meteosat Visible and InfraRed Imager (MVIRI) and Spinning Enhanced Visible and InfraRed Imager (SEVIRI) observations without quality differences throughout the day.Therefore, the scheme can only be based on infrared window channel observations.
2. It shall be applicable to the full spatial and temporal resolution of MVIRI and SEVIRI observations.
3. No auxiliary input data from numerical weather prediction (NWP) or radiative transfer models, nor from other satellite platforms, shall be necessary.
4. It is able to detect middle-and high-level clouds, i.e., clouds with cloud-top pressure smaller than 680 hPa.
The Satellite Application Facility on Climate Monitoring (CM SAF) of EUMETSAT (European Organisation for the Exploitation of Meteorological Satellites) in cooperation with Centre National de la Recherche Scientifique (CNRS) produced and released a long-term data record of free tropospheric humidity (FTH).The FTH retrieval is reliable under clear sky and low-level cloud conditions, i.e., in presence of clouds with cloud-top pressure larger than approximately 680 hPa.In a future release the FTH product shall be based on minimum temporal and spatial resolutions of 1 h and 0.25 • , respectively.This goes beyond the specifications of currently available cloud mask data records.Therefore, CM SAF initiated the development of HelioFTH and intends to utilize the results.
The calculation of the long-wave cloud index (LCI) for HelioFTH using Heliosat principles (Cano et al., 1986;Beyer et al., 1996;Hammer et al., 2003;Dürr and Zelenka, 2009;Posselt et al., 2012) is based on raw sensor counts instead of brightness temperature.The calculation of brightness temperature is dependent on the calibration of the MVIRI IR window channel.Despite the blackbody cavity on board the Meteosat First Generation satellites a vicarious calibration method has to be applied to obtain the calibration coefficients (Gube et al., 1996).The vicarious calibration is based on radiative transfer calculations for cloud-free pixels using atmospheric input data from NWP models.The usage of cloudfree pixels only shows some deficiencies for the calibration of the coldest sensor counts (Knapp, 2008).Another potential vulnerability is the dependency on satellite data from different platforms as demonstrated by Knapp (2008) for the ISCCP B1 dataset.Therefore a feasibility study is presented here which elaborates the potential to define a IR cloud mask for geostationary satellites based on Heliosat principles and a modified SPARC ("Separation of Pixels Using Aggregated Rating over Canada") rating scheme without the need for auxiliary model or satellite input data.
The paper is structured as follows.Section 2 gives an overview of satellite and surface data used to obtain and validate the different HelioFTH cloud products.Section 3 describes the various processing steps of the HelioFTH scheme in detail.Section 4 contains the results of validation against independent surface cloud observations and the results of satellite inter-comparison with ISCCP-DX and CM SAF (Satellite Application Facility on Climate Monitoring) products.Section 5 gives an overview of the upcoming activities.And Sect.6 concludes with a summary of the paper.

Data
CM SAF and ISCCP-DX cloud products and surface cloud observations are used to validate the results of the HelioFTH scheme.

Satellite data
The investigation area covers Meteosat full disk which is roughly approximated by a regular latitude/longitude grid from 60 • S to 60 • N and 60 • W to 60 • E in this paper.Two regional areas were additionally analyzed: Europe (30

Meteosat-7
The MVIRI spatial resolution of IR channel 10.8 µm data (IR10.8) is 5 km × 5 km at nadir and 2.5 km × 2.5 km for the visible (VIS) channel.Half-hourly (hereinafter referred to as instantaneous) IR10.8 and VIS raw sensor counts from Meteosat-7 satellite for April 2004 were obtained as level 1.5 OpenMTP files from EUMETSAT's U-MARF archive.

CM SAF
The cloud screening and cloud masking are performed using the NWC SAF MSG v2010 algorithm, which is described in more detail in Derrien and Gléau (2005).The cloud mask comprises 6 categories: cloud filled, cloud free, partially cloudy and non-processed, snow/ice contaminated, undefined.The cloud fractional cover is defined as the fraction of cloudy pixels per grid square compared to the total number of analyzed pixels in the grid square.Pixels are counted as cloudy if they belong to the classes cloud filled or partially cloudy.Fractional cloud cover is expressed in percent.The cloud mask is produced in an operational environment since summer 2006 (Schulz et al., 2009).Therefore, the CM SAF team processed a off-line set of the CM SAF cloud products based on SEVIRI data for April 2004, which is used as the reference month for this paper.
A typical issue with passive IR is the detection of thin clouds with an optical thickness of approximately 0.3 or less.Some thin clouds (particularly, ice clouds) over cold ground surfaces may remain undetected even if having cloud optical thicknesses higher than the above mentioned detection limit.Even though a special twilight transition procedure has been applied, the switch from day-to nighttime algorithm might lead to spurious spikes.Finally, a distinct dependency on satellite-viewing zenith angle (VZA) occurs that leads to an overestimation of cloudiness at high VZA (Kniffka et al., 2012).

ISCCP-DX
ISCCP provides cloud properties over a period of more than 25 yr (Rossow and Schiffer, 1991;Rossow et al., 1996;Rossow and Schiffer, 1999).This project was established in 1982 to analyze weather satellite radiance measurements (from geostationary and polar orbiting satellites) to infer the global distribution of clouds, their properties, and their diurnal, seasonal and inter-annual variations.This project and its results are considered to be the state of the art today on what can be derived from routine weather satellite data to study the role of clouds in climate.ISCCP is the first existing TCDR for cloud physical properties.The ISCCP-DX product contains a cloud mask and CTP and is available at 30 km and 3 h spatio-temporal resolution.The 3-hourly ISCCP-DX product was obtained from the EOS data server (http://eosweb.larc.nasa.gov/PRODOCS/isccp/tableisccp.html)and the cloud flag was calculated according to Rossow et al. (1996, see Sect. 2.3.4).The ISCCP-DX cloud mask is based on an IR threshold test during night and a VIS (if available) or a nearinfrared threshold test (not available for Meteosat-7) during the day.Stubenrauch et al. (2013) provides estimates on uncertainties: cloud fractional cover within 10 % and CTP within 100 hPa.

Long wave cloud index based on radiation data
For April 2004 surface radiation measurements were obtained from the Alpine Surface Radiation Budget (ASRB) network (Marty et al., 2002) as level 004 files.Two-meter air temperature (T 2 m ) and relative humidity (RH) were measured by the Automatic network (ANETZ) or by the Swiss Meteorological Network (SMN), respectively, both maintained by MeteoSwiss (Suter et al., 2006).Surface radiation data, air temperature and relative humidity from the Baseline Surface Radiation Network (BSRN) were obtained from the BSRN FTP server at the Alfred Wegener institute (ftp://ftp.bsrn.awi.de).Surface incoming solar (SIS) and surface downward long-wave (SDL) radiation, T 2 m and RH were measured at all investigated surface sites.Partial Cloud Amount (PCA) in octa was estimated with APCADA (Automatic Partial Cloud Amount Detection Algorithm) (Dürr and Philipona, 2004).A shortwave cloud flag (SCF) based on SIS was used to detect high thin cirrus clouds during daytime (Dürr and Zelenka, 2009).Table 1 gives an overview of the subset of ASRB sites and BSRN stations, where the necessary input parameters for APCADA were available for automatically estimating cloud cover retrieved from surface radiation measurements.
LCI observed at ASRB sites was first introduced by Dürr ( 2004) as the so-called cloud-free index saturation (CFI sat ).LCI observed at ASRB stations is defined as where r = CFI max − CFI cloud free , CFI max = 1 AC , CFI cloud free = AC AC = 1.The cloud-free index (CFI) is defined as where A = SDL/(σ T 4 ) is the apparent emissivity of the sky with σ T 4 the Plank emissivity of a black body and T the absolute 2 m air temperature given in Kelvin, and AC = SDL cloud free /(σ T 4 ) is the correspondent empirical apparent emissivity of a cloud-free sky (Dürr and Philipona, 2004).A LCI value of 100 % indicates low clouds, where the long-wave emission of the cloud base is equal to the Plank emission of T 2 m .LCI ≤ 0 % indicates cloud-free conditions, where SDL emitted by the sky is lower or equal to the upper limit of SDL for cloud-free situations statistically obtained from site measurements.
All surface measurements were available as 10 min averages.In this paper the temporal resolution was reduced to 30 min intervals by using every third 10 min average only.

Synoptic cloud observations
Synoptic observations of total cloud amount (SYN) based on World Meteorological Organization (WMO)-standards are available for all sites except Carpentras for different times as indicated in Table 1.Nighttime observations are available for Payerne only.

Formulation of the HelioFTH scheme
The Heliosat cloud index for visible channels estimates the relative influence of the clouds on SIS: high reflectances at the cloud top are correlated with small SIS values at the surface, i.e., the clouds influence on SIS is close to 100 % compared to cloud-free conditions, where the cloud influence is 0 %.Analogously the HelioFTH scheme proposed here estimates the influence of the clouds on SDL: very low cloud-top  sensor counts are correlated with large SDL values at the surface, i.e., the clouds influence on SDL is close to 100 % compared to cloud-free conditions, where the cloud influence is 0 %.The most critical information for obtaining a realistic LCI out of the HelioFTH scheme is the apparent cloud-base temperature, which mainly determines the amount of SDL radiation received at the surface.However, compared to visible radiation the path length of infrared radiation in clouds is short.Therefore it is not possible to retrieve the cloudbase temperature directly by IR10.8 measurements.Thus a relation between the observed cloud-top temperature and the cloud-base temperature has to be formulated.Validation with surface measurements showed that in general the colder the cloud-top temperature, the larger the measured SDL at the surface, i.e., the warmer the cloud-base temperature.That means that the vertical extent of the clouds tends to increase, if the cloud-top reaches higher up in the troposphere.Therefore, the following formulation for LCI is suggested: with C being the instantaneous satellite's raw IR sensor count, while C max , the maximum satellite's raw IR sensor count, corresponds to a cloud free, clean and dry sky, and C min corresponds to the coldest cloud-top temperatures.
Stand-alone cloud coverage products, i.e., cloud fractional coverage (CFC) and high cloud coverage (HCC), and an internal cloud-top pressure (CTP) product for the separation of middle-and high-level clouds are obtained from limbview corrected MVIRI raw IR10.8 sensor counts by use of LCI based on the Heliosat cloud index principles and with a cloud-free flag c based on a modified formulation of the SPARC scheme.Figure 1 shows the flow chart of the main processing steps of the HelioFTH scheme, which are described in detail in the following sections.The core element of the HelioFTH scheme is the cloud-free flag c, which separates cloud free from cloudy pixels.

Limb-view correction of satellite count
The limb darkening effect is accounted for by the application of a purely geometric correction function using VZA (Minnis and Harrison, 1984, see Fig. 1).Their theoretical limb darkening function is parametrized here as follows: where C is the raw sensor count, C the limb-view corrected IR10.8 sensor count and VZA is given in radians.

Definition of modeled maximum satellite count
LCI (Eq. 3) depends on a stable retrieval of C max which is proportional to the maximum planetary brightness temperature.The diurnal cycle of C max mainly depends on the solar geometrical parameters like day length or the actual sun position.Mannstein et al. (1999) suggested a combination of cosine and sine functions to model the diurnal cycle of C max over Northern Africa for each satellite pixel.However, this function is not applicable for the short day length over midand higher latitudes during wintertime.Therefore, the combination of cosine and sine functions was replaced by the bell-shaped curve in this paper: where ω = 2 π/N slot (N slot is the total number of Meteosat observation slots per day, e.g., N slot = 48 for Meteosat-7), t denotes the slot number, a 0 is the minimum of the C max diurnal cycle, a 1 is the amplitude of the bell-shaped curve, a 2 is the half-width of day length in radians and a 3 is the true sun time at UTC = 12:00 in radians.The half-width of the day length a 2 is calculated by where δ is the sun declination (radians), and φ latitude (radians), and d yr the actual day of the year starting with 1 at 1 January.The true sun time a 3 is calculated by t eq = 0.0172 + 0.4281 cos(θ) − 7.3515 sin(θ) where t eq is the equation of time, and λ is the longitude (degrees east).Parameters a 0 and a 1 in Eq. ( 5) have to be fitted to obtain the diurnal cycle of C max for each satellite pixel (see Sect. 3.4).Khlopenkov and Trishchenko (2007) published a scheme to detect cloud, snow and cloud shadows from AVHRR data called SPARC ("Separation of Pixels Using Aggregated Rating over Canada").SPARC uses aggregated rating instead of branch rating within the cloud detection.The modified version of SPARC used in the HelioFTH algorithm employs the limb-view corrected raw IR sensor counts (C) which are proportional to the planetary brightness temperature.They are compared to a dynamic threshold C max,real which is proportional to the diurnal cycle of the surface skin temperature.In analogy to the T score suggested by SPARC, a temperature (T ) score is calculated for the MVIRI IR10.8 counts:

Modification of SPARC scheme
where C max,real is the realistic diurnal cycle of C (see Sect.A binary cloud-free (CFR) flag observed at surface radiation sites was defined as CFR = (PCA = 0 or SCF = 0), where PCA indicates the partial cloud amount from APCADA scheme and SCF a shortwave cloud flag based on SIS measurements (see also Sect.2.2).This formulation of CFR allows the inclusion of cloud-free situations also during nighttime (PCA = 0), and minimizes the occurrence of cirrus clouds during daytime (SCF = 0).
Over ocean -a Heliosat based processing scheme using a simplified formulation of SPARC was applied to MVIRI VIS data to define a reference cloud mask over water since very few continuous radiation measurements are available over water.Clouds over water can easily be detected due to the large brightness contrast.Therefore, the cloud mask based on VIS data was used as a reference for cloud free and cloudy pixels to determine the SPARC factors for HelioFTH over water.
The resulting C offs factors over land (−0.1314 a 0,med ) and over water (−0.0768 a 0,med ) are dependent on the median value of a 0 over full disk (a 0,med = median (a 0 )), whereas C scale over land (−0.0457) and over water (−0.0625) is a constant.
The spatial behavior of C is tested applying the SPARC uniformity score (U temp ) enhanced with the simultaneous testing of the temporal behavior of the C spatial differences.This allows to distinguish moving or developing clouds (indicated by enhanced changes of C) from the spatio-temporal 1 LDA and the related Fisher's linear discriminant are methods used in statistics and machine learning to find a linear combination of features which characterize or separate two or more classes of objects or events.The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.

B. D ürr et al.: Infrared-based cloud masking
evolution of cloud-free pixels.First the mean spatial difference of C to the 8 surrounding pixels is calculated for the current (t0) and the 3 preceding slots (t0-1, t0-2, t0-3): where i and j indicate indexes in column and row direction, respectively, and t = t0, t0-1, t0-2, t0-3.Afterwards the temporal variability C var of C t is calculated by summarizing the absolute differences of C t between the adjacent slots normalized with the number of slot differences s involved: Finally the spatio-temporal difference score D is calculated using LDA as follows: where C var,offs indicates the offset (over land: 0.9451, over water: 0.7043) and C var,scale the scale factor (over land: 0.4933, over water: 0.3304) of C var .
The final expression for the aggregated rating F used in this paper is Values of F below zero indicate cloud free, and above-zero cloudy conditions.The stronger the deviation from zero, the more probable the classification becomes.However, the separation of cloud free from cloudy pixels with aggregated rating based on a single channel only is not sufficient, yet.
To refine the separation between cloud-free and cloudy pixels a cloud-free flag c based on the fuzzy-logic principle is introduced: where

Daily update of C max
To update C max in Eq. (3) for each slot, the previous and instantaneous C values are weighted according to the cloudfree flag c: Once a day the coefficients a 0 and a 1 of C max are fitted if C max was changed by Eq. ( 17) for at least one slot.All slots have equal weight for the fitting process, because C max has already been weighted slot wise by Eq. ( 17).
Limits for a 0 and a 1 are required to reduce the number of outliers of C max due to misclassified clouds.These limits were obtained by eye inspection of a number of full-disk maps of C during summer-and wintertime.All thresholds are multiplied with a factor y defined as a function of sun declination δ and latitude φ to roughly mimic the yearly cycle of SIS: The minimum value of a 0 over water is a 0 = 60 + 40 y for |φ| < 70 • and a 0 = 20 + 80 y for |φ| ≥ 70 • .A lower (a 1 = 10 y) and upper limit (a 1 = 120 y) are applied for a 1 over land.In the current version of HelioFTH the limits are constant.The processing of longer time series of MVIRI data covering Meteosat 2-7 may show that these limits have to be dynamic due to sensor gain changes, satellite changes and sensor degradation.

Realistic diurnal cycle of C max
The main problem of using C max in Eq. ( 3) is the fact that the ground measured diurnal cycle of C max is often smaller due to the damping effect of clouds on SIS, where the measured diurnal amplitude a 1 can be close to zero.Therefore, for the final version of LCI C max in Eq. ( 3) is replaced by C max,real , which is defined here as where LCI t is calculated by Eq. ( 3), but using the previous value of C max,real for slot t. s is the number of available slots for that day.Here the range of LCI t is restricted to 0-100 % instead of the normal range of LCI values, which is restricted to −50-110 %.

Daily update of C min
According to Eq. ( 3), determination of LCI requires the current minimum satellite's IR sensor count C min , which is proportional to the coldest observed cloud-top temperatures.
Once a day at 15:00 UTC, when the tropical thunderstorms in the center of the Meteosat viewing field reach their maximum height extension, the median value of the N = 99 lowest C values within the zonal band from 30 • S to 30 • N is used as the instantaneous minimum value of C (C min ).Finally, C min is obtained as the median C min value over the last 15 days.
The resulting median C min value is assumed to be a constant for the whole full disk and over one day.

Correction of sudden satellite count changes
The application of raw IR10.8 sensor counts (C) causes problems if a satellite sensor changes, e.g., data of the backup satellite are used instead of the original counts.This may cause a sudden change of the median C value observed over the full-disk area.Coincidentally, April 2004 was affected by such a sudden extreme change of the median value (C med = median (C)) of C over the full disk.Sudden extreme changes of C are monitored by with t indicating the current slot and t − 1 the last available slot.C change can notably affect a 0 and a 2 , which are corrected immediately for slot t if necessary: with i = 0 or i = 2.The limit of 0.1 was chosen heuristically and probably has to be modified after processing a longer Meteosat IR10.8 time series.Additionally, the offset C offs for the T test has to be updated by calculating the new median value of a 0 (a 0,med,t = median (a 0 , t) and by multiplying with the corresponding factor over land or water.In the current formulation of the HelioFTH scheme daily constant factors are used.If a sudden extreme change occurs, the factors are changed immediately.Thus a couple of slots for that day will have unrealistic coefficients, which are not flagged in the current formulation of the HelioFTH scheme.
The sudden change on the 14 April 2004 at 09:00 UTC was caused by a change of the MVIRI calibration coefficients.The value of C med,t C med,t−1 was 0.84, thus C med dropped by approximately 16 %.This dramatically affects the retrieval of C max , because most of the pixels after the change of C would be misleadingly interpreted as overcast by the modified SPARC scheme (Sect.3.3), and C max would remain more or less unchanged after the sudden change.

Definition of HelioFTH products
The HelioFTH cloud fractional coverage (CFC) product classes are based on the modified SPARC cloud-free flag c in Eq. ( 16): where CFC = 1 indicates cloud free, CFC = 2 partially cloudy, CFC = 3 overcast and CFC = 255 undefined pixels.The limit c lim = 0.66 was estimated by localizing the minimum position between the two peaks of cloud-free and partially cloudy values from the distribution of c at the ASRB sites Payerne and Locarno-Monti.ISCCP-DX and CM SAF CFC products are transformed to the same cloud classes using the corresponding ISCCP-DX and CM SAF products.However ISCCP-DX has no partially cloudy values, i.e., CFC = 2 is missing.
A requirement for the development of HelioFTH is the capability to separate low-level clouds from middle-and highlevel clouds.For this separation CTP information is needed, with low-level clouds being defined by CTP > 680 hPa (Rossow and Schiffer, 1991).The retrieval of CTP from a single channel and with the additional requirement not to use, for example, NWP input is highly challenging.Therefore, the CTP retrieval can only be based on a simple scheme, and CTP is not considered as a stand alone product, rather it is used for the separation of low-level from middle-and highlevel clouds only.
The HelioFTH middle and high-level cloud coverage (HCC) product comprises all partially cloudy (CFC = 2) or overcast (CFC = 3) pixels, where CTP ≤ 680 hPa, which corresponds to a cloud-top height of about 3000 m a.s.l.
LCI contains some implicit information about CTP.Based on empirical comparisons of HelioFTH LCI with CTP products from ISCCP-DX and CM SAF, we propose the following linear relationship between CTP and LCI: where CTP is given in hectopascal (hPa) with CTP min = 50 hPa, LCI max = 100 %, and LCI min = 0 %.Only LCI values greater than LCI min are used to calculate CTP.CTP is undefined for cloud-free pixels.
The maximum possible value of CTP follows the US standard atmosphere (McClatchey et al., 1971) and is defined as where z is the mean pixel altitude given in meters asl and z = 500 m an altitude offset roughly accounting for the vertical extent of the clouds.CTP in Eq. ( 25) is set equal to CTP min if CTP < CTP min .

Verification approach
PCA values (cloud cover in octa, see Sect.2.2) for surface radiation sites are transformed to CFC cloud classes in the following way:  where CFC = 1 indicates cloud free, CFC = 2 partially cloudy and CFC = 3 overcast.HCC flags for surface radiation measurements are missing for the time being, because the retrieval of high cloud coverage needs further investigations.
For validation purposes the CFC cloud classes 1-3 as defined in Eqs. ( 24) and ( 27) are linearly transformed to 0-1 to compare with the results published in Reuter et al. (2009), where CFC = 0 indicates cloud free, CFC = 0.5 broken clouds and CFC = 1 overcast conditions at the surface site.
For comparison of instantaneous CFC products from satellites with surface observations the nearest neighbor pixel values both in space and time were applied.Therefore the maximum time difference amounts to 5 min, and the maximum spatial difference roughly amounts to the half of the satellite product spatial resolution.
All satellite cloud products from HelioFTH, CM SAF and ISCCP-DX are provided on different grids.For intercomparison purposes the HelioFTH and CM SAF products are reprojected to a regular latitude/longitude grid with 0.1 • resolution, where the values at the grid points are selected with the nearest neighbor method.For comparison of He-lioFTH and CM SAF with ISCCP-DX the grid resolution is reduced to 0.5 • to account for the coarse spatial resolution of the ISCCP-DX products.

Results
In this section the validation results of the HelioFTH CFC with surface measurements from 3 ASRB and 3 BSRN sites are presented.Further, HelioFTH CTP, CFC and HCC is intercompared to the corresponding ISCCP-DX and CM SAF products.All presented results are based on one month of data from April 2004.

Validation of the CFC product
We used the definition of statistical quantities as defined in Appendix A suggested by Reuter et al. (2009, see Sect. 5), who compared CM SAF CFC with synoptic reports (SYN) for the year 2006.Their results (see Table 2; Reuter et al., 2009) are reproduced as the CM SAF-SYN results in this study.
The number of available PCA observations e.g., for daytime is 2-8 times higher than for SYN reports.Thus, due to the high temporal resolution during day-and nighttime PCA observations are an effective means for the statistical evaluation of satellite clouds retrievals.On the other hand, the low temporal resolution and also the lesser amount of available observation of SYN especially during nighttime should always be kept in mind when evaluating the validation results.The accurate observation of clouds during nighttime is dependent on the sky illumination, which significantly influences the observed cloud amount (Hahn et al., 1995).
The CM SAF-SYN results in Table 2 are in accordance to previous validation results by Reuter et al. (2009, see Table 2).Overall and site-specific HelioFTH-PCA and HelioFTH-SYN comparisons show consistent results with  These pixels are subsequently labeled as cloud free resulting in an underestimation of the cloudy conditions.This also leads to a negative CFC bias for Alpine sites.The low probability of detecting clouds in Sede-Boqer is the result of a misrepresented diurnal cycle as will be shown in Sect.4.1.2.The average accuracy or fraction correct (FC) for HelioFTH is lower than for CM SAF, but FC is increased in the order of 0.10 compared to ISCCP-DX.With the exception of the Alpine sites Davos and Jungfraujoch, HelioFTH CFC reveals a systematic positive bias in the order of 0.15-0.25,which is comparable to ISCCP-DX.

Validation of the diurnal cycle of the CFC product
The different CFC satellite products were separately validated with PCA and SYN observations in  (Hahn et al., 1995).The comparison for the Alpine valley site Davos in Fig. 2 (middle panel) shows good agreement of PCA, SYN and HelioFTH based CFC, but a strong positive bias of CM SAF during daytime and over the whole day for ISCCP-DX.The comparison for the high Alpine site Jungfraujoch in Fig. 2 (bottom panel) indicates a systematic negative bias for HelioFTH.This behavior is connected to the misinterpretation of clouds as snow which was already mentioned in Sect.4.1.1.CM SAF, on the other hand, shows again a strong positive bias during daytime due to snow cover misinterpreted as clouds as seen by the SEVIRI visible channels.
The ISCCP-DX product indicates overcast conditions during the whole day, thus, it is very likely that the cold snow surface is misinterpreted as clouds in the ISCCP-DX IR cloud retrieval.This also leads to a systematic underestimation of the day-to-day variability in CM SAF (daytime) and ISCCP-DX (whole day).
Figure 3 shows the mean diurnal cycles and the standard deviations (STD) for the BSRN stations Carpentras (top), Sede-Boqer (middle) and De-Aar (bottom).At Carpentras (top), HelioFTH, CM SAF and ISCCP-DX show similar course of the diurnal cycle, but an obvious positive offset of HelioFTH.The day-to-day variability is well captured by HelioFTH.However, compared the BSRN measurements, all three satellite products overestimate CFC in the afternoon and evening.For the semi-arid site Sede-Boqer (Israel) with large VZA values the diurnal cycle in Fig. 3 (middle panel) is neither captured by CM SAF, ISCCP-DX nor by HelioFTH.CM SAF tends to underestimate cloudiness during the morning, whereas HelioFTH and ISCCP-DX overestimate CFC.This rises the question if these discrepancies are a problem of the surface measurements and the applied PCA algorithm or if the diurnal cycle is misrepresented in all three satellite products.The SYN report at 06:00 UTC reveals a large gap to the BSRN PCA value.The day-to-day variability is considerably different between all the products.The comparison for the De-Aar site (South Africa) in Fig. 3  based observations (the surface data) for surface validations or satellite measurements (the reference satellite data) for satellite inter-comparisons which are both referred to as reference dataset (rd) hereinafter.We use a contingency table (Table A1) that contains the number of observations derived 1070 from rd-sat being cloud-free-cloud-free, cloud-free-cloudy, cloudy-cloud-free, and cloudy-cloudy.Note that the contingency table for the surface validations contains only results from unambiguous synoptic observations that are 0, 1, 7, and 8 octa.

Intercomparison of CTP, CFC and HCC products
The CTP, CFC and HCC products of the three different satellite datasets HelioFTH, CM SAF and ISCCP-DX are compared to each other.Tables 4, 5 and 6 show the results of the CTP, CFC and HCC intercomparison for the three different satellite datasets over three different regions (see Sect. 2.1), only over land pixels and over the validation sites (Table 1).Table 4 shows the bias and median difference for the CTP product intercomparisons.On the full disk the mean bias between the HelioFTH and CM SAF is 163 hPa and the median difference 90 hPa.The CTP differences for the EU   and SA region are less pronounced.The ISCCP-DX product also shows systematically higher CTP values compared to CM SAF.Thus, ISCCP-DX cloud tops tend to be at a much lower altitude compared to CM SAF.Results published by Stubenrauch et al. (2013) indicate that ISCCP CTP seems to miss many cases with cirrus clouds when compared e.g., to the Calipso cloud lidar.The single IR channel based HelioFTH scheme is probably affected by the same problems.The HelioFTH and ISCCP-DX bias compared to CM SAF is not consistent for pixels over land only: for Europe (EU-L) the bias is stable compared to the whole area including pixels over water, but over South Africa (SA-L) the bias reaches 246 hPa for HelioFTH, respectively 191 hPa for ISCCP-DX.The SA-L area is dominated by the inner-tropical convergence zone, where the CM SAF cloud retrieval detects many more cirrus clouds with low CTP values than HelioFTH or ISCCP-DX.
In Table 5 POFD cf for the HelioFTH CFC product over full disk is 6 % lower than for ISCCP-DX both compared to CM SAF.FC for the CFC product is similar (81-83 %) for all three intercomparisons, but 0.02-0.06% lower for the comparison over the land surfaces only, where the amount of false CFC cloudy pixels indicated by the difference of P (cc rd |cc sa ) and P (cc sa |cc rd ) is considerably increased.The KSS is highest for the intercomparison between ISCCP-DX and CM SAF.KSS for HelioFTH compared to CM SAF for the different validation sites is considerably higher except for SBO and JFJ.For the mountainous sites DAV and JFJ the probability of false detection of cloud-free pixels (POFD cf ) is obviously higher due to the misinterpretation of snowcovered areas.
In Table 6 POFD cf for the HelioFTH HCC product in the order of 8 % higher than for ISCCP-DX both compared to CM SAF on full disk.The FC for the HelioFTH and ISCCP-DX HCC product compared to CM SAF is in the range of 80-82 % for all three intercomparisons.For land pixels only FC agrees generally better by 0-6 %, and KSS is even increased by 11-14 %.This indicates that CM SAF HCC retrieval is able to detect many more thin-cirrus pixels over water.
Figure 4 shows HelioFTH CTP, CFC and HCC over the full disk during daytime compared to the respective fields from CM SAF.The CTP anomaly map (top right) shows that HelioFTH produces larger CTP over the tropical regions (i.e., lower cloud tops) and gives reasonable CTP over the higher latitudes.The mismatch of HelioFTH CFC cloud-free pixels (middle right panel) compared to cloudy CM SAF pixels (light green pixels) is 8.7 % and mainly concentrated over the Atlantic, which is discussed in more detail in the next Sect.4.2.2.The mismatch of HelioFTH CFC cloudy pixels and CM SAF cloud-free pixels (light purple pixels) is 6.9 % and more pronounced over land areas.
The mismatches in HCC (lower right panel) occur mainly in the higher latitudes where the higher cloud tops (lower CTP) in HelioFTH lead to a positive cloud detection whereas CM SAF states clear sky because of the lower cloud tops (higher CTP) (light purple pixels).This happens in a total of 0.9 % of the pixels.The mismatch of clear cases in HelioFTH and cloudy cases in CM SAF (light green pixels) occurs in 21.6 % of all pixels.These mismatches are located predominantly over water.

Intercomparison of the diurnal cycle of HCC over South Africa
HelioFTH, ISCCP-DX and CM SAF HCC were compared over South Africa for the 3 April on 03:00 UTC (nighttime) and 15:00 UTC (daytime).
During nighttime, the difference between HCC from He-lioFTH and CM SAF (Fig. 5   cloudy-cloud free mismatch of 0.7 % (light purple pixels) at the border of large cloudy areas.As stated above, those are most likely due to the higher cloud tops in HelioFTH.Mismatches of the other kind (cloud free-cloudy, light green pixels) are apparent over the sea and are due to the higher cloud tops in CM SAF than in HelioFTH.The mismatches are more pronounced over land for daytime differences (Fig. 5, top right panel).
Comparing HelioFTH HCC to ISCCP-DX HCC in Fig. 5 shows similar patterns for the cloudy-cloud free mismatches (light purple pixels) for night-(middle left panel) and daytime (middle right panel).However, the cloud free-cloudy mismatches over the sea do occur less in the intercomparison with ISCCP-DX.This is due to the lower cloud-top heights in ISCCP-DX compared to CM SAF.The last fact is supported by the comparison of HCC from ISCCP-DX with HCC from CM SAF (Fig. 5, bottom left and bottom right).Large areas over the sea show a cloud free-cloudy mismatch (light green pixels) stating that ISCCP-DX does not have high clouds whereas CM SAF has.
The comparison of HCC between night-and daytime shows notable differences between HelioFTH/ISCCP-DX and CM SAF over the sea, which may be explained by including more spectral (day/nighttime) and visible (daytime) information from SEVIRI to the CM SAF cloud retrieval algorithm.Therefore, the simple IR-based HelioFTH products are probably less affected by discontinuities between land and open water, and between day-and nighttime compared to the CM SAF products.

Future plans
CM SAF products and their documentations, in particular the validation report, are subject to external reviews.The validation report will include a section on assumptions and limitations.Here, based on the long-term HelioFTH record problematic areas/periods will be discussed, together with recommendations on utilization.
The positive HelioFTH CTP bias leads to a systematic underestimation of HCC, especially over the sea.Further investigations are needed to find an improved formulation of CTP based on LCI.
Snow cover information retrieved by a Heliosat retrieval scheme could be used for higher latitudes and mountainous regions in a future version of HelioFTH to improve cloud masking over these regions.
The current HelioFTH scheme overestimates LCI for stratiform and single-layer middle-and high-level clouds such as alto-or cirrostratus.A future version of the modified SPARC algorithm shall be able to separate scenes with single-layer stratiform clouds to some extent.
The validation site Sede-Boqer is located at the border of the full disk.Results presented in Fig. 3 (middle panel) indicate large differences between the different datasets.The effect of high VZA on the results could be investigated with HelioFTH products based on Meteosat-6 data over the Indian ocean (63 • E).
In analogy to the Heliosat method, the retrieval of SDL and of the long-wave cloud effect (Philipona et al., 2004) shall be investigated based on LCI and surface measurements or long-term reanalysis data of 2 m air temperature and relative humidity.the HelioFTH scheme to retrieve cloud-free pixels in areas for which its parameters were not trained, such as deserts, snow or open sea.The problems are more pronounced for the CFC than for the HCC product due to the presence of low clouds near the surface.Thus, some differences between the modeled diurnal course of C max and the diurnal course of the surface brightness temperature measured by the satellite are misinterpreted as low clouds in the current version of HelioFTH.
The internal HelioFTH CTP product shows a systematic positive bias, which leads to an underestimation of HCC compared to CM SAF.During daytime the use of the IR channel only has advantages over snow-covered areas where CM SAF misclassifies snow patches as clouds.However, He-lioFTH has to deal with the opposite problem when misinterpreting clouds as snow.But this misinterpretation is not bound to a certain daytime so that day-night biases as in CM SAF are not occurring.Furthermore, the CM SAF HCC product detects more clouds during daytime, when spectral information from the visible SEVIRI channels is applied.This effect is mostly pronounced over the sea.The probability of false detection of cloud-free HCC pixels is comparable to ISCCP-DX.Both HelioFTH and ISCCP-DX likely fail to detect thin cirrus clouds since they use a single IR channel only.The validation results further indicate that the daytimenighttime CFC differences of CM SAF especially over snow and other bright surfaces need to be analyzed in more detail with regard to climate monitoring needs.
The results and conclusions are based on a preliminary analysis using only one month of data.Within the CM SAF framework HelioFTH will now be extended to alternatively also use visible channel data during daytime and to employ inter-calibrated radiances for Meteosat First and Second Generation.A continuous climate data record of cloud physical products will then have to be validated for consistency and homogeneity and intercompared for the full Meteosat record.

Statistical measures
The Kuiper skill score (KSS; (Hanssen and Kuipers, 1965)) determines the probability that a predicted event occurs, relative to its casual occurrence.Here, we apply it to satellite measurements (the predicted value; sat) and both to groundbased observations (the surface data) for surface validations or satellite measurements (the reference satellite data) for satellite inter-comparisons which are both referred to as reference dataset (rd) hereinafter.We use a contingency table (Table A1) that contains the number of observations derived from rd-sat being cloud free-cloud free, cloud free-cloudy, cloudy-cloud free, and cloudy-cloudy.Note that the contingency table for the surface validations contains only results
3.5), C offs is the offset and C scale is the scale factor for C. C offs and C scale are obtained from a linear discriminant analysis (LDA 1 ) by use of a training dataset: Over land -the two low-land ASRB sites Locarno-Monti and Payerne were used from October 2004-September 2005.
975 is the maximum of the distribution of all F values from the training dataset for the ASRB sites Locarno-Monti and Payerne.The corresponding value for pixels over water is F lim = −0.775.c = 1 means cloud free, and c = 0 overcast and all values in between partially cloudy.

Table 3 .
Mean results of the comparison of different satellite (sa) products with surface (rd) CFC observations for all investigated sites for April 2004 [Prod = product origin, S = scenario (D = day, N = night, T = twilight), obs = reference cloud observation at the surface (PCA = partial cloud amount, SYN = synop observation), N = number of all available surface values, N cf,cc = number of only cloud-free or overcast surface values, FC = fraction correct, KSS = Kuiper Skill Score and bias = mean (satellite − surface)].

Fig. 3 .
Fig. 3. Mean diurnal cycle of CFC and its standard deviation (STD) for BSRN sites for April 2004.

Fig. 3 .
Fig. 3. Mean diurnal cycle of CFC and its standard deviation (STD) for BSRN sites for April 2004.
Fig. 4. 3 April 2004, 15 UTC: cloud top pressure (CTP), cloud fractional coverage (CFC) and middle and high cloud coverage (HCC) for HelioFTH (left hand side).Categorical differences of HelioFTH products to CM SAF CTP, CFC and HCC (right hand side) over full-disk.

-
KSS = a d − c b (a + b) (c + d) ; -Conditional probabilities: -P (cf sa |cf rd = a/(a + b): the conditional probability of the satellite cloud detection classifying a scene as cloud free, given a cloud-free observation from the reference dataset; -P (cc sa |cc rd = d/(c + d): the conditional probability of the satellite cloud detection classifying a scene as cloud covered, given a cloud-covered observation from the reference dataset; -P (cf rd |cf sa = a/(a + c): the conditional probability of a cloud-free observation from the reference dataset, given a cloud-free satellite classification; -P (cc rd |cc sa = d/(b + d): the conditional probability of a cloud-covered observation from the reference dataset, given a cloud-covered satellite classification; -accuracy or fraction correct (FC; referred to as hit rate by (Reuter et al., 2009, see Sect.5)): FC = a + d a + b + c + d ; -probability of false detection of cloud-free pixels (POFD cf ): POFD cf = c/(c + d).

Table 2 .
Average station-based results of HelioFTH (sa) and surface CFC observations (rd) for April 2004 -obs = reference cloud observation at the surface (PCA = partial cloud amount, SYN = synoptical observation), N = number of all available surface values, N cf,cc = number of only clear-cloudy surface values, FC = fraction correct, KSS = Kuiper Skill Score and bias = mean (satellite − surface).For comparison the mean results for CM SAF and ISCCP-DX CFC products are also shown.SiteObs N N cf,cc CFC rd P (cf sa |cf rd ) P (cc sa |cc rd ) P (cf rd |cf sa ) P (cc rd |cc sa ) cf,cc CFC su P (cf sa |cf rd ) P (cc sa |cc rd ) P (cf rd |cf sa ) P (cc rd |cc sa ) (cf rd |cf sa ) is lower than P (cf sa |cf rd ) but false cloud pixels are indicated if P (cc rd |cc sa ) is lower than P (cc sa |cc rd ).Thus, HelioFTH detects more cloudy cases than the surface observations show (i.e., it is clear-sky conservative), especially for the semi-arid sites De-Aar and Sede Boqer, but not for the mountainous sites Davos and Jungfraujoch.At these two sites conditional probabilities of detecting cloudy situations by the satellite (P (cc sa |cc rd )) are in the order of 0.7 due to the misinterpretation of cloudy pixels as snowy surface in the current version of HelioFTH, which has no snow-detection implemented yet.

ürr et al.: Infrared-based cloud masking afternoon
Table3for day, night and twilight conditions.Compared to PCA, HelioFTH shows the best performance during day(FC and KSS highest)and notably lower during night and twilight.Compared to SYN, night and twilight yield better agreement.He-lioFTH detects more false cloud pixels during nighttime, i.e., the difference of P (cc rd |cc sa ) to P (cc sa |cc rd ) is larger.For HelioFTH-SYN, however, P (cc sa |cc rd ) and P (cc rd |cc sa ) are above 0.95, because nighttime SYN comparison is dominated by observations of the training site Payerne.The validation results of CM SAF data with PCA and SYN show an overestimation of cloudy cases during day (only SYN) and night (PCA and SYN) but an overestimation of clear cases during twilight (PCA and SYN).This features are again more pronounced in comparison with SYN.ISCCP-DX shows an extreme overestimation of cloudy cases and subsequently an extreme underestimation of clear cases during the whole day with a maximum at twilight.The KSS is, therefore, considerably lower than for HelioFTH and CM SAF.The FC, however, is only slightly lower because the high amount of false detections are not considered in FC as it is done in KSS.Figure2shows the mean diurnal cycles and the standard deviations (STD) for the ASRB sites Payerne, Davos and Jungfraujoch.ISCCP-DX cloudiness is overestimated and the day-to-day variability considerably underestimated in the at Payerne (top panel).The HelioFTH cloudiness shows a systematic positive bias, but the day-to-day variability is well captured compared to SYN and PCA observations.The CM SAF and SYN cloudiness fit well, but there is a systematic positive bias of the day-to-day variability of CM SAF compared to SYN during daytime in Payerne.The PCA based cloudiness shows a negative bias compared to SYN during daytime.SYN observations contain a reasonable amount of thin, high clouds or clouds very close to the horizon which cannot be captured by the ASRB PCA but is apparently represented by CM SAF.However, the effect is less pronounced during nighttime where the quality of SYN observations depends on the sky illumination www.atmos-meas-tech.net/6/1883/2013/Atmos.Meas.Tech., 6, 1883-1901, 2013 B. D

Table A1 .
Contingency table of satellite and synoptic / reference satellite (reference dataset) observations.

Table 4 .
Bias = mean (satellite − satellite) and median = median (satellite − satellite) of CTP product inter-comparison given in hectopascal (hPa) for full disk (FD), Europe (EU), South Africa (SA) and for all sites with synop observations from Table1for April 2004.Suffix "L" indicates pixels over land only, and N the number of compared values.

Table A1 .
Contingency table of satellite and synoptic/reference satellite (reference dataset) observations.