The CM SAF SSM / I-based total column water vapour climate data record : methods and evaluation against re-analyses and satellite

Introduction Conclusions References


Introduction
Water vapour plays a central role in the Earth's energy budget and water cycles and is the most effective greenhouse gas, making it a key variable for climate analysis.A total 60 % of the natural greenhouse effect can be explained with water vapour opacity in the atmosphere (Kiehl and Trenberth, 1997).Water vapour plays an amplifying role in global warming through a strong positive climate feedback loop, as evident in climate predictions (Held and Soden, 2000).The water vapour feedback in turn interacts with the cloud and Published by Copernicus Publications on behalf of the European Geosciences Union.
ice albedo feedbacks and is dominated by water vapour in the tropical free troposphere (Held and Soden, 2000;IPCC, 2007).With increasing temperature the water vapour content in the lower troposphere increases.This results in changes of the hydrological cycle, in particular also in increased likelihood of more intense precipitation events (IPCC, 2007;Allan et al., 2010).The spatial distribution of WVPA (total column water vapour path) and precipitation and their seasonal cycle exhibit clear similarities, at least in the tropics (Trenberth, 2011).With increasing temperature and with unchanged winds, wet areas will get wetter and dry areas will get drier, as noted, e.g. in Chou and Neelin (2004) and Allan and Soden (2007).This likely leads to intensified droughts in divergence zones of the subtropics and floods in the convergence zones of the tropics.Eventually, the combined effect of more intense precipitation events and increased WVPA will affect atmospheric circulation.More details can be found in, e.g.Trenberth (2011) and Sherwood et al. (2010).
Analysing the recent decades of the global water vapour distribution and changes is expected to help extend our understanding of the climate system and how it responds to increasing greenhouse gas concentration.Also, the analysis of synoptic scale water vapour transports can yield valuable insights into the dynamics of the atmosphere and its evolution.
The Global Climate Observing System (GCOS) introduced and defined Essential Climate Variables (ECV), one of them being WVPA.Requirements for accuracy (stated as bias) and stability, which is the temporal variation of the bias, for WVPA are provided by GCOS and published in GCOS-107 (2011): 1 % (bias) and 0.3 % per decade (stability).Current CDRs (climate data records) are mainly based on observations of operational satellite systems that were primarily built to support short-term weather and environmental prediction applications.Many problems are associated with the utilization of operational satellites for climate monitoring due to -among other reasons -instrument changes and changes in calibration approaches.In consequence, significant efforts are needed to homogenise and inter-calibrate radiance data records for input to consistently-applied retrieval schemes that derive ECVs such as WVPA.The Sustained and Coordinated Processing of Environmental Satellite Data for Climate Monitoring (SCOPE-CM) was initiated by the WMO (World Meteorological Organization) in order to establish a network of facilities to provide satellite-based ECVs in high quality and in a continuous and sustained environment.One of SCOPE-CM's pilot projects, led by CM SAF (Satellite Application Facility on Climate Monitoring), focuses on ECV retrievals from SSM/I (Special Sensor Microwave Imager) observations and therefore indirectly also on the quality assessment of the underlying radiance record.
Long-time series of WVPA based on recalibrated and homogenised radiance data records from SSM/I observations are developed and processed at RSS Remote Sensing Systems (Wentz, 1997) and can be accessed via http://www.ssmi.com/.Another global water vapour climatology data set from the NASA (National Aeronautics and Space Administration) Water Vapour Project (NVAP) is based on a combination of SSM/I, TOVS (TIROS Operational Vertical Sounder) and radiosonde data for the years 1988-1999(Randel et al., 1996) ) and is being reanalysed and extended to cover the period 1988-2009 (NVAP-M as part of NASA's MEaSUREs (Making Earth System Data Records for Use in Research Environments) programme, Vonder Haar et al., 2012).A second project on water vapour within the MEa-SUREs programme focuses on the generation of multi-sensor water vapour climate data record using cloud classification from A-Train measurements and has recently been released under http://disc.sci.gsfc.nasa.gov/.
Employing the Clausius-Clapeyron relationship between temperature and the approximate response of saturation vapour pressure leads to a theoretical change in atmospheric water vapour of 7 % K −1 (IPCC, 2007).Trenberth et al. (2005) analysed decadal trends in WVPA from SSM/I over open oceans and found an average trend of ∼ 1.3 % per decade or 8.87 % K −1 relative to observed changes in sea surface temperature.Considering the tropical belt, a value of 7.8 % K −1 has been observed which is slightly larger than the theoretically expected value.
In this paper we introduce the CM SAF total column water vapour climate data record (Jonas et al., 2009) derived from SSM/I observations onboard the Defense Meteorological Satellite Program (DMSP) platforms using the HOAPS (Hamburg Ocean-Atmosphere Fluxes and Parameters from Satellite) algorithm for WVPA (Schlüssel and Emery, 1990).The HOAPS algorithm package has been developed at the Max-Planck Institute for Meteorology (MPI) and the University of Hamburg (UHH) (Schulz et al., 1998;Jost et al., 2002;Andersson et al., 2010).The HOAPS algorithm and associated products are widely used in the scientific community, e.g. in Chou et al. (2004), Curry et al. (2004), Gershunov and Roca (2004), Kumar and Schulz (2002), Röske (2006) and Sohn et al. (2004), and have been positively evaluated in an independent retrieval assessment by Sohn and Smith (2003).
The WVPA product has further been post-processed by application of an objective analysis for interpolation, namely kriging.Kriging has the advantages to allow for gap filling and to derive an uncertainty estimate on grid basis.The CDR, the product requirements on WVPA, the theoretical basis for the retrieval, validation results, a user guide and a description of the processing chain are available at the CM SAF webpage (http://www.cmsaf.eu)free of charge.The document package fully describes the processing and the product, ensuring traceability and reproducibility.
The structure of this paper is as follows: after a description of the retrieval and homogenisation schemes as well as the kriging approach (Sect.2), the validation set up and results are presented in Sect.3. Finally, Sect. 4 summarises the results and gives conclusions.

Total column water vapour from SSM/I
The WVPA climatology is based on data from six SSM/I instruments on DMSP F-08, F-10, F11, F13, F14, and F15 platforms.As the SSM/I instruments on the F-09 and F-12 satellites have failed, the climatology is derived using the remaining six instruments.Details on the SSM/I instrument characteristics can be found in Hollinger et al. (1987).
The DMSPs have a temporal overlap to at least one consecutive satellite, which makes the homogenisation of the measured brightness temperatures between different SSM/I instruments feasible.For the homogenisation, the SSM/I on F-11 has been chosen as a reference instrument, because the F-11 observation period has maximum overlap with other DMSP satellite observations.Using the overlap between the DMSPs, probability density functions (PDFs) based on ten days of the brightness temperatures have been calculated and statistically matched in each channel.The matching coefficients can then be used to homogenise the measurements of different SSM/I instruments.Details on the homogenisation scheme are described in Fennig (2001) and Andersson et al. (2010).
The data set has global coverage, i.e. within ±180 • longitude and ±80 • latitude, and is defined for the ice-free oceanic surface with a minimum distance of 50 km to land surfaces.The data are available as daily and monthly averages.The temporal coverage ranges from 9th of July 1987 to 31st of August 2006.The data set is available from http: //www.cmsaf.eu/.

Retrieval
Schlüssel and Emery (1990) developed four retrieval schemes for WVPA utilising different combinations of SSM/I channels.These schemes are of semi-physical nature, i.e. they are based on forward radiative transfer calculations on a set of atmospheric profiles followed by a statistical inversion using linear regression.All four variants of the total column water vapour retrieval purely depend on SSM/I brightness temperatures and regression coefficients derived from the initial set of atmospheric profiles.The retrievals do not depend on any additional information.
For long-term applications the availability of the channels used by the retrieval is a crucial limitation.In view of the failure of the 85 GHz channel on F-08, the Schlüssel and Emery (1990) retrieval variant employing the 22 and 37 GHz channels has been implemented for long-term application, although the use of the 85 GHz channel data might improve accuracies slightly when present.The continuous availability and homogeneity of the input data source is clearly an advantage of the two-channel retrieval scheme.With the two-channel variant, 98.9 % of the variance within the training data set could be explained by the regression formula.The remaining uncertainty, which may be interpreted as the accuracy of the statistical retrieval, is given by the au-thors as 1.5 kg m −2 .A detailed description of the retrieval schemes and the statistical inversion technique can be found in Schlüssel and Emery (1990).

Objective analysis for interpolation
After the retrieval of WVPA on the satellite swath, an objective analysis technique called kriging is applied to map the retrievals in space and time to the product grid.The principle is that an estimate or prediction for an unobserved location is performed by using the observations from locations in the vicinity.The optimal estimate at each grid point is found by a weighted average of the information from the surrounding points.The challenge is to determine the optimal weights for each of the used observations.These depend on two parameters: the distance-dependent spatial correlation function and the error of the used observation.Kriging provides both optimum fields of the considered parameters together with an error map and allows the generation of fully covered fields giving values even in areas with a very low data density.
Kriging can be regarded as a prediction of a value x 0 at a location P 0 where no measurement has been carried out, by using information at surrounding positions P i : where x 0 is the value which has to be predicted.x i denotes the available measurements and x i their errors.The task is to determine the weights λ i .Obviously a solution is not possible for a single case.But if a time series of m measurements at each location P i is available, a reasonable constraint is that the mean squared deviation between predictions and truth is minimal: When searching for an optimum set of weights λ i , Eq. ( 2) has to be differentiated with respect to all λ i .This leads to a set of n linear equations, which can be written as a matrix equation (Eq.3).For the sake of clarity, the temporal summation over m is abbreviated by brackets [. . .].The errors are assumed here to be random, so that all mixed terms [x i x i ] are zero (no tendency of overestimating high and underestimating low values).This assumption seems to be valid as validations of the retrieval scheme have not shown such systematic errors (Sohn and Smith, 2003).Furthermore, all error covariances [ x i x j ] (with i unequal j ) are zero if different observations are considered (no tendency to overestimate when the neighbour is overestimating).For satellite observations it can be expected that kriging error estimates will increase when error covariances are also considered.However, the random error assumption is kept: The result is a linear set of equations containing the covariance matrix between the data points P i with the error variances on its diagonal, and a vector giving the covariance between the predicting point P 0 and the locations P i where measurements are available.
Obviously, the minimized expression in Eq. ( 2) is equal to the error of the predicted value, often called kriging error (σ krig ).Transformation of that expression leads to The first term of the kriging error denotes the variance at P 0 .
The second term in Eq. ( 4) contains the covariances between the predicting point P 0 and the data points P i , and represents that part of the kriging error which can be called information.Its negative sign indicates that the error decreases with increasing information.But two points in close vicinity can contain redundant information, which is represented in the third term.The associated covariance between these data points increases the kriging error.The fourth term obviously describes the effects of individual errors of the used data points.The kriging error may be interpreted as unexplained variance at the prediction point.If anomalies, determined by subtracting the monthly mean x from the observations, are normalised with the standard deviation σ x of daily means within a month, the time series at all grid points has a standard deviation of 1. Consequently, correlation and covariance are identical.
The kriging is performed for daily averages.These daily averages are calculated from the satellite raw data in two steps.First, neighbouring pixels of the same overpass and satellite are averaged to intermediate means at time of overpass on a rectangular grid of 0.5 × 0.5 • .The variance between these intermediate means can be considered to be independent from each other because they are based on different satellite overpasses during the day which are hours apart.In a second step, the intermediate means are averaged once again to obtain the final daily mean.This way equal weight is given to each temporal contribution.The underlying variance of the intermediate means is used to estimate the error variance of that specific daily mean: where x ki denotes the values within the kth day, n k the number of observations of the kth day and xk the corresponding daily mean.The covariance of daily means is calculated as a pure function of distance.This is a further reason for normalising the values.Otherwise, regions of high variability would dominate the results.In the concrete accomplishment, an exponential function is fitted to the correlations.More details on the kriging routine are described in Lindau and Schröder (2010).

Climatology
The output fields from the kriging are the gridded WVPA, number of valid observations and uncertainty information.The arithmetic average per grid of the defined data values of the full time series is displayed in the top panels of Fig. 1.The average WVPA field exhibits maxima in the tropics, in particular over the warm pool region, regional minima at the continental west coasts and a general decrease towards the poles.The standard deviation shows maxima where spatial gradients in WVPA are largest, associated with the annual north-south migration of the intertropical convergence zone (ITCZ) and in storm track regions as well as minima at the continental west coasts and in the polar region.An exemplary daily average and its associated kriging error are shown in the bottom panels of Fig. 1.General features are similar as in the top left panel but exhibit much more fine structure.For example, an atmospheric river can be observed in the South Pacific which transports moisture from the tropics to the extra-tropics.The kriging error (bottom right panel) exhibits pronounced maxima in data void areas, which is a direct consequence of the decreased confidence with distance to valid observations.

Approach
The evaluation setup is outlined and an in-depth discussion of the results is given in the following sections.A thorough evaluation of the WVPA products versus radiosondes or other ground based instruments is hampered by the almost non availability of accurate in situ observation in ocean areas.Thus, the evaluation is performed as follows: A theoretical assessment of the algorithm accuracy has been carried out in a study by Sohn and Smith (2003).They compared several WVPA retrieval schemes using SSM/I, including the Schlüssel and Emery (1990) Sohn and Smith (2003) have carried out a study to compare various statistical and physical water vapour retrieval schemes based on SSM/I observations.Their primary goal was the identification of differences between those algorithms and their dependence (a) on various "tangential environmental factors" (such as SST, cloud water path, surface wind speed) and (b) on the training data set used for the derivation of the regression coefficients in case of the retrievals using statistical inversion.In the study, the Schlüssel and Emery (1990, S&E) algorithm as used in this CM SAF product has been compared to four statistical and two physical retrieval schemes (see Sohn and Smith, 2003 for references).Additionally, a previously developed "optimum statistical retrieval" (WOS) based on an earlier comparison exercise carried out by Wentz (1995) has also been used.The results of their study can be summarised as follows:

Algorithm evaluation
Within both retrieval types (statistical and physical), the algorithms agree quite well.Nevertheless, the physical retrievals show a significant positive bias and greater RMSE compared to the statistical retrievals which are based on a temporally and geographically limited selection of radiosonde soundings.
Within the statistical retrievals, the WOS retrieval was considered a benchmark.The S&E algorithm applied here has been found as close as 0.5 kg m −2 to this benchmark throughout the year and most regions.The variability in the water vapour column was represented almost identically.

Comparison against previous HOAPS WVPA product
The CM SAF WVPA product (CM SAF HOAPS v3.1) has been compared to its previous version, HOAPS v3, which is accessible via http://www.hoaps.zmaw.de.The average absolute (left panel) and relative (right panel) differences between both data sets are shown in Fig. 2. Absolute and relative differences are generally positive, with global mean values of 0.3 kg m −2 and 1.1 %, respectively, and with maximum values of ∼ 1.5 kg m −2 and ∼ 3 %, respectively.Local maxima in absolute differences are correlated with maxima in WVPA, that is, over the tropics and warm ocean currents.The local maxima in relative difference are observed over storm track regions.
To understand the differences it is relevant to repeat the need for independent observations in the kriging procedure (see Sect. 2.2).Therefore, the systematic difference is potentially caused by the different averaging approaches in the determination of monthly means: in HOAPS v3 all values within a month and grid box are summed and normalised to the number of observations.In CM SAF, HOAPS v3.1 observations from individual overpasses are averaged first and then used to compute monthly means.From a mathematical view point, such approaches do not need to be identical because the number of observations over time is not constant.However, the systematic nature and the magnitude of this effect are surprisingly large.
An overpass with only few observations more strongly impacts the monthly average in the CM SAF HOAPS v3.1 data set because it has a larger weight after the sub-setting than within the "all-pixel" approach used for HOAPS v3.Furthermore, a reduced number of observations is mainly caused by rain screening and the remaining valid observations in the vicinity of heavy rain likely exhibit large WVPA values.Therefore, larger WVPA values in combination with smaller numbers of valid observations within an overpass can lead to a systematic bias in WVPA.
Void data points due to rain are filled by the kriging scheme.Given the tendency of large WVPA values in such areas, also the rain gaps will be filled with large WVPA values.This reasonable approach leads to an increased number of large WVPA values and therefore can explain the systematic difference.
These considerations will likely lead to patterns of absolute and relative difference (as in Fig. 2) because absolute maxima in difference strongly correlate with maxima in the spatial distribution of precipitation, and relative differences   are mainly observed in storm track regions in which frequent rain events occur.

Comparison to WVPA from RSS
As mentioned above, RSS is also providing WVPA from SSM/I and from TMI measurements.The RSS algorithm is a physical retrieval scheme to derive a set of variables (including total column water vapour) simultaneously from all channels of the SSM/I instrument.The retrievals are obtained via inverse radiative transfer modelling based on a parameterised solution of the forward radiative transfer equations.Details on the algorithm can be found in Wentz (1997).Version 6 of the RSS SSM/I WVPA data record and version 4 of the TMI WVPA data record were utilised in this comparison.The homogenisation techniques, retrieval algorithms, including auxiliary data and temporal and spatial gridding, are completely different between RSS and CM SAF and will be explained below.

Systematic and random deviations
Monthly mean WVPA data for each instrument from RSS have been interpolated to the CM SAF product grid for an easier comparison.Figure 3 shows the bias and root mean square difference of each individual SSM/I and the TMI water vapour product from RSS versus the homogenised water vapour data set from CM SAF.
The overall bias between the two SSM/I based water vapour retrievals is found to be on the order of 0.5 kg m −2 , where the CM SAF product tends to be dryer than the RSS product, which is consistent with the findings by Sohn and Smith (2003) mentioned above.The bias against the TMI product is slighly larger with about 0.6 kg m −2 .This is mainly caused by the limited coverage of the comparison within ± 40 • latitude due to the TMI orbit.On the other hand, the bias is surprisingly small as both algorithms differ significantly concerning the homogenisation techniques, retrieval schemes and precipitation filtering approach used.Two different levels of RMSE are seen in the lower panel, where starting from 1991 onwards the RMSE steadily decreases Also evident from Fig. 3 is the stable bias after 1991 and the high correlation of the bias and RMSE between individual SSM/I products from RSS and the homogenised data set from CM SAF.From this high correlation at the overlap periods, one can conclude that the homogenisation scheme used in the production of the CM SAF data set is working well as no significant differences occur during the overlap periods compared to the RSS individual satellite products.
The peak in the RMSE of RSS's F-10 product versus the CM SAF data set at the beginning of F-10's data period is due to a different starting date: RSS uses observations from the first day an instrument is operational, whereas the CM SAF product always introduces new satellites at the beginning of a month.Hereby, a different temporal sampling is introduced in the two data sets leading to the RMSE peak in January 1991.Peaks in RMSE also occur later throughout the data period, but at these times, they are well correlated among the satellites.These peaks are caused by a different recognition of spatial patterns in the water vapour fields due to different boundary conditions (e.g.sea surface temperature data sets used) that influence RSS's simultaneous retrieval of all variables, whereas they do not influence the purely statistical retrieval of the CM SAF data set that is not using sea surface temperature data.

Trend comparison
be suitable for trend monitoring, the stability over time is another important issue to be considered.For total column water vapour, Ohring et al. (2005) provides a decadal stability requirement of 0.26 % per decade.Proving evidence for decadal stability is very challenging because global longterm reference observations with sufficient spatial and temporal sampling and known uncertainty characteristics are not available over open oceans.Having two data sets derived from the same source only allows the comparison of trends and an analysis of the relative stability of the data sets, which may increase confidence in using them for trend monitoring.
After removing the seasonal cycle from both data sets, linear trends over the years 1991-2006 are determined.Global maps of the resulting trends for both data sets are shown in Fig. 4. The spatial trend patterns agree well and very small differences are visible close to sea ice edges, which are most likely caused by different ice coverage data sets used during mapping in both data sets.The global mean decadal trend derived from both data sets is 0.45 kg m −2 or 1.67 % decade −1 for the CM SAF data set, and 0.44 kg m −2 or 1.62 % decade −1 for the RSS data set.Thus, the difference of the trends is only 0.05 % decade −1 , which is significantly smaller than the estimated trend uncertainties.Thus, the relative stability is fulfilling the requirement stated in Ohring et al. (2005).This increases our confidence in using both sets for trend monitoring.However, both trend estimates are slightly larger than those given in Trenberth et al. (2005) and Mears et al. (2007) who also utilise SSM/I data but consider different time periods (Trenberth et al., 2005;Mears et al., 2007) than used in this study, and more important are only considering the tropics (Mears et al., 2007).

Comparison against reanalyses
Assimilation of radiances or total column water vapour estimates into numerical weather prediction (NWP) models is another way of producing a global water vapour data record with the advantage of being physically consistent within the constraints of the model physics.To achieve temporal homogeneity with respect to model versions and assimilation sys-tems, NWP centres perform reanalyses, which also allows for the use of more input data inclusive of better quality control.Within this study the following reanalysis data sets are used: -ECMWF (European Centre for Medium-Range Weather Forecasts) ERA40 re-analysis for 1987-2002(Uppala et al., 2005)); -ECMWF ERA40 re-analysis (forecast step +24 h) for 1987-2002; -ECMWF ERA INTERIM re-analysis for 1989-2006 (Dee et al., 2011); -JMA (Japan Meteorological Agency) JCDAS-25 reanalysis for 1987-2006 (Onogi et al., 2007).
In order to assess the impact of model changes, the ECMWF operational analysis for 1995-2006 is additionally compared to the CM SAF data set.
The earlier ERA40 re-analysis data set has some known deficiencies in the global water cycle, as discussed in Hagemann et al. (2005) who found that the ERA40 water cycle changed in several respects compared to earlier re-analyses and that the on-going validation yielded that the ERA40 precipitation has several deficiencies.For data after 1972, the global water budget was found unbalanced and freshwater flux over the ocean was negative in the long-term mean.
In all analyses data sets SSM/I information is assimilated either by using retrievals of total column water vapour (ERA40 and JCDAS-25) or by taking directly the SSM/I radiances (ERA INTERIM and ECMF operational analysis).Thus, the compared data sets are not independent of each other.However, the analyses not only assimilate SSM/I data but a wealth of other water vapour information from radiosondes and other satellite instruments.Given this fact and the complex interactions between variables incorporated in forecasting systems, the analysis represents a best fit through all observations as a whole, but not necessarily for an individual variable or instrument.From this point, it is clear that data sets derived purely from satellite measurements still provide a good basis for re-analysis assessment.
Re-analysis fields are provided at 00:00, 06:00, 12:00 and 18:00 UTC, so that a daily mean can be derived from four equally distributed fields.The CM SAF data set uses the unequally distributed orbits to construct daily means.This introduces additional sampling uncertainty, which is assumed to be small on the global average considered in this comparison.All reanalysis data sets have been interpolated to the spatial resolution of the CM SAF product.
In The monthly uncertainty given in Fig. 5 has been split up regionally and in terms of the PDF.The results are shown in Fig. 6.The lower panel illustrates the fractional contribution to the total bias and its latitudinal dependence.Intense grey shading shows regions that contribute much to the total bias, i.e. regions where CM SAF WVPA data sets and ERA INTERIM values differ most.The following patterns can be identified: in both hemispheres, the storm track regions show higher differences, predominantly in the respective summerfall season.This may be due to the different temporal sampling in both data sets.A second region of higher differences is the ITCZ, evident as a band-like structure that can be seen throughout the time-series including its inter-annual variability.This may be explained as the SSM/I signal is saturating at very high water vapour values, where the additional information from other observations in the re-analysis data set still provides reliable values for WVPA.
The upper panel of Fig. 6 shows the differences in WVPA split up in terms of the PDF.At the dry end of the PDF, i.  Considering RMSE values, ERA INTERIM gives best agreement with the CM SAF product over time, with values around 3 kg m −2 , followed by JCDAS-25 and (starting in 2002) the operational analysis from ECMWF.ERA40 exhibits similar RMSE values as the operational analysis from ECMWF for the overlapping period for reasons outlined above.Lastly, the ERA40 forecast step +24 h not surprisingly shows the largest RMSE values, as (a) with increasing forecast time small scale structures get smeared out and (b) only two forecast base times per day at 00:00 and 12:00 UTC are available.From that, a complex difference in temporal sampling between model and satellite arises.
In general, both bias and RMSE values decrease in all cases at the beginning of 1991, which is the time when the second DMSP satellite started to be used in the CM SAF product (see Fig. 3 for SSM/I temporal coverage).The lack of information is more severe for the CM SAF product as the reanalyses also use other additional information for water vapour.The sudden decrease in bias in 2002 is related to the introduction of a temporally variable bias correction scheme which is favourable to the previously used temporally fixed bias correction.

Conclusions
A Climate Data Record for total column water vapour derived from SSM/I observations based on the Schlüssel and Emery (1990) retrieval has been produced at and released by CM SAF (Jonas et al., 2009).It covers the global ice-free oceans and ranges in time from 1987 to 2006.Monthly and daily means with a spatial resolution of 0.5 × 0.5 degrees are available.An objective interpolation scheme has been developed and applied in order to allow full spatial data coverage and to also provide uncertainty estimates on grid basis.The CDR is fully transparent and is, together with its documentation, freely available from CM SAF.The major objectives for the development and release of this CDR are (a) to ensure continued and sustained processing and development capabilities for a mature and widely used WVPA CDR and (b) to establish another WVPA CDR from SSM/I in order to decrease structural uncertainty.
The quality of the data set was assessed in terms of bias, RMSE and decadal stability by comparison to WVPA data sets provided by Remote Sensing Systems and to various (re-)analyses as well as the prior version of the HOAPS water vapour data set.In addition, results from previous water vapour retrieval comparisons were considered.The evaluation results can be summarized as follows: The retrieval algorithm itself has been found to differ from reference "optimum statistical retrievals" less than 0.5 kg m −2 as far as the systematical difference (bias) is concerned.The RMSE differences between the Schlüssel and Emery (1990) algorithm underlying the CM SAF data set and the benchmark "optimum statistical" retrieval are negligible (Sohn and Smith, 2003).
Based on a comparison to the ERA INTERIM reanalysis data set, the CM SAF water vapour data set from SSM/I shows a global mean bias which is below 0.5 kg m −2 for the monthly means with a RMSE of less than 2 kg m −2 (after 1991, in 1987-1990 due to temporal sampling slightly higher than 2 kg m −2 ).For the daily means, similar values are found in either case for the bias, but some outliers in cases of daily RMSE are larger than 2 kg m −2 .
The comparison to RSS results revealed bias and RMSE results very similar to corresponding results based on ERA-INTERIM.The trends derived from the RSS and the CM SAF data set based on a least squares regression analysis of the deseasonalised monthly means show excellent agreement of the geographical trend patterns.Also, decadal trend values agree very well.The small difference between the decadal trends of 0.05 % decade −1 is an indicator for small structural uncertainty and for high stability, which gives confidence in the observed trends.
Finally, it needs to be noted that the validation of WVPA over ocean remains a challenging task due to missing reference observations with sufficient sampling and temporal coverage.The consistency among the comparison results using a highly diverse ensemble of data sets shows high quality and homogeneity of all considered WVPA data sets (with the exception of operational analysis).Thus, it can be concluded that trends in total column water vapour over ocean can be monitored.
Fig. 1.WVPA (top left) and extra daily standard deviation (top right) from SSM/I averaged over the period 1987-2006.The bottom panels show WVPA and kriging error for an exemplary day in April 2001.

Fig. 1 .
Fig. 1.WVPA (top left) and extra daily standard deviation (top right) from SSM/I averaged over the period 1987-2006.The bottom panels show WVPA and kriging error for an exemplary day in April 2001.

Fig. 3 .
Fig. 3. Time series of bias and RMSE between water vapour products from CM SAF and RSS.RSS products are available for each satellite, the CM SAF product is based on inter-satellite homogenised SSM/I brightness temperatures.The temporal coverage of each instrument is given by the colored bars.25 Fig. 3. Time series of bias and RMSE between water vapour products from CM SAF and RSS.RSS products are available for each satellite; the CM SAF product is based on inter-satellite homogenised SSM/I brightness temperatures.The temporal coverage of each instrument is given by the coloured bars.

Fig. 4 .Fig. 5 .Fig. 5 .
Fig. 4. Regional trends of monthly mean water vapour derived from deseasonalized CM SAF (left) and RSS (right) data sets for 1991-2006.Shown is the monthly trend of the regression analysis in kg m −2 .
e. at values of less than 6 kg m −2 , the bluish colours show that the CM SAF data set contains less grid points with such WVPA values compared to ERA INTERIM.This is compensated by more grid points in the CM SAF data set at values of 6-12 kg m −2 (reddish colours).This may point to a problem Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Fig. 6 .Fig. 6 .Fig. 7 .
Fig. 6.Contributions to the total monthly uncertainty (bias) for the CM SAF WVPA data set assessed versus the ERA INTERIM re-analysis.Values are fractional contribution per latitude bin (lower plot) and absolute difference between PDF values for binned WVPA (upper panel).