Continuous quality assessment of atmospheric water vapour measurement techniques : FTIR , Cimel , MFRSR , GPS , and Vaisala RS 92

At the Izãna Observatory, water vapour amounts have been measured routinely by different techniques for many years. We intercompare the total precipitable water vapour (PWV) amounts measured between 2005 and 2009 by a Fourier Transform Infrared (FTIR) spectrometer, a Multifilter Rotating Shadow-band Radiometer (MFRSR), a Cimel sunphotometer, a Global Positioning System (GPS) receiver, and daily radiosondes (Vaisala RS92). The long-term characteristics of our study allows a reliable and extensive empirical quality assessment of long-term validity, which is an important prerequisite when applying the data to climate research. We estimate a PWV precision of 1% for the FTIR, about 10% for the MFRSR, Cimel, and GPS (when excluding rather dry conditions), and significantly better than 15% for the RS92 (the detection of different airmasses avoids a better constrained estimation). We show that the MFRSR, Cimel and GPS data quality depends on the atmospheric conditions (humid or dry) and that the restriction to clear-sky observations introduces a significant dry bias in the FTIR and Cimel data. In addition, we intercompare the water vapour profiles measured by the FTIR and the Vaisala RS92, which allows the conclusion that both experiments are able to detect lower to upper tropospheric water vapour mixing ratios with a precision of better than 15%.


Introduction
In the troposphere, water vapour is the most important trace gas.It is a key factor in governing tropospheric dynamics and it is a powerful greenhouse gas.Observing and analysing Correspondence to: M. Schneider (matthias.schneider@kit.edu)its evolution is needed for a better understanding of weather and of past and future climate.Long-term middle/upper tropospheric observations are of particular interest for the climate change research community, since at these altitudes water vapour acts very effectively as a greenhouse gas (e.g., Spencer and Braswell, 1997;Held and Soden, 2000).
Concerning total precipitable water vapour (PWV) measurements, there are some widely-automated techniques, like sunphotometers and GPS (Global Positioning System) receivers, which offer good global coverage.For operational and, in particular, for research applications, it is essential to know the long-term quality of these measurements, since the expected trends in the PWV values due to global warming are on the order of a few tenths of mm per decade (Trenberth et al., 2005).Great effort has been put into theoretically and empirically estimating the quality of these automated techniques (e.g., Revercomb et al., 2003;Van Baelen et al., 2005;Sapucci et al., 2007;Bokoye et al., 2007;Wang et al., 2007;Alexandrov et al., 2009).However, most of the empirical intercomparison studies are limited to intensive campaign periods.
Upper tropospheric water vapour profiles are traditionally measured by operational radiosondes.Efforts have also been made to reduce the uncertainties and document the quality of these measurements (e.g., Turner et al., 2003;Vömel et al., 2007;Miloshevich et al., 2009).However, similar to PWV quality assessments, these studies are often limited to campaigns.In our opinion, the long-term quality of tropospheric water vapour measurements, performed under routine conditions, is not sufficiently documented, which hinders its use for climate research.
The ground-based FTIR (Fourier Transform Infrared) experiments of NDACC (Network for Detection of Atmospheric Composition Change, Kurylo and Zander, 2000) have measured high quality solar absorption spectra for many Published by Copernicus Publications on behalf of the European Geosciences Union.years, which allows monitoring of a large variety of atmospheric trace gas column amounts and profiles, including water vapour at a very high precision (Schneider et al., 2006;Pałm et al., 2010;Schneider and Hase, 2009;Sussmann et al., 2009).We think that long-term intercomparisons with FTIR measurements can significantly improve the quality assessment of different water vapour sensors.
At the Izaña Observatory, Cimel and MFRSR (Multifilter Rotating Shadow-band Radiometer) sunphotometers, Vaisala RS92 radiosondes and ground-based FTIR water vapour measurements have been performed simultaneously and for more than four years.In addition, GPS measurements started in June 2008.In this paper, we use this unique long-term dataset of coincident water vapour measurements to empirically estimate the quality and limitations of the different techniques.The following section briefly describes the five different experiments.In Section 3, we intercompare the routinely measured PWV amounts, discuss the observed disagreements and assess the data quality, its long-term stability, and its dependence on atmospheric conditions and observation geometry.In Sect.4, we compare tropospheric water vapour profiles measured routinely by the Vaisala RS92 radiosonde and the FTIR experiment and discuss their quality.The most important results of our study are summarized in Sect. 5.

The water vapour instrumentation at Iza ña
The Izaña Observatory is located on the Canary Island of Tenerife, 300 km from the African west coast at 28 • 18 N, 16 • 29 W at 2370 m a.s.l.It unites a huge variety of different atmospheric measurement techniques, among which are five capable of detecting upper-air water vapour.These five are briefly described in the following (for more details please refer to Romero et al., 2009).The precision with which these techniques are expected to measure PWV is given in Table 1.

Ground-based FTIR
Izaña's FTIR activities started in March 1999.They form part of the Network for Detection of Atmospheric Composition Change (NDACC).There are about 25 groundbased FTIR experiments performed within NDACC, mostly in northern mid-latitudes and in polar regions.For several decades, the NDACC FTIR experiments have been essential for studying stratospheric ozone chemistry by providing a long-term dataset of different ozone relevant trace gases (e.g., Rinsland et al., 2003;Vigouroux et al., 2008).Due to its versatility, a ground-based FTIR instrument is a key experiment of an NDACC station.It measures spectra of the direct solar light beam using a high-resolution Fourier Transform Spectrometer.Figure 1 shows a spectrum for the 700-1350 cm −1 (7.4-13.5 µm) region.The bottom panel gives an impression of the huge amount of information present in these high resolution spectra.It shows two spectral microwindows with the wavenumber scale being expanded by a factor of 200.Individual rotational-vibrational lines of different absorbers (O 3 , H 2 O, HDO, CH 4 , etc.) are discernable.The high spectral resolution allows measurements of the pressure-broadening effect, i.e., the line shape depends on the pressure at which the absorption takes place (e.g., compare widths of the lines of H 2 O, which absorbs mainly in the lower troposphere, with the width of the lines of O 3 , which absorbs mainly in the stratosphere).The high resolution spectra disclose, not only the total column amount of the absorber but also contain some information about its vertical distribution.
The inversion problems faced in atmospheric remote sensing are, in general, ill-determined and the solution has to be properly constrained.An extensive treatment of the topic is given in the textbook of C. D. Rodgers (Rodgers, 2000).In recent years, the NDACC-FTIR community has increased its efforts to monitor the tropospheric distribution of greenhouse gases, including water vapour.The inversion of atmospheric water vapour amounts from ground-based FTIR spectra is far from being a typical atmospheric inversion problem and, due to its large vertical gradient and variability, standard retrieval methods are not appropriate.During the last several years, the ground-based FTIR group of the Institute for Meteorology and Climate Research (department of Trace Constituents in the Stratosphere and Tropopause Region; in German letters: IMK-ASF), Karlsruhe, Germany, developed an appropriate water vapour retrieval method (Hase et al., 2004;Schneider et al., 2006;Schneider and Hase, 2009), which is applied in this study.An extensive description of this method is given in Schneider and Hase (2009).
In Schneider et al. (2006), the FTIR's PWV precision is estimated to be 4%.This is a rather conservative estimate since the analysis method has been further refined.In addition to highly precise PWV data, the ground-based FTIR technique can provide tropospheric water vapour profiles that are 15% more precise and with a vertical resolution of 2 km in the lower troposphere and 6 km in the upper troposphere (Schneider and Hase, 2009).Furthermore, the technique is able to detect profiles of water vapour isotopologue ratios, which is very useful for investigating the atmospheric water cycle (Schneider et al., 2009).

Cimel sunphotometer
The Cimel sunphotometer is an automated sun and sky scanning filter radiometer.At Izaña, the first Cimel measurements were made in 1997 and they have been continuously performed since 2004.The Cimel sunphotometer measures at eight different passband filters between 340 nm and 1020 nm.Its field-of-view is 1.2 • .The pointing of the instrument is controlled by astronomical calculations.For the direct sun measurements, the tracking is assisted by a four-quadrant detector.Direct sun measurements are made typically every 10 min.The sky is scanned many times at different angles with respect to the sun, which allows the determination of many different aerosol properties (theory of Mie scattering).The Cimel measurements are performed at several hundred globally distributed sites within AERONET (Aerosol Robotic Network, Holben et al., 1998).
The PWV is calculated from the direct sun observations of the 940 nm passband.At Izaña, we determine the water vapour columns by the modified Langley plot method.Therefore, the relation between the slant optical depth and the water vapour slant column amounts is approximated by a power law parameterisation (e.g., Bruegge et al., 1992;Schmid et al., 2001).Uncertainties in this parameterisation and the Langley regression (due to variable atmospheric water vapour amounts) as well as deficits in the filter characterisation are the leading error sources.Alexandrov et al. (2009) estimate a PWV precision of about 10%.
In this paper, we use AERONET level 1.5 data, which are automatically cloud screened by a method described in Smirnov et al. (2000).Romero et al. (2009) show that there is no significant difference between the Cimel PWV AERONET level 1.5 and level 2.0 data.

MFRSR sunphotometer
An MFRSR sunphotometer has detected irradiances at Izaña since 1996.It measures at six narrow wavelength passbands between 410 nm and 940 nm the global horizontal, the diffuse horizontal and the direct normal irradiances.The first is measured directly, whereas the latter two are calculated from a sequence of three measurements.For the middle measurement, a shadowing band blocks a strip of the sky where the Sun is located and for the other two the shadowing band blocks strips of the sky 9 • to either side.These side measurements permit a correction of the excess sky blocked during the middle (Sun-blocking) measurement necessary to determine the diffuse horizontal irradiances.The direct normal irradiances are then calculated by subtracting the diffuse horizontal from the global horizontal irradiances.For more details, please refer to Harrison et al. (1994).The MFRSR sensors have very good temporal and reasonable spatial coverage, since they measure automatically at many stations of the Baseline Surface Radiation Network (BSRN; Ohmura et al., 1998).
As for the Cimel, the MFRSR PWV is calculated from the 940 nm passband direct normal irradiances applying the modified Langley technique.The precision is estimated to be 10%.It is mainly limited by uncertainties involved in the calibration process and the filter characterisation (Alexandrov et al., 2009).
A very critical aspect of automated radiation measurements is cloud screening.The huge number of measurements requires the application of an automated procedure to separate cloud-affected data from clear sky data.Our automated cloud screening is based on iterative Langley plots.It considers outliers as cloud-affected measurements.For more details please refer to Romero et al. (2009).
In addition, we perform a data post-processing to screen low quality measurements.It is similar to the method applied by Alexandrov et al. (2004) for automated cloud screening of the MFRSR irradiance measurements.It consists of analysing the inhomogeneity of the atmospheric water vapour field as determined by the MFRSR.Therefore, we calculate the parameter = 1 − exp(lnPVW) PVW .Here the overbar indicates a moving average over one hour.For a homogeneous dataset, the value of is close to 0, for an extremely inhomogeneous dataset it is close to 1. Figure 2 shows a histogram for the values of encountered in the MFRSR PVW data between 2005 and 2009.The peak at 10 −2.7 represents the typical atmospheric water vapour inhomogeneity, whereas the second peak close to 1 is caused by sudden erroneous changes in the MFRSR PVW due to inefficient cloud screening, an incompletely blocked Sun, incorrectly estimated total horizontal irradiances, etc.We put the threshold at an of 10 −1.6 , i.e., we consider that only MFRSR PVW values with < 10 −1.6 are reliable.

GPS receiver
Due to refraction in the atmosphere, the radio signals emitted by the GPS (or GLONASS, the Russian global positioning system) satellites are delayed.The Zenith Total Delay (ZTD) is the sum of the Zenith Hydrostatic Delay (ZHD) associated with induced dipole moments of the atmospheric molecules and the Zenith Wet Delay (ZWD) related to the permanent dipole moments of the water vapour molecules.Absolute ZTD values can only be determined if the GPS receiver is operated within a network of reasonable spatial coverage (the same satellite must be seen at different GPS stations from different elevation angles, Duan et al., 1996).Stationed at Izaña is a Leica GRX 1200GG pro GPS/GLONASS receiver, which has been operated within the European Reference Frame network (EUREF, Bruyninx, 2004) since June 2008.The GPS instrument is the property of the Spanish National Geographic Institute (in Spanish: Instituto Geográfico Nacional, IGN), which provides us with 15-min mean ZTD values.They are calculated by applying the Bernese software (Rothacher, 1992).We separate the ZHD and ZWD (the amount of interest).The ZHD is calculated with the actual surface pressure at Izaña.The ZHD is typically one order of magnitude larger than the ZWD, and consequently precise measurements of surface pressure are essential for a ZWD determination.The ZWD is then converted to PWV using the refraction constants of water vapour (for more details please refer to Romero et al., 2009).
Ground-based GPS measurements offer good global coverage (IGS (International GNSS service) network, Dow et al., 2005) and can provide a valuable dataset for climate research.The main error sources are ZTD uncertainties (due to receiver noise, multipath and antenna phase delays, satellite orbit errors, ionospheric corrections, elevation cutoff angles, etc.) and surface pressure uncertainties.Since Izaña, we applied the actual pressure measured by a highly-precise manometer (SETRA 470) close to the GPS receiver and we can neglect the surface pressure uncertainty.Then the total PWV random error is estimated to 0.7 mm (Wang et al., 2007).

Meteorological radiosonde (Vaisala RS92)
On Tenerife Island, meteorological radiosondes have been launched twice daily (at 11h15 UT and 23h15 UT) since the 1970s, from a site situated at the coastline, approximately 15 km to the south of Izaña (WMO station #60018).Until June 2005 the Vaisala RS80 radiosonde was employed as the operational radiosonde.Since then, the Vaisala RS92 sondes have been used.We corrected the temperature and radiation dependence (in the case of daytime soundings) of the RS92 sensor as suggested by Vömel et al. (2007), which does not consider the importance of solar elevation angle or clouds when calculating the radiation correction.Miloshevich et al. (2009) performed an extensive empirical error study for the RS92 sensor.When applying an ultimate correction strategy, they estimated a precision of 5% for the PWV and for the lower and middle tropospheric mixing ratios.In the upper troposphere and for very dry conditions, it is poorer (about 10-20%).The main error sources are sensor manufacturing variability (Turner et al., 2003) and improper operating procedures.For dry conditions or at higher altitudes, the effects of clouds on the radiation correction and roundoff errors (in the standard RS92 processing relative humidity is reported as an integer) become important.
The Vaisala RS92 radiosonde is used at many sites throughout the globe within WMO's upper air meteorological network.The RS92 humidity data are an important input for weather forecast models.Furthermore, they are often used for research and for the validation of ground-and spacebased remote sensing techniques.3 Assessment of PWV data quality

The dataset
We compare the data measured since 2005, when the last major changes to Izaña's water vapour instrumentation took place: In January 2005 the Bruker FTS 120M was replaced by a Bruker 120/5HR and in June 2005 the Vaisala RS92 sonde replaced the RS80 as the operational radiosonde.Figure 3 depicts the PWV time series as measured by the FTIR instrument between 2005 and 2009.It documents the typical high variability of atmospheric water vapour amounts.On dry days, the water vapour column is close to 0.3 mm and on wet days it can reach 30 mm, i.e., it spans two orders of magnitude.
Table 2 documents the data availability for the different experiments between 2005 and 2009.The FTIR instrument measures on about 3 days per week and for the analysed period there are 845 water vapour observations available.The Cimel has measured continuously since 2005 with the exception of the period from April to September 2008, when there are only version 1.0 (not cloud screened) data in the database.The MFRSR measures continuously during the four years considered.There are only some short periods without data in 2005.The Cimel and MFRSR produce water vapour data whenever the line between the instruments and the sun is cloud-free and with a high temporal resolution (Cimel data are produced every 10 min and MFRSR data every minute), which explains the large number of available measurements.The GPS receiver was installed in spring 2008 and provides data from mid July 2008.The software is configured to estimate ZTD and, thus, PWV data every 15 min.Finally, the RS92 sonde has been Tenerife's operational meteorological sonde since June 2005 and provides data twice daily (00:00 and 12:00 UT).

Coincidence criteria
When comparing different measurements, we must ensure that the same airmasses are detected.This is particularly important for atmospheric water vapour due to its high temporal and spatial variability.Figure 4 depicts the 1σ standard deviation of the difference (the scatter) between FTIR and Cimel PWV data as a function of the coincidence interval.If we compare each FTIR measurement with all Cimel measurements taken within an interval of 8 h, we observe a scatter of 23%, which strongly decreases when reducing the coincidence interval.Apparently most variability takes place on time scales larger than 1 h, and we choose 1 h as the temporal coincidence criterium for the comparisons.The definition of this coincidence criterion is straightforward when comparing the remote-sensing measurements of Cimel, MFRSR, GPS and FTIR, since their measurements take only some seconds (Cimel, MFRSR) or less than 15 min (GPS, FTIR).When a measurement of instrument X coincides with several measurements of instrument Y within 1 h, we use exclusively the coincidence with the minimal time difference.Thereby, each measurement is only compared once and all the pairs of coincident measurements are fully independent.Concerning the radiosonde measurements, the definition of temporal coincidence is more difficult since a radiosonde measurement takes approx.one hour (time the sonde needs to travel between Izaña and 15 km altitude).In this case, we take the time when the sonde reaches 4 km as a reference time for the 1 h temporal coincidence criterion, since the layer between the altitude of Izaña (2.37 km) and the altitude of 4 km contains typically 50% of all the PWV above Izaña.
Spatial coincidence is no problem for FTIR, Cimel and MFRSR.All these techniques observe the airmass between the sun and the instruments, which are located at Izaña within a radius of less than 50 m.However, the GPS measures the water vapour amount between the receiver and a set of  satellites, and the radiosonde measures the amount in situ at its location.Both instruments detect different airmasses than the FTIR, Cimel and MFRSR.This aspect has to be considered when discussing the comparisons with the GPS and RS92.

Empirical error quantification
In this subsection, we give an overview of the mean difference and the 1σ standard deviation of the differences (the scatter) between the measurement techniques.Figure 5 shows the correlations for all the data that fulfill the 1 h coincidence criterion.We choose a logarithmic scale due to the large variability of the water vapour amounts.The total water vapour amounts span two orders of magnitude and can be approximated by a log-normal frequency distribution; consequently a presentation on a logarithmic scale is more appropriate than a presentation on a linear scale.A linear scale presentation would give too much weight to the rarely occurring large water vapour amounts, whereas a log-scale presentation adequately reveals how the different techniques compare given the huge dynamic range in atmospheric water vapour amount.Cimel, MFRSR and GPS measure with a frequency of 1 to 15 min, which explains the large number of coincidences for comparisons which involve these data (although GPS is only operating since July 2008).Generally the data of the different sensors correlate quite well.The correlation coefficient ρ is above 0.92 (with the exception of the GPS versus Cimel correlation where ρ is 0.845).For all instruments, the correlation is the best with the FTIR data.Whenever FTIR data is involved, the respective correlation coefficient ρ is above 0.95.Among the correlations that do not involve FTIR data, only the correlation between Cimel and MFRSR (both are very similar techniques) and between Cimel and RS92 leads to a coefficient ρ above 0.95.The relatively poor correlation between GPS and Cimel can be explained by the prevailing dry conditions during the coincidence period (autumn and winter: October 2008 -January 2009).Under dry conditions, the GPS data are known to be less precise (Wang et al., 2007).
Table 3 gives the mean and 1σ standard deviation for the differences between the experiments, i.e., it reveals biases and scatter between the different measurement techniques.The values are given in absolute water amount (in mm) and in percent.The lowest scatter is achieved when FTIR data are involved.The scatter between FTIR and Cimel of 13% is very close to Cimel's estimated precision of 10% (see Table 1).We observe very large systematic differences to the MFRSR data, whereby the MFRSR overestimates the PWV of all other experiments.A bias is also observed for the Cimel data, but of an opposite sign: Cimel systematically underestimates the PWV of the other experiments.This huge systematic difference between Cimel and MFRSR (62%) is very surprising since the techniques are very similar (observation of the slant optical depth in the 940 nm region).It suggests that during the calibration procedures of the MFRSR and Cimel filter radiometer, different radiative transfer models and/or spectroscopic parameters are applied or that there are errors in the assumed filter characteristics.It seems that the water vapour data obtained by filter radiometer techniques are very sensitive to the calibration procedures involved.The systematic differences between GPS and FTIR, RS92 and FTIR, and the GPS and RS92 data are rather small.Given the large number of coincidences (more than 100), this observation is very robust evidence of good agreement between the water vapour scales of these three techniques.When calculating the mean and standard deviation, as collected in Table 3, we use all available coincidences of the two experiments that are compared.This strategy assures maximal validity of each comparison, but it means that the different comparisons are not representative of the same atmospheric conditions.For instance, the 112 FTIR-GPS coincidences represent rather dry conditions, whereas a lot of the 17 951 Cimel-MFRSR coincidences are for partly cloudy sky, i.e., more humid conditions.In order to overcome this ) can be interpreted as the root-square-sum of the uncertainties of all three experiments (RS92, Cimel and 2 × FTIR).It is very close to (even lower than) the rootsquare-sum of the uncertainty of the two experiments Cimel and RS92 of ±20.7%, indicating that the uncertainty in the FTIR data must be very small.Table 4 allows for the conclusion that the precision of the FTIR PWV data is in the percent range and much better than the precision of the other experiments.The FTIR data can serve as a reference for an empirical estimation of the precision of the other techniques.
We would like to remark that our study documents the quality of temporarily highly-resolved data (1 min in the case of the MFRSR, 10 min in the case of the Cimel and the FTIR, and 15 min in the case of the GPS).We do not average the data over longer time periods.This has to be considered when comparing our results to other studies, which occasionally analyse hourly or daily mean data.

Empirical error characterisation
In this subsection, we examine in detail the differences between the experiments and, thereby provide an empirical error characterisation of the different techniques.We examine whether the data quality depends on the atmospheric conditions (dry or humid) and on the observation geometry, and document the long-term stability of the data quality.In the case of FTIR, Cimel, and MFRSR data, we examine whether their limitation to clear sky observations introduces a bias in the dataset and in the case of GPS and RS92 we analyse if there are differences between day-and night-time measurements.

Observation geometry
The observation geometry may be important for the FTIR, Cimel and MFRSR experiments, which measure direct sunlight.The actual solar elevation may affect the quality of these measurements.This aspect is examined when taking the RS92 and GPS data as reference (both RS92 and GPS are independent of the solar elevation angle).The left panels of Fig. 6 document that Cimel − RS92, Cimel − GPS, FTIR − RS92, and FTIR − GPS do not significantly depend on the solar elevation angle, demonstrating that the quality of Cimel and FTIR data is independent of the observation geometry.On the contrary, when referencing the MFRSR data to RS92, FTIR and Cimel, we observe a significant dependency on the solar elevation angle, suggesting inconsistencies in the MFRSR data: for high solar elevation angles the measured PWV is about 40% larger than for low solar elevation angles.Such dependency is typical for errors in the Langley calibration method.

Atmospheric conditions
The right panels of Fig. 6 examine the dependency on the actual atmospheric water vapour content.The right top panel documents that the difference between FTIR and RS92 does not significantly depend on PWV and suggests that the quality of both experiments is consistent for low and high PWV.The situation is different for the Cimel instrument (right second panel from the top), which, for low PWV, increasingly underestimates the FTIR and RS92 data.Furthermore, the scatter between Cimel and the other experiments is larger for smaller PWV.For PWV above 7 mm, the scatter between Cimel and FTIR data reduces to ±6.7% compared to ±12.7% as listed in Table 3 for the whole ensemble.For the MFRSR and the GPS experiments, we make similar observations: increasing underestimation and more scatter at small PWV.The scatter between MFRSR and FTIR is ±17.2% for the whole ensemble (see Table 3) but reduces to ±11.0% if 6. Characterisation of PWV differences (2 × Y−X X+Y ), with X: RS92 (open blue squares), FTIR (solid black squares), and Cimel (red crosses), and with Y: FTIR (top panels), Cimel (second row of panels), MFRSR (third row of panels), GPS (bottom panels).Left panels: difference versus solar elevation angle; right panels: difference versus PWV of Y.
we limit to PWV above 7 mm.For the GPS data, this PWV dependency is very pronounced.For PWV smaller than 3-4 mm, the GPS strongly underestimates the PWV if compared to the other experiments and the scatter increases significantly.We can consider a PWV of 3.5 mm as the detection limit of the GPS experiment.Such an increased relative uncertainty of the GPS PWV is in agreement with Wang et al. (2007).For low water vapour amounts, the ZTD is almost completely due to the ZHD.Therefore, small relative errors in these amounts produce a large relative error in their difference, i.e., in the ZWD and consequently in the retrieved PWV.The dry conditions at Izaña provide a very demanding test of the sensitivity of the GPS technique.

Temporal stability
Figure 7 depicts the time series of the differences between the FTIR data and the data measured simultaneously by the other experiments.This plot documents well the long-term stability of the different techniques.There is no long-term trend in the differences.However, concerning the Cimel, we observed some steps in the time series: for instance, in May 2005 the typical difference with respect to the FTIR changes abruptly from −7% to −25% or during some weeks in March and April 2006 the difference is −8%, whereas before and after that period it is about −30%.We think that these steps are produced by changes in the calibration parameters on those dates.Concerning the MFRSR, we observe a clear annual cycle in the difference relative to the FTIR.The difference is especially large and positive (MFRSR overestimates FTIR values) in summer, and close to zero in winter.This can be explained by the solar elevation angle dependency of the MFRSR's PWV data (see Fig. 6) and indicates errors involved in the calibration procedure.

Clear sky bias
The FTIR, Cimel and MFRSR only provide water vapour data if the line between the instrument and the Sun is cloud free.It seems likely that this restriction introduces a dry bias in the datasets.Such a potential clear sky bias is an important drawback of visible and infrared water vapour remote sensing techniques (e.g., Lanzante and Gahrs, 2000).Gaffen and Elliot (1993) estimated the clear sky dry bias from a set of radiosonde observations performed in the period 1988-1990 at 15 different Northern Hemispheric sites.They found a significant dry bias, which strongly depends on latitude.It reaches +50% at high latitudes, whereas it is below +10% for the tropics.They defined the dry bias B as: Here the overbar indicates mean values and PWV a are all PWV values and PWV c those obtained at clear sky conditions.
We derive the clear sky bias (B) from the RS92 measurements, which are available for cloudy and clear sky conditions.PWV c are the PWV values measured by the RS92 when it coincides with a FTIR, Cimel or MFRSR measurement and PWV c are all RS92 PWV measurements.The B values for each instrument are listed in Table 5. DJF, MAM, JJM and SON represent ensembles for winter (December, January, and February), spring (March, April, and May), summer (June, July, and August), and autumn (September, October, and November), respectively.The row "year" shows all-season values.The different ensembles are sufficiently large for a reliable estimation of B (the smallest ensemble is the DJF FTIR ensemble, which consists of 33 RS92 (X+FTIR) ), where X is Cimel, MFRSR, GPS and RS92, as given in the panels, respectively.observations).The Cimel and, in particular, the FTIR PWV data have a significant clear sky dry bias.It is larger in winter than in summer.This is in good agreement with the latitudinal dependence as observed by Gaffen and Elliot (1993), since in winter Izaña's atmosphere has mid-latitudinal and in summer subtropical characteristics.
A seasonality is also observed in the MFRSR B values.However, the MFRSR clear sky bias is not significant.There is a dry bias in winter, but in summer the MFRSR PWV data are wet-biased.In this context, it is important to mention that the clear sky bias is exclusively produced by the atmospheric conditions that are prevailing when performing the measurements.It is not correlated with deficits in the FTIR, Cimel or MFRSR experiments but with the atmospheric conditions that are required to conduct the respective experiment (or with deficits in the RS92 experiment used for deriving the clear sky bias, see Eq. 1).We think that the atmospheric conditions that are prevailing for MFRSR observations are slightly different from the conditions required for FTIR and Cimel observations: the high aerosol loadings in summer, which are correlated with a particularly dry atmosphere, are filtered out by the MFRSR cloud or post-processing data screening and disregarding these dry days counterbalances the clear sky dry bias.
Like the RS92, the GPS instrument also measures at cloudy and clear sky conditions.The FTIR, Cimel and MFRSR PWV clear sky bias derived from GPS measurements is similar to the bias derived from RS92 measurements, however, it is less reliable since the GPS analysis is only possible for an eight-month period.

Night-day differences
Both RS92 and GPS measure during the day and night and we examine the daytime bias of these instruments (defined as 1 −

PWV day PWV day+night
).There is a significant day-night difference in the RS92 data.If applying a temperature correction but no radiation correction to the RS92 data, we observe the known daytime dry bias of about +4%.After applying a temperature and radiation correction (as suggested by Vömel et al. (2007)), we get a daytime wet bias of about −3%, which is mainly caused by an excessive radiation correction: the correction of Vömel et al. (2007) was determined for a tropical site.At Izaña, the RS92 radiation correction should be weaker due to generally lower solar elevation angles than at tropical sites (Miloshevich et al., 2009).We observe no significant night-day differences in the GPS data.

Assessment of water vapour profile quality
We compare water vapour profiles measured routinely at the Izaña Observatory by two different techniques: the Vaisala RS92 in situ sensor and the ground-based FTIR system.The latter technique only provides reasonable water vapour profiles if the developments of the IMK-ASF water vapour analysis algorithm are applied (Schneider and Hase, 2009).
Atmospheric profiles remotely sensed by the groundbased FTIR technique offer -compared to in situ measurements -a limited vertical resolution.The vertical structures that are detectable are documented by the averaging kernels.A typical set of FTIR averaging kernels for water vapour when applying the IMK-ASF inversion algorithm is shown in Fig. 8.The kernels are for the logarithm of the volume mixing ratios (VMR) since the variability of ln(VMR) is similar throughout the troposphere allowing straightforward interpretation of ln(VMR) kernels.On the contrary, VMR kernels would be difficult to interpret since the VMR variability decreases over several orders of magnitude from the lower to the upper troposphere.The FTIR system is able to detect 2 km thick layers in the lower troposphere, 3-4 km layers in the middle troposphere and 6 km layers in the upper troposphere.The averaging kernels for 3, 5, and 8 km (representative for the lower, middle and upper troposphere) are highlighted in red, blue and green, respectively.The sum, along the rows of the averaging kernel matrix, documents the sensitivity of the remote-sensing system (thick black line).It is almost optimal (close to unity) throughout the whole troposphere, which means that the FTIR system is well able to detect the atmospheric variability between the surface and an altitude of about 10 km, where the sensitivity starts to decrease.
When comparing the FTIR profiles with the in situ RS92 profiles, it is important to account for the inherent vertical resolution of the FTIR data.For an adequate comparison, we have to adjust the vertical resolution of the vertically highlyresolved data to the vertically poorly-resolved data.Therefore, we convolve the vertically highly-resolved RS92 profiles (x RS92 ) with the FTIR averaging kernels Â: The result is a smoothed RS92 profile ( xRS92 ) with the same vertical resolution as the FTIR profile (x a in Eq. 2 stands for the a priori climatological mean profile).
In the following, we compare FTIR and RS92 mixing ratios measured at altitudes of 3 km, 5 km and 8 km (representing the lower, middle and upper troposphere, respectively).These are the altitudes whose typical averaging kernels are highlighted in Fig. 8.As reference for the 1 h coincidence criterion, we take the time when the sonde reaches the altitude of 3, 5 and 8 km, assuring that the temporal coincidence criterion is of similar stringency for all altitudes.
Figure 9 depicts the time series of data that fulfill the 1 h coincidence criterion for an altitude of 3 km (198 coincidences).The upper panels show the water vapour mixing ratios as measured by the FTIR and the bottom panels the relative differences between FTIR and RS92.The bottom right panel shows a correlation plot between FTIR and RS92 data.In the lower troposphere, the mixing ratios measured in coincidence vary between 250 ppm and 12000 ppm, i.e., cover almost two orders of magnitude and are well representative of the huge atmospheric water vapour variability.As a mean, www.atmos-meas-tech.net/3/323/2010/Atmos.Meas.Tech., 3, 323-338, 2010  the FTIR overestimates the RS92 values by 21.8%.The scatter between FTIR and RS92 is 28.7%.The 199 coincident measurements of the mixing ratios of middle tropospheric water vapour (Fig. 10) vary between 100 ppm and 6000 ppm.The mean difference and scatter between the FTIR and RS92 data is −15.4 ± 22.3%.If compared to the lower troposphere, the scatter is reduced by more than 6%.The scatter is partly due to the detection of different airmasses (the RS92 detects the airmass at the sonde's location and the FTIR the airmass between the spectrometer and the Sun).We think that the reduced scatter reflects the larger stability of the middle tropospheric water vapour fields compared to the more variable lower tropospheric fields.
In the upper troposphere (Fig. 11) the mixing ratios within the ensemble of the 194 coincident measurements vary between 40 ppm and 1200 ppm.The mean difference and scatter is −3.1 ± 19.7%.The scatter is further reduced, compared to the lower and middle troposphere, which indicates a further reduction of the temporal and spatial water vapour variability at these altitudes.
The agreement between the RS92 and FTIR profiles is very satisfactory.We think that the higher scatter between FTIR and RS92 at lower altitudes can be explained by the detection of different airmasses and an increased spatial and temporal variability in the lower troposphere, suggesting that the combined FTIR and RS92 errors are very likely smaller than 20% throughout the troposphere.This value is in good  agreement with the estimated RS92 precision of 5% in the lower and middle troposphere and 20% in the upper troposphere.We conclude that the FTIR technique offers precise tropospheric water vapour profiles (an accuracy of better than 15%), with a vertical resolution of 2, 4 and 6 km in the lower, middle and upper troposphere, respectively.Furthermore, we observed no trend in the difference between FTIR and RS92, which documents the feasibility of the techniques for studying long-term evolution of the vertical distribution of tropospheric water vapour.
The RS92 measurements allow derivation of vertical profiles of the FTIR's clear sky bias (defined by Eq. 1).It is depicted in Fig. 12.In particular, in summer, there is a slight maximum bias around 8 km.Except for winter, the clear sky bias decreases rapidly above 10 km.Generally it is rather small above 12 km, indicating that clouds do not significantly affect the humidity at these altitude levels.
We also estimate the night-day differences of the RS92 profile measurements.Without radiation corrections, we observe the known altitude dependence of the radiation dry bias of +3% in the lower troposphere and +10% at 10 km.As mentioned in Sect.3.4, a radiation correction with the Vömel et al. (2007) formula is excessive: it produces a wet bias of −2% in the lower and −9% in the upper troposphere.

Conclusions
We present an extensive long-term intercomparison of five different upper-air water vapour measurement techniques: FTIR, Cimel, MFRSR, GPS and RS92.All five techniques are able to measure PWV.Our empirical PWV quality assessment reveals the following (see also Table 6): -FTIR: It is the most precise technique (accuracy of about 1%) and shows no significant dependency on observation geometry and atmospheric conditions.We can use it as a reference when assessing the accuracy of the other techniques, however, we have to be aware of the FTIR's significant clear sky bias.
-Cimel and MFRSR: The precision of the filter radiometer techniques depends on the atmospheric conditions (dry or humid).For PWV > 7 mm it is 7% (for the Cimel) and 11% (for the MFRSR), whereas under very dry conditions (PWV≤ 2 mm) it is only about 25%.Furthermore, there is a tendency to an increased underestimation of the PWV.In addition, the bias in the MFRSR data depends on the solar elevation angle: at high solar elevation the PWV is about 40% larger than at low solar elevation.Although Cimel and MFRSR are based on the same measurement principle, we observe a large www.atmos-meas-tech.net/3/323/2010/Atmos.Meas.Tech., 3, 323-338, 2010 for PWV > 7 mm: 11% GPS < 20% for PWV > 3.5 mm: < 10% RS92 < 15% systematic difference between both (of 62%) suggesting that the filter radiometer technique is very sensitive to the calibration procedure.Furthermore, the Cimel data are significantly clear sky biased.
-GPS: For PWV > 3.5 mm, it has an accuracy of better than 10% and a very small bias with respect to the FTIR data.We can define a PWV of 3.5 mm as the GPS's detection limit, since for PWV < 3.5 mm the precision is relatively poor (about 20%).Furthermore, for PWV < 3.5 mm, the GPS systematically underestimates the atmospheric water vapour content.Due to the small bias between GPS and FTIR, a combined sensor would be a very promising development.It could provide high quality data for cloudy as well as extremely dry conditions and during day and night.
-RS92: The quality of the radiosonde data is independent of atmospheric conditions.From the comparison to the FTIR, we estimate the RS92's PWV precision to be 15%, which is, however, a rather conservative estimate, since RS92 and FTIR detect different airmasses.
-Long-term stability: We analyse FTIR, Cimel, MFRSR and RS92 data for a four-year period (2005)(2006)(2007)(2008)(2009) and we observe no significant long-term trend in the biases, which indicates long-term validity of our results.However, Cimel and MFRSR biases show abrupt changes and annual cycles revealing a strong sensitivity to their respective calibration procedures, an issue which has to be kept in mind when applying these data for climate research.
-Confirmation of theoretical studies: Our results are in good agreement with theoretical error estimations (listed in Table 1).
In addition to PWV, the FTIR and RS92 experiments measure water vapour profiles between the Research Centre and an altitude of approx.15 km.Their comparison documents that both techniques provide data of good quality (the precision is empirically estimated to be better than 15% for the lower, middle and upper troposphere).However, radiosondes have only used the RS92 humidity sensor since 2004/2005 and extending the time series with historic radiosonde measurements with different sensor types might degrade the consistency of the dataset.A highly consistent radiosonde time series is restricted to a few years only, which limits its use for climate change studies.The ground-based FTIR measurements, on the other hand, have been performed within the NDACC for up to two decades and with the same instrument type.Reprocessing these historic measurements by applying recent inversion algorithm developments would produce a consistent long-term dataset of lower to upper tropospheric water vapour amounts with a vertical resolution of 2 to 6 km, respectively.These data would allow long-term studies of the middle/upper tropospheric water vapour amounts.Furthermore, the FTIR data can be used to document the long-term consistency of radiosonde measurements and detect abrupt changes, e.g., when changing the radiosonde's sensor type.Combining the radiosonde and the FTIR technique may allow the production of consistent long-term dataset of vertically highly resolved tropospheric water vapour profiles.
The FTIR provides very precise tropospheric water vapour data.However, depending on the application of the data, other experiments may be of more interest.For instance, when area-wide coverage and real-time data availability is important, the GPS and the RS92 data are more appropriate, since the ground-based FTIR measurements are only performed at about 25 globally distributed sites and the data are not available in real-time.Furthermore, it is important to be aware of the FTIR's significant clear sky dry bias.

Fig. 1 .
Fig. 1.Upper panel: Spectrum measured by the FTIR with the 700-1350 cm −1 filter setting and an integration time of 8 min.Bottom panels: Zoomed in spectral microwindows containing H 2 O and HDO signatures.The spectrum was recorded on 25 July 2005 at 11h30 UT (local noon is at 13h10 UT), with 0.005 cm −1 spectral resolution, for 47 • , solar elevation, and 4.5 mm PWV.

Fig. 2 .
Fig. 2.Histogram for the values of as determined from all MFRSR PWV data of 2005-2009, which is used for data postprocessing.We define the data with > 10 −1.6 as not reliable.

Fig. 4 .
Fig. 4. 1σ standard deviation of the PWV differences (scatter) between Cimel and FTIR as a function of temporal coincidence.Blue stars and left y-axis for Cimel-FTIR, black squares and right y-axis for (2 × Cimel−FTIR Cimel+FTIR ).Indicated are the numbers of Cimel-FTIR coincidences.

Fig. 5 .
Fig. 5. Correlation of PWV measured by FTIR, Cimel, MFRSR, GPS and Vaisala RS92.The number of coincidences N and the correlation coefficients ρ are given in each panel.The blue line is the diagonal and the red dotted line is the linear regression line.

Fig. 7 .
Fig. 7. Time series of the difference between FTIR and the other techniques ( 2×(X−FTIR)(X+FTIR) ), where X is Cimel, MFRSR, GPS and RS92, as given in the panels, respectively.

Fig. 8 .
Fig.8.Typical averaging kernels for ground-based FTIR remote sensing of water vapour.The kernels for 3, 5, and 8 km are highlighted in red, green, and blue, respectively.The sensitivity ( row ) is depicted as a thick black line.

Fig. 12 .
Fig. 12. Vertical profiles of the FTIR clear sky bias B, for winter (black line), spring (red line), summer (green line), autumn (blue line), and all seasons (thick grey line).

Table 2 .
Availability of PWV data from Izaña's FTIR, Cimel, MFRSR, GPS, and RS92 experiments.Covered period, duration of a single measurement, measurement frequency, and total number of available measurements between 2005 and 2009.

Table 3 .
Results of intercomparison of different sensors: Number of coincidences (N ), mean difference and standard deviation of difference in mm and % (2

Table 4 .
Same as Table3but for the 101 occasions on with all the four experiments FTIR, Cimel, MFRSR and RS92 coincide within 1 h.Table4collects the mean and 1σ standard deviation of the differences.The values are very similar to Table3: the smallest scatter is found when FTIR data are involved, large bias between Cimel and MFRSR, etc.FTIR, RS92, and Cimel/MFRSR are rather different measurement techniques and their errors should be uncorrelated.We can use the scatter values of Table4to estimate the techniques' precision: the root-square-sum of the scatter FTIR versus RS92 and FTIR versus Cimel (

Table 5 .
Clear sky bias in PWV of FTIR, Cimel and MFRSR observations determined from RS92 measurements (expressed as

Table 6 .
Empirical estimation of the PWV data precision (GPS and RS92 values are conservative estimates since both experiments detect different airmasses as the FTIR experiment).