Retrieval of tropospheric water vapour by using spectra of a 22 GHz radiometer

. In this paper, we present an approach to retrieve tropospheric water vapour proﬁles from pressure broadened emission spectra at 22 GHz, measured by a ground based microwave radiometer installed in the south of Bern at 905 m. Classical microwave instruments concentrating on the troposphere observe several channels in the center and the wings of the water vapour line (20–30 Ghz), whereas our retrieval approach uses spectra with a bandwidth of 1 GHz and a high resolution around the center of the 22 GHz water vapour line. The retrieval is sensitive up to 7 km with a vertical resolution of 3–5 km. Comparisons with proﬁles from operational balloon soundings, performed at Payerne, 40 km away from the radiometer location, showed a good agreement up to 7 km with a correlation of above 0.8. The retrievals shows a wet bias of 10–20 % compared to the sounding.


Introduction
Water vapour as the most important natural greenhouse gas plays a key role in the global radiation budget. The strong temperature dependance of the saturation vapour pressure according the Clausius-Clapeyron equation (up to +7 % per Kelvin) leads to a strong positive feedback effect of the tropospheric water vapour on global warming (e.g. Schneider et al., 2010, and references therein). Therefore it is essential to understand the effects of increasing water vapour on circulations on all scales, which requires first of all to extend the Correspondence to: R. Bleisch (rene.bleisch@iap.unibe.ch) knowledge about actual humidity distribution and circulation processes (Sherwood et al., 2010).
All the more important is the observation of the distribution of water vapour in the atmosphere. Microwave radiometry offers the opportunity of continous observations throughout a large altitude range from surface to mesosphere (with a gap in the UT/LS-range) and under almost all conditions, except during precipitation (e.g. Westwater, 1993).
The pressure broadened 22 GHz water vapour emission line (a simulated spectrum is shown on Fig. 1) is most suitable for ground-based remote sensing of water vapour profiles and content. Classical microwave instruments concentrating on the troposphere usually observe in several channels between 20 and 30 GHz at the center and the wings of the water vapour line (e.g. Peter and Kämpfer, 1992;Solheim and Godwin, 1998;Crewell et al., 2001;Ware et al., 2003;Rose et al., 2005).
Further, there exist several radiometers designed to retrieve middle atmospheric water vapour, which make measurements at high frequency resolution in a narrow range around the line center (e.g. Nedoluha et al., 1999;Forkman et al., 2003;Deuber et al., 2004;De Wachter et al., 2011;Straub et al., 2010).  showed, that it is also possible to use measurements of such radiometers to additionally retrieve tropospheric water vapour profiles, reinforced by simulated retrievals and retrievals with real data of the MIAWARA radiometer, operated by the Institute of Applied Physics, University of Bern.
This paper readopts this approach and presents extended evaluations and analysis of retrievals of tropospheric water vapour profiles using the now available three years of data from the MIAWARA instrument.
Published by Copernicus Publications on behalf of the European Geosciences Union. Simulated absorption coefficients between 10 and 40 GHz at 850 hPa for water vapour and further major contributors in this frequency range, which are oxygen (mainly the wing of the strong 60 GHz-line) and cloud liquid water (CLW). The shaded area marks the bandwidth of the MIAWARA-radiometer (from ).

Fig. 2.
The MIAWARA-instrument on the roof of the ExWibuilding: the front-end of the instrument basically consists of a rotatable mirror, which reflects the radiation into the horn antenna. A reference absorber is mounted next to the mirror as hot load for the tipping calibration, and a reference absorber bar is attached on top of the instrument for the balancing calibration. Further, a pan for the liquid nitrogen is included. Sections 2 and 3 describe the instrument and the retrieval algorithm. Section 4 contains an analysis of the averaging kernel matrix, which is a central term of a retrieval, relating the true with the retrieved profile. Further the question is issued, what information we gain from the retrieval. Finally comparisons with balloon sounding data and with other remote sensing instruments are presented.

Instrument and calibration
The Middle Atmosphere Water Vapour Radiometer MI-AWARA ( Fig. 2) was designed by the Institute of Applied Physics, University of Bern to observe middle atmospheric water vapour (Deuber et al., 2004(Deuber et al., , 2005a. MIAWARA is an NDACC instrument and is in operation since 2002. It was first located on the roof of the ExWi-building (University of Bern), 2007 it was moved to the Zimmerwald facility in the south of Bern (905 m a.s.l.). It is now equipped with an ACQIRIS-FFT spectrometer, having 16385 channels covering a 1 GHz band around the center frequency of 22.235 GHz. Calibration is done by a combination of balancing calibration (instrument properties) and tipping calibration (influence of troposphere). The balancing calibration uses the reference load and the sky under ∼20 • elevation. This delivers difference spectra (line spectra -reference spectra), as used for the middle atmosphere retrieval.
Further, tipping calibrations (according Han and Westwater, 2000) are performed in regular intervals (1 h until autumn 2007, 30 min until April 2010 and 15 min since then). For the tipping curves, the instrument observes under six different angles: four line angles with elevations between 30 • and 50 • , a reference absorber at ambient temperature as hot load and the sky at 60 • elevation as cold load (see also Fig. 2). To detect instrumental driftings, calibrations with liquid nitrogen are performed monthly.

The tropospheric retrieval algorithm
Retrievals of water vapour profiles from the 22 GHz emission line are possible because the linewidth is altitude dependent due to pressure broadening (shown in Fig. 3).
For the operational middle atmosphere retrieval , results are going into the NDACC-database) only 300 MHz around the line center are used, as the information about the middle atmosphere is mainly in the center of the line. The used spectral resolution is very high (610 kHz resp. 61 kHz at the center) to retrieve information from the mesosphere. This setup leads to a lower altitude limit of 20 km due to the bandwidth, and an upper limit of 80 km due to spectral resolution and influence of Doppler-broadening (Fig. 3). However, instrumental artifacts lead to an altitude range of approx. 30 km to 75 km that is accessible for middle atmospheric retrievals at this time.
As the linewidth increases rapidly to 1 GHz and above from the tropopause towards surface (Fig. 3), the entire bandwidth of 1 GHz (or theoretically even more) is needed to retrieve the troposphere, but the spectral resolution can be reduced to achieve a low noise level (20 MHz are used).
In our retrieval approach, the measured spectra are assumed to be a combination of contributions from water vapour, liquid water and other species (O 2 , CO 2 , N 2 ). The contributions of O 2 , CO 2 and N 2 are part of the forward modelling and are removed before the retrieval itself.
The liquid water contribution is modelled as 1st order polnyome and its offset and slope are retrieved in combination with the water vapour profile. Hence, in contrast to other approaches as e.g. in the middle atmospheric retrieval described in Nedoluha et al. (2011), where the tropospheric apriori profile used in the middle atmosphere retrieval is modelled using the opacity from tipping calibrations, not only the opacity but total power spectra are needed, containing all available information about the troposphere.
Total power spectra are delivered only by the tipping calibration, where the voltages measured by MIAWARA are converted into absolute brightness temperature using tipping curves. Hence, the temporal resolution is limited to 1 spectrum each 15 min. Usually 8 to 16 spectra are averaged (corresponding to 2-4 h) as a small bandwidth is used and the tipping curves are performed by the instrument only in second priority. Further, the natural variability of water vapour in the altitude range of the highest sensitivity of our retrieval (∼4-5 km) is not as large as in the lower troposphere. Fig. 4 shows spectrum and residuals for an example case (2 h averaged spectrum).
The water vapour volume mixing ratio (VMR) profile is retrieved from the spectra according to the optimal estimation method (Rodgers, 2000) using the software packages Arts  and QPack .
The pressure broadened spectrum (y) is a function of the profile of the species to be retrieved (x), parameter b takes into account the remaining information about the atmospheric state, which could have influence on the measured spectrum and ε is the measurement noise (Eq. 1).  The radiative transfer is modelled with the PWR98 model (Rosenkranz, 1998) (embedded in QPack). Pressure and temperature are taken from ECMWF-reanalysis, combined with surface date from Zimmerwald meteo station. The inversion of the forward model F (x) is not unique, to constrain the solution to realistic profiles, a-priori knowledge is used. As F (x) is nonlinear in the troposphere, an iterative search of the solution using the Levenberg-Marquardt approach (Levenberg, 1944;Marquardt, 1963) is performed: where x i is the state vector at iteration i, x a the a priori state vector, y the measured spectrum, F (x i ,b) the spectrum calculated with the forward model, S e the error covariance matrix, S x the a priori covariance matrix, the weighting function matrix evaluated at x i and γ is a trade-off parameter.
The iteration is initiated by setting x a as the state vector and continued until the cost-function (χ 2 ), derived from Bayes' probability theorem is minimized: Otherwise, x i+1 is rejected and the iteration is retried with an increased γ . To save computing time, the number of iterations is limited to 10. As a priori guess (x a ) a monthly climatology of the balloon soundings launched at Thun (20 km from the instrument location) is used, below 650 hPa interpolated linearly to the actual surface value from Zimmerwald meteo station. For the a priori covariance S x , gaussian statistics are assumed with a correlation length of 2 km (approx. one scale height of water vapour) and a standard deviation of 10 % at the surface increasing to 80 % at 500 hPa and staying constant above.
In a first retrieval step, the measurement covariance S e is calculated assuming a standard deviation of 0.01 K and for the second step this value is replaced by the standard deviation between forward model spectrum F (x) and measured spectrum y.

Limitations from the averaging kernel
The averaging kernel matrix A characterises the response of the retrieved VMR profile (x) to a perturbation in the "true" profile (x) and accounts for the sensitivity and limited vertical resolution of a retrieval (Rodgers, 2000).
where D y = ∂x ∂y is the contribution function matrix and K x = ∂F ∂x the weighting function matrix. The rows of A are usually just called "averaging kernels" (AVK). An example is shown in Fig. 6. The area of the AVKs is a measure of the sensitivity of the retrieval against perturbations at a certain altitude. This quantity in the literature is often called measurement response or measurement contribution in contrast to the a-priori contribution, indicating the contribution of measurement and a-priori on the retrieved profile at a certain altitude.
But in particular for the tropospheric retrieval, this concept lacks because the measurement response exceeds 100 % at middle tropospheric altitudes (as in the example in Fig. 6). In terms of mathematics, the measurement response is not limited to 100 %. A possible interpretation of this effect is an oversensitivity of the retrieval against perturbations in the specific altitude range. Herein, the less stringent term sensitivity will be used instead and enhanced sensitivity is assumed for values above 0.8/80 %. For the tropospheric retrieval, this is true between about 3 and 7 km a.s.l.
Further, the full-width at half-maximum of the AVKs is an indicator of the smoothing and the real vertical resolution of the retrieved information, which has to be distinguished from the resolution of the retrieval altitude grid. retrieval decreases with altitude from 2-3 km in the lower troposphere to 4-5 km in the upper troposphere (retrieval grid resolution ∼1 km in our case).
The peak of the AVKs attributes the information from "true" altitude levels to the altitude levels of the retrieval. In the tropospheric retrieval, the attribution deviates significantly from the correct case (Fig. 7), informations of "true" levels are mostly attributed to too low levels of the retrieval.
In summary, from the point of view of the averaging kernel matrix, the tropospheric retrieval is sensitive to perturbations from 2 to 7 km, but over-sensitive to middle altitudes (4-6 km). The vertical resolution is coarse and the information from the measurement is attributed to too low altitudes in the retrieved profile.

Error budget of the retrieval
The main components of the total random error of a retrieval method are the observation error, caused by the measurement noise and uncertainty, and the smoothing error, caused by the vertical smoothing of the retrieval method. The covariances of observation error (S obs ) and smoothing error (S smo ) can be estimated as follows: where I is the identity matrix. Hence, the absolute errors are derived as square root of the diagonal of S obs resp. S smo .
In the case of the tropospheric retrieval, the total error is dominated by the smoothing error. The smoothing error increases from 10 % at surface to 80 % in the upper troposphere, whereas the observation error stays between 0 and 10 % throughout the entire troposphere (Fig. 8). The total error lies significantly below the natural variability (dashed line in Fig. 8), except in the upper troposphere where both are in the same range.

Information content of spectrum and retrieval
How much additional information do we really get from the retrieval? This is probably the most important question concerning a specific retrieval algorithm. In a pessimistic scenario, the retrieved profile could mainly be an exponential profile, based on the surface value of humidity and scaled to the measured opacity. In other terms, there would be only one independent piece of information in the retrieved profile.

Test: what information do we gain from the retrieval?
To investigate this crucial question, a test was performed using retrieved profiles from April to September 2010 (2 h averaged spectra). To each retrieved profile, a profile was modelled in the following way: A first guess profile of water vapour density (ρ w ) was calculated from surface measurements (from Zimmerwald meteo-station) after Eq. (7) (exponential decrease with height).
Based on such an exponential humidity profile, the resulting opacity at 22 GHz was derived using the PWR98-model. The calculation was initiated assuming a water vapour scale height H of 1000 m and continued with increasing scale height until the difference between calculated and measured opacity (2h-mean from tipping calibration averaged over the entire 1 GHz frequency-band) was minimized. The final scale height (leading to the opacity closest to the measured value) varies between 1500 and 4500 m and is on average 2500 m. This value is close to a mean scale height calculated from IWV (derived from GPS wet delay) and relative humidity measured at surface.
Comparing modelled with retrieved and sounding profiles (Fig. 9), both retrieved and modelled profiles follow the variations of the sonde data quite well, but the modelled profiles Timeseries of H 2 O vmr for different altitude levels. Red: retrieved from MIAWARA (2 h averaged spectra), blue: corresponding sounding data from Payerne (regridded to MIAWARA grid) and dashed: a-priori guess. Shown data is restricted to cases with near timecoincident sounding and retrieval profiles (mostly two cases per day). mostly are too wet, above all in the upper troposphere. At 4.8 km, (highest sensitivity of the retrieval) the model delivers values on average about 30 % wetter than the soundings and 10-15 % wetter than the retrieval. The wet bias between retrieval and model is higher for cloudy/rainy conditions (high humidity near the surface), thus the resulting scale height is too high and the modelled profile too steep. Further Fig. 10 shows, that the correlation between retrieval  and sounding resp. model and sounding are nearly identical up to 4 km, most likely due to the influence of the surface humidity measurements, whereas above the correlation between model and sounding decrease rapidly in contrast to the correlation between retrieval and sounding. We thus conclude, that the retrieval delivers more than only the surface value scaled with opacity after all.

Theoretical concepts
After testing in practice, the question arises, if it is possible to determine the information content of a retrieval method out of the algorithm itself. For this, Rodgers (2000) proposes several concepts, which are based on information theory. These concepts were derived for linear retrievals, but should in principle also be valid for slightly non-linear retrievals as the tropospheric retrieval algorithm.
The degree of freedom of a signal gives the number of useful independent quantities in the signal (here the measured spectrum). This quantity can be determined as trace of the averaging kernel matrix A.
Another concept to get the number of independent quantities is the effective rank of the retrieval, which is the number of singular values of S − 1 2 e K x S 1 2 x greater than unity (Rodgers, 2000).
Further, A can be used to determine the information content of the retrieval after Shannon and Weaver (1949) (Shannon information content, H ), which gives the number of bits of information contained in the retrieval (Rodgers, 2000): where I is the identity matrix and λ i are the eigenvalues of A). Table 2 shows a statistical overview of these parameters, calculated for tropospheric retrievals from March 2007 until now. It can be seen, that generally, the information content and the amount of independent information of the tropospheric retrieval is lower than that of the stratospheric retrieval. The degrees of freedom and the Shannon information content of the tropospheric retrieval show a high retrieval-toretrieval variability and further a high correlation with the tropospheric opacity (r = 0.8), which leads to an annual cycle with its minimum in winter (Fig. 11). The more humidity is around (the higher the opacity), the more information the retrieval can extract from the spectrum. As the damping of the total signal also increases with opacity, the relation opacity -information content has a maximum at a certain opacity, typically near ∼30 % and an increase of the opacity above this value leads to a decrease in information content.
The Shannon information content of the retrieval lies between 1 and 4 bits, so the retrieval is able to deliver between 1 and 4 independent pieces of information about the atmospheric state. This corresponds quite well with the vertical resolution of 2-5 km, which also leads to not more than maximal 3-4 independent informations throughout the troposphere. ) and regridded to the retrieval grid using the Curtis-Godson approach (snd regr.). Further plotted are the a-priori profiles (MIA a priori). The left plot (8 September 2010) shows a situation of good congruence. On the right plot (16 September 2010), the balloon sonde detects a very dry layer between 3 and 6 km. The retrieval, not able to resolve this feature, makes a strong smoothing between the dry layer and above, which leads to far too high values below 6 km and too low values above.

Comparison with in-situ measurements
The only available in-situ measurements of water vapour throughout the troposphere are balloon soundings. For the a-priori profiles, balloon sounding launches in Thun were used. As those were performed only until autumn 2008, other comparison data had to be found. Payerne (40 km from the site) is the closest station, where operational balloon soundings using SRS 400 sondes (with hygristors) are performed twice a day. The sounding profiles were corrected according Miloshevich et al. (2009) to take into account the daytime radiation dry bias of sensors of this type. For the comparison, the profiles were regridded to the MIAWARA-altitude grid using the Curtis-Godson approach (e.g. Andrews, 2010, p. 94ff), preserving the integrated water vapour amount. The comparison showed that the retrieval is able to reproduce the sonde data well up to 7 km (Figs. 12,13 and 15,left) which corresponds well with the upper limit of enhanced sensitivity from the AVKs. The correlations are above 0.8 for this altitude range. The retrievals deviate from the soundings by 30 % to 40 % in average up to 8 km. The standard deviation increases from 20 % at the surface to more than 100 % in the upper troposphere (Fig. 14).
The quite high average deviation originates at least partly from its high sensitivity to outliers, so the median which is less sensitive to outliers, lies only at 10-20 % (Fig. 14).
Further, we have to bear in mind the limited vertical resolution of the retrieval, in contrast to the very high resolution of the soundings (see Fig. 15 and Table 1). Certainly, the retrieval is not able to reproduce small scale variabilities as often seen in the soundings. An example of such a situation is shown in Fig. 15  6 km, whereas the retrieved profile stays close to the a priori. It is obvious, that such situations can lead to tremendous outliers in percentual deviation (e.g. 479 % at 4.9 km in the example profile).
A major problem in this comparison is the lateral distance between the observation site and the sounding site (Payerne). Beside it, the sonde moves in space and its profile is in reality not over one single point, a problem which is relevant for most intercomparisons between balloon soundings and ground-based instruments (e.g. Sussmann et al., 2009;Vogelmann et al., 2011).
Simulations with a simple trajectory model based on wind velocity and direction measured during soundings showed, that due to the predominant SW to NE wind pattern in these parts of the Swiss plains, the sonde is often driven towards Zimmerwald observatory, but seldom passes the surroundings of the observatory before crossing the tropopause (Fig. 16). In summary, the retrievals are compared with airmasses mostly more than 30 km away, which is quite a lot in the troposphere. Tests with limitating the comparison to retrieval-sounding pairs for when the sonde was driven towards the direction of Zimmerwald observatory show a significant decrease in standard deviation and a small decrease in bias (Fig. 14).

Comparison with other instruments
Looking for further instruments to compare with, there is no other instrument available, which provides humidity profile data and is located near Zimmerwald observatory. The closest remote sensing instruments are situated in Payerne (40 km from Zimmerwald site), there MeteoSwiss operates a RPG-HATPRO and a Raman-Lidar instrument (RALMO).

RPG-HATPRO and RALMO-Lidar at Payerne
In contrast to MIAWARA that detects the peak of the 22 GHz line with high spectral resolution, RPG-HATPRO ("Humidity and Temperature profiler") is a multichannel microwave radiometer with several channels at around 20, 30 and 50 GHz. The RPG-HATPRO instrument was developed by Radiometer Physics Gmbh and delivers tropospheric humidity profiles and columns (Rose et al., 2005), operating continously and in principle under all weather conditions. Raman-Lidar ("Light detection and ranging") spectroscopy uses the effect of vibrational Raman scattering of a laser beam by N 2 and H 2 O molecules. Scattered radiation is measured by a collocated detector, hence the water vapour profile can be derived (e.g. Lazzarotto et al., 1998;Weitkamp, 2005). This technique has the potential to deliver profiles up to the lower stratosphere, but is heavily limited by cloud occurence. Therefore the measurements are continous, but with a strongly varying upper altitude limit and a lot of gaps. Figure 17 shows timeseries of RPG-HATPRO and RALMO measurements compared with MIAWARA retrievals for four altitude levels. In the lower and middle troposphere, the temporal variations of MIAWARA and RPG-HATPRO are quite similar, but MIAWARA has a certain wet bias compared to both other instruments, which is more distinct in very humid situations. The measurements of the RALMO-Lidar also look similar at this altitudes, but at dry conditions, the RPG-HATPRO instrument tends to have a wet bias.
In the upper troposphere, the RPG-HATPRO shows more variations than MIAWARA, but the difference is not as large as when comparing MIAWARA with balloon sounding data. RALMO-Lidar often delivers much higher data than both radiometers, when it reaches this altitude range.
In summary, all three instruments coincide quite well in lower altitudes. At least a part of the differences between MI-AWARA and RPG-HATPRO/RALMO likely originates from the large lateral distance between the sites. Further, there are substantial differences in vertical resolution (see Table 1 for an overview).

Further comparisons
Additionaly, the MIAWARA retrievals where compared with ECMWF reanalysis data. This comparison suffers under the poor horizontal resolution of the reanalysis data (1.125 • , corresponding to ∼120 km N-S resp. ∼80 km E-W for central Europe) scarcely resolving the Alps. Thus sometimes the profiles correspond quite well and sometimes not at all, leading to no clear conclusion.
Further, there is an ongoing study comparing MIAWARAprofiles with profiles from the Fourier Transform Infrared Spectrometer (FTIR), operated at Jungfraujoch observatory by the Institute of Astrophysics and Geophysics of University of Liege. However this comparison is as well difficult because the Jungfraujoch-observatory is located at 3580 m.a.s.l. in high-alpine environment and 60 km away of Zimmerwald. Nevertheless, the study could deliver some hints regarding the performance of the MIAWARA-retrieval in the upper troposphere.

Clouds/liquid water
In the range of the 22 GHz line, the influence of cloud liquid water can be assumed as a linear term, as shown on Fig. 1. Thus as already mentioned in Sect. 3, a polynomial of order 1 is included in the forward model and its offset and slope are part of the state vector of the retrieval. First retrieval simulations showed a good correlation between the retrieved slope and offset and the liquid water content ). In the frame of investigations concerning the estimation of the integrated water vapour content (IWV) from the opacity measured by MI-AWARA, comparisons with IWV derived from GPS-wet delay showed a tendency of overestimation of the IWV from the MIAWARA opacity when clouds are present. As the IWV is estimated from opacity using a more or less linear relationship (Deuber et al., 2005b), this is likely due to too high opacity measurements. This then could have consequences for the retrieval, mainly leading to an overestimation of the humidity over the entire profile. In principle, this should be managed by including the linear term in the retrieval. As clouds are very irregular in space and time, it is very difficult to estimate their influence on the retrieval. An IR-sensor has recently been installed next to the MIAWARA horn antenna (looking into the same direction) as a kind of cloud detector providing information that could later be used to optimize the retrieval.

Conclusions and outlook
We have shown that the tropospheric retrieval algorithm is able to deliver realistic tropospheric water vapour profiles up to 7 km with enhanced sensitivity between 3 and 7 km (maximum at 4-5 km), with a coarse vertical resolution of 3-5 km and a random error between 10 % at surface and 80 % at 10 km.
The resulting profiles correspond well with balloon soundings up to 7 km, with an average deviation of 30-40 % (median 10-20 %) and a correlation of more than 0.8 in middle troposphere.
This results are significantly influenced by the lateral distance between soundings and instrument location. A comparison with closer collocated measurements perhaps would deliver better results.
Large vertical gradients in the true profile (e.g. layers of very dry air) lead to large differences between retrieval and in-situ due to the coarse vertical resolution and strong vertical smoothing.
Further the performance of the retrieval is strongly correlated with opacity and thus total water vapour amount (IWV). Opacity and information content are in a balance between sufficient water vapour to have a detectable effect on the signal and not too much water vapour, that the total signal is still detectable. The maximum of information content is probably reached at an opacity of near 30 %.
The next step will be to combine the tropospheric MI-AWARA retrieval with the operational middle atmospheric MIAWARA retrieval to an integrated retrieval approach.
Including spectra of other instruments or in-situ observations, leading to an integrated profiling technique as presented in Löhnert et al. (2004) would be a further goal.