Introduction
The atmospheric water vapor is a key parameter of the climate
system and the understanding of its variation under a climate evolution
relies on a thorough documentation of its horizontal and vertical
distributions . It is
a major greenhouse gas, part of a strong positive feedback that amplifies the
warming caused by increases of greenhouse gases in the atmosphere
, and, because of its
short life cycle compared to other species, its distribution is mainly
influenced by natural processes that occur at all scales, from the large
scale cells of the atmospheric circulation to the scale of the hydrometeor
e.g.,.
While direct measurements by radiosondes are the most simple ways to look at
the vertical structure of the relative humidity (RH) field, the network of
stations (permanent or not) is unequally distributed between the two
hemispheres and there is a clear gap of data over the oceans
. The climate record built by aggregating the observations
from the various operational sensors used worldwide (e.g., Vaïsala,
MEISEI, IM-MK3, MODEM) requires regular intercomparison campaigns, such as
those organized by the World Meteorological Organization ,
and the development of dedicated correction schemes in order to correct most
of the observational errors (such as the drying effect of the radiative
heating on the Vaïsala sensor or the insensitivity of the MEISEI
system
under dry conditions). Quite recently have summarized the
systematic instrumental biases between several versions of the Vaïsala
system that, if uncorrected, would affect analyses of the global moisture
field. An alternative is the fleet of space-borne radiometers with channels
located in spectral bands sensitive to the absorption by water vapor. Such
instruments provide a more global sight of the distribution of the water
vapor field since the late 1970s, in the thermal infrared (IR) (in the 6.3 µm
band) and in the microwave (MW) domain (in the 183.31 GHz absorption
line). One can mention, among others, the successive imagers of METEOSAT
(Meteorological satellite, EUMETSAT) and of GOES (Geostationary Operational
Environmental Satellite, NOAA); the sounders HIRS (High resolution Infrared
Radiation Sounder, NOAA), AIRS (Atmospheric Infrared Sounder, NASA), IASI
(Infrared Atmospheric Sounding Interferometer, EUMETSAT and CNES) and CrIS
(Cross-track Infrared Sounder, NASA); and the microwave sounders AMSU-B (Advanced
Microwave Sounding Unit-B, NOAA), MHS (Microwave Sounding Unit, EUMETSAT),
MWHS (Microwave Humidity Sounder, CMA) and ATMS (Advanced Technology
Microwave Sounder, NASA). One can browse the OSCAR web page (Observing
Systems Capability Analysis and Review tool) of the WMO (World Meteorological
Organization) for an exhaustive list of the past, current and planned
missions (http://www.wmo-sat.info/oscar/). However, these so-called “water
vapor” channels provide indirect estimations of the RH since they
measure the upwelling radiation. Estimation of the RH from these measurements
are thus strongly linked to the constraints of the underlying inverse problem
(RH = f(radiation)).
Upper tropospheric humidity (UTH) can be one way to interpret these “water
vapor” measurements. The retrieval of UTH was initiated by
and for observations in the 6.3 µm
band and successfully applied to 183.31 GHz measurements by
, or . The
logarithmic transformation of the BT into UTH is quite simple and elegant. It
relies on a large training data set that provides the parameters of the
transformation and on a precise definition of UTH: a mean RH value
vertically weighted by a dedicated function (a so-called sensitivity
function) that is related to the transmission of the atmosphere in the
spectral domain. The well-known drawback of this method is that the weighting
operator used to define the UTH has a width and altitude of peak that depend
on both the absorber amount and on the temperature profile: the drier the
atmosphere (i.e., higher BTs), the thicker the layer, and the peak of maximum
of sensitivity shifts downwards. Therefore, there is no pressure attribution
of the area of the troposphere under consideration.
The first aim of this study is to perform an analysis of the
contribution of the two microwave instruments of the Megha-Tropiques mission,
operating since October 2011, for the retrieval of layer-averaged RH
profiles. The SAPHIR sounder and the MADRAS imager are both dedicated to
improving the documentation of the atmospheric water cycle. In a previous
paper, showed the expected improvements for the
estimation of the RH profiles thanks to the combination of those two
instruments, highlighting the gain of information for both ends of the
troposphere when only a subset of the channels of MADRAS are combined to
SAPHIR measurements. Despite the short lifetime of MADRAS, the availability
of a few months of measurements constitutes a test bed for future missions,
such as the Second Generation of the Meteorological Operational satellite
program (MetOp-SG, EUMETSAT Polar Satellite) planned for launch in 2020.
Indeed, the Microwave Sounder (MWS) and the Microwave Imager (MWI) of
MetOp-SG have channels very close to those of SAPHIR and MADRAS.
The second objective is to demonstrate the potential of purely statistical
methods in the following problem: given a set of brightness temperatures
(BTs) provided by a space-borne radiometer, what is the vertical distribution
of RH and what are the expected limits of such an approach? Many retrieval
approaches exist; however, to our knowledge, a few of them estimate the RH profile
from a simple input data set restricted to the BTs. Indeed, most of the
approaches are physically based iterative techniques such as a n-dimensional
variational algorithm that converges to the least biased profile using other inputs as prior knowledge of the system under study (such as surface
emissivity, temperature profile and sometimes a prior water vapor profile for
BT simulations). These variational techniques are well established
and it would be unnecessary to reinvent a similar algorithm.
Here, the selected approach is to learn the relationship between the inputs
(i.e., the BTs) and the output (i.e., the averaged RH in a specific atmospheric
layer) directly from a training set that implicitly contains all the relevant
information such as the statistical distribution of the atmospheric RH or the
radiative transfer equation from the set of BTs. We chose not to discuss the
relevant a priori constraints that could improve the retrieval or on the
choice of a relaxation scheme.
The current operational retrieval (version 6, released in 2013) of water
vapor profiles (layer and level products) from the instruments of the Aqua
mission (namely AIRS, AMSU and HSB (Humidity Sounder for Brazil, INPE), see
) differs from these approaches : a
stochastic approach combined with a neural-network defines the first guesses
of clear-air temperature and humidity profiles following
instead of a climatology in the previous version. Above 118 hPa, the water
vapor profiles are filled with a climatology from the ECMWF IFS model
(European Center for Medium-Range Weather Forecasts/Integrated Forecasting
System). The final profiles are obtained from a physically based iterative
procedure that adjusts the transmittance of the radiative transfer model.
This algorithm requires either both IR and MW measurements (AIRS + AMSU) or
IR-only measurements (AIRS) and forecast surface pressures, which are taken
from the ECMWF forecasts. This last version is currently under evaluation,
but first performance analyses using radiosondes measurements as a reference
show an improvement of the estimation of water vapor in the mid and lower
troposphere that is related to the new definition of the first guesses and to
the cloud-clearing methodology .
As in , the retrieval technique is based on the
Generalized Additive Model (hereafter GAM, ) and its ability
to model multi-variate and non-linear relationships. The choice of GAM over
other retrieval techniques is relatively subjective. So to ensure that the
main patterns are independent from the choice of the statistical model, a
comparison against two other models is done. We consider two other machine
learning regression methods based on different design algorithms and
different learning techniques. A multi-layer perceptron (MLP), which is a
neural-network, and a least squares support vector machine (LS-SVM), which is
a kernel method. The MLP, as defined by , is generally
considered as reference because it is the most common approach to develop
non-parametric and non-linear regression in various application domains. MLPs
have been successfully applied in remote sensing application, with or without
prior information e.g.,.
The second one is the least squares support vector machines (LS-SVMs)
, which belongs to the family of kernel methods. LS-SVMs
are models with high generalization capabilities and numerous analysis
involving real data in other areas
have shown that SVM-based techniques are comparable in efficiency to MLPs.
The description of the data at hand and of the context of the work is made in
Sect. 2. The three non-linear models, GAM, MLP and LS-SVM and their design
for the study are detailed in Sect. 3. Section 4 is dedicated to the
evaluation of the estimations over a realistic data set in order to have a
large sample of evaluation. The application to Megha-Tropiques measurements
is discussed in Sect. 5 with a comparison to radiosonde measurements.
Section 6 finally draws a conclusion on the study and discuss the ongoing
work.
Description of the non-linear models
General aspects
To ensure the consistency between the mathematical descriptions of the three
statistical models, the notation will be as follows: the estimation of the
RHi of a specific layer i (i∈[1;7]), namely the output, is
performed from a vector of BTs, the inputs, which is a p-dimensional
covariate noted BT (p∈[1;15]). Thus, for each layer i the
training data set is made of (p+1)-tuples BTk,RHkik=1N, where the cardinality of the set N is 16 310 (1631
profiles × 10 noisy reproductions).
The GAM, MLP and LS-SVM models are built with three different statistical
supervised learning techniques. Overall, the learning phase consists of using
a set of training examples to produce an inferred function. Each example is a
pair made of an input vector (BT) and a desired output value
(RHi of layer i), without other a priori information. The nonlinearity
between the input vector BT and the RHi is more or less strong
depending on the channel of observation and the atmospheric layer
. This is especially true for upper tropospheric
channels, as illustrated on Fig. for the BT of the 183.31±1.1 GHz channel of SAPHIR and the RH3 and RH4 (taken from the synthetic
base). Therefore the approach chosen is to adjust and optimize the
BT-to-RHi relationships separately for each of the seven layers.
The data set described in Sect. is randomly divided into two
subsets: a subset of 2/3 of the N samples (∼11 000 samples) is
dedicated to the training and to the validation of the models while the
remaining 1/3 forms the test set (∼5000 samples). Some parameters of
the three modeling methods have to be adjusted and the selected models are
those with the best generalization capabilities. These parameters are tuned
to minimize the validation error which is an empirical estimation of the
generalization error. Thus the selection of the models consists of the
involvement of an efficient validation method. Various validation techniques
exist in the literature . The most popular techniques are
probably the cross-validation method and the leave-one-out (LOO) technique,
which are implemented according to the modeling method. Note that since the
three modeling methods will be compared, we focus on efficient validation
techniques and pay less attention to the computational burden they involve.
The input vector BT is normalized (zero mean and unit variance).
While such normalization does not affect the estimation provided by GAM (but
only the relative weight of each predictor in the fit), the normalized input
data set is the same for all models in order to simplify the process. A principal component analysis (PCA) is also implemented on the BT
to feed each statistical model with uncorrelated and linearly independent
data. Indeed, the weighting functions of the six channels of SAPHIR slightly
overlap each other to cover the entire absorption line. As a result, while
each channel receives mainly the radiation emitted by a given layer of the
atmosphere, contributions from layers above and below are not negligible,
yielding to some interdependencies between the channels. Finally, in order to
account for the known exponential relationship between the BT in the
183.31 GHz line and the atmospheric RH (for instance at 183.31±1.0 GHz,
see , and ), the use of the exponential
function is also considered, which has also the advantage to ensure the
retrieval of positive values. The effect of the PCA and of the exponential
function have been evaluated for each statistical model for each layer i.
The configuration with the smallest validation error was selected.
Generalized additive model
GAMs have recently started to be used in environmental studies as a surrogate
to traditional MLP thanks to their ability to model nonlinear behaviors while
providing a control of the physical content of the statistical relationships
. Therefore, among the recent works, one can cite the use of
GAM to perform a statistical downscaling of precipitations
e.g.,, to analyze time series
and more recently
to solve inverse problems e.g.,. A reasonable number of
papers provide in-depth descriptions of the GAM algorithm, and one can refer
to for a detailed presentation of the background and the
implementation issues of such model. We provide here only briefly its main
characteristics. A GAM infers the possible nonlinear effect of a set of p
predictors (BT1,…,BTp) to the expectation of the predicant
RHi. It is expressed as followed:
g(E(RH^i|BT))=ϵi+f1(BT1)+f2(BT2)+…+fp(BTp),
where g is a linearizing link function between the expectation of
RH^i given BT and the additive predictors
fj(BTj), which are smooth and generally non-parametric functions of
the covariates BT1,…,BTp. Finally ϵi is the residual
that follows a normal distribution. Here, penalized regression cubic splines
are used as the smoothing functions and are estimated independently of the
other covariates using the “back-fitting algorithm” . Part
of the model-fitting process is to determine the appropriate degree of
smoothness, which is done through a penalty term in the model likelihood,
controlled by a smoothing parameter λ. λ determines the trade
off between the goodness of fit (λ→0, gives a wiggly
function) of the model and its smoothness (λ→∞).
Part of the GAM fitting process is to choose the appropriate degree of
smoothness of the regression splines. The smoothing parameter λ is
adjusted to minimize the generalized cross validation score (GCV). One can
refer to and for more details on the training
algorithm.
Multilayer perceptron algorithm
An artificial neural network is an interconnection of simple computational
elements (nodes or neurons) using functions that are usually non-linear,
monotonically increasing and differentiable . The
multilayer perceptron (MLP) algorithm belongs to the family of artificial
neural networks . MLPs are attractive candidates thanks
to various well known properties. For instance, an MLP is a universal function
approximator and thus can represent any arbitrary functions
, so they are widely used for the approximation of
non-linear transfer functions. Moreover MLPs have been shown to be able to
deal with noisy data. In our case, defining the architecture of the MLP
consists of (i) selecting the relevant input variables and (ii) setting the
number of neurons in the hidden layer. A fixed architecture defines a
function family F(⋅), in which we seek the best function allowing us to
invert BTs. It is possible to express this MLP model in a mathematical way
as
RH^i=F(W,BT),
where F(⋅) and W correspond respectively to the transfer
function and the synaptic weights matrix of the model. The main critical
point with the MLP method is the way to choose the optimal architecture and
to adjust the corresponding internal parameters (the weights). These
parameters are determined so as to minimize the mean quadratic error computed
on the training data set. As our goal is to create a nonlinear model with good
generalization capabilities, the problem of overfitting must be considered.
To avoid overfitting, the LOO validation method is implemented to check the
possible overfitting and to optimally select model parameters such as to
minimize the validation error.
Least squares support vector machine
SVMs are kernel methods . They are attractive
candidates for nonlinear modeling from data. Thanks to various desirable
properties, they have the ability to build models with high generalization
capabilities by avoiding overfitting and controlling model complexity. A
least squares formulation of SVM called LS-SVM was proposed to make the SVM
approach for modeling more generally applicable, such as for dynamic modeling
or for implementing sophisticated validation techniques
. The SVM technique and its derived formulations have
found applications in atmospheric sciences, such as in statistical
downscaling of precipitation , in
regression problems or in classification from remote sensing
measurements .
The LS-SVM training procedure consists of estimating the set of adjustable
parameters w and b by the minimization of the cost function:
J(w,e)=12wTw+12C∑k=1Nek2,
with ek the prediction error for example k and N the size of the
training set. C is an hyperparameter that controls the tradeoff between the
prediction error and the regularization. This optimization problem can be
cast into a dual form with unknown parameters α and
b, α being the vector of the Lagrange
multipliers. Thus, the parameters can be computed by resolving a set of
(N+1) linear equations.
Since LS-SVM models are linear in their parameters models, the solution of
the training phase is unique and can be computed straightforwardly, using the
set of (N+1) linear equations as stated above. Here the validation error is
estimated using the virtual LOO (or VLOO) method. This method, first proposed
for linear models and later extended to nonlinear models
, allows to estimate the validation error by performing
one training involving the whole available data. This estimation is exact
when dealing with linear-in-their-parameters models, such as LS-SVM models,
while it remains an approximation for models which are nonlinear with respect
to their parameters. More recently, a framework described by
implements the VLOO method for LS-SVM models. This method
gives a fast and exact estimation of the validation error, which is a great
benefit for reducing the computational burden involved by other validation
techniques such as the cross validation method .
Performance over the synthetic data set
The retrievals of layer-averaged RH profiles provided by GAM, MLP and LS-SVM
are compared for the two schemes. The following criteria are computed over
the test set (∼5000 samples) for each atmospheric layer i: the mean
error (referred to as the “bias”), the standard deviation of the error
(SD) and the Pearson's correlation coefficient (R) between the estimated
RH^ and the reference RH, using the variance-covariance matrix
(cov):
llSDi=1N∑k=1N(RHki-RH^ki)2biasi=∑k=1N(RHki-RH^ki)Ri=cov(RHi,RH^i)SD2(RHi)⋅SD2(RH^i).
Moreover, the notation %RH will be used to make easier the discussion between
relative units (in %) and RH units (in %RH). The size of the test set
allows to consider that all the results are significant at the 99.9 % level
of confidence.
On the optimization of the models
As mentioned in Sect. , it is important to underline that the
optimized models are different for each atmospheric layer. Indeed, in the
case of the SAPHIR-MADRAS scheme a selection of the relevant channels is
performed using the GSO procedure. The GSO procedure helps to reduce the
complexity of the algorithms by reducing the number of inputs of the
available set of data. It is implemented in the present case with a
reasonable threshold of 10 % on the variation of the variance. This means
that the inputs that enhance the error variance less than 10 % are considered
as irrelevant. Of course the same inputs could be used for the different
models with small deterioration. For example, for the layer 4 (425–650 hPa),
a sensitivity analysis has shown that, when using GAM, the best set of inputs
is {S3, S4, S5, S6, M3, M4} and if M9 is added, the SD decreases from 4
to 3.8 %RH. When the RH retrieval is based on the MLP approach, the SD
increases from 2.8 to 3 %RH. In these two cases the difference is
relatively small. In fact, an in-depth study of the relevancy of the channels
reveals that the selected inputs are only weakly dependent on the retrieval
model but are highly dependent on the atmospheric layer.
For the SAPHIR-only scheme, all channels are used.
An impact study of the pre-processing of the data on the accuracy of RH
retrieval shows that, whatever the atmospheric layer or algorithm considered,
the improvement obtained with PCA is negligible (<3 % of the error
variance). The use of uncorrelated inputs is thus not necessarily required
for the considered models. Finally, the linearization of the problem with the
exponential function is beneficial only for the MLP: in this case it leads
to a decrease of the error variance of about 50 %, while no significant
improvement is observed for LS-SVM and GAM (<3 % of the error variance).
Performance of GAM against the two other models
From here on, noise-free BTs are considered in order to only assess the
statistical approaches. The radiometric noise of the two instruments are
implemented for the evaluation of the retrieval of RH with profiles
considered as reference profiles. Vertical profiles of mean biases, SD and
R
between the observed RH and the estimated RH are presented on Fig. .
At first sight, the analysis of one layer at a time clearly
shows that the overall quality of the retrieval is layer-dependent, meaning
that it is strongly constrained by the physical limits of the inverse
problem. Thus, the layers covering the free troposphere (layers 2 to 6) are
quite well modeled, with small SD reaching values between 2.6 %RH and
7.8 %RH, and are characterized by a small scatter, with R lying in the 0.85–0.97
interval. The combined use of SAPHIR and MADRAS BTs is enough to explain
more than 70 % of the variability of the RH at these layers. The retrieval of
the RH of the extreme layers (layer 1 for the top of the atmosphere, layer 7
for the surface) seems more delicate and is clearly limited by the inputs at
hand: as illustrated on Fig. a, the six channels of SAPHIR
observe the emitted radiation grossly between 150 and 850 hPa, and although
MADRAS brings some additional relevant measurements, other information such
as the surface emissivity might contribute significantly to better constrain
the retrieval near the surface.
Vertical profiles of R (left), biases (center, in %RH) and SD
(right, in %RH) for the MLP (solid line), the GAM (dashed line) and the
LS-SVM (dotted line) models, trained on noise-free SAPHIR and MADRAS data.
Mean bias (in %RH), standard deviation (SD, in %RH) and
correlation coefficient (R) for the seven layers, and defined between the
observed RH and the estimated RH. The estimated RH is obtained using the GAM
approach from the two configurations: SAPHIR-MADRAS joint measurements and
SAPHIR-only measurements. For the SAPHIR-MADRAS configuration, the relevant
channels selected using the GSO procedure are listed using the labels Si
and Mj indicated in Table .
SAPHIR &
Layer #
Scores
MADRAS
SAPHIR
Relevant channels
All channels
# 1 (85–100 hPa)
bias (%)
S1, S2, S3, S5
1.98
2.36
SD (%)
M1, M2, M3, M4, M5,
8.92
9.91
R
M6, M7, M8, M9
0.67
0.57
# 2 (130–250 hPa)
bias (%)
S1, S2, S3
-0.01
-0.09
SD (%)
M1, M2, M3, M4
5.96
6.02
R
M5, M6, M7
0.92
0.91
# 3 (275–380 hPa)
bias (%)
S1, S2, S3, S4, S5, S6
0.48
0.48
SD (%)
M1, M2, M3, M5, M6
3.67
3.79
R
0.95
0.94
# 4 (425–650 hPa)
bias (%)
S1, S3, S4
0.45
0.08
SD (%)
M1, M3, M5, M7
3.56
4.72
R
0.97
0.95
# 5 (725–850 hPa)
bias (%)
S1, S3, S5, S6
0.95
2.69
SD (%)
M3, M4, M7, M9
8.55
11.68
R
0.91
0.83
# 6 (900–955 hPa)
bias (%)
S1, S3, S4, S5, S6
0.11
-1.53
SD (%)
M1, M3, M4, M5, M6,
6.72
11.65
R
M7, M8, M9
0.91
0.70
# 7 (1013 hPa)
bias (%)
S1, S2, S3, S4, S5, S6
0.36
-0.02
SD (%)
M1, M2, M3, M4, M5,
8.69
9.67
R
M6, M7, M8, M9
0.54
0.34
The LS-SVM technique provides overall the best results, with the highest
correlation coefficients and the lowest variance for five layers over the
seven considered in this study. In fact, theoretically, these three learning methods
are equivalent, but the conditions of their implementation are
somewhat
different. First, since the LS-SVM are linear-in-their-parameters models, an
exact validation method was implemented. The resulting procedure of selection
of the relevant inputs is quite efficient. In addition, MLP models are
nonlinear with respect to the adjusted parameters, and their training amounts
to a nonlinear optimization. Several trainings with different initializations
must be performed with no guarantee to achieve the best generalization
capability given a network architecture. From this point of view, the LS-SVM
approach is thus more successful. Finally, concerning the GAM approach, the
smoothing splines used guarantee a nonlinear behavior, continuity and
smoothness which are important characteristics in a learning algorithm.
Another convenient characteristic for splines is that they are monotonic:
the back-propagation algorithm can estimate parametric and non-parametric
components of the model simultaneously.
The three methods perform equivalently: R and SD are very close to each other.
The MLP approach provides slightly more biased estimations of the RH
throughout the troposphere while the GAM and LS-SVM methods are centered.
This distinction is more pronounced for the surface layer with retrievals of
RH characterized with a 6.9 %RH bias when using the MLP, whereas the bias is
0.06–0.07 %RH with GAM and LS-SVM. A sample of layer-averaged profiles is
presented on Fig. , with the observed relative humidity
and the three estimations using the three approaches. As discussed above the top
layer is the less well retrieved from the set of BTs, whatever the approach,
while the mid-tropospheric layers (3 to 6, i.e., 350 down to 950 hPa) are
pretty well estimated.
Examples of three estimations of RH profiles (in %RH) extracted from
the database using the SAPHIR-MADRAS configuration. The observed profile is
the thick gray line and the three estimations (plain, dashed, dots,
respectively, for MLP, GAM and LS-SVM) are in black.
RH and the associated errors (both in %RH) projected on the
10×10-neuron self-organizing maps obtained from the step of clustering
of the original RH profiles (see Sect. 2.1): the upper row shows the mean
RH for the seven layers, and the lower row shows the errors of estimation using
GAMs. Note that the color scales of the maps representing the 1013 hPa layer
and the error estimated for layer 1 are adjusted.
Scatter-plots of the observed RH versus the estimated RH (in %RH)
for layer 4 (top row) and layer 6 (bottom row). The estimations are done
using GAMs trained from SAPHIR-only BTs (left-hand side column) and from
SAPHIR and MADRAS BTs (right-hand side column). The dashed line is y=x line
and the solid line represents the linear regression. The correlation
coefficient (R) and the standard deviation of the error (SD) are provided
within each panel.
The errors obtained from the GAM estimation are projected on the 10×10
Kohonen maps that were obtained during the stage of clustering of the
atmospheric layers (Sect. ) and give a structural view of the
errors. The projections are shown on Fig. . This
allows to analyze the retrieval errors with respect to the clusters of RH
revealed by the maps, and allow for a deeper analysis related to
meteorological situations than the global biases and SD. A pattern of a large
bias (∼44 %RH) clearly stands out of the map of layer 1 (near
tropopause), and this bias is associated to the neurons related to a moist
structure at this top layer. This suggests that GAM has difficulties when
dealing with a moist upper troposphere, that could be due to an
under-representation in the training set. A similar statement can be made for
the 6th layer, with the neurons associated to the largest bias in the
upper left corner (negative in this case) being this associated to the more
dry neurons of this layer. There is no clear pattern standing out of the
remaining layers, even for the surface layer, meaning that the errors are
uniformly distributed.
Performance for the two instrumental schemes
In the following, noisy BTs are used in order to discuss the results over the
realistic instrumental configurations. Two GAMs are optimized for each
atmospheric layer, one for each instrumental scheme: a SAPHIR-MADRAS scheme
and a SAPHIR-only scheme. The evaluations over the validation set are
summarized on Table , with biases, SD and R. An illustration
of the scatter is given with Fig. for two atmospheric
layers: layer 4 (∼425–650 hPa) and layer 6 (∼900–955 hPa). These
statistics allow for a discussion on the influence of MADRAS BTs on the
quality of retrieval of the RH. MADRAS channels are an asset for the
estimation of the RH profile since their use reduce the scatter (improvement
of R and reduction of SD). The pattern of scatter follows the distribution of
the weighting functions of SAPHIR: the best estimations are obtained for the
mid-tropospheric layers (R=0.83 to 0.97, over layers 2 to 5) where the
functions strongly overlap, and the quality of the estimations decrease
towards the edges. One can also note that the retrieval model of the 7th
layer uses all 15 BTs of the microwave payload, but this does not allow for a
robust estimation of the RH (R=0.54, corresponding to a R2 value of 0.29).
An estimation of the RH profile down to 955 hPa seems reasonable if no other
constraint is added to the model.
When the SAPHIR-only scheme is used, such a statement can be extended to the
top layer (R=0.57, R2=32), thus limiting the estimation of RH from layer 2
to layer 6. For these atmospheric layers, the biases are small and range
between 2.69 and -1.53 %RH. The impact of MADRAS BTs on the retrieval of
RH is important to keep in mind when specific analysis of temporal and
spatial variations of the RH field will be performed over the MT (Megha-Tropiques) lifetime.
Application to Megha-Tropiques measurements
Some considerations on the Megha-Tropiques observations
As other similar radiometers with varying viewing geometries, SAPHIR
observations are subject to the so-called “limb effect”, described for
instance in . This means that, at SAPHIR frequencies, the
pixels on the edge of the swath have BTs artificially lower than the pixels
located in the center, the atmosphere of the former having a larger optical
depth than the latter. For the same thermodynamical profile, this limb effect
yields to shift upward the sounding altitude of the outermost pixels. Of
course this needs to be taken into account in any retrieval processes
e.g.,. Possibilities are (i) to have one
dedicated model per viewing angle (as done by ), (ii) to
include explicitly the viewing angle in the retrieval method traditionally
done in iterative schemes see, or (iii) to
apply a correction that brings all the viewing angles to an equivalent nadir
position, before the retrieval itself . Here, the GAMs
have been optimized using the nominal viewing angle of MADRAS
(53.5∘) and limited to the nadir geometry of SAPHIR. In fact, the
observed relationship between the BT and the viewing angle can be accurately
approximated by a multi-variate linear function, as noticed by
and . Knowing the means and variances
of this relationship for each angle is enough to assimilate this function in
the normalization method, which is based on standard scores. These have been
computed every 2∘ from nadir to 52∘ (the maximum viewing
angle of SAPHIR is 50.7∘) using the training database.
Comparison to radiosonde measurements: the CINDY/DYNAMO/AMIE data set
Observed RH profiles gathered from the CINDY/DYNAMO/AMIE international field
experiment are used to evaluate the estimated RH profiles. With the 1st
orbit of Megha-Tropiques executed on 13 October 2011, this large scale
campaign is ideal to perform such an exercise. It took place over the October 2011–March 2012 period in the Indian Ocean and was dedicated to better
understand the processes involved in the initiation of the Madden–Julian
Oscillation and to improve its simulation and prediction (Cooperative Indian
Ocean Experiment on Intraseasonal Variability in the Year 2011/Dynamics of
the Madden–Julian Oscillation/ARM Madden–Julian Oscillation Investigation
Experiment, hereafter C/D/A). Measurements related to the atmospheric and
oceanic states have been collected from radars, microphysics probes, a
mooring network and an upper air sounding network. One can refer to
for a discussion on the quality of the RH profiles and their
use in the context of the evaluation of SAPHIR measurements. Here we focus on
the oceanic sites and on the October–December 2011 period to evaluate the
RH estimations; over that period, MADRAS performed optimally.
found a systematic bias in the BT space that increases with
the distance of the observing channel from the central frequency. Such biases
are eliminated by the normalization procedure of the retrieval scheme.
Overall, among the 10 000 high-resolution soundings collected during the
campaign , only about 50 profiles match to our
collocation criteria: a Δt≤±45 min and a Δx≤50 km.
The restriction of the training of the GAMs to clear-sky conditions requires
a cloud mask. Therefore, cloud-free cases are detected from the radiosounding
record itself (RH limited to 100 %RH) and are associated to the
method to detect the precipitating scene (i.e., the convective
overshootings: it is a threshold method based on the depression induced by
the scattering of the microwave radiation by the precipitating particles)
from the SAPHIR observations. One point of concern here is the availability
of the Megha-Tropiques archive over this period which is not 100 %, with a
lower availability for MADRAS. The completion of this archive until the date
of launch is still a major point of concern for the two space agencies CNES
and ISRO, in order to maximize the size of the MADRAS record.
For each of the seven layers, the observed RH is defined by the mean of the
measurements that fit into the pressure boundaries, assuming that this mean
will be representative of the layer. This assumption is very simple,
especially since the tropospheric RH is characterized by strong vertical
gradients induced by complex transport and thermodynamic processes
e.g.,. However, a comparison (not
shown) between such a smooth mean and a discrete mean as defined from the
training profiles show no systematic differences. Figure
shows the comparison between the observed and estimated RH using profiles of
R and biases, for the two instrumental configurations. Figure
summarizes the results. Since the sample size is quite small
(N-2=48 degrees of freedom), a Student t test is
performed to test the independence of the samples, assuming that they
follow Gaussian distributions. The 99.9 % level of confidence is indicated on
Fig. and t values below this level are not given.
Box-and-whiskers diagrams are used to represent the distributions of the
differences and show the similarity/differences of the estimations when using
both SAPHIR and MADRAS or only SAPHIR. As expected from the synthetic data
analysis, the mid-tropospheric layers 2 to 5 are very well retrieved, with
quite good correlations (0.6–0.92) when SAPHIR and MADRAS are combined.
Additional analyses show that the SD of the differences reach a maximum of
10 %RH (layer 4). The removal of MADRAS clearly affects the estimation of RH
for most layers, while for layer 3 there is no significant effect. This is
expected from the distribution of the weighting functions that present a
large overlap around 300 hPa (see also Fig. ). Our results are
consistent with the findings of dedicated to the evaluation
of the RH profile retrieval designed by the Indian team involved in the
Megha-Tropiques mission. The layer-averaged relative humidity (LARH)
retrieval technique differs from the present approach by
its dependence on outputs from the National Center for Environmental
Prediction/National Center for Atmospheric Research (NCEP/NCAR) re-analyses.
This explains the relatively closer pattern of the LARH estimated from SAPHIR
to the NCEP/NCAR RH profiles than to other models (e.g., ERA-Interim), as
found by .
Vertical profiles of R (left) and differences (right, in %RH) for
the SAPHIR-only (red) and the SAPHIR-MADRAS (blue) retrievals, computed over
the subset of 50 radiosonde RH profiles from the CINDY/DYNAMO/AMIE campaign. For the profiles of
the differences between the observed and estimated RH, the box and whiskers
diagram indicates for each layer the median (the central vertical line) and
the lower and upper quartiles (left and right edges of the box). The whiskers
indicate the lower and upper limits of the distribution within 1.5 times the
interquartile range from the lower and upper quartiles, respectively.
Relative humidity (in %) of the layer 400–600 hPa as estimated from
Megha-Tropiques/SAPHIR measurements (top) and by the ERA-Interim reanalysis
(bottom) for 14 November 2011. For the map of ERA-Interim RH, the
black contour delineates the clear sky and the grey contour delineates the
areas with low-level clouds, while the dotted areas are covered with high or
mid-level clouds.
Land surfaces
The approach has been adapted to continental cases, where the influence of
the surface emissivity on the measured brightness temperature at the top of
the atmosphere needs to be taken into account , even for
SAPHIR channels. have shown that for relatively humid
columns, defined with a proper filter on the precipitable water vapor (PWV)
given by radiosoundings, AMSU-B water vapor channels (similar to channels S2,
S3 and S5 of SAPHIR) are barely sensitive to the surface. The emission by the
surface affects the measured BT in the 183.31±7.0 GHz (equivalent to
channel S5) when the PWV is lower than 30 kgm-2. The
study focuses on polar atmospheres, and AMSU-B channels are much less
affected by the surface when observing tropical situations
. However, to limit the errors introduced by a possible
contribution of the surface emissivity, the consideration of realistic
surface emissivity is an asset for the definition of a realistic training
set. In the current study, we use the emissivity atlas of
as an additional input of the radiative transfer model for the simulation of
the BTs. A GAM is trained for each layer following the same method than for
the oceanic conditions.
Comparisons (not shown) to radiosoundings launched from a continental site in
Ouagadougou, Burkina Faso (a dedicated field campaign during the summer 2012)
reveal similar performance in the mid-tropospheric layers, the surface layers
being slightly better estimated.
Insight of large scale structures
Figure shows an example of RH estimation using the SAPHIR-only
scheme, for the 4th atmospheric layer (425–650 hPa) observed on 14 November 2011
(observing time 17:00–18:55 UT). The RH of ERA-Interim of the
same date at 18:00 UT is also presented. The large-scale patterns are clearly
identical in the two maps, such as the large dry area over West Africa, the
moist and thin filamentary structure northwest of it, or the moist area over
Central America. The amplitude of the two fields present some discrepancies,
but is important to focus specifically on the cloud-free zones. The high
and mid-level clouds in ERA-Interim are shaded in black while the low clouds
are delimited by the grey contour. Over these areas the amplitudes are
similar, with minima of RH around 10 %RH. Note that no cloud-mask is yet
available for the Megha-Tropiques observations and a current effort is on the
use of the cloud mask and cloud classification developed by the SAFNWC
(Satellite Application Facility of EUMETSAT) and applied to the belt of
geostationary satellites, adjusted to Megha-Tropiques.
Conclusions
Microwave observations from the SAPHIR and MADRAS microwave radiometers of
the Megha-Tropiques satellite are used to retrieve seven-layer RH profiles. For
this purpose, optimized GAMs were trained for each atmospheric layer over a
realistic set of synthetic observations. This set is composed of 18 years of
radiosonde profiles covering the tropical belt (±30∘), sampled
from the ARSA database, used in combination with a radiative transfer model
(RTTOV v9.3) to get the associated synthetic BTs. Our approach consists
of using only the satellite measurements as inputs of the retrieval method. The
training phase of the model considers implicitly the role of temperature,
humidity and surface characteristics of the tropical atmosphere.
To assess the performance of GAM, two other algorithms based on supervised
learning, namely a MLP and a LS-SVM, have been also trained and optimized
using adapted validation methods. To our knowledge, the LS-SVM modeling
technique has never been applied for remote sensing retrievals, whereas it
solves the major problem of local minima, a common pitfall when using neural
networks (such as the MLP). While the three modeling methods come from different
theoretical backgrounds, they achieve roughly the same performance, even
though the LS-SVM approach provides roughly slightly better results. We
assume that these improvements come from their built-in regularization
mechanisms, but they are associated to a heavy computational burden that
compromises their implementation when considering large data sets (such as
satellite measurements).
The intercomparison of the three models points towards the definition of the
problem given the inputs at hand. The combination of SAPHIR and MADRAS or the
use of SAPHIR-only makes it possible to perform a robust estimation of RH in
the 150–950 hPa part of the troposphere with a small error (absolute maximum
bias of 1.53 %RH) and scatter (min correlation of 0.49). Near the tropopause
and at the surface, the retrieval capacity is clearly constrained by the
information content brought by the inputs, whatever the configuration. Of
course, the use of a retrieval technique (e.g., neural network or
1-D-variational) using prior physical information should further improve the
estimation: for instance, the surface layer should clearly benefit from prior
knowledge of the surface temperature and total water vapor content. In fact,
a comparison with existing works based on methods combining physical
constraints with statistical tools applied to on similar
radiometers with less channels in the 183.31 GHz line, such as AMSU-B or MHS,
shows that the current approach gives similar performance (root mean square
errors of about 10 %RH in the mid-troposphere). It is also consistent with
the layer-averaged RH profiles estimated by the Indian team involved in the
Megha-Tropiques mission, although further constraining the retrieval by
NCEP/NCAR outputs . A 1-D-variational technique
exploring SAPHIR data should further improve the estimation of RH.
Following this work, our current efforts focus on the estimation of the
conditional error associated to the retrieval itself. Indeed, because the
widths and altitudes of the weighting functions of SAPHIR are strongly
dependent on the thermodynamical state of the atmosphere (the drier the
atmosphere, the wider the layer; the maximum of sensitivity shifting from the
upper troposphere towards the mid-troposphere), it is clearly expected that
the robustness of the RH estimation will be conditioned by the state of the
atmosphere. The aim will be to provide the probability density function of
the relative humidity on given BTs (a given state of the atmosphere) and thus
address the issue of non-Gaussian distribution of the relative humidity at a
given height. The knowledge of such information fits into the current work
done within the Global Energy and Water Cycle Experiment (GEWEX) Water Vapor
Assessment (G-VAP: http://www.gewex-vap.org) to better characterize
the observational records, together with their uncertainties.