Random uncertainties and vertical error correlations are estimated for
three independent data sets. The three collocated data sets are
(1) refractivity profiles of radio occultation measurements retrieved
from the Metop-A and B and COSMIC-1 missions, (2) refractivity derived
from GRUAN-processed RS92 sondes, and (3) refractivity profiles derived
from ERA5 forecast fields. The analysis is performed using
a generalization of the so-called three-cornered hat method to include
off-diagonal elements such that full error covariance matrices can be
calculated. The impacts from various sources of representativeness
error on the uncertainty estimates are analysed. The estimated
refractivity uncertainties of radio occultations, radiosondes, and
model data are stated with reference to the vertical representation of
refractivity in these data sets. The existing theoretical estimates of
radio occultation uncertainty are confirmed in the middle and upper
troposphere and lower stratosphere, and only little dependence on
latitude is found in that region. In the lower troposphere,
refractivity uncertainty decreases with latitude. These findings have
implications for both retrieval of tropospheric humidity from radio
occultations and for assimilation of radio occultation data in numerical weather prediction models and reanalyses.
Introduction
In variational estimation of geophysical parameters from satellite
observations, the obtained accuracy relies on the validity of the
underlying uncertainty and error correlation assumptions of the
observation and of the model background fields.
The three-cornered hat (3CH) method
provides an empirically based uncertainty estimate of three independent data
sets, all representing a series of measurements of the same physical
property.
A historical overview of the applications of 3CH, and related methods,
is given by . The 3CH method
was introduced independently by multiple authors; the earliest was by
and (often referenced)
. The method has in several
cases been used for meteorological applications, sometimes under
other names, see, e.g. .
In numerical weather prediction (NWP), the method developed by
is being widely
adopted for empirically based adjustment of observation error
covariance matrices, e.g.
. However, the 3CH method has not
been adopted as a tool in operational assimilation of satellite data
into NWP models. This is probably because in NWP data assimilation, all
the model representativeness errors, including forward modelling
errors, are considered a part of the observation error. The 3CH
method is not targeted specifically at NWP applications. This means
that all three data sets involved are treated equally as a start, thus
they are all assumed to contain representativeness errors with
respect to the underlying truth. In order to use results obtained from
the 3CH analysis it is necessary to consider, for each particular
application, how representativeness errors are distributed among the
involved data sets, and this is not always possible to find out.
To distinguish error correlations between data sets from vertical
error correlations within each data set, we will refer to the former as
error cross correlations. Such error cross correlations can, for
instance be due to similarities in measurement methods and processing
or they can, for example arise as a result of similarity in resolution
among the data sets. Error cross correlations can cause the 3CH
method to misrepresent uncertainties
.
If error cross correlations
and representativeness issues are properly considered and accounted
for, the 3CH method can serve as an alternative to – or a validation
reference for – uncertainty estimates based on instrument
characteristics and measurement geometry.
Recently applied the 3CH
method to refractivity, temperature, and humidity profiles from radio
occultations (RO), combined with radiosondes and model analysis. The
results of that study yielded relatively large uncertainty estimates
(see Discussion, Sect. ), and the study leaves the problem
of error cross correlations unresolved.
In this paper the 3CH method is generalized to include off-diagonal
elements of the error covariance matrices. We apply the generalized
3CH (G3CH) to three data sets where the random error components can
be assumed not to be interdependent, meaning that their error cross
correlations are assumed to be negligible. The refractivity error covariance
matrices of RO measurements are estimated and compared with current
vertical correlation assumptions, used in 1D-Var retrieval of specific
humidity and temperature from RO refractivity. The main objective of
this study is to assess refractivity random uncertainty and vertical
error correlations, expressed as the refractivity error covariance
matrix, to be used in 1D-Var retrieval of temperature and specific
humidity
. The
three data sets are treated on equal terms such that none of them are
considered more or less representative for the truth a priori, thus
the analysis will also provide estimates of the ERA5 refractivity
error covariance matrix and of the GRUAN processed RS92 refractivity
error covariance matrix.
The remainder of the paper is organized as follows: Sect. and contain
definitions of the terminology used throughout the paper. Next, the
three data sets are introduced in Sect. , and the G3CH
method is presented in Sect. , which includes a derivation
of the G3CH equations. Results are presented in Sect.
along with interpretation of the different collocation and filtering
experiments. In the Discussion, Sect. , the results
are related to previous studies and applications. The results are
finally collected in the Conclusions, Sect. .
Definitions
The terms random uncertainty and systematic uncertainty
are used as defined in the Guide to the Expression of Uncertainty in Measurement (GUM;
). We adopt the concept of error covariance (matrix) and error correlation (matrix) from NWP terminology
to describe vertically
correlated random uncertainties, and we use the terms error variance and error standard deviation to refer to the
diagonal of an error covariance matrix and its square root.
The term vertical footprint of a data set is used here in the
same way as in :
The vertical scale that an observation value represents. The
word resolution may be used to describe this property, but we
shall avoid this term because in the NWP community it is used in the
meaning of sampling density – the number of data points per spatial
interval (e.g. in ).
The vertical footprint will typically be larger than the distance
between the vertical height levels which the data values refer to.
Error components
For a given refractivity data profile, x, we consider the
observation error ε as the deviation from the unknown truth
t, ε=x-t. The G3CH does not make any assumptions
about exactly what the true profile t is. t may be
thought of as defined with respect to a given but unknown finite
vertical footprint, which may differ from the vertical footprints of
all three data sets.
The quantity ε is a sum of the measurement error εI and a
representativeness error εR. Both terms may contain random and
systematic error components, but for each subset of collocated
triplets being analysed, we remove systematic error differences
between the three involved data subsets prior to the analysis. The
measurement error εI acts as a superimposed noise, possibly
correlated in space and time. εI may for instance include
instrument errors, radio noise from external sources, and also some
errors arising during data processing steps. The εR component
represents the distortion of the underlying truth in a data set, as it
is being mapped to the vertical observation grid. εR contains errors
associated with, for instance sampling, interpolation, and mismatch
between the observation grid and measurement resolution in time and
space. Especially εR contains a geometric error component,
εG, representing the departure of the ERA5, RO, and RS92 vertical
or skewed profiles in time and space from the unknown true profile at
the RO reference coordinates. The RO reference coordinate is the point
at which a straight line between the GNSS satellite and the receiving
Low Earth Orbiter tangents the Earth ellipsoid. The ERA5 profile is
strictly vertical, interpolated to the RO reference time and position,
while the RO profile is a weighted average of the 3D atmosphere in the plane of occultation
, and the radiosonde
follows the balloon trajectory. The forward operator used estimates
refractivity along a 1D assumed vertical line, and this has
an impact on the uncertainty estimates. Thus, the RO observation
errors estimated by the 3CH method in this paper are applicable for
variational assimilation with a 1D operator, but not for 2D or 3D
operators. The time scale of an RO profile is in the order of 1 min and the timescale of an radiosonde profile is in the order of
1 h. The skew trajectories of the RO tangent points and RS92
balloons are assumed not to be correlated with each other. Hence the
εG term can be assumed to contain no cross correlations.
However, there are potentially error cross-correlation components
arising from spatial correlations between the data sets, which we
cannot assess. This could, for example be the case for ERA5 and RO,
because these are sampled on similar horizontal scales.
Given the definition of t to be the actual profile at the RO
reference location and time, there are, in addition to εR and
εI, errors induced by the methods applied in this paper. These
are a collocation error, εC, due to the distance in time and
space between the radiosonde and the reference coordinates, and a
cross-correlation error, εX, representing error cross
correlations induced by the finite vertical footprints of the three data sets.
The raw G3CH uncertainty estimate for one of the data sets
will not represent the observation error, but it will represent a
combination of the observation error ε and error components added
by the G3CH:
εG3CH=εI+εR+εC+εX.
We are able to remove εC and the εX components of the three
data sets, by adding the following additional analysis steps to the G3CH. The
εC covariance matrix, CC, is eliminated by first calculating
G3CH estimated covariance matrices Ci for a series of collocation
subsets, sampled from areas of decreasing size around the RO reference
coordinates. Next, the sequence of decreasing covariance estimates is
extrapolated to the virtual zero-area case C0, by assuming that CiC=Ci-C0.
Subsequently the εX covariance matrix, CX, is eliminated by
smoothing all three data sets such that they have the same vertical
footprint, and then for each data set a covariance matrix
Cs is calculated with G3CH from the smoothed data sets, such that CX=C0-Cs. Thus, the
observation error covariance matrices that we estimate in the end include only
measurement error εI and representativeness error εR.
ε=εI+εR.
The final estimate of ε will be stated with reference to a
common vertical footprint of the three data sets, which is determined
by the data set with the largest vertical footprint, ERA5.
These general definitions of measurement error and representativeness
error are thought to be applicable for all three data sets.
Data
Three data sets are combined in the analysis. The RO data set includes refractivity profiles from the Metop and
COSMIC-1 missions ,
interpolated to 247 levels. These are downloadable as part of the
ROM SAF CDR v1.0 and ICDR v1 data sets. The CDR v1.0
consists of RO data from several
satellite missions, which have been reprocessed by the ROM SAF,
using lower-level input data from both EUMETSAT and UCAR as input. The
ICDR v1 consists of RO data from the
Metop mission, which have been reprocessed by the ROM SAF, using input
data from EUMETSAT. Secondly the radiosondes (RS92) are taken from the
RS92-GDP.2 data set, provided by the GCOS Reference Upper-Air Network,
GRUAN,
. From
these two data sets a collocated subset has been selected, from the
criterion that the GRUAN central time and position must be within
3 h and 300 km of the RO reference point. In
effect this ensures that the location criteria are met in the upper
troposphere while measurements can be sampled further apart at both
higher and lower altitude. The RO data have been subject to the ROM SAF
quality control described in
, and the GRUAN data
have been pruned for a few extreme outliers. The third data set (ERA5)
contains model forecast from the ERA5 data set
on model levels, retrieved
from the ECMWF MARS archive. The forecast verification time has in
each case been chosen such that the RO has not been
within the assimilation window used for initialization of the given
forecast. Effectively this implies that the verification times used run
from 3 to 15 h, and the ERA5 uncertainty is assumed to be constant
in this time range.
The ERA5 forecast is prepared at model levels and
interpolated in time (3 h grid) and horizontal space (1×1∘ grid) to the RO reference points. These interpolated ERA5 profiles are
also provided as part of the ROM SAF CDR v1.0 and ICDR v1 data sets.
The data span a time interval from 2006 to 2020. A total of 15 597 collocations were found for this analysis. The RS92 temperature,
humidity, and pressure variables have been interpolated with cubic splines
to the 137 ERA5 model levels, hereafter the ERA5 and RS92 variables have been forward
modelled to refractivity at the RO vertical grid of 247 levels . The refractivity calculation is
made with the method described in the ROPP user guide
.
MethodThe generalized three-cornered hat method
The 3CH method has historically been applied to triplets of data
without considering vertical error correlations, meaning that the data
sets have effectively been treated as scalar properties
. A straightforward
generalization of the method allows us to also infer internal error
correlations for each data set. In the generalized 3CH (G3CH) it is
assumed that we have three independent variables x,y, and
z, which are composed of four stochastic vectors; the truth
t, and three independent error terms εx, εy, and εz,
such that
x=t+εx,y=t+εy,z=t+εz.
In the present paper x,y, and z may represent atmospheric
refractivity profiles obtained from different sources. εx,εy, and
εz represent the random observation error vectors. In the following
the bracket notation, 〈〉, is used to denote
expectation values.
The error vectors εx,εy, and εz may have internal correlations,
expressed as error covariance matrices X=〈εxεxT〉, Y=〈εyεyT〉, and Z=〈εzεzT〉, but we assume no cross-correlation components; i.e.
〈εxεyT〉=〈εxεzT〉=〈εzεyT〉=0. We may allow that the error is correlated with the
physical property t; e.g. 〈tεxT〉≠0. Besides this, no assumptions are made about the particular shape of
error distribution functions. In the present paper we only estimate
the random uncertainties. In practice we remove biases in each subset
of collocations where G3CH is to be applied by subtracting the subset
mean of each of the three data sets prior to the analysis. Thus, in the
following derivation we can assume that all data are bias-free. In
the absence of bias, the covariance matrices of each subtraction pair
can be written as
〈x-yx-yT〉=〈xxT+yyT-xyT-yxT〉,〈x-zx-zT〉=〈xxT+zzT-xzT-zxT〉,〈y-zy-zT〉=〈yyT+zzT-yzT-zyT〉.
Expanding the right-hand side of, for instance the first line of
Eq. () we obtain
〈x-yx-yT〉=〈εxεxT-εxεyT-εyεxT+εyεyT〉.
If we keep in mind that error cross correlations
between data sets are set to zero,
the three subtraction pair covariances reduce to
〈x-yx-yT〉=〈εxεxT+εyεyT〉,〈x-zx-zT〉=〈εxεxT+εzεzT〉,〈y-zy-zT〉=〈εyεyT+εzεzT〉.
Finally, by solving these three equations for the error covariance matrices X=〈εxεxT〉, Y=〈εyεyT〉, and Z=〈εzεzT〉 for the variables x,y,
and z, we get
X=〈εxεxT〉=12〈x-yx-yT+x-zx-zT-z-yz-yT〉,Y=〈εyεyT〉=12〈y-xy-xT+y-zy-zT-x-zx-zT〉,Z=〈εzεzT〉=12〈z-xz-xT+z-yz-yT-x-yx-yT〉.
The above G3CH model is applied to the three data sets described in
Sect. . In this analysis the mean is subtracted from each data
set prior to applying the G3CH. The biases are not the focus
here, but for reference the global means of RS92 and RO refractivity
differences to ERA5 for all collocations used in the analysis are
plotted in Fig. .
Handling collocation uncertainty
In order to compensate for the impact of collocation uncertainty εC on the
obtained refractivity error covariance matrices, the G3CH analysis is
applied to a series of data subsets with increasing collocation
distances between 50 and 300 km. The collocation uncertainty is removed
from the uncertainty estimates by extrapolating the covariance matrices to
zero collocation distances. This procedure, which is also performed by
in another
context, also allows one to track how the G3CH method partitions the
collocation uncertainty among the three data sets (see Sect. ).
Error correlations between data sets
The 3CH algorithm cannot distinguish between true physical variability
and mutual positive error correlations
. In cases where errors of
two data sets (x and y) are positively correlated, the
discrepancy between the third data set, z and (x,y) will
be attributed as an uncertainty of z, because the term 〈(x-y)(x-y)T〉 would be reduced in such cases.
In this study the measurement error cross correlations between the
chosen data sets are assumed to be negligible, since the three data
sets at hand are obtained by completely independent techniques. In
particular the ERA5 model forecast data are chosen such that no
information from either a given RO or an RS92 profile can have been
passed to the forecast being used in a given collocation triplet.
However, if two data sets have similar vertical footprints, or if
they are sampled at similar horizontal scales, these two
data sets may have cross-correlated errors, and possibly biases. All
biases are removed prior to application of G3CH, but the error cross
correlations introduced by finite vertical footprints or similar
horizontal scale may influence the result of G3CH.
Global biases of RS92 and RO refractivity,
with ERA5 used as reference, based on all collocation triplets
(collocated ERA5, RO, and RS profiles) used in this study,
evaluated at the RO reference location. See also Sect. .
Percentages are calculated relative to the mean of ERA5 forecasts.
Handling differences in vertical footprints
The three data sets differ in their vertical footprints. The RS92
radiosonde has a vertical footprint of around 50 m
. This vertical footprint is
increased through the interpolation to the common grid , and through
the procedure for correcting for collocation error. The RO refractivity has been shown to have a vertical footprint
of about 200 m under optimal conditions in the lower troposphere
. In the RO data used
here, the processing has removed some small-scale information, and thus the
RO vertical footprint is expected to be larger than 200 m. In
Fig. two examples of refractivities of
triple collocations are shown. The plots illustrate the ability of
resolving vertical structures in the middle troposphere and lower
stratosphere of the three data sets. Even though the highly resolved
ERA5 has 137 vertical levels, shown on the right vertical axis, it
provides a somewhat smoother representation of the vertical
structures compared to the RO. The RO profiles and
RS92 profiles show more vertical structure than ERA5.
Refractivity of two selected triple collocations
exemplifying differences in vertical footprint. (a) ERA5, RS92,
and COSMIC-1 RO (40.1∘ N, collocation distance: 18.6 km (0.2 h)) and (b) ERA5, RS92, and Metop RO (52∘ N, collocation
distance: 25.5 km (1.6 h)). RS92 and ERA5 have been interpolated to
the 247 RO height levels. The refractivity has been
normalized by division with the mean of the ERA5 data set
refractivity. The thinned refractivity levels and the 137 ERA5
model levels are printed on the right vertical axis.
Uncertainty estimates for any vertically represented variable must
refer to a specified vertical footprint to be meaningful. Thus, the
G3CH analysis has to be accompanied by an assessment of the vertical
footprints of the data sets. In our approach the data set with the
largest vertical footprint determines the common vertical footprint to
be used for all three data sets. In other words, if one of the
data sets does not contain information below a certain length scale,
there is not enough information in the data triplet to apply the G3CH
method to estimate uncertainties related to variability below that
length scale.
Because ERA5 is missing some fine-scale physical features, seen in the
better resolved RO and RS92 data set, we are forced to state the
uncertainty on the common scale determined by ERA5. This means that
the RO and RS92 data must be smoothed to match the ERA5 vertical footprint
prior to the G3CH analysis. If this smoothing is omitted, the G3CH may
give a biased estimate of uncertainties. By smoothing the data sets to
a common scale we remove both the physical features and the errors on scales
shorter than the common vertical footprint. Therefore the estimated
uncertainties of RO and RS92, which are correct on the common
scale found, may be viewed as lower uncertainty boundaries for these
variables on their native scales. The vertical footprints of the
three data sets are examined in Sect. .
Results
In this section the G3CH results are presented, first as raw
unfiltered uncertainty estimates, then with corrections for
collocation mismatch (εC terms) and corrections for cross
correlations due to finite vertical footprints (εX terms), to
assess the uncertainty limits for each data set.
Raw uncertainty estimates
Figure shows the raw estimates of the mid-latitude
refractivity uncertainty expressed as error standard deviation of the
three data sets, obtained by applying the G3CH directly to the raw
data sets. Generally the G3CH attributes a big part of the collocation
error (εxC) to the RS92 uncertainty. The reason is that the
collocation is performed by interpolating ERA5 to the RO reference
point, such that ERA5 and RO are closely collocated, while RS92 is chosen such that it is within 300 km of the RO reference
point; thus, naturally RS92 will stand out from the two other data sets
in many cases.
Raw estimate of refractivity random
uncertainty (standard deviation) of ERA5, RO, and RS92 at middle
latitudes.
Collocation uncertainty
The most striking feature in Fig. is the bulge of
RS92 around the tropopause. The main part of this bulge is removed
along with the collocation uncertainty by the procedure described in
Sect. . We calculate the G3CH estimates
of covariance matrices for a sequence of collocation criteria (between
50 and 300 km) and use these to extrapolate all covariance matrices
to 0 km collocation distance, with a linear fit to the full covariance
matrices as a function of the squared collocation distance. The
effect of varying the temporal collocation window is small, and therefore we
have excluded that from the analysis.
The impact on RS92 of changing
collocation distance is shown as an example in Figs. and where a few examples of extrapolations are
presented. The result of this procedure is summarized for all three data
sets in Fig. . The RS92 uncertainty estimate is reduced
considerably, while the uncertainty estimates for the two other data
sets are slightly changed.
In the subsequent analysis the 0 km estimates of covariances are used
for evaluation of covariance matrices and difference terms in the G3CH
equations.
Estimate of refractivity uncertainty
(standard deviation as percent of ERA5 mean refractivity) of RS92
for a series of collocation criteria. The dashed black line shows the
STDV obtained by extrapolation of the variance to zero collocation
distance.
Examples of extrapolation of refractivity
error variance of RS92, i.e. diagonal elements of covariance
matrix, to zero collocation criterion at five different
altitudes. The variances are divided by the mean ERA5 refractivity.
3CH estimates of refractivity error
standard deviations are shown for 300 and 0 km collocation
criterion for each of the ERA5, RS92, and RO data sets at low (a), middle (b), and high (c) latitudes. Smoothing has not been
applied.
Vertical filtering
In Fig. the impact of smoothing on error
standard deviations estimated with G3CH (Eq. ) is shown
for middle latitudes. The smoothing is applied as a sequence of
Gaussian filters of increasing widths. For each data set, filtering has
been performed, not on the data set itself, but on the two other
complementary data sets (see figure legends). The idea is
basically to probe the vertical footprint of one data set with two
other data sets of varying vertical footprint. The G3CH analysis was performed at the sequence of such prepared triplets of data sets
with increasing filter width. The impact of applying sequences of
Gaussian filters is best viewed near the tropopause. We note that all
variances eventually start to grow at some filter width, but the ERA5
error standard deviation drops in most cases at small filter widths,
and does not start to increase until the width of the filter, applied
on the RO and RS92 data, exceeds a certain threshold. We interpret
this threshold as the ERA5 vertical footprint.
The ERA5 vertical footprint was estimated for each altitude, as the minimum of a second-order polynomial, fitted to σERA5 as a function
of filter width. These vertical footprints are plotted in
Fig. , for middle and high latitudes. At low
latitudes the result is unstable, and therefore that plot has been omitted. We use these results to identify a common ERA5 vertical footprint to be applied
globally as the mean of the middle- and high-latitude vertical footprints, shown
as a dashed line in Fig. .
Effect of smoothing on error
standard deviations of ERA5 (a), RO (b), and RS92 (c) estimated
by 3CH. Note that the x axis is shifted by 0.2 % in panel (c). All plots are for middle latitudes. The standard
deviations are divided by the mean ERA5 refractivity. The
filtering width is defined as 2 times the standard deviation of
the Gaussian filter function. The single curves are most easily
identified at a given altitude by counting the curves from one
side. For each data set, the smoothing is only applied on the two
other data sets.
Estimates of the ERA5 vertical footprint
for middle and high latitudes. The effect of filtering of RO and
RS92 on the estimated ERA5 error standard deviation,
σERA5, is shown in
Fig. a. The ERA5 vertical footprint is found for
each altitude as the filter width which minimizes
σERA5.
A similar analysis cannot be performed for the RO or RS92 data sets,
because these appear to have small vertical footprints which happen to lie
close to each other. There is not a finite filter length which
minimizes the refractivity error standard deviation for RO and RS92
(the filters being applied to the complementary data sets in each
case). Therefore the RO and RS92 vertical footprints cannot be inferred from
these three data sets alone, but it can be concluded that their
vertical footprints are smaller than the ERA5 vertical footprint since the ERA5 estimated
error standard deviation decreases if either the RO or RS92 is
smoothed. This is illustrated in Fig. : The impact of
smoothing RS92 on the ERA5 variance is shown in Fig. a.
Generally σERA52 decreases as the RS92 data are
brought closer to the ERA5 data by smoothing, consistent with RS92
having a smaller vertical footprint than ERA5. The RO error variance,
σRO2, on the other hand increases as a result of
smoothing the RS92 data (see Fig. b). This is consistent
with the RS92 vertical footprint being close to the RO vertical footprint and the RS92 data
moving closer to the ERA5 data as smoothing is applied to RS92. In
Fig. c σERA52 is seen to decrease as
the RO refractivity is brought closer to ERA5 refractivity, as
smoothing is applied to RO.
Estimates of middle-latitude refractivity
error variances based on smoothed data divided by refractivity
error variance based on un-smoothed data; (a) ERA5 error variance
with smoothed RS92, (b) RO error variance with smoothed RS92, and
(c) ERA5 error variance with smoothed RO. The legends show the
width of the different vertical Gaussian filters applied to the
refractivity profiles mentioned in the legend title; e.g.
σERA5/σERA5,0 in panel (a) means uncertainty estimate of ERA5, given filtering of
the data set mentioned in the legend – in this case RS92,
divided by the uncertainty estimate of ERA5, obtained without
filtering. A total of 9722 collocated data triplets were used in the G3CH
analysis for these error covariance estimates.
To estimate the final G3CH uncertainties with reference to the common
vertical footprint determined by ERA5, all three raw data sets were
smoothed with a Gaussian filter with the width of the ERA5 vertical footprint
prior to the G3CH analysis. In Fig. the final G3CH
inferred uncertainties are shown for each data set for low, middle,
and high latitudes. For all data sets the unfiltered uncertainty
is also plotted, for later discussion.
Best estimate of refractivity
uncertainties, shown as standard deviations in percent of the
ERA5 refractivity for ERA5, RO, and RS92 at low (a), middle (b),
and high (c) latitudes. For all data sets the uncertainty is
given with reference to the ERA5 vertical footprint, which is achieved by
filtering all data sets to match the ERA5 vertical footprint (thick
curves). The uncertainties based on un-smoothed
data are also shown (thin curves). The standard deviation estimates found have been vertically smoothed with a
10 grid points box filter.
Error covariances
In Figs. and the G3CH-based error
covariance matrices for ERA5, RO, and RS92 are shown for rising and
setting occultations for middle and high latitudes. These matrices have been
calculated without any vertical filtering applied. The tropics are not shown
because of insufficient number of data in that region. The fine-scale
off-diagonal structures must be attributed to statistical noise, but
there are certainly larger-scale vertical correlation structures especially in the
RO and RS92 data.
Generally the vertical correlations are divided into two separable
regimes: Close to the diagonal we see a short-range correlation with
standard deviation of approximately 0.5 km, and a long-range
correlation component of varying shape and amplitude. The short-scale
vertical correlations are very similar for all data sets. Rising
occultations are found to have larger vertical error correlations (and
slightly larger standard deviation where correlations are broader)
than setting occultations in this data set, which is seen when
comparing panel (b) with panel (e) in Fig. and panel (b)
with panel (e) in Fig. . This is believed to be due to
the ionospheric correction in the RO processing for rising
occultations, where the L2 GPS signal is often not available below 20 km, and extrapolation from above is necessary. In the CDR v1.0 data
set, it is in particular the rising occultations for Metop after
instrument firmware upgrades in 2013 that suffer from missing L2 data
below 20 km , and consequently
there are broader vertical error correlations in the retrieved
refractivity profiles for Metop after 2013 (not shown).
G3CH estimate of ERA5 (a, d), RO (b, e), and RS92 (c, f) refractivity vertical error covariance
matrices at middle latitude. Rising occultations (a–c) and
setting occultations (d–f). The covariance matrices are plotted
as correlation matrices (diagonal =1) with superimposed standard
deviation as a function of height (black line), plotted on the left
and upper axes. The standard deviation is given as percent of the
mean ERA5 forecast refractivity. The covariance matrices have
been truncated at 30 km where the RS92 data sparseness starts to
destabilize the results. A total of 9722 collocated profile triplets were
used for these covariance estimates. For these estimates no
smoothing was applied on any of the data sets.
Same as in Fig. , but for high latitudes.
It is worth noting that the estimated vertical correlations of RS92
are larger for setting than for rising RO at high latitudes,
especially between 6 and 22 km. Thus the G3CH fails to give an
independent estimate of the RS92 correlations. The RS92 is expected
to have long-ranging vertical correlations due to corrections
implemented in the GRUAN processing, but the G3CH fails to attribute
these correctly when strong long-range correlations are also present
in the RO data. The estimated RS92 diagonals (standard deviations
superimposed vertically on correlation matrices) seem reasonably
consistent for rising and setting occultations.
The relative magnitude of the off-diagonal covariance components can
also be viewed in Fig. : Here the vertical error
correlation function of RO refractivity is exemplified for two
heights, approximately 5 and 20 km, at low, middle, and high
latitudes. The correlation functions are slices of the RO refractivity
error correlation matrix at these altitudes. For instance at high
latitude there are pronounced long-range correlations at these two
heights. In the tropics the data are too sparse to get an estimate of
the correlation function. In Fig. the dashed curves
show the three kilometre exponential correlation which is assumed in the
current ROM SAF 1D-Var analysis . Given that the
finer correlation structures, around the 1 km scale, are influenced by
sparseness of data, the current correlation function appears to
be reasonably adequate at high latitudes at the selected altitudes. At
middle latitudes there is a potential for decreasing the error
correlation length in future applications.
Estimate of RO vertical error
correlations at approximately 5 and 20 km for low, middle, and high
latitudes (full curves). The dashed lines show exponential
correlation with three kilometre decay length. No smoothing was
applied on any of the data sets before the G3CH was
applied.
Discussion
It is evident from the results in Sect. that
uncertainty estimates for a data set must be stated with reference to
scale in space and time. We choose for all data sets to report the
estimated uncertainty boundaries with reference to the estimated
vertical footprint of the ERA5 data set. We identified the vertical
footprint of ERA5 for a range of altitudes, and used these to find RO
uncertainties in Fig. .
The unfiltered G3CH RO uncertainty estimates, also seen in
Fig. , include uncertainty associated with
fluctuations on shorter scales than the ERA5 vertical footprint. The
unfiltered uncertainties are overestimated because they will include
physical variability, falsely attributed as errors. We cannot quantify
the native vertical footprints of RO and RS92, but it can be assumed
that the unfiltered uncertainty estimates mark upper boundaries for their
uncertainties evaluated with reference to the native vertical
footprint.
The estimated uncertainties of RO in the lower stratosphere
are only slightly above 0.2 %, the theoretical estimates found in
.
Empirical uncertainty estimates have previously been analysed, for
instance by
and . In the
empirical uncertainty analysis study by
, the
refractivity uncertainty is found to be 0.35 % in the
lower stratosphere. used RO data in combination with ECMWF analysis, and were therefore
confined to make assumptions about mutual correlation and partitioning
of uncertainties among the two data sets. By combining three
independent data sets in a 3CH analysis such assumptions may be
avoided, and consequently one is able to decrease the
uncertainty estimates.
The RO uncertainty in the Upper Troposphere–Lower Stratosphere (UTLS),
where the uncertainty estimates are at a minimum, does not vary much
with latitude, except for a small increase at the tropopause. The
structure of the uncertainty profiles is quite similar for all
latitudes. The increase of uncertainty in the troposphere is smaller
at high latitude, but the crossover between high and low uncertainty
happens at approximately the same altitude, 5–7 km, for all
latitudes. For the tropics, the noisy uncertainty estimates, due to an
insufficient number of data, do not allow us to safely read a minimum
above the tropopause from Fig. , but there is an
indication of uncertainty almost down to 0.2 % even between
7 and 10 km, well below the tropical tropopause layer.
In the 3CH is applied to
multiple triplets of RO data and different model refractivity
data. Overall their uncertainty estimates are much higher, for
instance 0.55 % in the UTLS. This may be partly due to the high noise
level of the analysed RO data (COSMIC-1 and COSMIC-2). But the data
sets used by are less well
suited to 3CH analysis than the data sets used here, because the
errors of the ERA5 analysis fields used must be expected to be
correlated with errors of other data sets, leading to a biased
estimate of 3CH uncertainties. The models applied have larger vertical
footprint than the RS92 radiosonde data used here, and this will tend
to yield an overestimate of RO uncertainties, which is indeed seen in
. Their results show
apparent uncertainty maxima near the tropopause, as must be
anticipated according to our analysis of the impact of vertical footprints
in Sect. .
In the theoretical uncertainty analysis study by
the refractivity
uncertainty in the lower troposphere tangents is 1 %, which is a little
lower than the G3CH estimate at middle latitudes, but quite consistent
with the G3CH estimate at high latitudes. The
estimate is dominated by
the contribution from representativeness uncertainty arising from
horizontal gradients along the ray path up to 30 km, i.e. the
contribution from the spherical symmetry assumption to the
uncertainty. Retrospectively seen,
may have underestimated
the horizontal gradients in the troposphere since the effect of
horizontal variability was estimated from a model with 40 km
horizontal resolution. In a later study by
, even lower refractivity
uncertainty estimates are obtained, with a coarser model (60 km
resolution and 50 vertical levels).
find down to 0.1 % refractivity
uncertainty in the UTLS.
The estimation of error correlation matrices with G3CH is a novelty
introduced in the present study. The method is able to detect expected
differences between rising and setting RO long-range vertical error
correlations, and long-range correlations are also seen in RS92
data. The vertical correlation estimates have limitations since the
estimate of long-range vertical error correlations of RS92 seems to be
dependent on whether the RO data used represent rising or setting
occultations. However, this inaccuracy seems relatively small compared
to the difference between the RO vertical error correlation
estimates found and the vertical error correlation estimates currently used in RO 1D-Var retrievals, and therefore the correlation estimates will be
useful in this context. Especially at middle latitudes there is a
potential for decreasing the vertical error correlation length in
future applications.
The analysis of G3CH presented here may have consequences for uncertainty
parameterization in retrieval and assimilation of RO refractivity. In
particular the estimates of RO refractivity uncertainty reveal a
potential for deflating tropospheric refractivity uncertainty in the
ROM SAF 1D-Var configuration. A reduction of assumed refractivity
uncertainty is of particular interest in the troposphere, where it can
improve the information content of water vapour retrievals. There is
a need to establish tropospheric water vapour climate data records
for climate research, for instance as expressed in the
objectives of the GEWEX water vapour assessment (G-VAP;
). The results presented here
promise a reduction in the uncertainty of RO-based tropospheric water
vapour retrieval.
Conclusions
The collocation-corrected G3CH random uncertainty estimate provides
full refractivity error covariance matrices for three independent
data sets. The method was applied to collocated refractivity
profiles from ERA5 forecasts, RO, and radiosondes.
The RO refractivity uncertainty is between 0.2 % and 0.6 % in
the UTLS at between 8 and 25 km at middle and high latitudes, and between
0.2 % and 1.4 % below 8 km. The G3CH method presented here
also yields estimates of the vertical error covariance matrices for
refractivity.
The achieved refractivity uncertainty estimates are lower than
empirically determined uncertainties previously reported in the
literature. The results can be used to model uncertainty assumptions
used in NWP data assimilation and in 1D-Var calculations of
atmospheric temperature and specific humidity based on RO refractivity
data.
Code and data availability
The analysis was performed in Jupyter Notebook. The code is kept in an internal git repository at the Danish Meteorological Institute. It is available from the corresponding author upon request.
The RO refractivity profiles and interpolated ERA5 temperature and humidity profiles on model levels (including surface pressure) are available at the ROM SAF web-page https://www.romsaf.org (last access: October 2022; ). The GRUAN atmospheric profiles are available at the GRUAN web-page ().
The supplement related to this article is available online at: https://doi.org/10.5194/amt-15-6243-2022-supplement.
Author contributions
JKN put together (and partly invented) the method, performed the analysis, and is the main author of the paper. SS, KBL, and HG contributed with substantial reformulations of every part of the manuscript, through multiple iterations, which led to changes in the basic rationale of the paper.
Competing interests
The contact author has declared that none of the authors has any competing interests.
Disclaimer
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Special issue statement
This article is part of the special issue “Analysis of atmospheric water vapour observations and their uncertainties for climate applications (ACP/AMT/ESSD/HESS inter-journal SI)”. It is not associated with a conference.
Acknowledgements
This work was carried out as part of EUMETSAT's Radio Occultation Meteorology Satellite Application Facility (ROM SAF) which is a decentralised operational RO processing center under EUMETSAT.
Johannes K. Nielsen, Hans Gleisner, Stig Syndergaard, and Kent B. Lauritsen are members of the ROM SAF. We wish to thank the reviewers John Eyre, Paul Poli and the third anonymous reviewer for helping to
improve the paper.
Financial support
This research has been supported by the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT).
Review statement
This paper was edited by Chunlüe Zhou and reviewed by John Eyre, Paul Poli, and one anonymous referee.
ReferencesBarnes, J.: Atomic Timekeeping and the Statistics of Precision Signal
Generators, Proc. IEEE, 54, 207–220,
10.1109/PROC.1966.4633, 1966.Bormann, N., Bonavita, M., Dragani, R., Eresmaa, R., Matricardi, M., and
McNally, A.: Enhancing the Impact of IASI Observations through an Updated
Observation-Error Covariance Matrix, Q. J. Roy. Meteor. Soc., 142, 1767–1780, 10.1002/qj.2774, 2016.Desroziers, G., Berre, L., Chapnik, B., and Poli, P.: Diagnosis of Observation,
Background and Analysis-Error Statistics in Observation Space, Q. J. Roy. Meteor. Soc., 131, 3385–3396,
10.1256/qj.05.108, 2005.Dirksen, R. J., Sommer, M., Immler, F. J., Hurst, D. F., Kivi, R., and Vömel, H.: Reference quality upper-air measurements: GRUAN data processing for the Vaisala RS92 radiosonde, Atmos. Meas. Tech., 7, 4463–4490, 10.5194/amt-7-4463-2014, 2014.Gleisner, H., Lauritsen, K. B., Nielsen, J. K., and Syndergaard, S.: Evaluation of the 15-year ROM SAF monthly mean GPS radio occultation climate data record, Atmos. Meas. Tech., 13, 3081–3098, 10.5194/amt-13-3081-2020, 2020.Gleisner, H., Bækgaard Lauritsen, K. B., Nielsen, J. K., and Syndergaard, S.: ROM SAF Radio Occultation Climate Data Record v1.0, GRM-29-R1 (2019),
for 2006 to 2016, EUMETSAT [data set],
10.15770/EUM_SAF_GRM_0002, 2019.Gleisner, H., Bækgaard Lauritsen, K. B., Nielsen, J. K., and Syndergaard, S.:
ROM SAF Radio Occultation Interim Climate Data Record v1.0, GRM-29-I1
(2019), for 2017 to 2020, EUMETSAT [data set],
10.15770/EUM_SAF_GRM_0006, 2022.Gray, J. and Allan, D.: A Method for Estimating the Frequency
Stability of an Individual Oscillator, in: 28th Annual Symposium on
Frequency Control, Atlantic City, NJ, USA, 29–31 May 1974, 243–246, 10.1109/FREQ.1974.200027, 1974.
Grubbs, F. E.: On Estimating Precision of Measuring Instruments and
Product Variability, J. Am. Stat. Assoc., 43, 243–264,
1948.Healy, S. B. and Eyre, J. R.: Retrieving Temperature, Water Vapour and Surface
Pressure Information from Refractive-Index Profiles Derived by Radio
Occultation: A Simulation Study, Q. J. Roy. Meteor. Soc., 126, 1661–1683, 10.1002/qj.49712656606, 2000.Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A.,
Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D.,
Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P.,
Biavati, G., Bidlot, J., Bonavita, M., Chiara, G. D., Dahlgren, P., Dee, D.,
Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer,
A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M.,
Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P.,
Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5
Global Reanalysis, Q. J. Roy. Meteor. Soc.,
146, 1999–2049, 10.1002/qj.3803, 2020.Hollingsworth, A. and Lönnberg, P.: The Statistical Structure of
Short-Range Forecast Errors as Determined from Radiosonde Data. Part I:
The Wind Field, Tellus A, 38A, 111–136,
10.1111/j.1600-0870.1986.tb00460.x, 1986.Joint Committee for Guides in Metrology: JCGM 100:2008,
GUM 1995 with minor corrections,
Evaluation of measurement data – Guide to the expression of uncertainty
in measurement,
https://www.bipm.org/documents/20126/2071204/JCGM_100_2008_E.pdf, last access: September 2008.Kursinski, E. R., Hajj, G. A., Schofield, J. T., Linfield, R. P., and Hardy,
K. R.: Observing Earth's Atmosphere with Radio Occultation Measurements
Using the Global Positioning System, J. Geophys. Res.-Atmos., 102, 23429–23465, 10.1029/97JD01569, 1997.Kursinski, E. R., Healy, S. B., and Romans, L. J.: Initial Results of Combining
GPS Occultations with ECMWF Global Analyses within a 1DVar
Framework, Earth Planets Space, 52, 885–892, 10.1186/BF03352301,
2000.Levine, J.: Introduction to Time and Frequency Metrology, Rev. Sci. Instrum., 70, 2567, 10.1063/1.1149844, 1999.Lewis, H.: GRAS SAF Report 08 ROPP Thinner Algorithm, EUMETSAT, Tech. Rep. SAF/GRAS/METO/REP/GSR/008, https://www.romsaf.org/general-documents/gsr/gsr_08.pdf, last access: June 2009.Merchant, C. J., Holl, G., Mittaz, J. P. D., and Woolliams, E. R.: Radiance
Uncertainty Characterisation to Facilitate Climate Data Record
Creation, Remote Sens., 11, 474, 10.3390/rs11050474, 2019.O'Carroll, A. G., Eyre, J. R., and Saunders, R. W.: Three-Way Error
Analysis between AATSR, AMSR-E, and In Situ Sea Surface
Temperature Observations, J. Atmos. Ocean. Tech.,
25, 1197–1207, 10.1175/2007JTECHO542.1, 2008.Rieckh, T. and Anthes, R.: Evaluating two methods of estimating error variances using simulated data sets with known errors, Atmos. Meas. Tech., 11, 4309–4325, 10.5194/amt-11-4309-2018, 2018.Rieckh, T., Sjoberg, J. P., and Anthes, R. A.: The Three-Cornered Hat
Method for Estimating Error Variances of Three or More
Atmospheric Datasets. Part II: Evaluating Radio Occultation and
Radiosonde Observations, Global Model Forecasts, and Reanalyses,
J. Atmos. Ocean. Tech., 38, 1777–1796,
10.1175/JTECH-D-20-0209.1, 2021.ROM SAF: The Radio Occultation Processing Package (ROPP) Forward
Model Module User Guide, EUMETSAT ROM
SAF, Tech. Rep. SAF/ROM/METO/UG/ROPP/006, https://www.romsaf.org/romsaf_ropp_ug_fm.pdf, last access: December 2021a.ROM SAF: Algorithm Theoretical Baseline Document: Level 2B and 2C
1D-Var Products, Version 4.3, EUMETSAT ROM SAF, Tech. Rep. SAF-/ROM-/DMI-/ALG-/1DV-/002,
http://www.romsaf.org (last access: October 2021),
2021b.Scherllin-Pirscher, B., Steiner, A. K., Kirchengast, G., Kuo, Y.-H., and Foelsche, U.: Empirical analysis and modeling of errors of atmospheric profiles from GPS radio occultation, Atmos. Meas. Tech., 4, 1875–1890, 10.5194/amt-4-1875-2011, 2011.Schröder, M., Lockhoff, M., Fell, F., Forsythe, J., Trent, T., Bennartz, R., Borbas, E., Bosilovich, M. G., Castelli, E., Hersbach, H., Kachi, M., Kobayashi, S., Kursinski, E. R., Loyola, D., Mears, C., Preusker, R., Rossow, W. B., and Saha, S.: The GEWEX Water Vapor Assessment archive of water vapour products from satellite observations and reanalyses, Earth Syst. Sci. Data, 10, 1093–1117, 10.5194/essd-10-1093-2018, 2018.Semane, N., Anthes, R., Sjoberg, J., Healy, S., and Ruston, B.: Comparison of
Desroziers and Three-Cornered Hat Methods for Estimating COSMIC-2
Bending Angle Uncertainties, J. Atmos. Ocean. Tech.,
39, 929–939, 10.1175/JTECH-D-21-0175.1, 2022.
Sjoberg, J. P., Anthes, R. A., and Rieckh, T.: The Three-Cornered Hat
Method for Estimating Error Variances of Three or More
Atmospheric Data Sets – Part I: Overview and
Evaluation, J. Atmos. Ocean. Tech., 38, 555–572,
10.1175/JTECH-D-19-0217.1, 2020.Sommer, M., Dirksen, R., and Immler, F.: RS92 GRUAN Data Product Version 2 (RS92-GDP.2), GRUAN [data set], 10.5676/GRUAN/RS92-GDP.2, 2012.Steiner, A. K. and Kirchengast, G.: Error Analysis for GNSS Radio
Occultation Data Based on Ensembles of Profiles from End-to-End Simulations,
J. Geophys. Res.-Atmos., 110, D15307,
10.1029/2004JD005251, 2005.Steiner, A. K., Ladstädter, F., Ao, C. O., Gleisner, H., Ho, S.-P., Hunt, D., Schmidt, T., Foelsche, U., Kirchengast, G., Kuo, Y.-H., Lauritsen, K. B., Mannucci, A. J., Nielsen, J. K., Schreiner, W., Schwärz, M., Sokolovskiy, S., Syndergaard, S., and Wickert, J.: Consistency and structural uncertainty of multi-mission GPS radio occultation records, Atmos. Meas. Tech., 13, 2547–2575, 10.5194/amt-13-2547-2020, 2020.Syndergaard, S., Kursinski, E. R., Herman, B. M., Lane, E. M., and Flittner,
D. E.: A Refractive Index Mapping Operator for Assimilation of
Occultation Data, Mon. Weather Rev., 133, 2650–2668,
10.1175/MWR3001.1, 2005.Xie, F., Wu, D. L., Ao, C. O., Mannucci, A. J., and Kursinski, E. R.: Advances and limitations of atmospheric boundary layer observations with GPS occultation over southeast Pacific Ocean, Atmos. Chem. Phys., 12, 903–918, 10.5194/acp-12-903-2012, 2012.