An extensive validation of line-of-sight tropospheric slant total
delays (STD) from Global Navigation Satellite Systems (GNSS), ray tracing in
numerical weather prediction model (NWM) fields
and microwave water vapour radiometer (WVR) is presented. Ten GNSS reference
stations, including collocated sites, and almost 2 months of data from 2013,
including severe weather events were used for comparison. Seven institutions
delivered their STDs based on GNSS observations processed using 5 software
programs and 11 strategies enabling to compare rather different solutions and
to assess the impact of several aspects of the processing strategy. STDs from
NWM ray tracing came from three institutions using three different NWMs and
ray-tracing software. Inter-techniques evaluations demonstrated a good mutual
agreement of various GNSS STD solutions compared to NWM and WVR STDs. The
mean bias among GNSS solutions not considering post-fit residuals in STDs was

Tropospheric slant total delay (STD) represents the total delay that undergoes the GNSS radio signal due to the neutral atmosphere along the path from a satellite to a ground receiver antenna. This total delay can be separated into the hydrostatic part, caused by the dry atmospheric constituents, and the wet part caused specifically by water vapour. By quantifying the total delay, and by separating the hydrostatic and wet parts, it is possible to retrieve the amount of water vapour in the atmosphere along the path followed by the GNSS signal.

During the processing of GNSS observations only the total delay in the
zenith direction (zenith total delay, ZTD) above the GNSS antenna can be
estimated for each epoch or for a time interval. ZTDs from GNSS reference
stations are operationally assimilated into numerical weather prediction models (NWMs)
for almost a decade (Bennitt and Jupp, 2012; Mahfouf et al., 2015). In
Europe, this activity is coordinated mainly in the framework of the EUMETNET
EIG GNSS Water Vapour Programme (E-GVAP, 2005–2017, phases I–III,

Validation of GNSS slant delays with independent measurements is not a new research topic. GNSS slant delays were validated against water vapour radiometer (WVR) measurements in Braun et al. (2001, 2002) and Gradinarsky (2002). First attempts to derive slant delays from NWM fields and to compare them with GNSS STDs were carried out by De Haan et al. (2002) and Ha et al. (2002). Additional effort to evaluate GNSS slant delays using WVR and NWM data was done at GFZ Potsdam over the last few years. Bender et al. (2008) showed an existing high correlation within the three sources (GPS, WVR, NWM) of slant wet delays (SWDs) and tried to quantify the effect of removing multipath from GPS post-fit residuals using a stacking method what was also done by Kačmařík et al. (2012). Deng et al. (2011) validated tropospheric slant path delays derived from single- and dual-frequency GPS receivers with NWM and WVR data. Shangguan et al. (2015) compared GPS versus WVR slant IWV values (SIWVs) using a 184-day dataset. They also analysed the influence of the elevation angle setting and the meteorological parameters (used for the conversion to IWV) on the comparison results. More recently, a validation of multi-GNSS slant total delays retrieved in real time from GPS, GLONASS, Galileo and BeiDou was presented by Li et al. (2015a) using WVR and NWM as independent techniques for the assessment. Using multiple GNSS constellations brought a visible advantage, in terms of not only the number of available slants but also their higher accuracy and robustness.

Nevertheless, most of the studies presented thus far were limited to only a single strategy for obtaining GNSS STDs and usually restricted to a limited set of stations and/or a relatively short time period. The main purpose of this study is an extensive comparison of various solutions from GNSS processing, NWM ray tracing and WVR measurements using one common dataset as well as a comparison of results from collocated stations. The GNSS solutions evaluated in this work used 5 different software programs and 11 strategies and exploited the GNSS4SWEC benchmark dataset (Douša et al., 2016). Then, the paper studies the impact of various approaches on STD estimates and aims to find the most suitable strategy for estimating the GNSS-based STDs.

Section 2 briefly introduces the validation study dataset, and Sect. 3 describes the process of retrieving GNSS STDs including an overview of the different GNSS solutions. Section 4 provides a description of STDs generated from NWMs, and Sect. 5 summarizes WVR principals and WVR-based STD solutions. Section 6 introduces the methodology used in the validation of STDs, and Sects. 7 and 8 study the results achieved at single GNSS reference stations and at closely collocated stations, respectively.

The presented work has been carried out in the context of the EU COST Action
ES1206 “Advanced Global Navigation Satellite Systems tropospheric products
for monitoring severe weather events and climate” (GNSS4SWEC;

From the complete benchmark dataset, we selected a subset of 10 GNSS reference stations situated at six different locations (Table 1). The selection was based on the following requirements: (1) long-term quality of observations and its stability, (2) availability of another GNSS reference station in the site vicinity, (3) availability of another instrument capable of STD measurements in the site vicinity and (4) the location of the station with respect to its altitude and the weather events which occurred during the evaluation period. The subset also includes collocated (dual) GNSS stations that played an important role in the validation. The collocated stations observed GNSS satellites with the same azimuth and elevation angles, so that they should theoretically deliver the same or very similar tropospheric parameters – ZTD, linear horizontal gradients and slant delays. Post-fit residuals of carrier-phase observations at the collocated stations should represent common effects due to the local tropospheric anisotropy, while systematic differences could remain due to instrumentation and environmental effects such as antenna and receiver characteristics and multipath. Only STDs from the WVR at Potsdam, collocated with the GNSS stations POTM and POTS, were available for this study because the second WVR, located at Lindenberg and collocated with the GNSS stations LDB0 and LDB2, was operated only in the zenith direction during the period of the study.

Characteristics of 10 GNSS reference stations.

The STD cannot be estimated directly from
GNSS data since the total number of unknown parameters in the solution would
be higher than the number of observations. Instead, the total delays in the
zenith direction above the GNSS station (i.e. ZTD) are adjusted together
with, optionally, total tropospheric linear horizontal gradients (

In practice, the ZTD is decomposed into an a priori model, usually by
introducing the zenith hydrostatic delay (ZHD; see Saastamoinen, 1972), and
the estimated corrections, representing (mainly) the zenith wet delay (ZWD).
Similarly, the STD is decomposed to the ZHD, ZWD,

The first-order horizontally asymmetric delay

Additionally, post-fit residuals RES may contain un-modelled tropospheric
effects not covered by the estimated tropospheric parameters. Such remaining
effects are supposed to be caused mainly by higher spatial and temporal
variations of the humidity or its significant horizontal asymmetry in the
troposphere. Obviously, residuals contain also other un-modelled effects such
as multipath, errors in antenna-phase centre variations or satellite clocks.
For eliminating such systematic effects, cleaning of post-fit residuals is
applied by generating elevation- and azimuth-dependent correction maps as
described by Shoji et al. (2004). For each solution and each station, we thus
computed mean values of post-fit residuals in 1

For the analysis of GNSS L1 and L2 carrier-phase observations, the
least-squares adjustment or Kalman-filter approach was applied to estimate
the ZWDs and the two horizontal gradient components

Information about individual GNSS-based STD solutions used in the validation.

GOP delivered two solutions based on the Precise Point Positioning (PPP) technique (Zumberge et al., 1997) and using the in-house developed application Tefnut (Douša and Václavovic, 2014) derived from the G-Nut core library (Václavovic et al., 2013). Considering all available GNSS solutions, only GOP used a stochastic modelling approach to estimate all parameters. Additionally, GOP provided two solutions: (1) GOP_F using Kalman filter (forward filter only), i.e. capable of providing ZTD, tropospheric gradients and STDs in real time; and (2) GOP_S applying the backward smoothing algorithm (Václavovic and Douša, 2015) on top of the Kalman filter in order to improve the quality of all estimated parameters during the batch-processing interval and to avoid effects such as the PPP convergence or re-convergence.

Some institutions also delivered two STD solutions which differ in a single
processing setting. The aim was to evaluate their impact on STDs: (a) TUO_G
and TUO_R exploit GPS-only and
GPS

In total, we validated 11 solutions computed with five different GNSS processing software. Five of the solutions used GPS and GLONASS observations and six solutions used GPS-only observations; five of them are based on DD observations and six of them are computed using zero-difference data in PPP analysis. More information about TUW solutions can be found in Möller et al. (2016), about GFZ in Bender et al. (2009, 2011) and Deng et al. (2011) and about CNAM in Morel et al. (2014). For ROB, TUO and WUE solutions we refer the reader to Dach et al. (2015).

Simulating STDs in NWMs consists in integrating the atmospheric
refractivity through the path followed by GNSS signals. STDs have been
simulated using three different NWMs: ALADIN-CZ (4.7 km resolution
limited-area hydrostatic model, operational analysis in 6h interval with
forecasts for 0, 1, 2, 3, 5, 6 h;

The ERA-Interim and NCEP-GFS STD solutions by GFZ are based on “assembled”
STDs. At first, for the considered station and epoch, a set of ray-traced
STDs (various elevation and azimuth angles) is computed using technique
described in Zus et al. (2014). Secondly, from this set of ray-traced STDs,
the tropospheric parameters (i.e. zenith delays, mapping function
coefficients, first- and higher-order gradient components) are determined.
Finally, for the required azimuth and elevation angle the STD is “assembled”
using the tropospheric parameters. For a detailed description of the
tropospheric parameter determination the reader is referred to Douša et al. (2016). The
differences between the “assembled” and ray-traced STDs are
sufficiently small in particular for elevation angles above 10

To compute STDs from ALADIN-CZ, a simplified strategy has been used to model
the curve path followed by GNSS signals through the neutral atmosphere, as
suggested by Saastamoinen (1972). The delays simulated with this strategy
show small differences in comparison to straight-line simulations
(differences of about 4, 5 and 10 mm, respectively, at 15,
10 and 5

For each latitude–longitude grid point and each level of ALADIN-CZ model,
the NWM outputs considered to compute STDs are geopotential height
(geopotH in m), pressure (

Using the hypsometric equation, the ground pressure and the pressure of each level are considered to estimate the altitude for the different levels. In total ALADIN-CZ outputs provide 87 levels up to an altitude of about 55 km. However, to assess STDs from ALADIN-CZ, the integration was stopped at 15 km since the contribution of water vapour above this altitude is negligible. An adaptive step is considered (100, 200, 250, 500, 1000 m, respectively, for vertical altitudes between 0 and 1, 1 and 3, 3 and 5, 5 and 10, and 10 and 15 km). Bi-linear interpolations of ALADIN-CZ parameters at the altitude of the GNSS station and for each step of the integration were proceeded. Note that there is no station selected for the validation located below the first layer of ALADIN-CZ.

The expression of simulated STDs from ALADIN-CZ is the summation of these
four contributions:

STD

The ray-traced tropospheric delays for WUELS' solution are based on
piece-wise bent-2-D model propagation. Thus, it prevents us from knowing the exact
trajectory in advance, in contrast to straight-line models, and must be
solved iteratively based on the preceding ray refractive index. Similar examples
are given by Böhm and Schuh (2003) and Hobiger et al. (2008). We assume
the ray path does not leave the plane of constant azimuth for a given
elevation angle to a satellite. The out-of-plane contribution to the delay
is thus neglected, making the propagation two-dimensional (hence 2-D). The real
ray path is then approximated by a finite number of linear ray pieces in
WGS84 coordinates using Euler's formula for the Earth radius:

ALADIN-CZ NWM has been used to estimate the hydrostatic, wet and hydrometeor
contributions to slant delays. During the whole period of the benchmark
campaign, the maximum contribution of hydrometeors reached 17 mm at the
zenith during the extreme weather events on 20–23 June (Douša et al.,
2016). The 2-D fields of ZTD, ZHD, ZWD and ZHMD (zenith hydrometeor delays)
are presented in Fig. 1. They illustrate the
large-scale convection with the presence of hydrometeors along the
convergence line associated with a strong contrast of dry and wet air
masses. The contribution of hydrometeors to ZTD reached up to 7 mm (as
scaled in the zenith direction) for the stations POTS and POTM at 15:00 UTC
on 23 June 2013 (see Fig. 1d). According to
satellite trajectories at this time for the station POTS, a maximum SHMD of
25.6 mm is observed for a satellite at 22

Simulation of ZTD, ZHD, ZWD and ZHMD at 15:00 UTC on 23 June 2013. Each black dot represents a GNSS station included in the benchmark dataset. For stations included in this STD validation study their names are given.

Figure 2 shows simulated differential STDs for a
cone with a 10

Skyplot of differential slant delays simulated at 10

To complement the snapshot of Fig. 2 the
time evolutions of SHD, SWD, SHMD and STD in the direction of all observed
GNSS satellites for the station POTS are presented in
Fig. 3. Slant delays have been simulated in the
direction of observed satellites (hydrostatic, wet, hydrometeor and total
contributions) and, to avoid the effect of the elevation and to look at the
same order of magnitude of delays, corresponding delays in the zenith
direction have been computed and mapped using mapping functions presented in
Eq. (1) (mf

Time series of slant delays (STD, SHD, SWD and SHMD) differences (in direction of all GNSS visible satellites, then mapped in the zenith direction) during the whole period of the benchmark campaign for the station POTS.

During the benchmark period, the WVR located at GFZ Potsdam operated in a mode that scanned the atmosphere at selected elevation and azimuth angles. The instrument is situated on the same roof as the GNSS reference stations POTM and POTS. All three devices are within 10 m from each other. The HATPRO WVR from Radiometer Physics was set up to scan the atmosphere to extract profiles of atmospheric temperature, water vapour and liquid water using frequencies between 22.24 and 27.84 GHz and a window channel at 31.4 GHz. The WVR switches between “zenith mode” when it measures IWV and “slant mode” when it tracks GPS satellites using an in-built GPS receiver. In the latter case, SIWV values are delivered for the direction of satellites. Since the instrument can track only one satellite at one moment the number of observations is quite limited compared to slants from GNSS that are simultaneously observed from several GNSS satellites.

Our study focuses on the comparison of STDs, not SIWV. It was thus necessary
to convert the WVR SIWV into STDs. Firstly, WVR observations with rain flag
and atmospheric liquid water (ALW) values exceeding 1 kg m

We provide the specificities of each type of technique comparisons in this section. Since NWM outputs are restricted to the time resolution of their predictions (typically 1, 3 or 6 h) and, since WVR is able to track only one satellite at one moment, all three sources provide different numbers of STDs per day. Therefore, three different comparisons are presented: (1) results for GNSS versus GNSS comparisons, (2) results for GNSS versus NWM comparisons and (3) results for GNSS versus WVR comparisons. Section 7 presents the validation at individual stations and Sect. 8 intercompares results obtained at GNSS dual stations. All the given results are obtained over the whole benchmark period. No outlier detection and removal procedure was applied during the statistics computation within the study.

Two variants of the comparisons are presented: “ZENITH” and “SLANT”.
“ZENITH” stands for original STDs mapped back to zenith direction using
1/sin(

Presented values of biases and standard deviations were computed directly from all STDs within the processed benchmark campaign period, and therefore they are not based on any kind of daily or other averaging. In some tables, only median values of bias and standard deviation over all GNSS STD solutions (Tables 5, 7 and 8) or over all processed stations (Tables 3 and 4) are given to consolidate the presentation of validation results. Median was used as a parameter minimally affected by outliers.

Statistics from comparisons of individual GNSS STDs (projected in the zenith direction) while using none, raw and clean residuals; median values of biases and standard deviations (SD) calculated over all stations with an exception of LDB0 station are given.

Impact of selected strategy modifications assessed via comparing individual STDs solution variants. Median values of biases and standard deviations (SD) calculated over all stations with an exception of LDB0 station using the estimated model only (without residuals) are given.

Medians of bias and standard deviation values of differences between all GNSS solutions and a particular NWM-based solution at each reference station, expressed in the zenith direction.

In the case of individual inter-GNSS solutions validation, the situation was straightforward and no interpolation nor specific hypothesis was necessary: the comparisons were done on a direct point-to-point basis of observations coming from identical azimuth and elevation directions.

To find pairs of STDs observations between WVR and GNSS, the following rules
were used: (1) the time difference between both observations had to be
shorter than 120 s and (2) the difference between both azimuth and elevation
angles had to be smaller than 2.5 and 0.25

Given the very small distances between collocated antennas and the coarse resolution of the global NWM models, STDs from NWM ray tracing using the ERA-Interim and the NCEP GFS models were derived only for one of the collocated stations. The same set of NWM-derived STDs was then used for the validation of the results at the collocated receivers.

The total of STD pairs available for this part of the validation is roughly 1.7 million and varies from 140 987 to 206 320 according to the station.

Individual GNSS solutions were first compared to the GFZ solution in the
zenith direction (ZENITH). We chose the “GFZ” solution as the reference
because GFZ Potsdam has long-term experience in producing GPS slant delays
and because the GFZ near-real-time solution for German GNSS reference
stations is already being operationally delivered to the Deutscher
Wetterdienst (German Meteorological Service) for NWM assimilation
testing purposes (Bender et al., 2016). Figure 4
shows all the solutions using STDs calculated from the estimated ZTD and
horizontal gradient parameters, i.e. without adding post-fit residuals.
Adding raw or clean residuals, applied consistently to both compared and
reference solutions, provided very similar graphs (not displayed). Colours
in Fig. 4 indicate the processing software used
in individual solutions. Medians of all solutions (dotted lines in each bin)
are displayed for each station in order to highlight differences among the
stations. These were observed mainly as biases ranging from

Comparison of individual GNSS STD solutions against GFZ solution,
all without using residuals (nonRES) and projected in the zenith direction:
bias

All individual GNSS STD solutions were compared independently using none (nonRES), raw (rawRES) and clean (clnRES) residuals. The comparison aimed to assess the impact of different strategies for reconstructing GNSS STDs. Figure 5 displays biases and standard deviations for all solutions when comparing STDs with and without raw residuals. Similarly, Fig. 6 shows results for STDs with and without clean residuals. Both comparisons demonstrate biases at a sub-millimetre level over all stations and solutions. Smaller biases are, however, observed in the latter case (clnRES), which demonstrates the presence of station-specific systematic errors in raw residuals (over all days of the benchmark campaign) projected into zenith directions. Although the decrease of biases is visible for all solutions, several solutions (GFZ, GOP, WUE) resulted in almost zero values over all the stations. It could be attributed to easier removal of systematic effects in PPP as absolute residuals are accessible directly. This is in contrast to the DD solutions by ROB with ZD residuals reconstructed using relative information in original values. Interestingly, the TUW PPP solutions seem to perform similarly to the ROB DD solution in this case.

Comparison of individual GNSS STD solutions without residuals
(nonRES) and with raw residuals (rawRES); statistics are projected in the
zenith direction: bias

Comparing standard deviations in both figures demonstrates that the impact
of cleaning residuals led to the standard deviations reduced by the factor
of 1.2–1.5 over all stations and solutions, namely reaching 2.5–4.5 mm for
clean residuals compared to 3.0–6.5 mm resulting from raw residuals. The
station-specific behaviour is more obvious for the latter rather than for
the former and, generally, the relative performance over all stations is in
a good agreement among different solutions applying clean residuals (see
Fig. 6). In particular, LDB0 and LDB2 stations
show high discrepancies for raw residuals (see
Fig. 5), while their standard deviations were
significantly reduced after cleaning the residuals becoming more homogeneous
with other stations. In this context it should be noted that the station
LDB0 is missing in both ROB solutions since it has been excluded from the
network solution during the pre-processing phase due to a lower quality of
observations. Besides the GOP_F demonstrating simulated
real-time solution, showing about 25 % worse standard deviations compared
to other solutions in Fig. 6, we can also observe
by a 12 % worse performance of the GOP_S solution using
forward filtering and backward smoother. Both can be attributed to the
stochastic model applied in the GOP software with epoch-wise parameter
estimation and partly also to remaining deficiencies in implementations of
all applied models – the only in-house software has been developed from
scratch recently and, in contrast to others, could not have been extensively
used in a variety of applications. Finally, there are rather small
differences observed due to the applied strategy, namely forward versus
backward filtering, GPS versus GPS

Comparison of individual GNSS STD solutions without residuals
(nonRES) and with clean residuals (clnRES); statistics are projected in the
zenith direction: bias

Table 3 summarizes statistics related to the
figures providing medians and standard deviations over all stations.
Notably, biases of STDs (over all stations) expressed in the zenith
direction are negligible in all solutions, i.e. not affected by adding raw
or clean residuals. The impact of adding raw residuals to the estimated
model can be characterized by the median
standard deviation of 3.9 mm (first two data
columns), which may vary for different stations, e.g. as evident for stations
LDB0 and LDB2 in Figs. 5 and 6. Adding cleaned residuals shows an overall
impact of 2.8 mm (middle data columns) corresponding to the reduction of 29 %
compared to raw residuals and up to 50 % for problematic stations
such as LDB0 and LDB2. The comparison is understood as the impact of
removing systematic errors from the residuals – in other words, as a
degradation of STD quality when applying uncleaned residuals due to the
contamination by systematic errors. From this reason, we would not recommend
adding uncleaned (raw) residuals, but cleaned only, when providing STDs from
GNSS. However, this comparison does not suggest any preference for
using the estimated model without residuals or for adding clean
residuals to reconstruct STDs. Both approaches still comprise of various
errors due to approximations, local environmental effects, instrumentation
effects or applied models. Additionally, the impact of cleaning the post-fit
residuals for the reconstruction of STDs can be characterized by a median
standard deviation of 2.6 mm when projected into the zenith direction,
roughly 25 mm at the elevation of 6

Individual GNSS solutions also provided variants using the same software and
strategy but with modified settings. This allows us to assess its impact on
the estimated parameters; see Table 4.
Consequently, we evaluated STDs calculated without residuals expecting the
impact (mainly) on estimated ZTDs and horizontal gradients. Biases reached a
sub-millimetre level and were almost insignificant, with the exception of
using GMF versus VMF1 mapping function resulting in a positive bias of

Figure 7 provides an evaluation of the STDs at
their original elevation angles for the station POTS. Four individual panels
show bias (top left), normalized bias (NBIAS, top right), standard deviation
(bottom left) and normalized standard deviation (NSD, bottom right).
Normalized bias and normalized standard deviation were computed to see the
dependence of relative errors in STDs at different elevations. For its
computation, absolute differences of STDs from two solutions were divided by
the STD values from the reference solution. For example, when the solution
from GFZ (taken here as the reference) was compared against TUO, the
standard deviation was computed from all valid absolute differences given as

Comparison of individual GNSS STD solutions against GFZ STD solution at station POTS, in slant directions.

In terms of standard deviation, the presumption about the dependency of
statistics on the elevation angle is clearly visible in the increasing
errors with the decreasing elevation angles (Fig. 7) while following an
exponential decay up to 45 mm at 7

STDs from four individual NWM ray-tracing solutions delivered by three different institutions entered the validation (see Sects. 4.1–4.3 for more information). Even though the time resolution of NWM is not continuous (only NWM-based results given at 00:00, 06:00, 12:00 and 18:00 UTC were used), the comparison with GNSS STDs measurements can be used to estimate the quality of the weather prediction. However, when the meteorological situation is well simulated by NWM, it is relevant for this study to compare the model with GNSS observations. To ensure the consistency of the comparison, only epochs for which STD values were available in all GNSS solutions were considered; i.e. if a single STD value was missing in any GNSS solution, then the STD values at the same epoch were also removed from all other GNSS solutions. This selection of observations and the low time resolution of the NWM models (6 h) led to a restricted set of STDs available for the validation consisting of 9866 observations in total.

Figure 8 presents the comparison of individual NWM
STDs and GNSS STDs (without residuals) expressed in the zenith direction.
From top to bottom, plots show biases (left) and standard deviations (right)
for ALA/BIRA, ERA/GFZ, GFS/GFZ and ALA/WUELS. For most stations, the bias
varies between

Comparison of individual GNSS STD solutions without residuals
(nonRES) against NWM solutions ALA/BIRA, ERA/GFZ, GFS/GFZ and ALA/WUELS (from
top to bottom), projected in the zenith direction: bias

Comparison of NWM-based solutions (ALA/BIRA, ERA/GFZ and GFS/GFZ) against GNSS GFZ solution at station POTS, in the slant direction.

Standard deviations between GNSS STDs and ALA/BIRA, ERA/GFZ and GFS/GFZ solutions are usually around 10 mm when projected into the zenith. Generally, they are higher than the comparison of individual GNSS solutions presented in Sect. 7.1 and they are also more station dependent. Degradations can be observed at mountainous stations KIBG and SAAL for the ERA/GFZ, GFS/GFZ and ALA/BIRA STDs, reaching standard deviations up to 18 mm in the case of the ERA-Interim NWM.

The solution of ALA/WUELS performed differently compared to all other NWM
solutions. It is biased against GNSS solutions, with biases ranging from

Finally, comparisons between the three versions of GNSS solutions (nonRES, clnRES, rawRES) and the ALA/BIRA, ERA/GFZ and GFS/GFZ NWM solutions were done to test the influence of post-fit residuals on GNSS STDs. The ALA/WUELS solution was excluded from this comparison because of the lower quality of its STDs. All GNSS solutions without post-fit residuals reached slightly lower standard deviation values than the solutions which included either raw or cleaned post-fit residuals, while differences in biases were negligible (not displayed). An average increase of standard deviation was 4.5 % for clean residuals and 8.3 % for raw residuals. Indeed, because of their low horizontal and time resolution, the used NWMs can barely capture the very fine-scale tropospheric structures which are supposed to be included in the GNSS residuals. As a consequence, this comparison does not allow us to draw a clear conclusion about the potential benefits of post-fit residuals in the reconstruction of the GNSS STDs.

Statistics from the comparison of ALA/BIRA, ERA/GFZ and GFS/GFZ against all three versions of GNSS GFZ solution expressed at original elevation angles of slant delays are presented for the station POTS in Fig. 9. Significantly higher biases can be found at the lowest-elevation bin in all three solutions and at all stations (not displayed). At some stations, sudden increases of bias at individual elevation bins were observed. They happened at any elevation angle (different for each NWM STD solution) and were particularly visible in terms of normalized bias. These sudden increases of the bias might be either because the model sometimes cannot render the tropospheric structures at their exact locations (unexpected location of high/low values of water vapour partial pressure) or because models running at these resolutions have a tendency to smooth out such tropospheric heterogeneities. Comparing with a model running at convective-permitting scale (e.g. 1 to 4 km) would help to sort out if the origin of such behaviours is the NWM STD or the GNSS STD.

For all stations, standard deviations present the shape with significantly
higher values at elevations below 30

Results from the GNSS versus ALA/WUELS solutions (not displayed) show an
enormous increase of both absolute and normalized bias and standard
deviations at low-elevation angles below 25

A summary of the GNSS versus NWM validation is presented in Table 5. For each reference station a median of bias and a median of standard deviation in the zenith direction between all GNSS solutions and a particular NWM-based solution are given. If we consider ALA/BIRA and ERA/GFZ only, without the two mountainous stations KIBG and SAAL, absolute biases between NWM and GNSS solutions stay mostly below 3 mm, which represents a very good agreement between these independent sources used for retrieving slant delays. Standard deviations generally range from 8 to 12 mm in the zenith projection, with the exception of ALA/WUELS, which shows lower precision by a factor of 2.5. Statistics stem from the complete benchmark period, and it should be noted that the daily variation of GNSS STDs was much lower than of NWM ray-traced STDs. Significantly higher values of biases and standard deviations were observed at particular days for NWM solutions. A detail evaluation of daily statistics with respect to the extreme weather conditions is one of the topics that we will study in future.

Figure 10 compares GNSS and WVR solutions at
stations POTM and POTS, in the zenith direction. The number of slant
observations which entered the comparison was 32 794 at station POTM and
36 070 at station POTS. Two remarks can be made on the evaluation of biases.
Firstly, an overall bias of about 4 mm between the stations POTM and POTS,
visible for all GNSS solutions already in Fig. 8,
indicates a common issue with the GNSS data processing at the station POTM.
It was particularly increased for GOP_F, GOP_S
and GFZ PPP solutions. Secondly, a bias of about 5.5 mm in the zenith
direction can be found between WVR and GNSS solutions even at station POTS.
This bias roughly corresponds to 1 kg m

Comparison of individual GNSS STD solutions for stations POTM and
POTS versus WVR measurements, expressed in the zenith direction,
bias

Values of standard deviation, resulting mostly in 12 mm, are higher than
those observed in GNSS versus GNSS comparisons (Sect. 7.1) and slightly
higher than from GNSS versus NWM comparisons (Sect. 7.2). A cut-off elevation
angle of 15

The GNSS versus WVR validation at the station POTS using original elevation
angles is displayed in Fig. 11. Although some
differences between GNSS solutions are visible, all of them performed in a
very similar manner. The decrease of values of four statistical parameters
strongly follows the increase of elevation angle and, generally, it is
steeper than statistics dependency of GNSS versus NWM. It indicates that
slant delays from WVR below 40

Comparison of WVR against individual GNSS STD solutions at station POTS, in the slant direction.

Generally, standard deviations for all solutions using cleaned residuals (raw residuals) are on average 1.7 % (3.8 %) higher than for the solutions without residuals. Differences between solutions variants are smaller due to an overall higher uncertainty of WVR observations, but the results are in a good agreement with those obtained for GNSS versus NWM comparisons presented in Sect. 7.1.

Two erroneous techniques for STD retrievals have been compared in previous sections (GNSS vs. NWM, GNSS vs. WVR) without knowing the true reference. The errors stem from the observation noise on one hand and from the processing models including the model for adjusted parameters on the other hand. From this perspective, the higher standard deviations for GNSS STD solutions applying clean residuals compared to those using adjusted GNSS parameters only (without residuals) do not necessarily mean the lower quality of the former. GNSS and NWM models with limited temporal and spatial approximations are not able to represent true signal tropospheric delays between a receiver and all visible satellites. The simplifications certainly result in better agreement of STDs without residuals in Eq. (1), but they hardly represent the true tropospheric path delays, deviating particularly during the events with high spatiotemporal variations in the troposphere.

For this reason, we assessed all GNSS solutions at the collocated (dual)
stations because for such constellation we are able to provide troposphere-free
differences of STDs to evaluate noise of GNSS STD retrievals. We
particularly focused on days with a high variability in the troposphere
selected from the benchmark period. Dual stations were available in the
benchmark campaign at three different locations in Germany. The first two
sites collocate twin GNSS reference stations (LDB0

Characteristics of individual dual stations.

Comparison of GNSS STDs from the elevation angles ranging
from 7 to 15

Comparison of GNSS STDs from the elevation angles ranging
from 15 to 90

STD validations in this paper were done for 2 months of the benchmark period during which heavy rain events occurred for some days, particularly 31 May–3 June, 9–11 and 21–26 June, all causing severe flooding in central Europe. During normal weather conditions, the tropospheric variation is reasonably smooth, meaning it can be well represented by GNSS STDs reconstructed from ZTDs and horizontal gradients. However, during high temporal or spatial variabilities in the troposphere, post-fit residuals certainly contain tropospheric signals which were not modelled. If they surpass the observation noise and other residual errors from GNSS models, cleaned residuals should be considered in the GNSS STD model as described in Eq. (1).

In order to initially address optimal STD modelling under different weather
conditions within the benchmark period, we tried to identify days with a high
variability in the troposphere. Daily standard deviations of cleaned
post-fit residuals were computed individually for each day of the benchmark period,
for every station and GNSS solution for 1

Three such days were identified at LDB0, LDB2, POTM and POTS stations
(31 May, 20 June, 23 June) and 2 days at WTZR and WTZS stations (19 and
20 June). They all very well correspond to the days initiating heavy
precipitations in the domain (Douša et al., 2016). Typical differences
between raw and clean residuals are displayed in Fig. 12 for all elevations
during the normal day (19 June, DOY 170) and the day with high variability in
the troposphere (DOY 171, 20 June) for LDB0, LDB2, POTM and POTS stations
using GFZ solution. Obviously, the variability of clean residuals (black
dots) and their 2

Elevation-dependent variability of clean residuals (black dots)
and their 2

Elevation-dependent differences of STDs using clean residuals (black dots)
are displayed in Fig. 13 for the same days as in
Fig. 12, selecting GFZ solution and station pairs
WTZS–WTZR and LDB0–LDB2. Additionally, 2

Firstly, we note that STD differences are more or less similar for both days, i.e. not significantly different between days with normal and high variations in the troposphere, which is also found for other days of the benchmark period. It suggests that increased residuals in Fig. 12 for DOY 171 contain strong contributions from the tropospheric effect that could not have been assimilated into ZTDs and tropospheric horizontal gradients. An alternative explanation suggests a possible contribution of satellite-specific errors common to both receivers, thus easily eliminated in STD differences at the dual stations. However, systematic errors at satellites are well absorbed by initial phase ambiguities in PPP and short-term or random errors, e.g. due to satellite clocks, are in this study eliminated by the use of final products, i.e. stable enough to avoid observed day-to-day variability in cleaned residuals. The DOY 171 thus shows the situation when cleaned residuals contain a tropospheric signal that should be added to the STD retrievals. In the case of GFZ, the contribution from residuals is particularly important due to local troposphere variation in time when using model of piece-wise constant function with 15 min time resolution for ZTD and 60 min for horizontal gradients. It is not so obvious in the case of a stochastic process used for epoch-wise estimates of all tropospheric parameters. However, the uncertainty of estimated parameters is then higher compared to the deterministic model, which makes it more difficult to separate errors in estimated parameter and errors due to insufficiency of the linearized tropospheric model in time.

Secondly, we can see that envelopes of differences using raw residuals are
always the largest ones. Raw residuals vary more with the elevation angle,
which is particularly visible for differences between LDB0 and LDB2. Obviously, it is
due to the large systematic errors at LDB0 station and additional
contribution from LDB2 errors observed at 35–55

Finally, we can consider error contribution from both stations to STD
differences at dual stations equal, i.e.

Figure 14 displays results for comparisons of individual dual stations in slant directions calculated from all days of the benchmark period. The same statistics and plots (not displayed) were prepared also for days identified with “severe” weather conditions, but only minor differences were observed. Strong variations are observed mainly in normalized biases over all elevation angles for the solutions using raw post-fit residuals (rawRES) regardless weather conditions. These are clearly related to local effects such as multipath or modelling instrument-related effects (phase centre offsets and variations) and disappear after using the cleaned residuals (clnRES). The standard deviations and normalized standard deviations at all stations are clearly the lowest for variants without using post-fit residuals (nonRES), slightly higher using cleaned residuals and significantly higher when using raw residuals, i.e. corresponding to above-performed inter-technique validations.

Elevation-dependent variability in STD differences of clean
residuals (black dots) and their 2

Tables 7 and 8 show the statistics expressed in the zenith direction for observations
ranging in elevation angles from 7 to 15 and from 15 to 90

We presented results of validating tropospheric slant total delays obtained from GNSS data processing with those obtained from NWM ray tracing, WVR measurements and collocated GNSS stations, in search of the optimal method for estimating GNSS STDs. Ten GNSS reference stations were selected, exploiting data from the 56-day COST ES1206 benchmark campaign. Eleven GNSS solutions, four NWM-based solutions and one WVR-based dataset entered this validation study. Eight out of 11 GNSS solutions delivered STDs in three variants: (1) without post-fit residuals, (2) with raw post-fit residuals and (3) with cleaned post-fit residuals. The comparisons were carried out into two scenarios, firstly for STDs at their true elevation angles and secondly for STD differences mapped into the zenith direction using a simple mapping function.

Comparison of GNSS STDs at dual stations computed over whole benchmark period from individual GNSS solutions in the slant direction for dual stations from left to right: LDB0-LDB2, POTM-POTS and WTZR-WTZS. Statistical parameters from top to bottom: bias, normalized bias, standard deviation and normalized standard deviation.

Comparisons of STD solutions without residuals, with raw or with cleaned
residuals were used to study the impact of different strategies for
optimally retrieving STDs from GNSS. The impact of cleaning residuals led to
the standard deviations reduced by a factor of 1.2–1.5 over all stations
and solutions, namely reaching 2.5–4.5 mm in the zenith direction for clean
residuals compared to 3.0–6.5 mm for raw residuals, the latter
also being highly dependent on the station. The impact of adding raw or cleaned
residuals was practically negligible in terms of biases, which always remained within

Biases and standard deviations between GNSS and NWM solutions depended on
applied ray-tracing method, NWM source and station location. Worse results,
by a factor of 2.5 in terms of standard deviation, were observed for the
ALA/WUELS solution originating from the deficiency of the applied ray-tracing
method. Generally, biases in the zenith direction were below

Using the simulation of delays from ALADIN-CZ weather model, we illustrated the impact of the hydrostatic, wet and hydrometeors contributions to zenith and slant delays. These showed strong horizontal variations that allowed relevant characterization of mesoscale meteorological situations. Visualizing the slant anisotropic variation of total, hydrostatic, wet and hydrometeor delays in a common skyplot illustrated a weak hydrostatic anisotropy (up to 5.8 mm) that was almost the same as that of the hydrometeor (up to 6 mm). The largest anisotropy was induced by water vapour (up to 20 mm), but the total anisotropy was much weaker (12 mm) due to the compensation of mean hydrostatic and hydrometeor anisotropies oriented in the opposite direction.

GNSS STDs from stations POTM and POTS were validated against collocated WVR
observations pointed to GNSS satellites. A positive bias of about 5.5 and
10 mm was observed for POTS and POTM station, respectively. Standard
deviations from comparisons of GNSS versus WVR STDs reached 12 mm in the
zenith direction, thus higher compare to NWM solutions. Normalized standard
deviations revealed a strong elevation dependency, indicating the WVR
observations lack this quality at low elevations, particularly below 40

Collocated GNSS stations at three different locations were used to evaluate the quality of GNSS STD retrievals applying statistics over troposphere-free STD differences from theoretical point of view. We could observe strong systematic errors in raw residuals at any elevation angles, particularly at stations without the choke ring antenna, such as LDB0 and POTM. We found a strong elevation dependency of bias when using raw residuals which almost vanished when cleaning the residuals from visible systematic errors. This suggests that the use of raw residuals should not be recommended, at least not without any information about possible systematic errors. Although the simplified STDs reconstructed from the estimated GNSS tropospheric parameters performed the best in all the comparisons, it obviously missed part of tropospheric signals due to non-linear temporal and spatial variations in the troposphere. By identifying low and high variability in the troposphere during all days in the benchmark period, we showed that residuals contain significant tropospheric signals in addition to the simplified model, particularly during high variability in the troposphere. Additionally, we also identified tropospheric signals at low elevations due to a non-linear horizontal asymmetry in cleaned residuals regardless of the station selection and the quality of its observations. From such findings, we recommend the use of cleaned residuals for an optimal STD retrievals from GNSS, at least for low-elevation angles and during high variability in the troposphere. We also have not seen any obvious degradation of STD retrievals in other conditions.

The better inter-solution and inter-technique agreements of STDs without residuals compared to those using clean residuals are attributed to the too-simple tropospheric model resulting in smooth and robust STDs and, consequently, not containing all interesting signals from the troposphere. The majority of evaluated GNSS solutions used deterministic models with rather long validity of estimated tropospheric parameters for which the residuals are important to overcome modelling deficiencies of low-resolution parameter estimates in time. Our future study will focus on the evaluation of GNSS STDs estimated using a stochastic process easily applicable in real time and on a long-term evaluation of azimuthal dependency of post-fit residuals under severe weather conditions.

GNSS data from the EUREF Permanent Network (EPN) stations
are freely available through the anonymous FTP, e.g. from the EPN historical
data centre at

The authors declare that they have no conflict of interest.

This study has been organized within the EU COST action ES1206 (GNSS4SWEC).
The authors thank all the institutions that provided data for the benchmark
campaign on which the validation was based on. Namely we want to thank
S. Heise (GFZ) for providing the WVR data. The GFS data were provided by the
National Centers for Environmental Prediction (