Water vapour is an important substituent of the atmosphere
but its spatial and temporal distribution is difficult to detect. Global
Positioning System (GPS) water vapour tomography, which can sense
three-dimensional water vapour distribution, has been developed as a research
area in the field of GPS meteorology. In this paper, a new water vapour
tomography method based on a genetic algorithm (GA) is proposed to overcome
the ill-conditioned problem. The proposed approach does not need to perform
matrix inversion, and it does not rely on excessive constraints, a priori
information or external data. Experiments in Hong Kong under rainy and
rainless conditions using this approach show that there is a serious ill-conditioned
problem in the tomographic matrix by grayscale and condition numbers.
Numerical results show that the average root mean square error (RMSE) and
mean absolute error (MAE) for internal and external accuracy are

Water vapour is a major component of the atmosphere, and its distribution and dynamics are the main driving force of weather and climate change. A good understanding of water vapour is crucially important for meteorological applications and research such as severe weather forecasting and warnings (Liu et al., 2005). Nevertheless, the variation of water vapour is affected by many factors, including temperature, topography and seasons with characteristics of changing fast with time and changing strongly in vertical and horizontal directions, which makes it difficult to monitor with high temporal and spatial resolutions (Rocken et al., 1993).

Thanks to the development of GPS station networks providing atmospheric information under all weather conditions, GPS is considered a powerful technique to retrieve water vapour. Since Bevis et al. (1992) first envisioned the potential of tomography to be applied in GPS meteorology, water vapour tomography has become a promising method to improve the restitution of the spatio-temporal variations of this parameter (Braun et al., 1999; Nilsson et al., 2004; Song et al., 2006; Perler et al., 2011; Rohm, 2012; Dong and Jin, 2018).

In GPS water vapour tomography, the research area should be covered by ground-based
GPS receivers and discretized into a number of cubic closed voxels by
latitude, longitude and altitude, each of which has a fixed amount of water
vapour at a particular time (Guo et al., 2016). The observations are
GPS-derived slant water vapour data, the precipitable water in the direction of
the signal ray path, which travels through the troposphere from its top
(Zhao and Yao, 2017). After obtaining the precise measurement of the signal
ray distance in each voxel by ray tracing its path from receiver to
satellite, we can achieve the basic equation for water vapour tomography,
which can be expressed in linear form (Flores et al., 2000; Yang et al.,
2018):

Since a GPS signal ray can only pass through a small part of the voxels in
the study area, the elements of matrix

To circumvent the ill-conditioned problem, many methods are explored in the
literature. Flores et al. (2000) added constraints on the vertical and
horizontal variability of tomography with additional top constraints to the
model. Most constraints are based on experience and difficult to match to
the actual water vapour distribution, resulting in the deviation of
tomographic results. Moreover, singular value decomposition (SVD) is
required to perform matrix inversion. Bender et al. (2011) utilized an
iterative algorithm called the algebraic reconstruction technique (ART) to solve
the observation equation. Several reconstruction algorithms of the ART
family were also implemented, e.g. the multiplicative algebraic
reconstruction techniques (MART) and the simultaneous iterations
reconstruction technique (SIRT) (Stolle et al., 2006; Liu et al., 2010). The
ART techniques are iterative algorithms that proceed observation by
observation. Only two vectors,

In the above-mentioned tomographic methods, excessive constraints on the matrix inversion, exact priori information or external data are commonly used to overcome the ill-conditioned problem. The mandatory usage of excessive constraints in tomographic experiments with poor voxel structure will induce limitations, while reliance on exact priori information will make the tomographic solutions too similar to the priori data and decrease the role of the tomography technique. External data cannot be used in all tomographic experiments. Therefore, this paper proposes a new tomography method based on a genetic algorithm (Sect. 2). The tomography experiments and results of the analysis are presented in Sect. 3. Section 4 summarizes the conclusions.

In water vapour tomography, the observation is slant water vapour, which can be
converted from slant wet delay (SWD) by the following formula (Adavi and
Mashhadi, 2015):

After obtaining the observation equation (Eq. 2), three types of
constraints are usually added:

Flowchart of the water vapour tomography based on the genetic algorithm.

For water vapour tomography based on the genetic algorithm, the first
procedure is to construct the tomographic equation. The idea of function
optimization is then used to solve Eq. (2) (Guo and Hu, 2009; Olinsky et
al., 2004), which is similar to the principle of least squares,

construct the fitness function which is converted from the tomographic equation.

generate some groups representing approximates of

select groups from the last generation of the population as parents
according to a lower-to-higher order of the groups of

produce offspring groups from parents by crossover and mutation to make up a new set of approximated solutions (new generation);

compute the fitness values of the new generation, go back to step 3 and produce the next generation of the population;

terminate the search when a group of approximates meets the requirements of the fitness value. (Generally, we set the stopping criteria for generation or calculation time.)

Parameters of the genetic algorithm.

Geographic distribution of GPS, radiosonde stations and the horizontal structure of the voxels used in water vapour tomography. Map data ©2018 Google.

In order to conduct the tomographic experiment based on a genetic algorithm,
Hong Kong was selected as the research region. The boundary and resolution
in west–east and south–north directions were 113.87–114.35

Grayscale graph of number of signal rays passing through
each voxel and distribution of voxel with sufficient signal rays (

The GPS tropospheric parameters (zenith tropospheric delay and gradient
parameters) were estimated by the GAMIT 10.61 software based on a
double-differenced model. In order to reduce the strong correlation of
tropospheric parameters caused by the short baseline between GPS receivers
in the tomographic area, three International GNSS Service (IGS) stations
(GJFS, LHAZ and SHAO) were incorporated into the solution model. In the
processing, the sampling rate of observations was 30 s, a cut-off elevation
angle of 10

To verify the proposed method, two periods of GPS observation data, with a sampling rate of 30 s, were used in the tomography experiment. One from 13 to 19 August 2017 (day of year (DOY) of 225 to 231, 2017), during which a spell of fine weather prevailed in Hong Kong with a ridge of high pressure extending westwards from the Pacific to cover south-eastern China on 16–18 August. In that period of time, the daily rainfall was 0 mm. Moreover, the relative humidity and SWV produced in the selected stations on average are 75 % and 79.1 mm, respectively. This period is defined as rainless days. Hence, fine weather occurs without any rainfall. In addition, the relative humidity and SWV are small. The other period is from 12 to 18 June 2017 (DOY of 163 to 169, 2017), which covers the rainy days. During the selected rainy period, the weather of Hong Kong was first affected by the approach and the passage of a severe tropical storm, named Merbok, with more than 150 mm of rainfall recorded on 13–14 June. Thereafter, from 15 and 16 June, the influence of an enhanced southwest monsoon and the development of a lingering through of low pressure made the remaining weather unstable and rainy till 21 June. In this period of time, the maximum daily rainfall is up to 203.7 mm, and the average daily rainfall is 66.8 mm. The average relative humidity and SWV produced in the selected stations are 89 % and 112.9 mm, respectively. This period represents rainy days, indicating that continuous rainfall occurs, and the relative humidity and SWV are high. The period covered is 0.5 h for each tomographic solution. The radiosonde data, collected twice daily at 00:00 and 12:00 UTC in these two periods, were treated as reference data.

According to the flowchart presented in Fig. 1, the above GPS observation
data were processed to construct the tomographic equation and further
converted into the fitness function for the optimization algorithm.
Population size is chosen based on the total number of unknown parameters
(water vapour density). The value of 200 is the default option of the
algorithm when the number of unknowns exceeds a certain amount. The
reproduction of elite count is chosen to be 10 to specify the number of
individuals that are guaranteed to survive to the next generation because it
is based on population size (0.05

Scatter diagram of the SWV residuals in different weather conditions for internal accuracy testing.

In a tomographic solution, the structure of the coefficient matrix in the observation equation depends on which voxels are crossed by SWV and the number of signal rays penetrating each voxel. Figure 3 illustrates this in the form of a grayscale graph for two different days: 13 August 2017 at 00:00 UTC, a rainless day (a), and 13 June 2017 at 12:00 UTC, a rainy day (b). In the upper panels of each sub-figure, the deepening of the grayscale refers to an increase in the number of signal rays crossing through the voxel. The closer the layer is to the ground, the more voxels are not crossed by any signal rays. Although there are few voxels with no signal rays passing through in the upper layers, many of the voxels have a lighter grayscale, which means that the voxels are crossed by fewer signal rays.

Note that when the signal ray passes vertically through the tomographic
region, the ray crossed a minimum number of voxels; that is, 10 in the
tomographic area. Therefore, the minimum probability that a voxel will be
crossed by a ray is 1.79 % (

To better analyse the ill-conditioned nature of the observation equation in
tomography modelling, the number of zero elements in matrix

Comparison of SWV residuals in zenith direction: circles for RMSE and diamonds for MAE; blue for rainy days and red for rainless days.

To evaluate the performance of water vapour tomography based on a genetic algorithm, slant water vapour of GPS stations for the data of 13 to 19 August and 12 to 18 June 2017 were computed using the tomographic results based on the water vapour tomographic observation equation established in Eq. (1). In this process, the parameters on the right side of Eq. (1) (the distance of the signal ray in each of the voxels and the water vapour density calculated by the tomographic modelling) are taken as known quantities. Moreover, the SWV on the left is the parameter to be determined, i.e. the tomography-computed SWV. The differences against the GAMIT-estimated SWV (as a reference) were also identified.

For internal accuracy testing, 13 GPS stations used in the tomographic
modelling were adopted. The change of tomography computed vs. GAMIT-estimated
slant water vapour residuals with elevation angle is shown in Fig. 4, where
the blue and red dots represent the rainy and rainless days, respectively.
The maximum residuals for rainy and rainless scenarios are 10.74 and

Histogram for MAE

To normalize SWV residuals for their evaluation in a single unit, we mapped
the tomography-computed SWVs back to the zenith direction using
the

Comparison of SWV residuals (differences between the
tomography-computed SWV and GAMIT-estimated SWV) for the KYC1 station in
each elevation bin;

For external accuracy testing, the data from KYC1 station, which was not
included in the tomographic modelling, were used. Figure 6 shows the
histogram for MAE (upper) and RMSE (lower) of SWV residuals (differences
between the tomography-computed SWV and GAMIT-estimated SWV), in which the
blue and red bars represent rainy and rainless days, respectively. The dashed
bars are the averages for those different weather conditions. From this
figure, it can be noted that all MAE and RMSE values are below 15 mm, with average
values lower for rainless days than for rainy days, respectively

Box plots of the SWV residuals (differences between the tomography-computed SWV and the GAMIT-estimated SWV) for the KYC1 station.

To further assess external accuracy, slant water vapour outputs were grouped
into individual elevation bins of 5

Panels

In the above analysis, RMSE and MAE were used for the external accuracy
testing of the tomographic results based on the GA. Box plots are used to
explore the statistical characteristics of SWV residuals and to detect the
outliers in the tomographic errors. Five characteristic values are shown in
the box plots. Q1 and Q3 located at the bottom and top of the box represent
the first and third quartiles; the second quartile (Q2) is located inside
the box; the ends of the whiskers refer to the upper and lower bounds, which
are located at Q1

Linear regression of the water vapour density from
radiosonde and tomography based on the genetic algorithm. Panels

The water vapour density profile derived from the radiosonde data can be used as a reference value, which is well suited to evaluate the accuracy of the tomographic results based on a genetic algorithm. As the radiosondes are launched daily at 00:00 and 12:00 UTC, the tomographic results of 12 to 18 June (rainy days) and 13 to 19 August 2017 (rainless days) at these times were compared. Figure 9 shows the water vapour density comparisons between radiosonde data and tomographic results for different altitudes at individual dates (rainy period). It is clear from the profiles that the water vapour density (WVD) decreases with increasing height. The WVD profiles reconstructed by the GA tomographic solutions conform with those derived from the radiosonde data, especially in the upper troposphere in absolute terms. With respect to the relative error, the values of the voxels higher than 5 km and lower than 5 km are 31 % and 15 %, respectively. The reason for this phenomenon is that the value of water vapour in the upper layers is relatively low. Even a small difference between the radiosonde and tomographic result can also lead to a large relative error, whereas the water vapour content resides for more than 90 % below 5 km near the Earth's surface. In certain cases, a relatively good consistency can also be seen in the lower atmosphere. This may be because a GPS station (HKSC) for tomography modelling is located at the voxel where the radiosonde station is situated, resulting in the low atmosphere with sufficient signal rays passing through.

Box plots of the WVD residuals, which are computed between the GA tomographic approach and radiosondes.

RMSE and MAE of the water vapour density comparison between
radiosonde and tomography based on the genetic algorithm for different
weather conditions (g m

The three-dimensional distribution of water vapour density derived from ECMWF data, the GA method and the least squares method (upper for rainless scenario and lower for rainy scenario).

To further illustrate the comparison with the radiosonde data, Table 2
lists RMSE and MAE of the WVD. In the table, the WVD in the voxels above the
radiosonde station computed by tomography and those derived from radiosonde
are counted to calculate their RMSE and MAE in each solution. Thus, the
average RMSE

Regression

To explore the overall accuracy of water vapour density reconstructed by the
GA tomography, the linear regression analysis and box plot were adopted for
different weather conditions. Figure 10 shows the linear regression of the
water vapour density for rainy days (Fig. 10a), rainless days (Fig. 10b) and their
combination (Fig. 10c), in which the scatter points of three graphs are close to
the

Water vapour density comparisons between GA and the least squares method in the selected voxels at 00:00 and 12:00 UTC from 13 to 19 August 2017 (rainless days); radiosonde and ECMWF data are used as reference.

The least squares method is most commonly used in water vapour tomography,
and numerous experiments prove that water vapour density with high accuracy
can be obtained with this method (Flores, et al., 2000; Zhang et al., 2017;
Zhao et at., 2017). To verify the accuracy of the genetic algorithm, we
compared the tomographic results between the genetic algorithm and the least
squares method in this section. The specific process and introduction to
the least squares method can be found in detail in Flores et al. (2000), Guo et al. (2016) and Yang et al. (2018). Figure 12 shows the
three-dimensional distribution of water vapour density derived from
tomography based on the GA and the least squares method. The water vapour
computed by the European Centre for Medium-Range Weather Forecasts (ECMWF)
data, which provides various meteorological parameters at different pressure
levels with a spatial resolution of

Statistical results of the GA and the least squares method
comparison; ECMWF data are shown as a reference (g m

To further analyse the tomographic results of the GA and the least squares
method, regression and boxplot analyses are conducted and displayed in Fig. 13,
which covers all solutions, each of them containing 560 voxel results. In
Fig. 13a, a good linear regression relationship is shown by the
distribution of scatter points and the straight line of regression.
Specifically, the starting points of the regression equation and the slope are 0.5198 and 0.9401, respectively. The right panel shows the distribution
of differences between the two types of tomographic results. The Q1 and Q3
are

Moreover, a detailed comparison between GA and the least squares method is
conducted using the voxels above the radiosonde station. Figure 14 shows the
changes in water vapour density derived from GA and the least squares method with
altitudes in different days (rainless days), in which the radiosonde data
and ECMWF data are considered reference data. All the profiles derived
from the two methods decrease with increasing height and show good
consistency with the reference data. The statistical values are computed and
listed in Table 4 to illustrate the comparison of GA and the least squares
method. The RMSE and MAE indicate that both the GA and the least squares method
can achieve good tomographic results compared with the reference values
(radiosonde and ECMWF data) whether in the rainy or rainless scenario. The
GA which has an average RMSE

Statistical results of the GA and the least squares method using
radiosonde and ECMWF data as reference in the selected voxels (g m

Changes in water vapour density with altitude in different weather conditions; data are from radiosonde measurements (blue for rainy days from 12 to 18 June 2017 and red for rainless days from 13 to 19 August 2017).

In our experiments, the comparisons under various weather conditions
illustrate that the tomographic result of rainless scenarios was better than rainy scenarios, which is also concluded in other studies (Yao et al., 2016;
Zhao et al., 2017; Ding et al., 2017). This result is because the spatial
structure of atmospheric water vapour is relatively stable in rainless
weather, whereas its spatial distribution changes faster in rainy weather.
Thus, certain limitations are imposed on tomography to obtain accurate water
vapour during unstable weather conditions. Additionally, all the water vapour
densities along the radiosonde path were collected during the experiments.
Their changes with altitude are shown in Fig. 15, in which the rainy and
rainless weather are represented by blue and red dots, respectively. The situation of 8–12 km is magnified to show the water vapour information outside the tomographic
region. In the figure, the larger value of WVD can be observed above 8 km in
rainy days compared with that of rainless days. For the rainless situation,
the value of WVD within 8–12 km is small and near to zero. By contrast, the
value is basically not close to zero in the rainy situation, especially in
the range of 8–10 km, which is substantially greater than 0.5 g m

In this paper, a new tomography approach based on the genetic algorithm was proposed to reconstruct a three-dimensional water vapour field in Hong Kong under rainy and rainless weather conditions. The inversion problem was transformed into an optimization problem that no longer depends on excessive constraints, a priori information or external data. Thus, many problems do not need to be considered, including the difficulty of inverting the sparse matrix, the limitation and irrationality of constraints, the weakening of tomographic technique by prior information, and the restriction of obtaining external data. Based on the fitness function established by the tomographic equation, the water vapour tomographic solution could be achieved by the genetic algorithm through the process of selection, crossover and mutation.

Our new approach is validated by tomographic experiments using GPS data
collected over Hong Kong from 12 to 18 June (rainy days) and 13 to 19
August 2017 (rainless days). The problem of matrix ill condition was
discussed and analysed by the grayscale graph and condition number. In a
comparison of the SWV residuals, internal and external accuracy testing were
used for the GA tomography. The internal accuracy testing refers to computing
the differences between the tomography-computed SWV and GAMIT-estimated SWV
for the 13 GPS stations used in the tomographic modelling, whereas the
external accuracy testing denotes the differences for the KYC1 station which
is not included in the tomographic modelling. The RMSE

The GNSS observations and the relative meteorological data can be download from the Hong Kong Satellite Positioning Reference Station Network (SatRef) (

FY, JG and JS did the conceptualization; YZ, LZ and DZ conducted the data curation; FY, JG and JS conducted the formal analysis; FY proposed the methodology; JS and XM provided the resources; FY validated the experimental results; FY wrote the original draft; FY, JG, XM, JS, LZ, YZ and DZ reviewed and edited the manuscript.

The authors declare that they have no conflict of interest.

The authors would like to thank the Lands Department of HKSAR for providing the GNSS data from the Hong Kong Satellite Positioning Reference Station Network (SatRef). Chinese Scholarship Council (CSC) and the University of Nottingham for providing the opportunity for the first author to study at the University of Nottingham for 1 year. Acknowledgements are also given to the editor in charge (Roeland Van Malderen) and my colleague at the University of Nottingham (Simon Roberts) for their revision to improve the English language and style of the paper.

This research has been supported by the National Natural Science Foundation of China (grant nos. 41474004 and 41604019).

This paper was edited by Roeland Van Malderen and reviewed by Andre Sa and two anonymous referees.