Four-dimensional mesospheric and lower thermospheric wind ﬁelds using Gaussian process regression on multistatic specular meteor radar observations

. Mesoscale dynamics in the mesosphere and lower thermosphere (MLT) region have been difﬁcult to study from either ground- or satellite-based observations. For understanding of atmospheric coupling processes, important spatial scales at these altitudes range between tens to hundreds of kilometers in the horizontal plane. To date, this scale size is challenging observationally, and so structures are usually parameterized in global circulation models. The advent of multistatic specular meteor radar networks allows exploration of MLT mesocale dynamics on these scales using an increased number of detec- 5 tions and a diversity of viewing angles inherent to multistatic networks. In this work, we introduce a four dimensional wind ﬁeld inversion method that makes use of Gaussian process regression (GPR), a non-parametric and Bayesian approach. The method takes measured projected wind velocities and prior distributions of the wind velocity as a function of space and time, speciﬁed by the user or estimated from the data, and produces posterior distributions for the wind velocity. Computation of the predictive posterior distribution is performed on sampled points of interest and is not necessarily regularly sampled. The main 10 beneﬁts of the GPR method include this non-gridded sampling, the built-in statistical uncertainty estimates, and the ability to horizontally-resolve winds on relatively small scales. The performance of the GPR implementation has been evaluated on Monte Carlo simulations with known distributions using the same spatial and temporal sampling as one day of real meteor measurements. Based on the simulation results we ﬁnd that the GPR implementation is robust, providing wind ﬁelds that are statistically unbiased and with statistical variances that depend on the geometry and are proportional to the prior velocity vari- ances. A conservative and fast approach can be straightforwardly implemented by employing overestimated prior variances and distances, while a more robust but computationally intensive approach can be implemented by employing training and ﬁtting of model parameters hyperparameters. The latter GPR approach has been applied to a 24-hour data set and shown to compare well to previously used homogeneous and gradient methods. Small scale features have reasonably low statistical uncertainties, implying geophysical wind ﬁeld horizontal structures as low as 20-50 km. We suggest that this GPR approach forms a suitable method for MLT regional and weather studies.


Introduction
The mesoscale neutral dynamics of the mesosphere and lower thermosphere (MLT) region are challenging to study, despite their importance in global circulation models. Due to the lack of observations, these scales are usually parameterized in models (e.g., Liu, 2019). MLT large scale dynamics have been studied with monostatic specular meteor radars (SMRs) by providing 25 mean horizontal winds over areas with approximately 200-300 km radius at MLT altitudes, and 1-2 hour and 2-3 km temporal and vertical resolutions, respectively (e.g. Hocking et al., 2001;Holdsworth et al., 2004). These measurements have made significant contributions to community understanding of the climatological behavior of mean winds, planetary waves, and total tides over a variety of SMR monostatic sites (e.g. Mitchell et al., 1999Mitchell et al., , 2002Pancheva et al., 2002;Sandford et al., 2006;Hoffmann et al., 2010). Moreover, when the winds from more than one SMR widely separated in longitude at a similar latitude 30 are combined, spatiotemporal ambiguities of tides and planetary waves have been successfully resolved (e.g., Murphy, 2003;Murphy et al., 2006;He et al., 2018;He and Chau, 2019). Monostatic SMRs have been also used to study MLT gravity wave momentum flux with wide and narrow beam observing configurations, with the caveat that spatial and temporal contributions are combined (e.g. Hocking, 2005;Fritts et al., 2012;Andrioli et al., 2013;Placke et al., 2015).
Recently, multistatic configurations have been proposed to complement these previous studies and to allow the investigation 35 of MLT mesoscale dynamics. These configurations include the MMARIA (Multi-static Multi-frequency Agile Radar Investigations of the Atmosphere) concept (Stober and Chau, 2015;Chau et al., 2017). This concept has been further augmented by the SIMONe (Spread Spectrum Interferometric Multistatic meteor radar Observing Network) approach . By using recent technological developments in atmospheric radars, such as spread-spectrum, MIMO (Multi-input multiple-output), and compressed sensing approaches (Vierinen et al., 2016;Urco et al., 2018Urco et al., , 2019b, SIMONe allows the implementation 40 of MMARIA with several attractive qualities: it is easier, cheaper, and inherently expandable compared to original proposed configurations using traditional pulsed systems. Examples of SIMONe implementations in Germany, Peru and Argentina can be found in several studies Charuvil Asokan et al., 2020;Vargas et al., 2020;Chau et al., 2021;Conte can aggressively smooth real spatial structure and, in some cases, can introduce artificial structure, particularly in regions with sparse or noisy measurements. In recent years, a variety of analysis approaches using statistical inverse theory have been applied to these and similar problems. These studies have the goal of estimating the spatial structure of multi-point projected wind velocities and electric fields (e.g., Nicolls et al., 2014;Hysell et al., 2014;Harding et al., 2015;Stober et al., 2018). For example, a Tikhonov regularization originally developed for a optical network of Fabry-Perot Interferometers (Harding et al.,60 2015) has been adapted to yield MLT wind fields over Peru .
As in any statistical inverse theory problem, more independent samples are desirable to reduce the impact of regularization constraints and to improve the quality of the estimates. In November 2018, a short observing campaign was conducted in northern Germany, herein denoted SIMONe2018, in which six existing MMARIA links were complemented with eight additional SIMONe links. During this campaign, we obtained on average two hundred thousand meteor scatter observations per day (e.g., Vierinen et al., 2019;Charuvil Asokan et al., 2020). For reference, a monostatic SMR obtains on average ten thousand meteors per day at a comparable latitude and seasonal time.

Specular meteor radar measurements and geometry
SMRs receive echoes from meteor trails when the radar Bragg vector (k B ) points perpendicular to them. The Doppler shift (f ) of the received signal of a meteor echo at time t and location given by longitude, latitude, and altitude (Λ, Φ, z) results from 95 the projection of the atmospheric wind vector (u) in the Bragg vector k B (e.g, Hocking et al., 2001;Holdsworth et al., 2004), where k u , k v , and k w are the Bragg vector components of k B and u, v, and w are wind vector components of u in the zonal (East), meridional (North), and vertical (Up) directions, respectively. The Bragg vector is given by the difference of the scattered 100 and incident wavevectors, i.e., k B = k s − k i . Using interferometry on reception, the angle of arrival (AOA) is obtained. In the case of MIMO systems, interferometry is also implemented on transmission, allowing measurement of the angle of departure (AOD) (e.g., . By combining these angles along with range information, the meteor location (Λ, Φ, z) and Bragg wavevectors are obtained. In the reductive case of monostatic systems, k B = −2k i and its magnitude is equal to 4π/λ, where λ is the radar wavelength.

105
As mentioned above, traditionally a mean horizontal wind has been obtained from analysis that simultaneously solves N equations of the form of (1), with the assumption that the wind is constant in the observed volume (zero-order Taylor approximation or homogeneous method). The data for the N -equation set was obtained by binning desired observations with regular altitude and temporal resolutions. In general, with a sufficient number of meteors and viewing angles, the method yields spatial information of the wind inside the observed volume. For example, Chau et al. (2017) has implemented a gradient 110 method, where the wind field estimation includes the first-order Taylor expansion terms.
In multistatic geometries, both the observed volumes and separations of the multi-static links are relatively large. For this reason, it is necessary to take the Earth's geoid shape into account. Moreover, the GPR model described in the next section is directly dependent on calculating coordinate distances accurately. This implies that altitudes and horizontal distances that account for the Earth's curvature, the measurement goal, must also try to minimizing mapping distortions, particularly in 115 distance scaling. Use of a naive geometric projection such as the equirectangular projection, in which latitude and longitude are simply scaled to yield x−y coordinates in meters, does not satisfy these requirements. Therefore, in this work, we use a local azimuthal equidistant projection centered in the observing region, with Earth shape based on the well known WGS84 geoid model. This projection is used to transform longitude and latitude into local x and y coordinates, where horizontal distance in x and y reasonably approximates the true geodesic distance. Subsequently, we use these (x, y) projected coordinates in place of 120 (Λ, Φ) geodetic coordinates from (1). Note that this does not change the definitions of (u, v, w) and k B , which remain aligned with a local East-North-Up coordinate system and not, in general, with the projected x and y coordinates.
To represent a set of Doppler wind measurements, we use the following notation for the measurement equation. Let x m = (t m , z m , y m , x m ) denote the coordinates for a measurement m of M . Then the ensemble of coordinates is given by the matrix and the corresponding wind vectors are given by We group the Bragg vectors of a set of measurements by component and combine with the 1 2π scaling to give u, v, and w measurement vectors: Finally, using to denote the element-wise (Hadamard) vector product, our measurement equation following from (1) for the ensemble of Doppler measurements f is where ∼ N 0, Σ n is zero-mean Gaussian measurement uncertainty with covariance Σ n .

Estimation Problem
The estimation task is to take a set of Doppler measurements f and infer wind values u(x ), v(x ), w(x ) at a chosen location x using the measurement model from (5). We employ Gaussian process regression (GPR) to model the winds and hence Doppler measurements as a stochastic process. This approach allows estimation at arbitrary coordinates (convenient for the random meteor locations and non-gridded prediction) and produces statistical uncertainty as an output product.

140
Our GPR method is implemented as a 3-staged process. First, one defines the form for the model, which includes mean and covariance functions and their parameters :::::::::::::: hyperparameters. Then, one fully specifies the model by setting parameter Doppler measurements measurement uncertainty Test points x * = (lat * , lon * , alt * , t * )  hyperparameter values, either through prior knowledge or a separate fitting process. Finally, one applies the specified model to a set of measurements to calculate the posterior predictive distribution and make an estimate at points of interest. Figure 1 summarizes our implementation in a block diagram. In the following paragraphs, we describe the method in detail.

Gaussian process definitions
For a function f (x) drawn from a Gaussian process, we write This representation is fully defined by mean and covariance functions which describe the first-and second-order statistics: where E denotes expected value. Gaussian processes are convenient because evaluating them at a set of points leads to a Gaussian random vector which enables tractable computation. We recast this compactly using matrix notation as 155 f (X) ∼ N (m(X), K(X, X)).

Wind component prior distributions
Since we want to estimate the wind components, we model them as independent Gaussian processes:
For the covariance functions, we choose a functional form where each wind component has an independent amplitude 180 multiplying a common distance kernel: Using a common distance kernel is convenient for simplifying computations, and we expect that relaxing this assumption in 185 the future would allow for increased expressiveness at the cost of computational burden. The distance kernel κ d is chosen to be the Matérn covariance with ν = 5 2 , using length scales given by δ t , δ z , δ y , and δ x for the coordinate dimensions: where . 2 represents the Euclidean norm. Altogether, this results in a parameter ::::::::::::: hyperparameter set θ of for the GPR wind model. We chose the Matérn-5 2 covariance because it is twice-differentiable but not infinitely-differentiable, so it provides relatively smooth functions while still allowing for rapidgeophysically driven , ::::::::::::::::: geophysically-driven : changes that 195 might be expected in wind fields. It is a typical choice for physical processes for this reason across a wide series of applications (Rasmussen, 2004) ::::::::::::::::::::::::::: (Rasmussen and Williams, 2006). Jointly and in matrix notation, we then write the Gaussian random vectors for the winds at a set of points X as Note that since we have defined the wind component processes independently, the cross terms are zero in the joint covariance 200 matrix. However, this is not to say that we strictly enforce zero cross-covariance between the wind terms with this model.
Rather, it is more accurate to say that we do not require prior knowledge of the cross-covariance but also cannot benefit from the improved estimation that such knowledge would provide.

Doppler measurement prior distribution
Since we are taking the wind components as Gaussian processes, and (5) provides a linear relationship between the wind a set of measurements f corresponding to the locations X, this produces a formulation as where Note that the Gaussian process being measured is a linear composition. This is only a minor concern for our application, but it does make the formulation slightly different from the more typical examples. The following subsections provide the explicit formulas necessary to perform parameter ::::::::::::: hyperparameter fitting and wind estimation using this model.
3.4 Model parameter :::::::::::::: hyperparameter fitting 215 Fitting for the model parameters ::::::::::::: hyperparameters θ involves maximizing the likelihood function for the marginal distribution pertaining to a set of measurements. Assuming Doppler measurements f coming from the distribution defined in (22), the negative log-likelihood as a function of the parameters ::::::::::::: hyperparameters : is where C is a fixed scaling constant. Minimizing this function requires evaluating the gradient of the negative log-likelihood.

Wind estimation 235
Having defined the model parameters ::::::::::::: hyperparameters : either through fitting or prior specification, estimating the winds at a set of prediction points X * involves evaluating the posterior probability distribution given the measurements.
Similarly, we obtain an estimate of the prediction uncertainty by using the posterior variance for each wind component, given by Since the measurement covariance K f term includes the assumed measurement noise, these equations effectively propagate the Doppler uncertainty through the measurement geometry and meteor density to produce the wind estimate uncertainty. However, we note that this uncertainty estimate ignores the cross terms in the covariance, both between test locations and among the wind 265 components. These factors can also be included to give a more complete picture of how the individual estimates are correlated, at an increased computational cost. More detailed estimates could also be backed by a fully Bayesian approach that involves Markov chain Monte Carlo sampling of the posterior predictive distribution and includes full distributions for the parameters ::::::::::::: Evaluating the posterior mean and covariance is a straightforward numerical linear algebra problem. However, given the 270 potential sizes of the various covariance matrices, this can be computationally expensive. Mitigation of this implementation burden can be achieved with both matrix-free and approximate methods (e.g. Gardner et al., 2018;Wilson and Nickisch, 2015). Application of these methods are the subject of future work, but we note that their use would make practical fitting and evaluating more tractable. antennas and receiving systems located in Neustrelitz and Bornim were used to receive the coded CW signals, forming both MISO and SIMO (single-input multiple-output) links at both sites.
Using the simulated measurements, we followed the GPR method from section :::: Sect. : 3 to estimate the 4D wind field for 310 comparison to the simulated winds. We explored fitting with different cubic spline forms for the mean wind functions, and qualitatively we found that the wind estimates were not sensitive to the details of the fit as long as it was reasonable. Even using a constant mean of zero produced qualitatively similar results. Thus, to remove a confounding variable, all of the estimation results presented in this section use the exact mean functions that were used to simulate the winds, :::::: which :: in :::: turn :::: are ::: the :::: same ::::: mean :::::::: functions ::::: fitted :: to ::: the ::::::::::: SIMON2018 :::: data ::: as :::::::: described :: in ::::: Sect. :: 6. Likewise, we fit for the covariance parameters 315 ::::::::::::: hyperparameters : from the simulated measurements and found that the results were similar (within 10%) to the values used for the simulation. This was reassuring and showed that the fitting procedure works at least when the winds can be described exactly by a Matérn-covariance Gaussian process. Similar to the mean, the estimated winds showed little qualitative sensitivity to small changes in the covariance parameters ::::::::::::: hyperparameters, so for the subsequent estimation results we used (as a baseline case) the same values for the amplitudes and length scales between the simulation and estimation Gaussian processes in order 320 to remove fitting noise as a confounding variable. These comparisons should be viewed as a best-case scenario from the perspective of the model, and therefore they can be used primarily to explore the effects of meteor measurement spatial density and geometry on the quality of the wind estimates.

375
The posterior predictive uncertainties are plotted against the error variance in ::: Fig. : 6 for both the horizontal (left) and vertical (right) wind components. In the horizontal case, we show the results of the total horizontal wind speed, i.e., √ u 2 + v 2 .
Lines give the mean of the error variance distribution, while the shaded region indicates the 90% confidence interval. For green shading obscures ::::::: indicating ::::: more :::::::: confidence :: in the bias ::::: central :::: areas : where few measurements are available ::: the ::: bias ::: also :::: tends :: to :: be :: a ::: little ::::: lower. Contours on the ::: error : variance plots correspond to the ::::: sample : error variance as with ::::::: (matching the color scale :::::: coloring).  Based on these Monte Carlo simulations, we recommend one of two approaches for applying GPR depending on the re-385 quirements of precision. First, if computational speed is a constraint and relatively large uncertainties are acceptable, then using conservative overestimates of the wind variances to specify the covariance amplitudes will ::: still : yield unbiased wind estimates with uncertainties that can be treated as rough upper bounds on the error variances. Second, if more precision is needed and computational time is not a problem, then fitting on the incoming data to get more accurate estimates of the prior covariance amplitudes will yield unbiased wind estimates with more accurate uncertainties. This choice between specifying 390 the covariance parameters :::::::::::::: hyperparameters and fitting for them is a critical decision for any user of the GPR method, as seen already in the block diagram of Fig. 1.

Experimental Results
In this section we implement the proposed wind field estimator on a data set of 24-hour observations collected on November 415 5, 2018 during the SIMONe2018 campaign. After initial data quality control, almost 2 × 10 5 ::::::: 200,000 meteor detections were obtained in 24 hours. Using a conservative approach and performing further quality checks yielded 1 × 10 5 :::::: 100,000 : high quality detections. The filter criteria used in this second reduction required that detections were (a) within three standard deviations of the zero-order residuals, and (b) more than 30 • above the horizon, to ensure that good interferometric angle of arrival (AOA) or angle of departure (AOD) estimates were obtained (e.g. . ::::::: Filtering :: by :: a :::::::: minimum :::::::: elevation applied on overlapping 90-minute windows spaced at 30 minute intervals to estimate the covariance amplitudes and length scales as they varied throughout the day. With the current procedure that computes the full covariance matrix, limiting to short time intervals like this is necessary for computational feasibility. The parameters :::::::::::::: hyperparameters were found to be constant enough throughout the day that approximate overestimates would suffice and allow proceeding with a single set 430 of parameters ::::::::::::: hyperparameters. The resulting covariance parameters ::::::::::::: hyperparameters : are: σ 2 u = σ 2 v = 900 m 2 s −2 , σ 2 w = 90 m 2 s −2 , δ x = δ y = 26 km, δ z = 3 km, and δ t = 900 s. Finally, the wind estimates were produced by selecting a fixed time, gathering data from the 90 minute window around that time (more than enough given the time length scale of 15 minutes), and computing the posterior predictive values at chosen spatial points.
To get a sense of the scales resolved with the GPR method, Figure 7 shows latitude-longitude slices of wind fields at three 435 different altitudes (84, 90, and 96 km) and three different times (05, 08, and 11 UT). The presentation format is similar to Fig. 2, i.e., horizontal wind magnitudes :::::: speeds are color-coded, streamlines of the horizontal wind are indicated and color coded with the vertical wind ::: and :::::::::: streamlines :::: show ::: the :::::::: direction :: of :::: flow. Areas of large velocity variance are shaded with 50% transparency to white. The wind fields show significant complexity, much more than can be well represented by the single mean vector per plot that would be reported by a monostatic meteor radar. On simple inspection, horizontal wind structures of ∼ 20-50 km are 440 successfully resolved, which is commensurate with the horizontal length scale parameter :::::::::::: hyperparameter : of 26 km.
In Figure 8, altitude-time slices at selected latitude-longitude points are shown for both zonal (left) and meridional (right) wind components. The large-scale tidal features are in good agreement with those obtained with the homogeneous method applied to the same data (see, Vierinen et al., 2019, Figure 6). The winds show significant variation between horizontal locations as expected.

445
Although we do not have a ground truth in this analysis to validate the horizontal scales we are resolving, we conduct an additional comparison to complement earlier identification of the large scale features (i.e., tides). In Figure 9, we compare GPR wind fields with those obtained with the homogeneous method (i.e., independent of latitude and longitude), and those obtained with a gradient method. Specifically, the homogeneous method uses a zero-order Taylor expansion, while the gradient method uses a first-order Taylor expansion. Both estimates have been obtained with altitude and temporal bins of 4 km and 4 hour 450 respectively, in order to produce a good representation of large scale features. The specifics of the two methods can be found in Chau et al. (2017) and Chau et al. (2021), respectively.
The gradient wind fields are shown in the first row of Fig. 9 for three selected altitudes (84, 89 and 94 km). The arrows are color-coded with the horizontal wind speed (green tones), while the mean vertical wind from the gradient method is colorcoded with red-yellow-blue tones. In the second row the GPR 3D wind fields are displayed in a similar manner to the gradient 455 estimates in the first row. The third row shows the difference between the GPR wind fields and those from the gradient method.
Note that the arrow colors and colorbar in the third row are different from the first two rows and show the difference of the horizontal winds. In all three rows the horizontal wind from the homogeneous method is shown with a thick black arrow in the center.
The salient features of Fig. 9 are:   . Comparison of GPR to gradient and homogeneous methods. The first row shows the horizontal wind field obtained with the gradient method using 4-hr and 4-km bins; the second row shows the horizontal wind field obtained with the GPR method using fitted covariance parameters :::::::::::: hyperparameters; and the third row shows the wind field difference between the values in the second row and the mean horizontal wind indicated in all panels with a black arrow. In all cases, a normalized statistical variance is indicated as gray contour lines, while the color contour represents the vertical component from gradient method (first row), GPR method (second row), and GPR minus mean from gradient method (third row). The row two color bar corresponds to the background vertical wind coloring while the other two color bars correspond to their respective arrow colors.
that the gradient estimates have been obtained with relatively large temporal and vertical averaging, in order to produce a good representation of large-scale features.
-As expected, the GPR method provides smaller horizontal spatial structure than the gradient method (cf. second row).
-By subtracting the mean wind obtained with the gradient method (i.e., large scale features) from the GPR estimates, in 465 the third row, mesoscale structures are identified. Horizontal structures in the order to 20-50 km are clearly identified in all three altitude cuts.
Similar wind field comparisons for different times of the day can be found in supplemental material Movie S1.
In the particular case of the gradient method, Chau et al. (2017) have previously shown that the mean vertical velocity obtained with the homogeneous method, i.e., an area of ∼200 km radius, was contaminated by the mean horizontal divergence. Similar 500 effects would be expected at smaller scales.

510
These results represent just the first step toward applying GPR analysis to estimate wind fields from meteor observations.

535
Finally, we plan to apply the GPR method to selected additional datasets that use a multi-static configuration in order to further investigate the properties of the resolved 20-50 km horizontal wind structures. These investigations will cover both individual case studies and statistical studies: for the former, we expect to analyze special geophysical conditions and/or measurements that are complemented by other ground-or satellite-based instruments (e.g., Davis et al., 2018;Vargas et al., 2020); for the latter, we expect to compare the Reynolds stress tensor statistics of GPR-estimated wind fields to those obtained 540 from second-order statistics of projected wind velocities .

Conclusions
We have introduced an alternative observation method based on Gaussian process regression analysis to resolve MLT wind fields in 4D from multistatic radar observations. Based on Monte Carlo simulations of known wind field distributions, our proposed method provides unbiased mean velocity estimates and posterior velocity variances that are proportional to prior 545 velocity variances. By using an adaptive fitting procedure based on input data, unbiased posterior variances can be achieved.
This adaptive approach is currently not practical for real-time applications, but is ideal for case studies.
The horizontal regions of good GPR method performance in MLT wind determination are dependent on the meteor scatter geometric configuration. On one hand, optimal configurations should ultimately increase the number of detections. However, on the other hand, these same configurations need to provide sufficient Bragg vector diversity. For the particular SIMONe2018 550 experiment scattering geometry, these factors meant that vertical velocity estimates with relatively small variances were obtained over a much smaller horizontal area than horizontal wind estimates.