the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Sampling error in aircraft flux measurements based on a highresolution large eddy simulation of the marine boundary layer
Grant W. Petty
A highresolution (1.25 m) large eddy simulation (LES) of the nocturnal cloudtopped marine boundary layer is used to evaluate random error as a function of continuous track length L for virtual aircraft measurements of turbulent fluxes of sensible heat, latent heat, and horizontal momentum. Results are compared with the widely used formula of Lenschow and Stankov (1986). In support of these comparisons, the relevant integral length scales and correlations are evaluated and documented. It is shown that for heights up to approximately 100 m ($z/{z}_{i}=\mathrm{0.12}$), the length scales are accurately predicted by empirical expressions of the form I_{f}=Az^{b}. The Lenschow and Stankov expression is found to be remarkably accurate at predicting the random error for shorter (7–10 km) flight tracks, but the empirically determined errors decay more rapidly with L than the ${L}^{\mathrm{1}/\mathrm{2}}$ relationship predicted from theory. Consistent with earlier findings, required track lengths to obtain useful precision increase sharply with altitude. In addition, an examination is undertaken of the role of uncertainties in empirically determined integral length scales and correlations in flux uncertainties as well as of the flux errors associated with crosswind and alongwind flight tracks. It is found that for 7.2 km flight tracks, flux errors are improved by factor of approximately 1.5 to 2 for most variables by making measurements in the crosswind direction.
1.1 Background
The eddy covariance (EC) method has long been the principal means of making field measurements of turbulent fluxes of energy, momentum, and trace gases in the planetary boundary layer (PBL). While towers are commonly used to measure fluxes over longer time periods (weeks to years) at fixed locations, aircraft are the preferred platform for obtaining estimates of fluxes over larger areas, albeit for far shorter periods of time.
By their nature, EC estimates of fluxes are averages of quantities that randomly fluctuate on timescales ranging from fractions of a second to hours and over distance scales from centimeters to kilometers. Thus, in addition to the usual factors affecting measurement quality from any instrument, such as calibration, precision, time response, and site representativeness, the quality of eddy correlation estimates is critically subject to statistical sampling error as well as to potentially restrictive assumptions about temporal stationarity and/or spatial homogeneity.
In this paper, we are concerned exclusively with the problem of random sampling error in the context of aircraft flux measurements relying on in situ (e.g., gust probe) measurements of fluctuating wind vector and scalar quantities directly along the flight path. The key question that has been asked for decades is, how long of a flight path – whether consisting of a single extended leg or a series of shorter legs over a smaller region – is sufficient to obtain turbulent flux estimates of useful precision? This problem was examined by Lenschow and Stankov (1986), Lenschow et al. (1994), and Mann and Lenschow (1994), relying on statistical models of turbulence supported by observations. Mahrt (1998) further examined the sampling problem in the context of the problems posed by nonstationarity. Others took an empirical approach, comparing aircraft flux estimates with those from nearby fixed towers (Desjardins et al., 1989; Grossman, 1992; Mahrt, 1998).
Finkelstein and Sims (2001) used what was characterized as a more general yet rigorous derivation to arrive at a different expression for the flux sampling error. Based on the properties of test data sets for several variables, their computed flux errors were typically about a factor of 2 larger than those predicted by the expressions of Mann and Lenschow (1994).
In recent years, it has become possible to use large eddy simulation (LES) models to explicitly resolve turbulent fluctuations of mass, energy, and motion in the planetary boundary layer at fairly fine scales, leaving only a small fraction of the total turbulent exchange to subgridscale parameterizations, especially at levels far above the surface. Model turbulence fields can be directly compared with aircraft measurements in the same environment, as was done, for example, by Brilouet et al. (2020) for a coldair outbreak over the northwest Mediterranean, who found that the LES successfully reproduced convective structures observed by the aircraft.
Alternatively, the LES fields may serve as a virtual environment within which turbulence may be sampled in a manner consistent with the platform (Schröter et al., 2000; Sühring and Raasch, 2013; Sühring et al., 2019). An attractive feature of this second approach, which is the focus of this paper, is the unique ability to compare the flux estimate obtained from a limited sample with the known “true” domainaveraged value, in contrast to the case for realworld comparisons between inherently disparate aircraft and tower measurements of the areaaveraged flux.
A closer look at Schröter et al. (2000) is worthwhile in light of significant similarities in both their objectives and their approach to those of the present paper. They flew virtual aircraft through a 401 × 401 × 42 LES domain simulating a continental convective boundary layer at heights ranging from 175 to 1075 m. “Measured” sensible heat fluxes were compared with “true” domainaveraged fluxes, and the ensemble flux errors determined. These errors were found to compare well with expressions of Lenschow and Stankov (1986) discussed below. They found that 2000 s flight durations, corresponding to a distance of 200 km, were sufficient to achieve 10 % precision in the flux estimates.
The more recent studies by Sühring and Raasch (2013) and Sühring et al. (2019) have focused on using virtual aircraft flights through LES domains to assess the ability to resolve the effects of surface inhomogeneities on aircraft flux measurements.
Schröter et al. (2000)Sühring and Raasch (2013)Sühring et al. (2019)An important limitation of LES studies has been the tradeoff between domain size and the ability to resolve turbulence at the finest scales, a consequence of the high computational cost of combining large domains and short time steps. For this reason, even fairly recently published studies using LES to study the flux sampling problem have utilized horizontal grid dimensions of the order of 2×10^{3} or lower and horizontal grid resolutions of 7–50 m (Table 1). Coarser grid resolutions imply a significant role for parameterized subgridscale fluxes and preclude a complete examination of the flux sampling problem for low and slowflying light or ultralight aircraft (Metzger et al., 2011; Vellinga et al., 2013) and unmanned aerial vehicles (UAVs; Elston et al., 2015).
The present study is motivated by the recent availability of LES results for a 4096 × 4096 domain with 1.25 m solution, offering an unusually large range in resolvable scale. While the resulting 5.1 km domain size is still insufficient to capture some potentially important largerscale modes of variability (e.g., Brooks and Rogers, 1997; Brilouet et al., 2017), it is a sufficient improvement to motivate a fresh look at the aircraft flux sampling problem. This is especially true when the focus is expressly restricted to relatively lowlevel, purely turbulent motions having scales comparable to or less than the depth of the boundary layer.
1.2 Objectives
The central purpose of this paper is to investigate the empirical relationship between sampling error in turbulent flux measurements and the length of a continuous, ideal aircraft track through the virtual atmosphere represented by an LES. Specifically, the analysis is based on a single time step of the LES run described by Matheou (2018). Because the simulation utilized periodic lateral boundary conditions, it is possible to construct ensembles of simulated flight paths that are continuous over longer distances and yet do not resample the same locations in the domain. We are thus able to focus narrowly on the precision of aircraft flux measurements as a function of flight track length alone in the particular environment represented by this simulation without the complication of boundary effects or the averaging of shorter segments.
To put the empirical determinations of error into perspective, results are compared with the widely used formula of Lenschow and Stankov (1986). That relationship in turn depends on estimates of the socalled integral length of the fields being measured, so documenting integral lengths obtained from the LES fields and their dependence on height and wind direction relative to flight tracks is a related objective. Additionally, the relative error of crosswind versus alongwind flux measurements is examined for a single track length of 7.2 km.
As previously noted, the general methods and motivations are in many ways similar to those of Schröter et al. (2000) and Sühring and Raasch (2013). However, horizontal inhomogeneity of the surface and thus of the boundary layer forcing is eliminated as a factor in the present analysis, domain resolution is improved allowing the simulation of lowerlevel flights (down to 10 m), and the environment considered here is a midlatitude summertime marine boundary layer topped by a stratocumulus cloud deck.
1.3 Overview
In the following section, I describe the LES setup and the simulated environment, and I present selected features of the simulation results that lend confidence in the apparent realism of the LES as it relates to the present analysis. Section 3 describes the calculation of turbulent flux quantities from the LES fields and presents ogive plots depicting the dominant spectral contributions to vertical fluxes at selected heights about the ocean surface. Section 4 introduces the problem of characterizing random error in flux measurements from finite aircraft tracks and focuses in particular on the determination of integral length scales and correlations for turbulent quantities required by widely used expressions for the flux sampling error. In Sect. 5, I describe the algorithm for defining continuous, periodic flight tracks of varying lengths and for obtaining “virtual” flux measurements along those tracks. Section 6 describes key results of the analysis, including the following components: (1) empirical determinations of sampling errors as a function of track length, (2) comparisons with the widely used expressions of Lenschow and Stankov (1986), (3) an empirical examination of the potential contribution of uncertainties in estimates of integral lengths and correlations from flight tracks to the total random error, and (4) a comparison of flux errors for crosswind and parallel tracks for 7.2 km tracks. A short summary of key results is presented in Sect. 7.
2.1 LES description
In one of the largest LES model runs ever published, Matheou (2018) used a buoyancyadjusted stretchedvortex model (Chung and Matheou, 2014; Matheou and Chung, 2014) to simulate a nocturnal stratocumulus case over the Pacific Ocean southwest of Los Angeles (vicinity of 32^{∘} N, 122^{∘} W), corresponding to the 10 July 2001 research flight (RF01) of the second Dynamics and Chemistry of Marine Stratocumulus (DYCOMSII) field study (Stevens et al., 2003).
The LES grid resolution is 1.25 m in the x, y, and z directions, and the grid size is 4096 × 4096 × 1200, corresponding to a 5.12 × 5.12 × 1.5 km domain with periodic lateral boundary conditions. Because of the considerable computing resources required, it was only possible to simulate the atmospheric boundary layer for 2 h of physical time. Nevertheless, the simulation required 36 d of wallclock time on 4096 CPU cores at the NASA Advanced Supercomputing (NAS) Division at Ames Research Center.
The single final time step of the simulation, comprising a total of ∼ 800 GB of gridded data, was utilized for this study. Output included all gridresolved model variables, including temperature T, specific humidity q, wind velocity components u, v, and w, and cloud water content q_{l}.
2.2 Simulated environment
The environment was characteristic of the summertime closedcell marine stratocumulus that prevails over the ocean off the southern California coast. The operations summary for the flight day describes the case as “very homogeneous” (Earth Observing Laboratory, 2007).
Consistent with the observed environment on the flight date, the LES was initialized with a boundary layer depth z_{i}=840 m delineated by a sharp 8.5 K temperature inversion. Below z_{i}, the initial potential temperature θ was 289 K and the specific humidity q was 9 g kg^{−1}. These values did not change significantly during the simulated evolution of the boundary layer (Fig. 1). A prescribed sea surface temperature of 292.5 K was used, leading to domainaveraged sensible and latent heat fluxes of 15 and 115 W m^{−2}, respectively. The wind was initialized as geostrophic; at the conclusion of the simulation, the mean horizontal wind components ($\stackrel{\mathrm{\u203e}}{u}$ and $\stackrel{\mathrm{\u203e}}{v}$) below z_{i} were nearly uniform at 6.7 and −5.0 m s^{−1}, respectively, corresponding to a wind direction of 307^{∘} (approximately northwesterly) and speed of 8.4 m s^{−1}.
The cloud base and top, defined here as the lowest and highest levels at which nonzero cloud liquid water occurred at any point in the horizontal domain, were found at 500 and 885 m, respectively. This paper is concerned exclusively with the subcloud boundary layer at and below 400 m or 0.48z_{i}.
2.3 Turbulent structure and domainaveraged fluxes
Many details concerning the model run and the overall turbulent structure of the simulated boundary layer are given by Matheou (2018). Here, we focus on those gridresolvable turbulent properties most directly relevant to an understanding of the simulated flux measurements.
Throughout this paper, results are highlighted at four representative heights: 10, 40, 100, and 400 m. For the environment simulated, these correspond to approximately 0.01z_{i}, 0.05z_{i}, 0.12z_{i}, and 0.48z_{i}, respectively. While subjectively chosen, these heights have some correspondence to realworld aircraft observations, including Cook and Renfrew (2015), who observed marine fluxes around the British Isles at an altitude of ∼ 40 m, and the recent Chequamegon Heterogeneous Ecosystem Energybalance Study Enabled by a Highdensity Extensive Array of Detectors 2019 campaign (CHEESEHEAD19), in which the University of Wyoming King Air flew legs at 100 and 400 m (Butterworth et al., 2021). The lowest height of 10 m was included because of interest by the author in potential flux measurements using the University of Wisconsin ultralight research airplane.
2.3.1 Spectra
Power spectra were computed for vertical velocity w, temperature T, specific humidity q, and horizontal wind speed U for each of above levels. This was done by first taking the 2D Fourier power spectrum of the entire horizontal slice through the domain and then averaging radially to obtain a directionindependent 1D average power spectrum. For clarity, results are shown for only two heights (40 and 400 m) in Fig. 2.
Figure 2a shows that, at 400 m height (0.48z_{i}), the LES produces a vertical velocity spectrum approximately obeying a Kolmogorov power law for the vertical velocity field over more than two decades, from wavelengths greater than 1 km down to ∼ 8 m or about 3 times the Nyquist wavelength of 2.5 m. At this height, peak energy in the spectrum is near 1 km wavelength. At the much lower height of 40 m (0.05z_{i}), peak energy is found near 150 m, and there is substantially less energy at longer wavelengths. In both cases, the peak energy in w is thus found near ∼ 4 times the physical distance z from the lower boundary, consistent with dominant contributions from eddies of size z centered at height z.
The spectra for temperature (Fig. 2b) and specific humidity (Fig. 2c) reveal considerably more energy at the longest wavelengths, even at 40 m height, while also rolling off earlier than w at the short wavelength end. The spectrum for temperature is also somewhat flatter than for either w or q. The spectrum for horizontal wind speed U (Fig. 2d) is notable in having no significant dependence on height, presumably because proximity to the lower boundary imposes no special constraint on horizontal flow in this lowfriction, neutrally stratified environment.
Overall, the spectra of scalar variables seem reasonable and give no clear cause to doubt the overall realism of the LES for wavelengths spanning the range from ∼ 10 to ∼ 1 km. Nevertheless, the constraint on the longwavelength end due to the approximately 5 km domain size cannot be overlooked. While it is easy to discount the importance of the very short wavelengths that are artificially dissipated in the numerical model, it is more difficult to assess the influence of the domain size on turbulent structures of the order of 1 km and greater. de Roode et al. (2004) found that peak energy for vertical motion and for virtual potential temperature are found at wavelengths comparable to the boundary layer depth – which is less than 1 km in the present case – but that humidity and potential temperature can exhibit significant variations on larger scales.
2.3.2 Horizontal cross sections
Horizontal cross sections of temperature T^{′}, specific humidity q^{′}, and vertical velocity w^{′} are presented for these heights in Fig. 3, where the primed quantities indicate instantaneous, local deviations from the domainaveraged value at each height. These show an evolution from predominantly finescale structure near the surface to consolidated perturbations of order 1 km diameter at 400 m. In the w^{′} field near the surface, there is a visually obvious alignment of structures with the mean wind. This directional anisotropy is far less apparent in q^{′} and is completely absent from T^{′} at low levels.
The local contributions to the turbulent fluxes of sensible heat (SH) and latent heat (LH) are
and
where $\stackrel{\mathrm{\u203e}}{\mathit{\rho}}$ is the mean air density at level z, C_{p}=1005 J kg^{−1} K^{−1} is the specific heat capacity at constant pressure, θ^{′} is the potential temperature perturbation, and $L\approx \mathrm{2.5}\times {\mathrm{10}}^{\mathrm{6}}$ J kg^{−1} is the latent heat of vaporization of water. Note that for the low altitudes and nearstandard pressures p encountered in this simulation, ${\mathit{\theta}}^{\prime}={T}^{\prime}(\mathrm{100}\phantom{\rule{0.125em}{0ex}}\mathrm{kPa}/p{)}^{\mathrm{0.286}}\approx {T}^{\prime}$.
In the above expressions, the common notation LE for latent heat flux is avoided because it implies a product of the latent heat of vaporization L and surface evaporation rate E. At any significant distance above the surface, the actual vertical flux of water vapor measured by an aircraft may or may not equal E. All fluxes described herein should be understood as flightlevel fluxes without any assumption about their relationship to surface fluxes. To reinforce this distinction, the more generic symbols (SH and LH) have been adopted here.
The local u and v components of momentum flux are computed as
Where appropriate in the discussion below, the horizontal wind vector (u,v) is rotated into a coordinate system aligned with the mean wind direction so as to isolate the alongwind and crosswind components $({U}_{\parallel},{U}_{\u27c2})$.
Figure 4 depicts the ratio of parameterized subgridscale flux to total vertical flux (subgridscale plus resolved) as a function of height above the surface. At 10 m height, the fraction is less than 5 % for LH and SH and less than 2 % for horizontal momentum flux. Consequently, we can neglect the unresolved components of the fluxes for the purposes of this study.
Domainaveraged resolved fluxes and turbulent kinetic energy (TKE) are depicted in Fig. 5. These fluxes serve as the nominal “truth” values against which simulated aircraft flux measurements will be evaluated.
As expected, the profile of sensible heat flux (Fig. 5a) is quite linear except within cloud, where latent heating due to condensation sharply increases the positive correlation between T and w. It crosses through zero near 445 m (0.53z_{i}). The profile of latent heat flux is noticeably less smooth, as are τ_{u} and τ_{v} (Fig. 5b). Turbulent kinetic energy varies only weakly below cloud and has a sharp maximum at cloud top, as expected. (Fig. 5c).
Figure 6 presents ogive (cumulative distribution) plots of the spectral contributions to each flux. For all four flux variables, there is a pronounced shift from predominantly shortwavelength contributions at the lowest levels to much lowerfrequency contributions at 400 m. In particular, 90 % of the total sensible heat flux contribution comes from the wavelength ranges [6 m, 146 m], [8 m, 0.51 km], [12 m, 1.0 km], and [17 m, 2.6 km] at heights of 10, 40, 100, and 400 m, respectively, with qualitatively similar ranges for latent heat flux and momentum flux. Overall, these results suggest that flux error assessments for heights near or below 100 m are least likely to be seriously biased by the LES domain size of 5.12 km, while 400 km may be more susceptible.
4.1 Defining the problem
According to the eddy covariance method, an estimate ${\widehat{F}}_{\mathit{\psi}}$ of the true turbulent flux F_{ψ} is obtained from a series of closely spaced measurements of scalar variable ψ and vertical velocity w as
where the proportionality allows for appropriate multiplicative constants, and the overline indicates an average over time and/or space, with ${\mathit{\psi}}^{\prime}\equiv \mathit{\psi}\stackrel{\mathrm{\u203e}}{\mathit{\psi}}$ and ${w}^{\prime}\equiv w\stackrel{\mathrm{\u203e}}{w}$. In the present context, the average is along the length of a flight track and is considered to be effectively instantaneous.
Because of the stochastic nature of turbulence, ${\widehat{F}}_{\mathit{\psi}}$ is in general an imperfect estimate of the true flux F_{ψ}, but one that can theoretically improve with a larger sample, subject to assumptions about stationarity. For a single flight track, the error in the flux estimate is
If a large ensemble of identical tracks were flown but sufficiently displaced from one another in time and/or space so as to be statistically independent, the mean error would be expected to converge to zero, but the uncertainty associated with any single track could be characterized via the ensemble rootmeansquared error (Lenschow and Stankov, 1986, Eq. 1):
The fundamental problem is to characterize the above random uncertainty for any particular flight track, given the length of the track and appropriate assumptions about the turbulent environment being measured. An additional complicating factor is that the required true means $\stackrel{\mathrm{\u203e}}{\mathit{\psi}}$ and $\stackrel{\mathrm{\u203e}}{w}$ must themselves be estimated from the finite flight track and will therefore also be subject to random errors.
4.2 Integral length scales
4.2.1 Significance
Central to the problem of estimating random error in aircraft measurements of turbulent fluxes is the integral length scale, also known as the autocorrelation length, of the fields being measured (Lenschow and Stankov, 1986; Lenschow et al., 1994; Sühring and Raasch, 2013):
where A_{f}(r) is the autocorrelation of generic spatial function f(x,y) at displacement $r=\sqrt{{x}^{\mathrm{2}}+{y}^{\mathrm{2}}}$ in any specified direction, A_{f}(0)≡1, and r_{0} is the distance to the first zero crossing or onehalf of the domain diameter (in this case 2.56 km), whichever is smaller. In the ideal “red noise” case, ${A}_{f}\left(r\right)=\mathrm{exp}(r/{I}_{f})$, and ${A}_{f}\left({I}_{f}\right)={e}^{\mathrm{1}}$. The length scale is thus useful for characterizing the minimum track length and/or spatial separation required in order to be able to treat two sets of measurements as statistically independent.
Viewed simplistically, for a track length L=NI_{f}, the random error in ${\widehat{F}}_{\mathit{\psi}}$ as an estimate of the true ensemble value should decrease with ${N}^{\mathrm{1}/\mathrm{2}}$, consistent with the calculation of standard error for any set of N noisy measurements. Richardson et al. (2006) argues persuasively that it is more natural to focus on the absolute rather than relative random error, as the relative error becomes extremely large or even undefined in the common case that the flux approaches or crosses zero, for example, either at sunrise or sunset or near the altitude at which sensible heat flux switches signs. With that in mind, Eq. (7) of of Lenschow and Stankov (1986) is rearranged to give the following absolute random uncertainty:
where ρ_{w,ψ} is the zerolag Pearson correlation coefficient of w^{′} with ψ^{′}, and I_{wψ} is the integral scale length of ${w}^{\prime}{\mathit{\psi}}^{\prime}$.
Lenschow et al. (1994) and Mann and Lenschow (1994) refer to the difficulty of obtaining estimates of I_{wψ} and propose theoretically derived substitutions based on the more commonly available I_{w} and I_{ψ}. The problem of convergence of integral lengths under some circumstances is further discussed by Durand et al. (2000). Subsequent authors, citing Mann and Lenschow (1994), variously utilize
e.g., Sühring and Raasch (2013) or
e.g., Bange et al. (2002).
Because all of the above length scales and correlations can be computed directly from the LES fields, it will be possible to empirically assess the validity of Eqs. (8)–(10).
4.2.2 Determination from LES
For each height under consideration, the twodimensional autocorrelation of the full horizontal model domain was obtained for each turbulent field of interest. I_{f} was then determined in two orthogonal directions: one aligned with the mean wind (“parallel”) and the other perpendicular (“crosswind”). Results are shown in Fig. 7. Note that for momentum fluxes, the wind vector (u,v) was projected onto the parallel and crosswind directions to obtain the rotated vector $({U}_{\parallel},{U}_{\u27c2})$. Thus, the integral length scale labeled “U_{⟂} parallel”, for example, describes that computed for the turbulent flux of the momentum component perpendicular to mean wind measured along a track parallel to the mean wind.
A number of general observations can be made concerning the computed values of I_{f}:

For every variable, I_{f} at and below z=100 m = 0.12z_{i}, corresponding roughly to the surface layer, is well approximated by a straight line on a log–log plot, implying an empirical relationship of the form I_{f}≈Az^{b}. For some variables, such as the specific humidity (Fig. 7a) and the sensible and latent heat fluxes (Fig. 7f, g), the validity of such a relationship seems to extend considerably higher.

For most variables, I_{f} is larger for tracks parallel to the mean wind, as expected, than for crosswind tracks, often considerably larger. Two interesting exceptions are found in I_{f} for temperature above 40 m (Fig. 7b) and for U_{∥} above 100 m, where the crosswind value of I_{f} is up to 50 % greater than for the parallel direction.

For many variables (q, U_{⟂}, w, SH, and LH), the difference between the parallel and crosswind values of I_{f} is largest near the surface and decreases to near zero at higher altitudes. For temperature, however, the reverse is true (Fig. 7b).

The three turbulent wind components behave quite differently, with I_{f} for U_{∥} and U_{⟂} exhibiting a relatively weak and nonlinear dependence on height, while for the vertical wind component w, the dependence is strong. Indeed for the crosswind case, I_{f} for w is almost perfectly proportional to z.

Of all the variables considered, only the turbulent kinetic energy exhibits an integral length scale that is virtually constant with height throughout the full depth of the model domain (Fig. 7i).
The calculated I_{f} results below 100 m were used to obtain coefficients A and b for the aforementioned powerlaw fit. These coefficients are provided in Table 2. Alternatively, ${I}_{f}={A}_{\star}{z}_{\star}^{b}$, where ${z}_{\star}\equiv z/{z}_{i}$, ${A}_{\star}=A{z}_{i}^{b}$ and z_{i}=840 m. It is not known to what degree the results found here are transferable to other environments.
It is noteworthy that the exponents b found here for fluxes of sensible heat, latent heat, and horizontal momentum are quite different than the exponent of $\mathrm{1}/\mathrm{3}$ or $\mathrm{1}/\mathrm{2}$ obtained theoretically by Lenschow and Stankov (1986) for all of these (their Eq. 11), being closer to −1 for the crosswind length scales and $\mathrm{2}/\mathrm{3}$ for the alongwind length scales. It is beyond the scope of this paper to investigate the reasons for the discrepancy.
Apparent in the above results is that Eqs. (9) and (10) generally yield large overestimates of I_{wψ} for any scalar variable ψ. For example, at 100 m height and measured in the crosswind direction, I_{q}=220 m and I_{w}=73 m. $\sqrt{{I}_{q}{I}_{w}}$ is then 127 m, a factor of 2.5 larger than the directly computed I_{wq}=51 m. In general, the minimum overestimate at or below 400 m was found to be about a factor of 2 or greater, with ratios in excess of 3–5 typical at lower heights. The difference increases further if one uses the correlation coefficient ${\mathit{\rho}}_{w,\mathit{\psi}}<\mathrm{1}$ in the denominator per Eq. (10). Specific combinations of variables and heights can be easily examined using the relationships in Table 2.
Also required in the evaluation of random flux errors from Eq. (8) is the correlation ρ_{w,ψ} between the vertical velocity w^{′} and the scalar variable ψ^{′} of interest. Computed profiles of ρ_{w,ψ} are given in Fig. 8. For horizontal momentum, ${\mathit{\rho}}_{w,U}=\sqrt{{\mathit{\rho}}_{w,u}^{\mathrm{2}}+{\mathit{\rho}}_{w,v}^{\mathrm{2}}}$.
5.1 Track definitions
As previously noted, the central purpose of this paper is to investigate the empirical relationship between sampling error in turbulent flux measurements and the length of a continuous, ideal aircraft track through the virtual atmosphere represented by the LES domain. The algorithm for defining these tracks is described below.
First, each track is required to be perfectly periodic, albeit with a period L substantially longer than the dimensions of the domain. Specifically, track angles relative the x axis are defined by ϕ_{n}=arctan (n), where n is a positive integer. For ϕ_{1}=45^{∘}, the track makes a single pass through the domain before returning to its starting point, and the track length is 7.2 km. A continuation of the track would simply resample the same grid points. For ϕ_{2}=63.4^{∘}, the track makes two widely separated passes through the domain before returning to its starting point, and the nonrepeating track length is 11.4 km. The maximum value of n used was 20, for a 102.5 km continuous, nonoverlapping path.
For any given n, starting points for different tracks of the same length are uniformly distributed along the boundary of the domain, maintaining prescribed minimum separation (measured perpendicular to the paths) intended to ensure a reasonable degree of statistical independence. These minimum separations were chosen to be greater than I_{f} for all relevant variables at the level in question; 20, 50, 100, and 150 m were assumed for heights of 10, 40, 100, and 400 m, respectively.
The effect of the minimum separation requirement is to achieve a roughly constant overall sampling density for the ensemble of tracks of a given length. Thus, N_{n} is small for long tracks and larger for short tracks. By taking mirror images and 90^{∘} rotations of the tracks defined in this way, the total number of distinct tracks available is actually 4N_{n}, except for n=1, in which case it is 2N_{n} due to duplication when rotating and flipping the 45^{∘} track patterns. Figure 9 depicts an example track for n=3 (top) and the full corresponding set of 32 distinct tracks (bottom) for a relatively wide minimum separation of 200 m used for visual clarity; the number of distinct tracks is significantly greater in the actual analysis.
Note that the algorithm described above is more elaborate than the fixedangle track definitions utilized by Schröter et al. (2000) and Sühring et al. (2019). To reiterate, each track of length L_{i} is required to be periodic with period L_{i}, and individual realizations of a given track length are distributed so as to both maximize and equalize their separation, subject to the aforementioned minimum separations. This allows us to obtain an optimal set of quasiindependent tracks. Constraining the tracks to be periodic also means that there is by definition no trend or unsampled low frequency contribution to be concerned about – “measured” fluxes are the same regardless of the starting point along the track, as long as the total distance followed by the virtual aircraft is L_{i}.
An unavoidable shortcoming of the method used here is that track orientations relative to the mean wind cannot be specified separately, as the absolute orientations are uniquely determined by the chosen number of passes n though the domain and thus by the track length. As n increases, the trend is toward tracks that are oriented mostly north–south or east–west, whereas the mean wind in this simulation is from the northwest. Only for n=1 are the track directions nearly aligned with or perpendicular to the wind direction.
5.2 Virtual flux observations
The “truth” value for each comparison is the domainaveraged flux value computed at each level (Fig. 5). These values utilized the deviations (primed quantities) from the true ensemble mean values for the entire domain. As measured against the true values, aircraft measurements are subject to two statistical sources of error: (1) error in the determination of the mean values relative to which primed quantities will be calculated, and (2) insufficient sampling of the fluctuating primed quantities themselves. Our virtual measurements simulate both effects by utilizing the trackdetermined mean values rather than the domainaveraged values. Both sources of error decrease asymptotically to zero in the limit of an infinitely long flight track, and the flux estimate will converge on the true ensemble value.
In the finite 5.12 km × 5.12 km horizontal domain of our LES, there is the additional issue that a single long flight track can effectively sample the entire domain, so that the aircraft measurement can converge almost perfectly on the “true” value from measurements along a long but finite path. In principle, then, the results obtained herein for long flight tracks should be considered lower bounds on the expected random error. However, because of the minimum separation maintained between parallel segments of a given tracks, we believe this potential bias is unimportant for the flight track lengths considered here. More important may be the inability of the finite LES domain to reproduce flux variations with wavelengths greater than the domain size. As previously discussed for Fig. 6, this seems likely to be an issue mainly at heights of ∼ 400 m and above.
Finally, a notable constraint rooted in the availability of only a single time step from the LES is that the virtual aircraft effectively traverses each path instantaneously. We are in effect relying on Taylor's “frozen turbulence” hypothesis and, following Lenschow and Stankov (1986), postulate that this hypothesis is valid in the present case if I_{f}<VT_{f}, where I_{f} is the integral length scale, V is the speed of the aircraft, and T_{f} is the autocorrelation time of the flux field being measured. Unfortunately, it is not possible to determine T_{f} from the single time step available for this study, but for a realworld V≈85 m s^{−1} similar to that of the University of Wyoming King Air, one would only require T_{f}>1 s to satisfy the above inequality for flux measurements at a height of 100 m, based on the values of I_{f} seen in Fig. 7. For a much slower aircraft or UAV with an air speed of 20 m s^{−1}, the requirement for T_{f} is proportionally longer, but these have the option of flying at lower altitudes where I_{f} is smaller.
Raw sample results are presented in Fig. 10, with each blue dot representing the estimated flux from a single flight track of the indicated length. The true domainaveraged values are indicated as dashed black lines, and the rootmeansquare (rms) deviation from the true value is depicted as a dashed red line. For all flux variables, there is the expected decay in sampling error as the track length becomes longer.
Noteworthy features include the following:

Overall scatter is largest at the highest altitudes, consistent with the longer integral length scales I_{f} and with the observation by previous authors that error tends to increase with height (e.g., Grossman, 1992).

For sensible heat flux at 400 m (Fig. 10j), the magnitude of the scatter is often far in excess of the very low (1.8 W m^{−2}) mean flux itself, so that shorter track lengths cannot reliably determine even the sign of the mean flux.

Relative to the mean values, scatter is particularly large for momentum flux. Because this quantity is positive definite ($\mathit{\tau}=\sqrt{{\mathit{\tau}}_{u}^{\mathrm{2}}+{\mathit{\tau}}_{v}^{\mathrm{2}}}$), the distribution of errors at 400 m is strongly skewed and positively biased.
Figure 11 offers a closer look at the dependence of sampling error on path length and allows the results obtained to be compared with the random error predicted by Eq. (8). The red dots correspond to the empirical rms error values previously represented in Fig. 10 as dashed red curves. The dashed black lines represent fits to those values. The accompanying powerlaw expressions for each fit appears in black as well as being summarized in Table 3.
The blue lines represent Eq. (8), where both I_{f} and the correlation coefficients ρ_{w,ψ} are computed directly from the 2D LESgenerated fields at each level and represent averages of the crosswind and parallel values. As previously pointed out, most simulated flight tracks are neither parallel nor perpendicular to the mean flow, and the set of available orientations varies with track length. For both reasons, no single representation of Eq. (8) as a curve in Fig. 11 can capture the variability of I_{f} and ρ_{w,ψ}. However, especially for longer path lengths, most paths cross the mean wind at an oblique angle; thus, the use of the average (directionindependent) I_{f} and ρ_{w,ψ} seems reasonable for this qualitative comparison.
Finally, from the fits in Table 3, the minimum track lengths L_{10} required for 10 % relative accuracy are determined. These are presented in Table 4.
Noteworthy findings include the following:

Equation (8) due to Lenschow and Stankov (1986) and using the independently computed parameter values is remarkably accurate at predicting the random error for shorter track lengths.

Empirical fits generally show a steeper decrease with track length than the ${L}^{\mathrm{1}/\mathrm{2}}$ relationship predicted by Eq. (8), with exponents ranging from −0.64 to −0.70 at 10 and 40 m, to nearly −1 for sensible and latent heat fluxes measured at a height 400 m. It is not clear whether the more rapid decay in error is primarily related to the finite domain size – with this issue being most likely problematic at 400 m – or to other departures from the idealized statistical model of turbulence assumed by Lenschow et al. (1994).

For flights at and below 100 m, track lengths of approximately 30 km or less are sufficient to achieve 10 % precision in sensible and latent heat flux estimates in this particular environment. Required track lengths for momentum flux τ are considerably longer; nearly 90 km at 100 m height.

At a height of 400 m, no reasonable track length is long enough to adequately measure the low sensible heat and momentum fluxes encountered there.
6.1 Uncertainties in flux error determinations
The previous comparison with Eq. (8) assumed that the relevant integral lengths and correlations were the “true” values. In practice, it may be necessary when estimating sampling errors to determine these parameters empirically from the flight tracks themselves. These are thus subject to sampling errors of their own that in turn imply greater uncertainty in the estimation of σ.
To examine this problem, integral lengths I_{f} and correlations ρ were estimated directly from the flight tracks. Based on Eq. (8), an error bias factor was defined as
where the “est” subscripts refer to the trackestimated quantities and “true” refers to the domainaveraged quantities. Thus, the apparent sampling error is given as
where σ is the “correct” flux error based on perfect knowledge of the integral length scale and correlation. Results are depicted for LH in Fig. 12. The solid lines depict the average bias factor as a function of track length; the dashed lines depict the standard deviation of the bias about this average.
Interestingly, at the lowest levels, there is an average low bias of between 10 % and 20 %, and this bias persists for the longest track lengths. This is likely an artifact of the method used, as the track determinations of integral length and correlation reflect the actual orientation of the track relative to the wind, whereas the generic “true” values do not. At higher altitudes, the directional dependence of I_{f} disappears, as seen in Fig. 7g, and so does the apparent bias in the flux error.
At the lowest altitude of 10 m (Fig. 12a), the scatter about the mean bias quickly becomes small with increasing track length. Surprisingly, this is not the case at 100 or 400 m. Rather, the relative uncertainty in the flux error is nearly constant with path length. Why this is the case requires further investigation.
Notwithstanding the nonnegligible role of sampling uncertainties in I_{f} and ρ in estimating the flux uncertainties from Eq. (8), it is perhaps reassuring that the resulting variations are typically no more than about 10 %, though larger errors are manifestly possible.
Similar results were found at lower levels for sensible heat and momentum flux (not shown). At 400 m, the scatter in the bias term for these variables is quite large, reflecting the small cross correlations found near this level (see Fig. 8). When ρ is small, even modest sampling errors in the determination of ρ can produce large fluctuations in Φ. In short, when sampling error is poor relative to the actual magnitude of the flux, even the ability to accurately characterize that sampling error is impaired.
6.2 The effect of track orientation
As previously noted, the algorithm for track definition, which in turn is based on the requirement for periodic tracks, does not permit the choice of track orientation relative to the wind direction. For this reason, it is not generally possible to use the methods described herein to empirically determine sampling error as a function of track length for tracks nearly parallel or perpendicular to the wind direction.
Having verified that Eq. (8) does appear to give reasonable results overall for the sampling error (though possibly overestimating that error for longer track lengths), it follows that the ratio ${I}_{f}/L$ largely determines the sampling error, all other factors being equal. From the directional dependence of I_{f} alone, one may infer that flying crosswind should normally yield smaller errors for a given track length than flying parallel to the wind.
This assumption may be directly tested in one particular case. For the 7.2 km tracks, corresponding to N=1, the tracks have an orientation of ±45^{∘} and thus cross the mean wind direction of 307^{∘} at either 8 or 82^{∘}. Results for these orientations should be practically indistinguishable from true parallel and crosswind flights.
Empirical flux errors were determined separately for these two cases, and the results are given in Table 5. With the exception of sensible heat at 400 m, where sampling error is in any case unusably large relative to the true flux, all flux variables are sampled with significantly greater precision in the crosswind direction. The difference is up to about a factor of 2 for latent heat up to at least 100 m and for all variables at the lowest level.
In this paper, the highresolution large eddy simulation of the marine boundary layer by Matheou (2018) was utilized to determine random flux sampling errors by a virtual aircraft flying tracks of various lengths at heights of 10, 40, 100, and 400 m. We eliminated the need for detrending by constructing these tracks to be periodic with periods of the specified lengths.
The empirical results were compared with the theoretically derived expression of Lenschow and Stankov (1986). In support of these comparisons, we computed the required integral length scales I_{f} and found that these are well described by expressions of the form I_{f}=Az^{b} for z≤100 m, with coefficients A and b given in Table 2. The empirical exponents b are generally between about $\mathrm{2}/\mathrm{3}$ and 1 for fluxes of sensible heat, latent heat, and momentum, in contrast to Lenschow and Stankov (1986), who found exponents of $\mathrm{1}/\mathrm{3}$ to $\mathrm{1}/\mathrm{2}$. The reason for this apparent discrepancy is unknown. The commonly cited approximation for the integral length scale for fluxes given by Eq. (9) or (10) was also evaluated and found to consistently overestimate the true value by at least a factor of 2–5.
Using directly computed parameter values, Eq. (8) was shown to be remarkably accurate at predicting random errors for track lengths of approximately 10 km. However, it predicts a decay in error proportional to ${L}^{\mathrm{1}/\mathrm{2}}$, whereas our empirical results show substantially more rapid decay in many cases. It is uncertain to what degree the difference is due to the finite domain size of the LES, to departures from the idealized statistical assumptions of Lenschow and Stankov (1986), or perhaps to a combination of both. It seems less likely that the first of these issues would affect the results for measurements at lower heights, as there is then very little contribution to total fluxes by wavelengths greater than 1 km, according to Fig. 6.
In general, track lengths of ∼ 30 km or less are sufficient for measuring sensible and latent heat fluxes in this simulated environment to better than 10 % precision, provided that flights are conducted at or below 100 m. Substantially longer flight tracks (∼ 85 km) are required for momentum flux, a result traceable to the low correlation ρ_{U,w} (and thus low corresponding fluxes) between the horizontal and vertical wind components even near the surface. At 400 m height, only latent heat flux could be estimated with adequate precision using a flight leg of 35 km. On the other hand, legs in excess of 300 km would theoretically be required to estimate sensible heat flux or momentum flux with 10 % precision.
For comparison, Sühring et al. (2019) found that 200 km flights were desirable to ensure 10 % precision in sensible heat flux estimates within the lower half of the boundary layer. Their results were obtained for a deep, dry convective environment. The difference in specific findings undoubtedly reflects the larger integral length scales expected in that environment compared to the shallow, weakly forced marine boundary layer considered herein. It also highlights the difficulty of extrapolating findings from one environment to another.
Finally, it was empirically confirmed that for a relatively short, fixed track length of 7.2 km, flux sampling errors are reduced up to a factor of 2 by flying crosswind rather than parallel to the wind, especially at lower levels, consistent with the typically shorter integral length scales associated with the crosswind direction.
Data sets and Jupyter notebook files utilized in the analysis may be requested by sending a 1 TB USB drive to the corresponding author.
The author declares that there are no competing interests.
George Matheou kindly provided the LES data set without which this study would not have been possible. It was in turn created with support from the NASA HighEnd Computing (HEC) Program through the NASA Advanced Supercomputing (NAS) Division at Ames Research Center and by the University of Connecticut High Performance Computing (HPC) facility. The paper was significantly improved by comments and suggestions by Ankur Desai and by two anonymous reviewers. Partial support was provided by the Chequamegon Heterogeneous Ecosystem Energybalance Study Enabled by a Highdensity Extensive Array of Detectors (CHEESEHEAD) project.
This research has been supported by the National Science Foundation (grant no. 1822420).
This paper was edited by Thomas F. Hanisco and reviewed by two anonymous referees.
Bange, J., Beyrich, F., and Engelbart, D. A.: Airborne measurements of turbulent fluxes during LITFASS98: Comparison with ground measurements and remote sensing in a case study, Theor. Appl. Climatol., 73, 35–51, 2002. a
Brilouet, P.E., Durand, P., and Canut, G.: The marine atmospheric boundary layer under strong wind conditions: Organized turbulence structure and flux estimates by airborne measurements, J. Geophys. Res.Atmos., 122, 2115–2130, 2017. a
Brilouet, P.E., Durand, P., Canut, G., and Fourrié, N.: Organized turbulence in a coldair outbreak: Evaluating a largeeddy simulation with respect to airborne measurements, Bound.Lay. Meteorol., 175, 57–91, https://doi.org/10.1007/s10546019004994, 2020. a
Brooks, I. M. and Rogers, D. P.: Aircraft observations of boundary layer rolls off the coast of California, J. Atmos. Sci., 54, 1834–1849, 1997. a
Butterworth, B. J., Desai, A. R., Metzger, S., Townsend, P. A., Schwartz, M. D., Petty, G. W., Mauder, M., Vogelmann, H., Andresen, C. G., Augustine, T. J., Bertram, T. H., Brown, W. O. J., Buban, M., Cleary, P., Durden, D. J., Florian, C. R., Iglinski, T. J., Kruger, E. L., Lantz, K., Lee, T. R., Meyers, T. P., Mineau, J. K., Olson, E. R., Oncley, S. P., Paleri, S., Pertzborn, R. A., Pettersen, C., Plummer, D. M., Riihimaki, L., Guzman, E. Ruiz, Sedlar, J., Smith, E. N., Speidel, J., Stoy, P. C., Sühring, M., Thom, J. E., Turner, D. D., Vermeuel, M. P., Wagner, T. J., Wang, Z., Wanner, L., White, L. D., Wilczak, J. M., Wright, D. B., and T. Zheng: Connecting Land–Atmosphere Interactions to Surface Heterogeneity in CHEESEHEAD19, B. Am. Meteorol. Soc., 102, E421–E445, https://doi.org/10.1175/BAMSD190346.1, 2021. a
Chung, D. and Matheou, G.: Largeeddy simulation of stratified turbulence. Part I: A vortexbased subgridscale model, J. Atmos. Sci., 71, 1863–1879, 2014. a
Cook, P. A. and Renfrew, I. A.: Aircraftbased observations of air–sea turbulent fluxes around the British Isles, Q. J. Roy. Meteor. Soc., 141, 139–152, 2015. a
de Roode, S. R., Duynkerke, P. G., and Jonker, H. J.: Largeeddy simulation: How large is large enough?, J. Atmos. Sci., 61, 403–421, 2004. a
Desjardins, R. L., Macpherson, J. I., Schuepp, P. H., and Karanja, F.: An Evaluation of Aircraft Flux Measurements of CO_{2}, Water Vapor and Sensible Heat, in: Boundary Layer Studies and Applications, edited by: Munn, R. E., Springer, Dordrecht, https://doi.org/10.1007/9789400909755_5, 1989. a
Durand, P., Thoumieux, F., and Lambert, D.: Turbulent lengthscales in the marine atmospheric mixed layer, Q. J. Roy. Meteor. Soc., 126, 1889–1912, 2000. a
Earth Observing Laboratory: DYCOMSII C130 Mission Summaries, available at: http://catalog.eol.ucar.edu/dycoms/catalog/missions/ (last access: 17 January 2021), 2007. a
Elston, J., Argrow, B., Stachura, M., Weibel, D., Lawrence, D., and Pope, D.: Overview of small fixedwing unmanned aircraft for meteorological sampling, J. Atmos. Ocean. Tech., 32, 97–115, 2015. a
Finkelstein, P. L. and Sims, P. F.: Sampling error in eddy correlation flux measurements, J. Geophys. Res.Atmos., 106, 3503–3509, 2001. a
Grossman, R. L.: Sampling errors in the vertical fluxes of potential temperature and moisture measured by aircraft during FIFE, J. Geophys. Res.Atmos., 97, 18439–18443, 1992. a, b
Lenschow, D. H. and Stankov, B. B.: Length scales in the convective boundary layer, J. Atmos. Sci., 43, 1198–1209, 1986. a, b, c, d, e, f, g, h, i, j, k, l, m, n
Lenschow, D. H., Mann, J., and Kristensen, L.: How long is long enough when measuring fluxes and other turbulence statistics?, J. Atmos. Ocean. Tech., 11, 661–673, 1994. a, b, c, d
Mahrt, L.: Flux sampling errors for aircraft and towers, J. Atmos. Ocean. Tech., 15, 416–429, 1998. a, b
Mann, J. and Lenschow, D. H.: Errors in airborne flux measurements, J. Geophys. Res.Atmos., 99, 14519–14526, 1994. a, b, c, d
Matheou, G.: Turbulence Structure in a Stratocumulus Cloud, Atmosphere, 9, 392, https://doi.org/10.3390/atmos9100392, 2018. a, b, c, d
Matheou, G. and Chung, D.: Largeeddy simulation of stratified turbulence. Part II: Application of the stretchedvortex model to the atmospheric boundary layer, J. Atmos. Sci., 71, 4439–4460, 2014. a
Metzger, S., Junkermann, W., ButterbachBahl, K., Schmid, H. P., and Foken, T.: Measuring the 3D wind vector with a weightshift microlight aircraft, Atmos. Meas. Tech., 4, 1421–1444, https://doi.org/10.5194/amt414212011, 2011. a
Richardson, A. D., Hollinger, D. Y., Burba, G. G., Davis, K. J., Flanagan, L. B., Katul, G. G., Munger, J. W., Ricciuto, D. M., Stoy, P. C., Suyker, A. E., Verma, S. B., and Wofsy, S. C.: A multisite analysis of random error in towerbased measurements of carbon and energy fluxes, Agr. Forest Meteorol., 136, 1–18, 2006. a
Schröter, M., Bange, J., and Raasch, S.: Simulated airborne flux measurements in a LES generated convective boundary layer, Bound.Lay. Meteorol., 95, 437–456, 2000. a, b, c, d, e
Stevens, B., Lenschow, D. H., Vali, G., Gerber, H., Bandy, A., Blomquist, B., Brenguier, J.L., Bretherton, C. S., Burnet, F., Campos, T., Chai, S., Faloona, I., Friesen, D., Haimov, S., Laursen, K., Lilly, D. K., Loehrer, S. M., Malinowski, S. P., Morley, B., Petters, M. D., Rogers, D. C., Russell, L., SavicJovcic, V., Snider, J. R., Straub, D., Szumowski, M. J., Takagi, H., Thornton, D. C., Tschudi, M., Twohy, C., Wetzel, M., and M. C. van Zanten: Dynamics and chemistry of marine stratocumulus–DYCOMSII, B. Am. Meteorol. Soc., 84, 579–594, 2003. a
Sühring, M. and Raasch, S.: Heterogeneityinduced heatflux patterns in the convective boundary layer: Can they be detected from observations and is there a blending height? – A largeeddy simulation study for the LITFASS2003 experiment, Bound.Lay. Meteorol., 148, 309–331, 2013. a, b, c, d, e, f
Sühring, M., Metzger, S., Xu, K., Durden, D., and Desai, A.: Tradeoffs in flux disaggregation: A largeeddy simulation study, Bound.Lay. Meteorol., 170, 69–93, 2019. a, b, c, d, e
Vellinga, O. S., Dobosy, R. J., Dumas, E. J., Gioli, B., Elbers, J. A., and Hutjes, R. W.: Calibration and quality assurance of flux observations from a small research aircraft, J. Atmos. Ocean. Tech., 30, 161–181, 2013. a