Measuring surface fluxes using the surface renewal (SR) method requires
programmatic algorithms for tabulation, algebraic calculation, and data
quality control. A number of different methods have been published describing
automated calibration of SR parameters. Because the SR method utilizes high-frequency (10 Hz

Originally described by Van Atta (1977), the SR model measures
vertical flux that occurs during rapid events which manifest as coherent
structures in a turbulent flow. The physical mechanisms are statistically
distinct from those described in the eddy covariance (EC) method, which has
been established as a robust and accurate method to measure flux
(Baldocchi, 2014). The surface renewal (SR) method offers several
advantages and complements the use of EC to measure flux. While EC requires
fast (10 Hz

The SR method estimates turbulent transport rates from fast response measurements of scalar properties such as temperature or trace gas concentration. In the SR conceptual model, rapid changes in scalar concentration are associated with episodic displacement of near-surface air parcels, and the surface condition is renewed from upper air. While in proximity to the surface, the air parcels are gradually enriched or depleted in temperature or scalar concentration by diffusion (Castellví et al., 2002; Paw U et al., 1995). The majority of flux from the surface is attributed to these rapid ejections, which distinguish coherent structures in near-surface atmospheric motions (Gao et al., 1989). The duration and amplitude of these rapid fluctuations (visible as ramps in the scalar trace) are used to determine the magnitude and direction of the flux density. Because of the short duration of these events, the SR method complements spectral methods to evaluate the flux contributions made over timescales shorter than the typical 15–30 min averaging time used for EC (Katul et al., 2006; Shapland et al., 2012a, b). Rapid flux measurement will facilitate new applications, such as spatial mapping of flux using vehicle-mounted, near-surface sensors, and real-time monitoring systems. Mobile SR implementations and other novel field methods could provide new insights into the complexities of sub-basin-scale hydrology, be used to validate downscaled models, and measure the heterogeneity of flux at sub-field scales.

The implementation of SR requires a prescribed averaging time period (on the order of minutes) and ramp time duration (on the order of seconds), for which a representative and statistically robust flux magnitude can be determined. To implement SR on a moving vehicle (for instance, to map spatially variable flux), finding a minimum averaging time is desirable to increase the spatial resolution of the resulting map. The averaging time and lag time used in the SR method relate the sensitivity of the scalar measurement to the timescales at which most significant flux occurs (Shapland et al., 2014). To find the minimum measurement period, field studies were conducted in 2014 and 2015 over various types of surface conditions. This required a rapid computational method that worked over a range of different time averaging periods, and which could implement the various calibration procedures used in the SR method. Initial attempts to calculate flux followed methods as described by Paw U et al. (2005) and Snyder et al. (2008). However, implementing these methods as documented was hampered by slow computation time, which constrained the many iterations required to determine the minimum flux averaging period.

Open-source software and online forums are abound with methods that utilize advances in computing power, memory availability, and the accessibility of multithreaded processing. These methods reduce computational overhead, and can augment the SR technique to allow implementation with low-cost computers and data loggers, or where remote telemetry is required. Three example methods are shown here which streamline specific operational steps in the SR method. The first is a method adapted from signal processing to “despike” noisy data, a quality control technique commonly used in processing raw meteorological data. Second is a method to compute structure functions over multiple time lags rapidly using convolution in two dimensions. Third, an algebraic array calculation is used to find the cubic polynomials roots used to determine the SR ramp amplitude. By using more efficient algorithms, rapid iterative trials can be conducted to adjust calibration parameters, test hypotheses on the time averaging of flux calculations, and potentially measure SR flux in real time.

Advantages such as low-cost, relatively simple instrumentation, and easier field implementation are all cited as motivating factors to use the SR method (Paw U et al., 2005), yet work remains to standardize a robust method (French et al., 2012; Suvočarev et al., 2014). Because sensor cost is reduced, SR systems can be implemented to measure flux more extensively than EC, and in situations where EC is impractical. Extensive, site-specific SR estimates can augment the utility of sparsely located, permanent weather stations in mapping the heterogeneity of surface flux. Examples of situations which could benefit from low-cost flux measurements include direct crop ET monitoring, experiments at remote field sites, and developing regions. While SR may expand flux measurement applications, the method still requires standardized calibration and quality control measures to establish that SR is robust and accurate, and a critical step in developing the method is to reduce computation costs.

The example algorithms shown here improve or economize existing calculation
methods, including despiking of time series data
(Højstrup, 1993; Starkenburg et al., 2016), calculation
of structure functions (Antonia and Van Atta, 1978), and Fourier
analysis of signals, i.e., spectral analysis (Press, 2007; Stull,
1988). In each case, dramatically faster execution times were accomplished
using simple programming improvements. Most efficiency gains were a result
of code vectorization, which is the conversion of iterative looping
algorithms into array calculations. All methods described here were
implemented in the MATLAB language (The Mathworks Inc., 2016),
with the Statistics, Curve Fitting, and Signal Analysis toolboxes. MATLAB's
Profiler (^{™} i7-3720QM
processor operating at 2.60 GHz with 16 GB of RAM. Processor clock speed was
verified using MATLAB's Profiler tool at run time, and processing times
reported are described as run time (actual observed execution time) or as
total run time, which is the sum of CPU time for all calculation threads.
Example methods are indicated by

Vector and array calculations are more efficiently executed than iterative methods; vectorization of MATLAB code entails removing loops (which are not pre-compiled) and taking advantage of implicit parallel methods in MATLAB's pre-compiled functions (Altman, 2015). Other significant improvements were enabled through the fast Fourier transform by using convolution of number arrays, rather than iterative operations. In the case of determining ramp geometry in the SR method, Cardano's solution for depressed cubic polynomials (published in 1545) reduces a root-finding algorithm from an iterative numerical approximation to an exact algebraic vector calculation. While some of the implementation of these methods are particular to the MATLAB language, the general mathematical concepts are universal. Although these methods were prototyped in MATLAB, the examples shown are generally useful as solutions to challenges commonly encountered in micrometeorology.

The total computation time is the sum of CPU time spent on all calculation threads. Triangles mark the mean run time for multiple runs, which varied from 30 runs (15 min–4 h data) to 10 runs (8, 12, 24 h). The 48 h calculation is represented by one run only. Error bars represent 1 standard deviation of all runs.

Fast calculation of larger data sets is due to implicit parallel processing via the FFT, which is readily performed by multiple simultaneous threads. The efficiency of parallel processing is shown by a lower ratio of run time to total thread time.

Despiking is the removal of erroneous or extreme data points from a time series of sampled values. It is a common procedure when measuring environmental parameters, especially in challenging conditions or complex environments (Göckede et al., 2004; Starkenburg et al., 2016). The origin of spikes in a time series may be electronic or physical (sensor malfunction or actual physical non-errors); regardless of the origin, spikes can be recorded as abnormally large or small values, or may be marked by an error flag defined in the firmware. Spikes become problematic if they are not readily differentiated during automatic data imports (Rebmann et al., 2012). Spikes interfere with statistical calculations, and require some deliberate and objective method to identify, remove, and interpolate where they exist. For instance, a data logger program may record an error as “9999” or a character string, while MATLAB denotes missing values in a numerical array as NaN (“not a number”). Because normally distributed data may contain noise in a wide range of values, robustness of the despiking algorithm is complicated by the requirement to differentiate between “hard spikes” characteristic of automatic flags (such as 9999) and “soft spikes”, which are realistically valued but erroneous measurement. An objective limit for soft spikes is usually defined as appropriate for the signal-to-noise ratio of any particular data, usually in terms of variance during a defined windowing period. Clearly distinguishing errors can be achieved by a static objective criteria, by a dynamic statistic, or in a separate pre-processing operation. Previous authors have described a variety of methods including use of autocorrelation (Højstrup, 1993) and statistics within a moving window (Vickers and Mahrt, 1997). A comprehensive review of despiking methods is presented by Starkenburg et al. (2016), with emphasis on the accuracy and statistical robustness of different computation methods.

Despiking is a problem of conditional low-pass filtering; consequently this
procedure can be treated as an application of signal processing which can be
performed efficiently using convolution. Mathematically, convolution can be
understood as a multiplicative function that combines a data signal with a
filter signal. For a discrete signal, the filter is a weight array which is
multiplied (in the Fourier domain) with data inside a window. In the time
domain, the window can be visualized as moving along the data array as it is
multiplied. As examples, a filter with a weight of 2 at the center of the
window, and zeros elsewhere, would amplify the data signal by a factor of
2; a filter 10 samples wide, each weighted at 0.1, generates a running
average of the data. The computational efficiency of convolution is a
product of the fast Fourier transform (FFT), which allocates memory
efficiently by a process known as bit switching. A thorough treatment of bit
switching can be found in Chapters 12 and 13 of Press (2007). To
demonstrate the increased efficiency of the FFT, two methods were used to
despike 8.5 h of 20 Hz sonic temperature data (609 139 samples). One
method utilized a

To test computation time uniformly, identical 10 Hz data were sub-sampled to
record lengths of 0.25 to 48 h, and multiple runs were despiked with
each sample set. Raw data were checked
for hard error flags which required converting text to number values. Data were not otherwise manipulated prior to
despiking. MATLAB Profiler was used to track the run time for all threads,
using the undocumented flag “built-in” to track pre-compiled MATLAB
functions as well as user functions,

With increased computation speed, automatic and accurate despiking can be accomplished, with reduced time cost to determine any necessary calibrate for the procedure. The various methods employed to despike data are variously limited by computational inefficiency (Starkenburg et al., 2016). “Phase space thresholding”, originally described by Goring and Nikora (2002), is one such method that Starkenburg noted as being hampered by computational costs, and by a requirement for iterative applications to calibrate despiking parameters. By decreasing the execution time, a similar method was developed that allows rapid and accurate despiking of data, for the detection of both hard and soft spikes. A phase space method allows objective criteria to be calibrated for specific sensor data, and a visual diagnostic phase space diagram that allows for rapid calibration of the despiking criteria (Fig. 3). Projecting the signal into a phase space diagram reveals modes related to sensor error, response time, and other factors leading to spikes. Using convolution to determine moving window statistics (such as a moving mean, standard deviation), objective identification of behaviors characteristic to a particular sensor response. In Fig. 3, infrared gas analyzer data (in this case, signal strength) collected at 20 Hz for 17 days are projected with 1 min moving window statistics. Based on this projection, a cut-off in phase space for spike identification can be assigned, and the subsequent percentage of removed data calculated. In this case, the sensor was repeatedly affected by dust from farm operations (Fig. 4), yet only 1.5 % of the data were required to be removed as spikes due to the precision of the despiking algorithm. This procedure took less than 5 s of computation.

Phase space diagram showing moving window statistics of IRGA signal (17 days of 20 Hz data).

Time series showing data removed as spikes (bolded) from phase space criteria in Fig. 3.

Another computationally intensive process in SR is the determination of the
second-, third-, and fifth-order structure functions. Ramps are an identifiable
feature in the measured temperature trace above any natural surface, yet
determining the characteristic ramp geometry from high-frequency data
requires an efficient, robust, and preferably automated procedure. There are
several methods to determine ramp geometry, including visual detection
(Shaw and Gao, 1989), low-pass filtering (Katul
et al., 1996; Paw U et al., 1995), wavelet analysis (Gao and Li,
1993), and structure functions (Spano et al., 1997).
Structure functions in particular provide both objective criteria to detect
ramps and an efficient method to tabulate statistics of time series data,
and use of structure functions has become the predominant method used for
SR. The general form for a structure function is

For this analysis, data were used from several field experiments. The data
records used ranged in length from 8 h to over 2 months, with sampling
frequencies of 10, 20, and 100 Hz (fastest frequency for short duration
trials only). Initially, computation of structure functions with the first
method (series of nested

For a total of

Ten iterations of the structure function calculations using a range of averaging periods.

In most SR studies to determine flux, a lag time is assigned to the
structure function calculation, with only a few authors allowing for a
procedure to maximize the ratio

Because the ideal SR calculation identifies the lag which maximizes the
ratio

The performance gains using convolution are more significant for short averaging periods, regardless of maximum lag used in calculating structure functions.

For the idealized SR method, the structure functions are retained (for each
averaging period) for the lag which maximizes the ratio

Solving for the roots of this function yields a single, predominant ramp
amplitude from a given temperature trace (with units of

As with other methods for measuring flux from the surface, analytic solutions do not always translate easily into straightforward numerical computation, especially when working with large data records or when calculating in real time. In applied research, custom algorithms are often developed by individual researchers, requiring special training in programming, significant time investment, and the motivation to use sophisticated techniques that fully utilize available memory and processing power. Efforts to standardize the eddy covariance method (Aubinet et al., 2012; Baldocchi, 2014) and data quality control (Allen et al., 2011; Foken et al., 2012) have not yet been similarly applied to the SR method, although substantial work has been made to validate and unify SR methods (Castellví, 2012; Chen et al., 1997b; French et al., 2012; Suvočarev et al., 2014). By appropriating methods common in signal processing, and by sharing open-source tools on online forums, more sophisticated approaches can be implemented. In particular, reducing the computational overhead of calculating flux enables broad implementation and robust verification of the SR method. Rapid algorithms allow for automated assignment of lag time, rather than fixed assignment, and allow flux determinations while varying the length of flux averaging periods. These procedures allow for comprehensive analysis of both the physical timescales of surfaces flux, and the sensor response and uncertainty associated with the SR derived flux. Calibration of despiking criteria can be implemented quickly at low computational cost. In summary, efficient methods for computing SR flux allow implementation in novel deployments such as low-cost, continuous monitoring and on moving platforms. Future work remains to transfer efficient methods from the MATLAB development platform to open-source implementations, and to enable hardware to perform these techniques directly for real-time applications. Reducing the cost and power requirement of the required data loggers, computers, and telemetry will facilitate the extensive deployment of SR sensors to aid in describing the heterogeneity of flux across the landscape.

All data used in this analysis and scripts implementing the algorithms
described above are available online at

Abbreviated scripts for the three example methods may be found in the Supplement. Requests for phase space despiking methods can be directed to the corresponding author.

The supplement related to this article is available online at:

The authors declare that they have no conflict of interest.

The authors would like to acknowledge the contributions of Bruno Luong, whose function for Cardano's solution to depressed cubic polynomials we adapted for this analysis. Edited by: Andrew Sayer Reviewed by: three anonymous referees