Techniques for analyses of trends in GRUAN data

Introduction Conclusions References Tables Figures


Introduction
Long-term climate data records of essential climate variables such as temperature, water vapour and ozone are a prerequisite for climate change detection and attribution studies.Many decades of measurements are typically required to detect a trend at e.g. the 95 % confidence level.Not only do climate data records need to be temporally homogeneous over many decades, it is also imperative that the uncertainty on the trend can be estimated robustly so that we know what level of confidence to place on the derived trend.Having well characterized measurement uncertainties on the data being analysed, including traceability of uncertainty estimates to internationally recognized measurement standards, is essential in establishing the uncertainty on the resultant trend.
One of the core goals of the Global Climate Observing System (GCOS) Reference Upper Air Network (GRUAN; www.gruan.org) is to provide vertical profiles of reference measurements suitable for reliably detecting changes in global and regional climate on decadal scales.Reference measurements require that, at a minimum, the uncertainty of the measurement (including uncertainties arising from corrections applied) has been determined, the entire measurement procedure and set of processing algorithms are properly documented and accessible, and that every effort has been made to tie the observations to an internationally accepted traceable standard (Seidel et al., 2009).For vertical profile measurements within GRUAN, uncertainties are also required to be vertically resolved such that each datum in a profile is treated as a single measurement result requiring both the measurement and its uncertainty.Metadata describing how the measurements were made, what corrections were applied, what changes were made to the instruments over the lifetime of the measurements, and the data reduction algo-rithms used during the observation and post-observation periods, are also imperatives for reference quality observations.Immler et al. (2010) provided the theoretical basis for developing reference quality upper-air measurements in the form of GRUAN data products.These fundamental guidelines for establishing reference quality atmospheric observations are based on central concepts of metrology and, in particular, traceability.They demonstrate that the detailed analysis of the uncertainty budget of a measurement technique is the critical step for establishing reference quality observations.Detailed knowledge of the calibration procedures and data processing algorithms is required as part of determining the uncertainty budget.Finally, as highlighted by Immler et al. (2010), uncertainties introduced by correction schemes adjusting for systematic biases in the measurement system are an important component of the uncertainty budget.Dirksen et al. (2014) demonstrated the application of that theoretical basis for creating GRUAN data products based on Vaisala RS92 radiosonde measurements of temperature, pressure and humidity.The dominant source of uncertainty for radiosonde temperature measurements is solar radiation heating and Dirksen et al. (2014) report on the laboratory experiments performed to investigate and model the effects of solar radiative heating on the RS92's temperature and humidity measurements.GRUAN daytime humidity profiles show up to 15 % enhancement in humidity over Vaisala processed profiles, of which two-thirds is due to the radiation dry bias correction, and one-third is due to an additional calibration correction.Philipona et al. (2012) reported simultaneous solar short-wave radiation, thermal long-wave radiation, and air temperature measurements with radiosondes from the Earth's surface to 35 km altitude and then later demonstrated the use of these measurements during daytime and nighttime, under sun-shaded and unshaded conditions, to determine the radiation-induced error on radiosonde air temperature measurements (Philipona, 2013).They showed that, in general, solar radiation produces a radiative heating of about +0.2 K near the surface which linearly increases to about +1 K at 32 km (∼ 10 hPa).It is clear from these studies that significant time and effort has been invested in establishing reference quality measurements within GRUAN.
The goal of detecting a trend in upper-air temperature, and the uncertainty on that trend, from point source measurements from a radiosonde and their associated measurement uncertainties, can be challenging.Previous papers (Tiao et al., 1990;Weatherhead et al., 1998) and books (Kutner et al., 2005) have provided detailed statistical descriptions of how to detect trends, and their uncertainties, in geophysical time series.However, some of these studies require a higher understanding of statistics than might be available to scientists who want to quickly and reliably determine the trend and its uncertainty in a time series.The purpose of this paper is to take a pedagogical approach, leading the reader through the various perils and pitfalls that need to be surmounted to calculate a trend and its uncertainty.In this way we hope that the considerable effort that has been invested in obtaining reference quality measurements within GRUAN is exploited to the utmost.

Why not just fit a straight line?
The first question someone may ask when analysing a time series for trends is "why can I not just fit a straight line to the data and look at its slope?".If you simply want to know whether the data are trending up or down, with no interest in what is driving that change, or whether or not that change is statistically significant, this may be a legitimate approach.However, consider the synthetic data set shown in Fig. 1.The 10-year time series shown in the upper trace was generated by repeating the same annual cycle and adding some random noise.The data are purely periodic and contain no trend.And yet, if we fit a straight line to these data (dashed line in Fig. 1) we detect a negative trend.This is the case for almost all such time series generated with such random noise.Interestingly, if we derive the trend separately in each month over a large ensemble of such time series, and average those trends, the result is a zero trend.The reason for this is that a sinusoid (an "odd" function) was used to define the annual cycle, starting with a positive phase in the cycle and ending with a negative phase, thus causing a negative trend.Doubling the length of the record from 10 to 20 years reduces the magnitude of the negative trend, as expected.If we simply want to know whether the data are trending up or trending down, then this is a valid approach and the analyses indeed show that the data are trending down.But is it correct to conclude that there is a geophysical trend in these data that are constructed purely as a (noisy) repeating annual cycle?Almost certainly not.So we acknowledge that a first step in detecting a geophysical trend in a time series is to remove the mean annual cycle.This is shown by the line with filled dots in Fig. 1 with the straight line fit (dashed line).
But why stop there?There may be other cyclical signals in the time series, such as the solar cycle or El Niño-Southern Oscillation (ENSO), that should be removed to avoid their interference with the underlying geophysical trend.Therefore, a typical statistical trend model may be of the following form: where V t is the data value at time step t (typically year and month), QBO t is the quasi-biennial oscillation (Reed et al., 1961;Ribera et al., 2003) with a prescribed value at time step t, SOLAR t is the solar cycle with some prescribed value at time step t, and, in this case, ENSO t is the normalized Tahiti minus Darwin sea level pressure (southern oscillation index) at time step t, though we recognize that a range of other ENSO t basis functions are available and could be used.
The coefficients α, β, γ , δ and are fit coefficients, typically calculated using a multivariate least squares regression approach (Moore and McCabe, 2003).The fit coefficient β represents the trend in the time series.R is the residual, i.e. that part of the signal that cannot be tracked by the statistical model and is usually derived by subtracting the model fit (after it has been calculated) from the original data.It is possible, maybe even likely, that the trend will be different during different months of the year.For that matter, perhaps the QBO has a stronger effect in some months than others.The next section considers the options of how best to deal with possible seasonal dependence in the fit coefficients.

Dealing with seasonality in the fit coefficients
One approach is to fit a regression model in the form of e.g.Eq. (1) completely independently for each month, i.e. first fit the model to only the January data, then fit the model to only the February data etc.This results in 12 trend regression coefficients, one for each month, that capture the seasonal dependence of the data on each basis function.The disadvantage of this approach is that the number of fit coefficients increases by a factor of 12.This significantly increases the uncertainty on the fit coefficients.The approach also assumes complete independence from one month to the next, e.g. the trend in month M is completely independent of the trend in month M − 1 and in month M + 1.This is unlikely to be the case.Let us assume, as an example, that the fit coefficients are a function of season.So the trend coefficient β in Eq. ( 1) can be expanded as β = f (t).Clearly the value of β on 31 December is going to be essentially identical to the value of β on 1 January i.e. the dependence should be cyclic in season.It should also be smooth, and it should be a linear superposition of functions that, when added together, can describe any seasonal periodic structure.These requirements are met by expanding the coefficient in Fourier series, e.g. the trend coefficient β is expanded in two Fourier pairs as month M − 1 and M + 1.
A synthetic time series consisting of an annual cycle, a seasonally dependent trend, and some Gaussian noise added for realism is shown in the inset in Fig. 2. The trends extracted from this synthetic time series, first by applying a truncated version of Eq. ( 1) to include only the α and β terms, month by month (black dots in Fig. 2), and then by applying the truncated Eq. ( 1) with the α coefficient expanded in four Fourier pairs and the β coefficient expanded in two Fourier pairs (black line in Fig. 2), are shown in the main plot in Fig. 2.
How do we select how many Fourier pairs to use in each expansion?This is, in part, a judgment call.For terms where a robust seasonality is expected e.g. the mean annual cycle as represented by the seasonality in the offset coefficient (α), 3 or 4 Fourier pairs could confidently be fitted without the danger of fitting spurious seasonal structure.For the seasonality in the trend (β coefficient), this depends on our expectation -if we expect the trend to be the same through all seasons, then the β coefficient need not be expanded in Fourier pairs at all.If the trend shows clear seasonality but that seasonality is sinusoidal in nature, then 1 Fourier pair would be sufficient.The less sinusoidal that seasonality is in structure, the more Fourier pairs are needed to capture that structure.For the QBO (γ coefficient in Eq. 1), where some seasonality in its effect is expected, but where that seasonality may not be clearly present in the data and is not purely sinusoidal, 2 Fourier pairs would likely be the appropriate balance between avoiding over-fitting but capturing the seasonality in the QBO's effect on the data.For ENSO and SOLAR, because their effects are seldom seasonally dependent (or too weak to detect in the data), they are usually not expanded in Fourier pairs.
Expanding the trend coefficient in Fourier pairs results in a more coherent picture of the annual structure of the trend than if linear fits are used to derive trends individually for each month as shown in Fig. 2. The uncertainty on the trends is also smaller when the seasonality is accounted for using Fourier series (grey shaded area in Fig. 2) compared to fitting individually to each month (uncertainties shown as vertical error bars in Fig. 2).The calculation of the uncertainties on the derived trend coefficients is detailed in Sect. 4.

Pre-treating basis functions
Consider some geophysical quantity measured over the period 2002-2009 that has no trend and consists of nothing but an annual cycle and some unforced variability.Four synthetic time series of this nature have been created, differing only by the randomly generated unforced variability added to the sinusoidal annual cycle.The results from the fits of Eq. (1) to the four synthetic time series are shown in Fig. 3.In this case the ENSO basis function was excluded for clarity.
Even though the time series was constructed to have no trend, one of the four trend results (listed in panel (b) of Fig. 3) is found to be statistically significantly different from zero at the 1σ level (blue line in Fig. 3).This occurs because the combination of the linear trend term, and the solar cycle term, over the 2002-2009 period, may, purely fortuitously, track some of the unforced variability (generated randomly in our synthetic time series).As can be seen in Fig. 3, when the regression model assigns variance between the solar cycle and linear trend, solar cycle fits that induce a negative tendency in the time series are always matched with a positive linear trend, i.e. the negative tendency contribution from the solar cycle shown in Fig. 3d (blue line) is mirrored by the positive linear trend (blue line) in Fig. 3b.Similarly positive solar cycle tendencies are matched by negative linear trends (red lines in Fig. 3).This results, in large part, from the fact that the linear trend and the solar cycle over this 2002-2009 period are close to degenerate, i.e. they are very much like linear scalings of each other.This makes the combination of the linear trend and solar cycle basis function very susceptible to fitting the unforced variability on the signal.A nonzero QBO fit coefficient occurs for the same reason.One way to avoid this from happening is to take the view that any trend in the signal must be assigned exclusively to the trend basis function.The only way to ensure that this happens is to detrend each basis function time series.To show the effects of detrending the solar cycle and QBO basis functions before fitting Eq. ( 1), 20 000 trend results equivalent to those listed in Fig. 3b have been generated from 20 000 time series including random noise as in Fig. 3a.First the original solar cycle and QBO basis functions were used and then solar cycle and QBO basis functions were detrended before their use in the regression.Histograms of the 20 000 trend results from these two tests are shown in Fig. 4.  1) to synthetic time series (as shown in Fig. 3a) first with the original solar cycle and QBO basis functions (dark grey) and then with detrended QBO and solar cycle basis functions (light grey).The 1σ standard deviations of Gaussian fits to the histograms are listed together with the mean trend uncertainties derived from the regression model.The mean uncertainties were obtained by averaging the uncertainties from the 20 000 trend values.
It is clear from Fig. 4 that when the QBO and solar cycle basis functions are detrended, the likelihood of obtaining a trend that is statistically significantly different from zero is very much reduced i.e. the uncertainty on the trend halves.
But why stop there?By detrending all of the basis functions we ensure that the trend basis function is orthogonal to all other basis functions.But those remaining basis functions may not be orthogonal to each other.The basis functions can all be made orthogonal to each other through a Gram-Schmidt orthogonalization procedure (Nering, 1963).Consider two basis functions, BF 1 and BF 2 , where BF ⊥1 = BF 1 and the ⊥ subscript denotes the orthogonalized version of the basis function.To find a second vector BF ⊥2 that is orthogonal to BF ⊥1 , we need to find a number X such that where BF ⊥1 is orthogonal to BF ⊥2 .Two vectors are orthogonal to each other if their inner product (denoted here as < BF ⊥1 , BF ⊥2 >) is equal to zero.Taking the inner product of Eq. ( 2) with BF ⊥1 and solving for X leads to Substituting into Eq.( 2) results in If we have a third basis function, we derive BF ⊥1 and BF ⊥2 as described above and we find BF ⊥3 in the form of where BF ⊥1 is orthogonal to BF ⊥2 and BF ⊥2 is orthogonal to BF ⊥3 .We take the inner product with BF ⊥1 to find X 1 and the inner product with BF ⊥2 to find X 2 .This will then result in The same can be done for the nth basis function to ensure that all basis functions used within the regression are orthogonal to each other.The generic form of the Gram-Schmidt orthogonalization is Orthogonalization of the basis functions ensures that each additional basis function only describes the variance not already explained by the existing basis functions.However, using orthogonal basis functions in a regression model obfuscates the physical interpretation of the contribution of each basis function to the signal, which is represented by the derived fit coefficients.The contribution of each basis function to the signal then depends on the order in which the basis functions were orthogonalized and might lead to different conclusions if the basis functions are orthogonalized in a different order.If orthogonal basis functions have been used within the regression, care needs to be taken when interpreting the derived fit coefficients.

Dealing with lags between cause and effect
It is conceivable that there may be a delay between the change in a basis function and its effect on the geophysical quantity being modelled.For example, the effect of the QBO on ozone can often be shifted in phase; the QBO signal in tropical ozone is out of phase with the QBO signal in extra-tropical ozone (see e.g.Fig. 4 of Bodeker et al., 2013).Consider first the more abstract case of fitting a sinusoidal function, with an arbitrary phase shift, to some geophysical variable: where α is the amplitude of the signal and β is the phase shift.Noting that and substituting A = α × cos(β) and B = α × sin(β), Eq. ( 3) becomes with A and B becoming the new fit coefficients.Once A and B have been determined by fitting the two terms to the data, the original α and β values can be derived from ).

G. E. Bodeker and S. Kremser: Techniques for analyses of trends in GRUAN data
The key to this solution is that the "cosine" signal is orthogonal to the "sine" signal such that by mixing these signals with different amplitudes, a sinusoidal signal with arbitrary phase can be generated.For some studies (e.g.Steinbrecht et al., 2003;Sioris et al., 2014), two different QBO basis functions are selected at two different pressure levels such that the signals are approximately a quarter of a cycle out of phase and therefore close to orthogonal to each other.This allows for an arbitrary phase shift in the QBO to be fitted.Austin et al. (2008) artificially constructed an orthogonal QBO time series to the base QBO time series to fit for a phase shift in the base time series, while other studies (e.g.Randel and Wu, 1996) simply use multiple QBO time series to accommodate phase shifts.
But how can we accommodate lags in basis functions that are not cyclical e.g.basis functions accounting for the effects of volcanic eruptions?Generally this is best done by running the regression model multiple times with different lags applied and determining which lag leads to a minimum in the sum of the squares of the regression model residuals.

Determining the uncertainty on the trend
One approach to determining the uncertainty on regression model fit coefficients is the bootstrapping method (Efron and Tibshirani, 1986).For example, to obtain the uncertainty on the trend coefficient in a regression model: 1. Fit the regression model (e.g.Eq. 1) to the data.
2. Subtract the regression model fit from the data to obtain the residuals.
3. Take the signal produced by the regression model fit and, for each data point in that signal, randomly select one residual value and add it to the data point.Do this for the whole signal to generate a new signal which, while having the same underlying structure as the original signal, now has different random noise.Fit the regression model and record the trend value.
4. Repeat step 3 many times (e.g. 10 000) to generate 10 000 estimates of the trend.Calculate the standard deviation of those 10 000 values to obtain the estimated uncertainty on the trend.
Bootstrapping works because a very large ensemble of signals, each with the same underlying structure as the original signal, but with different noise characteristics, is created.Each of these signals, while statistically indistinguishable, has a slightly different trend resulting from the noise characteristics of the original data set.It is therefore natural that the standard deviation of the trends, derived from the ensemble of signals, is the uncertainty on the trend.There are some complications with the application of bootstrapping that need to be overcome in some circumstances, e.g. when the time series is autocorrelated, as discussed in Sect. 5.In this section we use bootstrapping to test some methodological choices in the calculation of uncertainties on trend coefficients.One would expect that the uncertainty on the regression model fit coefficients, such as the trend, should depend on, inter alia, 1.The noise (unexplained variance) on the data signal -the greater the noise the less certain the regression model fit coefficients should be.
2. The measurement uncertainty on each datum comprising the data signal -the greater the measurement uncertainties the less certain the regression model fit coefficients should be.Press et al. (1989) allude to the diagonal elements of the inverse matrix from the regression model being the variances (squared uncertainties) of the fitted parameters.The inverse matrix is given by (A T A) −1 , where A is the so-called design matrix whose N × M components are constructed from the M basis functions evaluated at the N times for which data are available, and from the N measurement uncertainties (σ i ): where X j (x i ) is the j th basis function evaluated at the ith time for which data are available.While this inverse matrix (A T A) −1 , and hence the uncertainties on the regression model fit parameters, is sensitive to the measurement uncertainties (satisfying expectation 2 above), it is insensitive to the noise on the signal since it is a function of the basis functions only and therefore does not take into account in any way the unexplained variance in the signal being fitted.Press et al. (1989) suggest that to account for the variance, one can use the bootstrapping method to estimate quantitative confidence limits on the fitted parameters.Just using the square root of the diagonal elements of the inverse matrix would not account for all uncertainties and would most likely underestimate the uncertainties on the derived trend.Bodeker et al. (1998), hereafter referred to as B98, implemented a revision of the measurement uncertainties according to Tiao et al. (1990) (hereafter referred to as T90).Their approach is to run the regression model according to Press et al. (1989) as a first step but then to account for the unexplained variance in the signal by substituting the measurement uncertainties in Eq. ( 4) with σ new given by where σ orig is the original measurement uncertainty and σ 2 is the variance of the residuals from the first fit derived individually for each month (T90).The regression model fit is then  Weatherhead et al. (1998), hereafter referred to as W98, derived an approximation for the uncertainty (ignoring autocorrelation -see Sect.5) on the regression model estimate of the trend as where σ N is the standard deviation of the residuals of the time series to which the regression model is being fitted and N is the number of data points in the time series.However, this approximation for the uncertainty on the trend is insensitive to the measurement uncertainties and no indication is given in W98 on how to accommodate that.To include the W98 approach in our methodological testing, Eq. ( 5) is applied to σ N of Eq. ( 6) similar to how it was done in B98.In this way, the W98 approach now accounts for both measurement uncertainties and unexplained variance in determining the uncertainty on the trend.The B98 and W98 approaches to estimate the trend uncertainty are now compared to each other and to the traditional bootstrapping approach.Monthly mean synthetic time series, containing only an annual cycle and no trend and no autocorrelation, extending from 1990 to 2010, with varying levels of measurement uncertainty and unexplained variance, were generated.Different combinations of measurement uncertainties together with different degrees of unexplained variance were created to investigate what effect the measurement uncertainties and the variance have on the derived uncertainty on the trend (which is zero).Clearly, if the measurements have a larger random uncertainty, the derived trend should become less certain than for a set of data where the measurements have a small uncertainty.The same is true for the unexplained variance on the signal.If we have noisy data (high variance) the estimated trend will have a larger un-certainty than for a signal with lower variance since the true trend might be buried in the noise.
To derive uncertainties that are statistically representative of the derived trend in each generated time series with random noise and random measurement uncertainties, and that are not affected by the chosen randomness of the noise, 500 synthetic time series for each combination of unexplained variance and measurement uncertainties (see Table 1) were generated.Three sets of measurement uncertainties, viz.
-uncertainties set to 1.0 for all measurements -hereafter referred to as uncert1.0-uncertainties selected randomly within a Gaussian distribution of mean 1.5 and standard deviation of 0.5hereafter referred to as uncert1.5σ0.5 -uncertainties selected randomly within a Gaussian distribution of mean 4.5 and standard deviation of 1.5hereafter referred to as uncert4.5σ1.5 and three sets of unexplained variance on the signal, viz.
-no additional variance -medium additional variance -random values are chosen from a Gaussian distribution with a mean 0.0 and a standard deviation of 3.0 -high additional variance -random values are chosen from a Gaussian distribution with a mean 0.0 and a standard deviation of 8.0.

G. E. Bodeker and S. Kremser: Techniques for analyses of trends in GRUAN data
It is clear from Table 1 that for all cases both the B98 and W98 approaches come close to matching the results from bootstrapping.Larger uncertainties on the trends are derived for all three methods when the unexplained variance on the signal and/or the measurement uncertainties increase.By construction all signals were generated without a trend, with the result that the derived trends are not statistically significantly different from zero within the derived 1σ uncertainties.The results presented in Table 1 show that both B98 and W98 are valuable approaches to estimate uncertainties on the derived trend.
While the bootstrapping approach accounts for both the unexplained variance on the signal and the measurement uncertainties, bootstrapping is computationally demanding and it does not account for autocorrelation in the signal.Both B98 and W98 can be used if the signal is autocorrelated (see below) and both methods are less computationally demanding than bootstrapping.W98 is only applicable for estimating uncertainties on the trend.To derive uncertainties on other fit coefficients W98 cannot be used.Therefore, as the B98 approach can be used to derive uncertainties on all fit coefficients, this method is used in all further examples and tests below.

Accounting for autocorrelation
Time series of atmospheric climate data records are often autocorrelated e.g. higher than normal temperature on one day is often followed by anomalously high temperature on the next day.This autocorrelation results from natural temporal and spatial scales of many atmospheric phenomena.Such autocorrelations have the effect of reducing the quantity of information that would be available from the same number of independent data points, generally increasing the size of the error estimates (T90).T90 show that the uncertainty on trend estimates depends critically on the magnitude of month to month autocorrelation (serial autocorrelation) in the measurements.For contiguous evenly spaced data, a first order autoregressive model is constructed as where R t is the residual at time t (see Eq. 1), R t−1 is the residual at the previous time step, φ is the autocorrelation coefficient, and the a t are the final residuals which should be free of any autocorrelation.For unequally spaced measurements, or for measurement series with many gaps, a second order autocorrelation model of the form (Reinsel et al., 1987).
Appendix A of T90 provides a convenient way to account for the autocorrelation in the application of Eq. ( 1).The data to be regressed are replaced by and all basis function time series are similarly transformed: The regression model is rerun and the usual standard error formulae are used to calculate the uncertainty on the regression model fit coefficients.
In the presence of autocorrelation, additional changes are required, e.g.Eq. ( 5) becomes or, if second order autocorrelation is accounted for and finally, Eq. ( 6) becomes i.e. the presence of autocorrelation inflates the uncertainty on the regression model fit coefficients by 6 GRUAN data

Application of trend analysis methods to Lindenberg upper air temperature data
At the time of writing this paper, RS92 radiosonde measurements of upper air temperatures currently comprise the only official GRUAN data product.RS92 radiosonde flights are made at a number of sites typically either two or four times per day (see Fig. 5).The site with the most comprehensive data record to date is the Lindenberg site which is used below in the example trend analyses.Figure 5 is not a reflection of the number of radiosonde flights made at each site, but rather the number of radiosonde flights that passed all QA/QC (Quality Assurance/Quality Control) checks imposed in the centralized data processing; the raw data from these flights are processed through a centralized RS92 radiosonde data processing facility located at the GRUAN Lead Centre in Lindenberg, Germany.A detailed description of the GRUAN RS92 radiosonde product, and how this product conforms to the requirements of a reference measurement, is described in detail in Dirksen et al. (2014).While the currently available RS92 radiosonde data set is too short for a robust longterm trend analysis, application of the methodology outlined above is demonstrated here on the data that are available.The challenges with applying the method to such a short time series are outlined below.The results of this very preliminary trend analysis, presented in Sect.6.2, should not be overinterpreted as they are certainly not indicative of longer term trends in upper air temperatures.The time series of GRUAN radiosonde upper air temperatures available from Lindenberg is rather short i.e. a five year period extending from mid-2009 to mid-2014.The brevity of this time series makes the fitting of longer-term geophysical drivers of temperature change such as the QBO, ENSO, and solar cycle difficult, if not impossible.And yet the increase in 10.7 cm solar flux from mid-2009 to mid-2014 may well have driven changes in temperature above Lindenberg over this period.If the question to be answered is "what were the temperature trends over Lindenberg from mid-2009 to mid-2014 as a function of season and altitude over and above the possible trends induced by periodic drivers such as QBO, ENSO and solar cycle?", then basis functions for these drivers should be included in addition to a trend basis function.The difficulty then is that if there was a trend in temperatures over this period, but this trend was not induced by the solar cycle, then the trend may well still be ascribed to the solar cycle since the 10.7 cm solar flux showed a steady increase from mid-2009 to mid-2014.The measurement time series is simply not long enough to determine the correlation between the temperature changes and the solar cycle.The same may be true for ENSO and QBO.If the question to be answered is "what were the temperature trends over Lindenberg from mid-2009 to mid-2014 as a function of season and altitude irrespective of the drivers of those changes?",then a regression model consisting of no more than an offset and trend term in Eq. ( 1) may be sufficient.Alternatively, a regression model including all potential drivers of temperature changes, but orthogonalizing all basis functions so that all basis functions other than the trend basis function are trendfree, would be appropriate.With these caveats in mind, three different constructs of the regression model have been applied to the data: 1.A regression model including offset, trend, QBO, ENSO and solar cycle basis functions -hereafter referred to as the all regression.
2. A regression model including only offset and trend basis functions -hereafter referred to as the reduced regression.
3. A regression model including offset, trend, QBO, ENSO and solar cycle basis functions, but where all basis functions have been sequentially orthogonalizedhereafter referred to as the all-orthog regression.
The measurement uncertainties, as provided in the GRUAN data files, are provided to the regression model along with the measurements.Individual measurements are used i.e. monthly means are not calculated.

Upper air temperature trends at Lindenberg
Examples of regression model fits to 00:00 UT temperature measurements above Lindenberg are shown in Fig. 6.The all-orthog fits are identical to the all fits and are therefore not shown; the partitioning of the variance between the basis functions between the all-orthog and all regression model runs of course differs, as will be shown below.Only sporadic measurements are available before late 2009 and during these periods, the uncertainty on the regression model fit is larger.It is also not immediately obvious that the all fit is superior to the reduced fit though, in some cases (e.g.early 2011 at 20 and 25 km) the all fit does capture the anomalously higher temperatures.Similar fits for 12:00 UT data are not shown.The resultant 00:00 UT trends from the three different applications of the regression model are shown in Fig. 7.
There are clear differences in the trends between the three different regression model runs.These differences can aid the interpretation of the causes of the trends.For example, a positive trend of 2.71 K year −1 is derived from the all re-gression model fit which reduces to 1.40 K year −1 in the allorthog fit.This suggests that in the all fit, other basis functions (in this case primarily the solar cycle basis function) drive a negative trend in temperature which is then partially compensated for by inflating the positive linear temperature trend to 2.71 K year −1 .Together, the negative trends induced by the other basis functions, together with the positive trend from the linear trend basis function, track the net trend of 1.40 K year −1 that results when all other basis functions are detrended i.e. from the all-orthog fit.The orthogonalization process changes all but the first basis function (in our case the offset basis function) such that the trend basis function in allorthog is different from that in reduced (see Eq. 2).This explains the difference in linear trends between the all-orthog and reduced fits.As noted in Sect.2.3, orthogonalizing the basis functions obfuscates the physical interpretation of the contribution of each basis function to the signal.Therefore, comparing now the all and reduced fit results, a trend reduction from 2.71 to 1.96 K year −1 is found.The explanation of this reduction is similar to that outlined above for the change from the all and all-orthog fits.Two interpretations of this result are possible: 1. Changes in solar activity resulted in cooling in May at ∼ 13 km.Other factors, accounted for by the linear trend term (e.g.changes in greenhouse gas concentrations) caused warming, larger in absolute magnitude than the solar cycle induced cooling, resulting in small net positive trends in temperature seen in the reduced panel of Fig. 7.The trend in temperature, over and above that resulting from the QBO, ENSO and solar cycle basis functions, is that derived from the all regression.
2. Changes in solar activity did not cause cooling in May at ∼ 13 km.Rather, shorter-term anti-correlations between temperature and solar activity aliased into a simulated longer-term negative trend in temperature which then needed to be compensated by a more positive linear trend contribution to track the true secular change in temperature as quantified in the reduced regression results.
A regression analysis, on its own, cannot determine which of these two interpretations is correct.It is a fundamental limitation of regression analyses that no physical understanding of the system is incorporated in the analyses.If, based on physically implausible mechanisms, an anti-correlation between temperature and solar activity at 13 km could be excluded, then interpretation (2) above could be excluded -in which case the trends quantified in the reduced analyses would best represent reality.We caution that trends derived from application of regression models need to be interpreted.Similar arguments regarding interpretation of 12:00 UT trends can be made as for the 00:00 UT trends (not shown here).In general, as with the 00:00 UT trends, the 12:00 UT trends are seldom statistically different from zero at the 2σ level.

Conclusions
Regression analysis is seldom a panacea for extracting trends from geophysical time series.This paper has explored some of the methodological perils and pitfalls of trend determination using regression analysis and has reiterated the caveats by demonstrating the application of the regression model to short time series of upper air temperature measurements at the GRUAN site at Lindenberg.The paper demonstrates how the GRUAN measurement uncertainties, the focus of much of the effort within GRUAN, can be propagated through to uncertainties on derived trends in temperature.The Lindenberg measurement series are too short to derive indicative upper air temperature trends from the data but are useful as a pedagogical example of the application of the regression model.The methodologies described in this paper demonstrate the care that needs to be taken in the calculation and interpretation of trends in GRUAN data so that the significant investment in GRUAN operations is fully exploited.

Figure 1 .
Figure 1.Solid grey line: a synthetic time series created from a purely periodic signal (plus noise) but still exhibiting a non-zero trend (upper dashed straight line fit).Line with filled dots: the deseasonalized signal with a straight line fit showing no statistically significant trend.

Figure 2 .
Figure 2. The seasonally resolved trend coefficient derived from fitting Eq. (1) (including only the α and β terms) to the synthetic time series (shown in the inset) together with its 1σ uncertainty (solid black line with grey shaded area).The trends obtained from linear fits to the data from each month separately, together with their 1σ uncertainties, are shown with black dots and vertical error bars respectively.

Figure 3 .
Figure 3. Four synthetic monthly mean time series calculated by adding Gaussian distributed random noise to a repeating mean annual cycle (a) together with the contributions from the trend (b), QBO (c) and solar cycle (d) basis function derived by fitting Eq. (1) to the time series in (a).The quantity plotted is artificial and therefore unitless.The trend values listed in (b) include their 1σ uncertainties.

Figure 4 .
Figure 4. Histograms generated from 20 000 simulations of trend results derived from the application of Eq. (1) to synthetic time series (as shown in Fig.3a) first with the original solar cycle and QBO basis functions (dark grey) and then with detrended QBO and solar cycle basis functions (light grey).The 1σ standard deviations of Gaussian fits to the histograms are listed together with the mean trend uncertainties derived from the regression model.The mean uncertainties were obtained by averaging the uncertainties from the 20 000 trend values.

Figure 5 .
Figure 5.The availability of RS92 radiosonde data at a sub-set of the GRUAN sites (not all GRUAN sites currently provide RS92 radiosonde measurements) at the time of the writing of this paper.Each site shows data in four rows indicative of the four 6 h periods through the day.Vertical columns in the grid indicate individual months with the years are listed across the top.The number of RS92 radiosonde flights available in each month in each 6 h period is shown using the colour scale in the bottom right corner of the figure.

Figure 6 .
Figure 6.Examples of regression model fits to the 00:00 UT measurements at five selected altitude levels spaced 5 km apart.1σ uncertainties are shown as error bars on the regression model fits.Because the measurement random uncertainties are small i.e. ∼ 0.1 K, they are not shown.

Figure 7 .
Figure 7. 00:00 UT Lindenberg temperature trends as a function of altitude and season for the three different constructs of the regression model.The trends, in Kelvin/year, are shown using the colour scale at the bottom left.Trends that are not statistically significantly different from zero at the 1σ level are covered in cross-hatching while trends that are not statistically significantly different from zero at the 2σ level are covered in single-hatching.Trends with no hatching are statistically significantly different from zero at more than the 2σ level.

Table 1 .
The mean 1σ uncertainties of the 500 trend values derived for the B98, W98 and bootstrapping approaches.For more details see text.new as the revised measurement uncertainties.As a result, the derived uncertainties on the regression model fit coefficients (obtained from the diagonal elements of the inverse matrix) become sensitive to both the original measurement uncertainties and the unexplained variance.