Investigating bias in the application of curve fitting programs to atmospheric time series
Abstract. The decomposition of an atmospheric time series into its constituent parts is an essential tool for identifying and isolating variations of interest from a data set, and is widely used to obtain information about sources, sinks and trends in climatically important gases. Such procedures involve fitting appropriate mathematical functions to the data. However, it has been demonstrated that the application of such curve fitting procedures can introduce bias, and thus influence the scientific interpretation of the data sets. We investigate the potential for bias associated with the application of three curve fitting programs, known as HPspline, CCGCRV and STL, using multi-year records of CO2, CH4 and O3 data from three atmospheric monitoring field stations. These three curve fitting programs are widely used within the greenhouse gas measurement community to analyse atmospheric time series, but have not previously been compared extensively.
The programs were rigorously tested for their ability to accurately represent the salient features of atmospheric time series, their ability to cope with outliers and gaps in the data, and for sensitivity to the values used for the input parameters needed for each program. We find that the programs can produce significantly different curve fits, and these curve fits can be dependent on the input parameters selected. There are notable differences between the results produced by the three programs for many of the decomposed components of the time series, such as the representation of seasonal cycle characteristics and the long-term (multi-year) growth rate. The programs also vary significantly in their response to gaps and outliers in the time series. Overall, we found that none of the three programs were superior, and that each program had its strengths and weaknesses. Thus, we provide a list of recommendations on the appropriate use of these three curve fitting programs for certain types of data sets, and for certain types of analyses and applications. In addition, we recommend that sensitivity tests are performed in any study using curve fitting programs, to ensure that results are not unduly influenced by the input smoothing parameters chosen.
Our findings also have implications for previous studies that have relied on a single curve fitting program to interpret atmospheric time series measurements. This is demonstrated by using two other curve fitting programs to replicate work in Piao et al. (2008) on zero-crossing analyses of atmospheric CO2 seasonal cycles to investigate terrestrial biosphere changes. We highlight the importance of using more than one program, to ensure results are consistent, reproducible, and free from bias.