Assessment of BSRN radiation records for the computation of monthly means

Introduction Conclusions References


Introduction
In this work we investigate the extent of the differences that can be caused by various potential data filling methodologies for surface radiation quantities.We do not intend or claim to have identified the ultimately best and least-error filling method but rather demonstrate the impact of a range of potential methods that could and undoubtedly have been used by researchers when working with data sets with realistic gaps caused by unavoidable and other observational issues.
Anthropogenic interference with climate occurs first through a perturbation of the Earth's radiation balance (e.g., Ramanathan et al., 2001).Despite the central role that the radiation balance plays in the climate system, considerable uncertainties remain with respect to its mean state and temporal variation, as well as its representation in climate models (Wild et al., 1995;Wild, 2008).Attempts are underway Published by Copernicus Publications on behalf of the European Geosciences Union.
A. Roesch et al.: Assessment of BSRN radiation records for the computation of monthly means to monitor changes in the radiation balance from both the surface and space.More and more studies, particularly those based on surface observations, present evidence that the radiative fluxes are not stable over time but undergo significant decadal variations (e.g., Gilgen et al., 1997;Stanhill and Cohen, 2001;Liepert, 2002;Dutton et al., 2006;Wild, 2009 and references therein), which may have major consequences for the climate system and climate change (Wild, 2009).However, all these analyses rely on data (typically monthly or yearly means), that have been aggregated in some way or another from incomplete raw data with much higher temporal resolution (typically minute to hourly).While the way to do this aggregation is by no means straightforward, neither the effects of different aggregation techniques nor the impacts of missing or flagged raw data have to date been rigorously assessed.Most of the studies based on monthly or yearly mean radiation fluxes ignore potential uncertainties induced by data gaps or differing aggregation methods.The problem has become more obvious with the unsatisfactory situation that for the same site substantially differing monthly or yearly mean have been published.The present study attempts to shed more light onto this issue, using exemplarily data from the Baseline Surface Radiation Network (BSRN; Ohmura et al., 1998).BSRN was established to provide high quality radiation measurements aimed at monitoring and detecting important changes in the surface radiation balance.In 2005, BSRN provides radiation data at almost 40 sites at high temporal frequency (time intervals of 1, 2, 3 or 5 min depending on site and period) and highest possible accuracy.The BSRN data have been successfully used in numerous scientific applications (e.g., Wild et al., 1995Wild et al., , 2005;;Dutton et al., 2006;Wild, 2008Wild, , 2009)).
This paper focuses on monthly means as this aggregation is widely used in numerous climatological analyses.The reasons for this are manifold: given such factors as large differences in scale and sampling frequency between "point measurements" such as BSRN surface radiation and both satellite retrievals and model calculations, one common practice is to use longer averages in any comparisons between the two.One of the common temporal averaging modes is to use monthly averages, which cover enough time that spatial and temporal sampling differences are mitigated to significant extent, yet are still "short enough" to be able to investigate such phenomenon as seasonal cycles.The same holds true for mitigating the effect of "missing or bad" data.For instance, an hour of missing solar radiation measurements near local solar noon precludes a meaningful daily average for that day.Without a priori knowledge of cloud occurrence and cloud properties for the missing time period, it is very challenging to accurately "manufacture" values corresponding to the missing data.Yet by the method of creating a monthly average diurnal cycle, the climatology of cloud occurrence for a given site helps to mitigate the influence of the missing data.
In the following we investigate the impact of missing BSRN radiation observations (either non-existent or flagged) and estimate the error when applying a number of different methods for the computation of monthly means from 1-min observations.

BSRN
The Baseline Surface Radiation Network (BSRN) is a project of the World Climate Research Program (WCRP) (Ohmura et al., 1998) and aims at providing the climate community with accurate and highly resolved irradiances for climate research purposes.This global network measures surface radiative fluxes at the highest possible accuracy with wellcalibrated state-of-the-art instrumentation at selected sites in the major climate zones.Data are available from 1992 onward, currently from 51 stations, covering a latitude range from 80 • N to 90 • S. The high temporal resolution (minute frequency) makes the database a valuable tool for the validation of radiation schemes as well as the evaluation of estimates of surface radiation based in part on necessarily indirect and imperfectly calibrated satellite observations.For detailed information on the BSRN database and the sites (and the 3-letter acronyms for the stations that are used in this study), the reader is referred to the website at http: //www.bsrn.awi.de/.The BSRN database currently contains approximately 5800 station months.The results presented in this study are based on all available observations that were available by Spring 2008.The study concentrates on the "basic" measurements, including global radiation (GLOB), diffuse shortwave radiation (SWDIFF), direct shortwave radiation (SWDIR), and downwelling longwave (LWDOWN).GLOB can be measured by either an unshaded pyranometer or by adding the direct and diffuse shortwave components.If not specified, GLOB refers to the pyranometer measurement.The term GLOB1 will be used for the (measured) sum of the downwelling direct and diffuse shortwave flux.The time interval for the radiation data compilation is mostly 1 min.A few sites provide data every 3 or 5 min.

Data flagging procedures
Detailed quality checks are applied to the BSRN radiation data.The WRMC does not correct the data but flags radiation data that is suspected to be erronous.Then, subsequent applications of the data can determine if the flagged data should be used or discarded.Note that AWI does not provide quality flags of the archived BSRN data.
Three different procedures have been applied to the data.The procedures and limits are identical for all BSRN sites.i.The "physically possible" procedure aims at detecting extremely large errors in the radiation data.The radiation data falling in the intervals defined in Table 1 are considered "physically possible".
ii.The limits in the "extremely rare" procedure are narrower than those of the "physically possible" test.Radiation data which violate these limits may occur over very short time periods under very rare conditions.These limits are given in Table 2. Within this study, data of "good quality" are assumed to be inside the "extremely rare" limits.
iii.The "across quantities" procedures capture smaller errors that have not been detected by the previous quality checks.These tests are based on empirical relations of the different quantities measured.The restrictions are defined in Table 3.

Methods for monthly mean computations
There are many options for the computation of monthly averages from incomplete data.We will test the performance of several methods that are currently applied in the climate and radiation community to the BSRN data.Most of them include to some extent arbitrary thresholds that were set based on expert knowledge, visual inspection of appropriate illustrations and practical reasoning.These type of methods have been used in many practical applications (e.g.Zhang et al., 2004;Dutton et al., 2006;Wild et al., 2006;and Hinkelman et al., 2009).Seven different such methods are selected here to demonstrate the effect that different methods can have.We applied the following seven algorithms for computing monthly means from n-min values (n = 1, 3, 5) from the BSRN data: M1: Computation of monthly fluxes from all minute-values, including all flagged data, which have been identified in the BSRN data base as being questionable (see Sect. 2.2).No filling of missing data is applied.
least 50% of the possible data for that value.For the solardriven variables (such as SWDIFF and SWDIR), the number of "good" data must be at least half of the daylight period, the daylight period defined as the number of minutes from sunrise to sunset on a given day for that date and location.Once daily averages have been produced per the above procedure, the daily averages are then used to calculate monthly averages as a simple arithmetic mean of the available daily averages if certain limits on available data are met.First, for any given day to be considered for being included in the monthly average there must be at least 1300 min of the possible 1440 min overall data available, regardless of the availability of any particular individual variable.Then for any particular variable, there must be at least 60% of the possible data available, i.e. for the downwelling LW there must be 864 min of available "good" data, for the downwelling SW there must be 60% of the possible daylight (sunrise to sunset) data available.M7: 15-min averages are first computed from the 1-min data for each month.Computation of a single bin requires at least 20% valid data.Minute values that are outside the "physically possible" limits (Sect.2.2, Table 1) are treated as missing values.For shortwave radiation fluxes, values below 0 W m −2 during night (solar zenith angle > 93 • ) were set to 0 W m −2 .The reason for negative shortwave fluxes ("night-time offset") has been discussed in (Haeffelin et al., 2001).The monthly mean is then computed by averaging the 96 bins (96 × 15 min = 24 h) that have been produced for each month.The monthly mean is valid only if all bins contain valid values.Performing the computation of the monthly mean diurnal cycle benefits from the typical diurnal cycle of shortwave fluxes, allowing more accurate estimates for incomplete observations.The methods M1, M2, M3, and M4 set nighttime (solar zenith angle greater than 93 • ) SW values to zero.Note that SWDIR is computed on a horizontal plane for the two methods M5 and M6 while the other methods provide SWDIR on a surface perpendicular to the direction of the incoming beam.
It is important to note that we do not recommend a method for filling in the gaps as there is no "best method" to fill them in.In fact, determination of the "best method" actually depends on what the resultant data are to be used for.Without a priori knowledge of variables that affect the surface radiation (i.e.cloud occurrence and cloud properties, atmospheric state, aerosol and ozone loading, etc.) for the missing time period, it is impossible to accurately "manufacture" values corresponding to the missing data.If one depends on climatology, then the gap filling interferes with the ability to analyze the data for long term subtle trends such as global dimming and brightening or global warming.There simply is no "win-win" methodology to remove the effects of missing data.That being the case, our methods have the advantages of being simple and easily understood, do not include modeled or external data, but relies only on actual measurements, and the methodology helps to mitigate the influence of missing data.The user of the monthly averages thus produced must be aware of the impacts of missing data and make their own judgment as to how much missing data is allowed.The high time resolution data are available for those who do prefer some other gap-filling methodology.

Results and discussion
The completeness of the BSRN observation are assessed by presenting (i) an overview on the frequency of gaps in the data (Sect.3.1), (ii) the amount of flagged data (Sect.3.2) and (iii) the impact of missing and/or flagged data on monthly mean estimates (Sects.3.3 and 3.4).

Data gaps
Data gaps in the initial field data occur due to different reasons such as calibration periods, instrument failure or data loss.For this study, the frequency of data gaps was investigated for both shortwave (SW) and longwave (LW) radiation fluxes at all BSRN sites using all currently (spring 2008) available 1-min observations.Figure 1 displays the gap distribution for GLOB and LWDOWN at the two BSRN sites Alice Springs, Australia (ASP) with 131 observed months and Billings, USA (BIL) with 149 months of observations.The figure clearly shows that both the gap lengths and gap frequency between different sites and different parameters may strongly vary.While for ASP, data gaps are generally very short, the radiation instruments at BIL often fail for more than one day (1440 min).Table 4 gives the percentage of missing data along with the total number of gaps.From this table we learn that ten BSRN sites have more than 5% missing GLOB observations.For SWDIR, 16 (8) sites have more than 5% (15%) missing data.A considerable fraction of LWDOWN data is missing: 11 (4) sites have more than 5% (15%) missing observations for LWDOWN.It can be thus concluded that at many sites, a substantial percentage of the observations are missing.The detailed gap analysis (Table 4) shows that for a specific site, the percentage of missing GLOB is generally lower than that for SWDIR.Only 6 (3, 7) sites out of the 33 stations listed in  of missing data hinder deriving reliable monthly means or trends in the radiation fluxes.

Flagged BSRN data
The overall quality of measured time series does not only depend on the frequency of gaps but also on the amount of flagged data.BSRN has established a simple quality control of measured radiation fluxes (see Sect. 2.2).In Table 5 we present the fraction of the flagged data according to the "extremely rare" procedure (in units of 0.1%).The high fractions of flagged data in the SW are primarily due to the flagging of small negative SW fluxes during night ("nighttime offset", see Haeffelin et al., 2001) which are related to a small level of thermal noise.We can therefore conclude that in a first approximation, numbers > 100 (10%) in Table 5 represent the fraction of data below −2 W m −2 .Approximately half of all BSRN sites (17) belong to this category.The fraction of flagged SWDIFF and SWDIR data is generally lower than the flagged GLOB data.Note that ignoring negative nighttime offsets might have a serious impact on the monthly mean.Note that there is still no common sense within the BSRN community how to handle negative nigthtime offsets.Flagged fractions ranging from 0 to a few percent may be attributed to "real" data problems (other than the nighttime offset) due to instrument failure or calibration problems.Sites that do not provide SWUP or LWUP are marked with −999 in Table 5, clearly pointing out that less than 25% of the BSRN sites observe SWUP and LWUP.
In the LW, measurements outside the "extremely rare" limits rarely occur.Only at two stations (E13 and FLO), more than 1% of the LWDOWN observations are flagged while 33 out of the 39 listed sites have less than 0.2% flagged LWDOWN observations.Table 6 provides detailed insight into the flagging results obtained from the "across quantities" procedure (Sect.2.2).BSRN data mostly meet the rules of the "across-quantities" procedure.The constraint between GLOB and GLOB1 -its difference should stay below 8% -is, however, quite often violated.For 14 BSRN sites, this condition is not satisfied for more than 2% of all 1-(2-, 5-) min observations.The mean over all sites (weighted with the length of the measurement period) is 2.8%.This is related to technical problems and Table 6.Percentage (× 10) of values flagged according to the "across quantity" procedures COMP1, COMP2, COMP3, COMP4, COMP5, and COMP6 as described in Sect.2.2 and Table 3.Values equal to −999 indicate that the "across-quantity" procedure could not be applied (due to missing data).
or the sensor, influencing the precision of a measurement.They do not have an impact on the mean values.

Differences in monthly means
In this section the seven different methods presented in Sect.2.3 will be compared.For this intercomparison, we use all available data from the BSRN archive in order to provide the best statistics possible.Figure 2 shows the deviation of monthly GLOB climatologies for each method from the average over all months.This figure clearly reveals that the differences among the seven investigated methods can be quite large.Typical differences are in the order of 1 W m −2 but may increase to a few W m −2 for some sites.It is evident that the differences become more pronounced for individual months (note that the results in Fig. 2 show climatologies).Figure 2 reveals that the handling of flagged data plays an important role.This can be clearly demonstrated by comparing M1 with M2.These two methods only differ in how the flagged data are handled: M1 includes all data outside the "extreme rare" limits while M2 excludes them.This indicates that the treatment of flagged GLOB observations during the day (night values are zeroed) may also have a pronounced effect on the computed monthly mean.Distinct differences are also found between M2 and M3, giving strong evidence that gap filling has a distinct effect on the computed monthly means.The M7 and M4 methods hardly differ for KWA and PAY (Fig. 2, right-hand panels).This suggests that the computation of monthly means by computing first the monthly mean diurnal cycles (as applied in M7) may help to avoid the use of gap-filling (as applied in M4), even for time series with a considerable amount of missing data (as for KWA, see Table 4).
The mean absolute deviation between two methods gives further insight into the differences between individual methods.The mean absolute deviation (MAD) is defined as follows with MX i and MY i the monthly means computed with method MX and MY , respectively, and N the number of valid monthly means in both MX and MY .Figure 3 displays MAD between all method combinations, averaged over all BSRN sites listed in Table 4.For GLOB, MAD generally amounts to 1-3 W m −2 .The comparison between method M1-M4 shows again that the treatment of flagged data and the gap-filling do have an effect on the computed monthly GLOB values.The mean GLOB biases between M6 and the other six methods is substantially higher than among the other methods.M6 applies a more stringent testing for "extreme rare" limits than the official BSRN screening and computes monthly means from the arithmetic mean of daily averages (Sect.2.3).This pronounced bias indicates that a more sophisticated quality control might also have a distinct effect on the computed monthly mean.For SWDIFF, MAD is generally smaller than for GLOB since SWDIFF is generally smaller than GLOB.The direct beam component of GLOB, however, clearly shows larger differences between the monthly means computed by different methods (Fig. 3).This is probably due to the technically more difficult measurement of SWDIR compared to GLOB and SWDIFF as sun tracking by the pyrheliometer is quite susceptible to errors.This is also reflected in the high percentage of missing values at many BSRN sites as shown in Table 4.This failure rate is a likely reason for the considerable biases between M1-M4 and M7: the former compute the monthly means by a simple arithmetic average of daily means while the latter computes the monthly mean from the monthly mean diurnal cycle.From Fig. 3, we learn that the monthly LW-DOWN estimates obtained with different algorithm are in close agreement.MAD is below 0.1 W m −2 between methods M1, M2, and M3.These low differences are closely related to the low percentage of flagged data (see Table 5) and the rather low percentage of missing data when compared to the SW fluxes.Furthermore, the low temporal variability might further reduce the effect of data gaps.Figures 4 and 5 display histograms of the monthly GLOB biases between two individual methods.The height of the bars gives the fraction of all concurrently valid monthly means within a certain range as provided above the bars.Ideally, all biases are within ±0.1 W m −2 which would imply one single grey bar for the range [−0.1, 0.1] W m −2 with a height of 100%.From Fig. 4 we learn that M4 and M7 do approach this ideal case most closely.This gives some evidence that both a clever interpolation of missing/flagged data (M4) and the computation of monthly means from the monthly mean diurnal cycle are likely to be useful and robust approaches for the computation of monthly radiation fluxes from high temporal observations.It is of some interest, however, that the biases between M4/M5 and M5/M7 follow well a gaussian distribution while the difference M4-M7 is positively skewed (Fig. 4, middle row, left panel).The distribution of the biases differs strongly when evaluating differences between method M6 and any other method (Fig. 4, righthand panels).The absolute bias between M6 and MX, X = 1, 2, 3, 4, 5, 7 is greater than 2 W m −2 for approximately one third of all monthly means.This might be related to the quality control and interpolation method applied in M6 that clearly differs from the other algorithms under investigation.Further, M6 requires further input parameters that are dependent on the site.Therefore, M6 was only applied to the data of 9 BSRN sites.
Figure 5 is similar as Fig. 4 but for M1-M4.This figure reveals that it is relevant to consider the effect of flagged data and/or data gaps on the computed monthly mean.Monthly means obtained with M1 differ by more than 2 W m −2 from M2, M3, and M4 in approximately 15% of all cases.In addition the distribution is far from being gaussian but rather negatively skewed.This suggests that the consideration of missing/flagged value is essential.The differences between M1 and M2 reveal that the handling of flagged data does also have an impact on the monthly mean estimates.The intramethod biases among the other methods are distinctly less pronounced.The fractions of monthly mean biases above monthly means obtained with M3 and M4 do not differ by more than 0.1 W m −2 .From this it is evident that the impact on the monthly bias is likely less dependent on the interpolation method but rather if flagged/missing will be replaced by interpolated values.
In the following, we consider all monthly means derived from two methods that differ by less than 0.2 W m −2 (hereinafter called "high agreement" or HIAG).The question is how the fraction of HIAG depends on the percentage of "good" observation.We address this question in Fig. 6 by binning monthly means into classes with different fractions of underlying "good" observations, which are (defined as 1-, 3-or 5-min fluxes that are within the "extreme rare" limits).This means, e.g., that the class 99-100% contains monthly means that are based on less than approximately 430 (30 × 24 × 60/100 = 432) 1-min measurements that are either missing or flagged (assuming a month with 30 days).Note that observation afflicted with nighttime offsets are not counted as flagged since all methods zero SW fluxes during night.
Figure 6 displays the inter-method differences for the four algorithms M4, M5, M6, and M7.We again included all data that are currently stored in the BSRN database, allowing for a very large basic set.It is evident that the fraction of HIAG increases for increasingly complete and un-flagged observed data.For a month with complete observation and no flagged data we expect all methods to give the same result, i.e. 100% show "high agreement".This is correct for all methods (except for M6) and all parameters as the HIAG fraction is generally above 90% for the bin with 99-100% "good" measurements.This discrepancy may be due to the fact that M6 is not based on the same version of the underlying measurements because the monthly means obtained with M6 were derived from an earlier retrieval that may slightly differ from the most recent version that is currently stored in the BSRN database.In addition, M6 applies a very sophisticated quality control (Long and Shi, 2006) of the radiation fluxes (Sect.2.3).This means that the percentage of flagged (and corrected) data might substantially deviate from the fraction of flagged data when using the quality procedure that is routinely applied to the BSRN observations.Excluding method M6, we learn from Fig. 6 that the percentage of "high agreement" for SW monthly means drops down to 60-80% for underlying measurements with only 90-95% of "good" data (data inside the "extreme rare" limits).The difference in montly LWDOWN obtained with the seven investigated algorithms generally differ less than for SW fluxes.For LWDOWN, the HIAG percentage remains above 90% even for measurements with a substantial part of missing or flagged data (Fig. 6).This is a clear hint that monthly LWDOWN fluxes are less affected by data gaps and/or flagged data as temporal variability of LWDOWN is generally distinctly lower than for SW fluxes.For months with more than 99% of the observation being within the "extreme rare" limits, more than 99% of the monthly means obtained with different methods (excluding again M6) do not differ by more than 0.2 W m −2 .Note that the population is sufficiently large (more than 2000 valid monthly means) in order to guarantee statistical robust results.For SWDIFF and LWDOWN in Fig. 6, only M7 and M4 provide a sufficient Fig. 6.The percentage of monthly means derived from two methods differing by less than 0.2 W m −2 versus the fraction of the underlying 1min observations that are within the "extremely rare" limits for the radiation quantities indicated for each plot.Only cases with a sufficiently large basic set (>30 valid monthly means) are displayed.Methods M1-M7 are described in Sect.2.3.The "extreme rare" limits are listed in Table 2.Note that for SW fluxes, the night-time offsets are not considered.number of valid monthly means for cases with only 80-90% of "good" (inside the "exreme rare" limits) data.This feature is directly related to the setup of the methods: M4 is based on a interpolation of missing and flagged data while M7 allows the computation of valid monthly fluxes also for high fractions of missing and flagged data due to taking advantage of the typical diurnal cycle of the SW fluxes.As complete time series are an important prerequisite for the determination of accurate trends in radiation fluxes (see e.g., Wild et al., 2005), we favor methods that allow the computation of reasonable monthly means such as M4 and M7.Both methods account for the diurnal and seasonal cycle.We favour method M7 over M4 as the extra task of computing solar zenith angle is not necessary.
The results shown in Fig. 6 can be repeated for the set M1, M2, M3, and M4 (not shown).This provides valuable insight into the impact of the interpolation of missing and flagged data on the computed monthly mean.The evaluation reveals that the fraction of monthly means with HIAG (difference less than 0.2 W m −2 ) decreases most rapidly between M1 and M2 with a decreasing percentage of "good" data.This is reasonable as M1 includes all flagged data and no interpolation of gaps while M4 applies an interpolation of flagged and missing data.The relationship for M3 and M4 are similar pointing to the fact that the computed monthly means do depend little on the applied interpolation method.As in Fig. 6, monthly LWDOWN is less sensitive to the fraction of missing and flagged data than are SW fluxes.Color coded correlation matrices for the monthly averages using the seven methods M1-M7.Displayed are the correlation coefficients for GLOB (Global Radiation), SWDIR (direct SW radiation), SWDIFF (diffuse SW radiation), and LWDOWN (downward longwave radiation).Correlations are computed from deseasonalized data and averaged over all BSRN sites.Monthly SWDIR computed with M5 and M6 cannot be compared to the other methods as they provide SWDIR on a horizontal surface.Note that M6 provides data for only nine BSRN sites.Further, M6 is based on a earlier retrieval from the BSRN database that may slightly differ from the most recent data version.

Correlation of monthly mean time series
The strength of the linear relationship between the monthly fluxes compiled from two differing methods will be investigated by checking the correlation coefficients.Figure 7 gives a visual overview on the correlation between the monthly time series between any pair of methods.The correlations are computed using deseasonalized data.The mean correlations shown in this figure are determined in two steps (for each method pair and each parameter): (i) computation of the correlation coefficients for each individual BSRN site, and (ii) calculation of the arithmetic mean of the correlation coefficient computed in (i).Note that M6 only provides data for nine BSRN sites whereas monthly means for all BSRN stations are available for the other six algorithms M1-M5 and M7. Figure 7 shows that the monthly means derived from various methods mostly correlate quite well with correlation coefficients > 0.96.M6 generally shows the lowest correlation with the other investigated algorithms for all radiation components.This is partly due to the smaller amount of available monthly mean data for M6.Very high correlations are found for downwelling LW radiation between the methods M1-M5 which is likely related to a rather small percentage of missing and flagged LWDOWN observations (see Table 5).Furthermore, temporal variability in LW-DOWN is generally smaller than in the SW fluxes which minimizes the effect of data gaps on the monthly mean.M5 and M7 compute monthly means from monthly mean diurnal cycles but handling of missing and flagged data differs.Furthermore, the details on the computation of monthly mean A. Roesch et al.: Assessment of BSRN radiation records for the computation of monthly means diurnal cycles differ.The mean correlation for these two methods are above 0.98 for the SW (excluding SWDIR) and LW fluxes.The lower correlation for SWDIR between M5 and M7 may be caused by frequent data gaps and a considerable amount of flagged data.
Summarizing, the fraction of missing/flagged data do clearly impact the monthly means obtained with two different methods.The intra-method differences are generally smaller for LWDOWN than for SW fluxes as LWDOWN shows less temporal variability than SW fluxes, which lowers the effect of missing (1-, 2-, 5-) min values on the monthly mean estimate.
Further investigation revealed that the methods generally are more sensitive to changes in the gap frequency than to the amount of flagged data.We conclude from this that, in order to decrease the uncertainty in the computed monthly fluxes, the gaps in the data series should be decreased.The quality control that was implemented in BSRN at ETHZ has also the potential to improve the accuracy of the computed monthly means.

Trends in global radiation estimated by different methods
The phenomena of global brightning has been widely discussed during the last few years (Wild et al., 2005;Gilgen et al., 2009;Wild, 2009).Trend estimation is, however, dependent on the quality and homogeneity of the time series.Furthermore, we show here that trend estimates may be also influenced by the method how monthly means have been estimated from the minute data.In order to estimate the effect of the selected method on the trend in global radiation, we analyzed stations with measurements starting in 1997 or before with no continuous longterm gaps.Trends have been computed on the basis of annual means.Annual means were calculated from monthly means if more than eight valid monthly means were available for the respective year.Considering these conditions and provided that three or more out of the seven investigated methods produce valid annual means during the 10-year time-period 1997-2006, we select 11 sites for our investigation.Least square linear regression was then applied for a trend analysis (Table 7).Ten out of the 11 investigated times series show a positive mean trend during 1997-2006.However, the tabulated standard deviations clearly reveal that the estimated trends strongly depend on the selected method.For some sites, e.g.GVN or SPO, the sign of the estimated trend in global radiation depends on the selected method.A closer investigation reveals that the main reason for the observed differences is due to the fact that the number of annual means taken into acount in the computed trends for 1997-2006 largely differs among the methods.However, note that the differences would be less significant for longer time series, as, considering the limited length of the timeseries involved, the computed trends are highly sensitive to the number of annual means included.

Summary and conclusion
This work demonstrates the issues month mean computation caused by missing observational data.This study investigates the completeness of the currently available BSRN data and its impact on computed monthly means that have been obtained with different methods.The range of results could indicate uncertainties in any unspecified method where gaps exist and the method is not clearly described.
The simple quality analysis show that the data quality at most sites is generally good.The percentage of observations that are outside the "extreme rare" limits are generally below 2%.The "across-quantity" conditions are mostly satisfied at all BSRN sites.The constraint that GLOB and GLOB1 should not differ by more than 8% is often violated.At 14 BSRN sites, this test fails for about 2% of the observations.
Within this study, seven methods for the computations of monthly means from minute-values have been intercompared.The results showed that the computed monthly means may differ by several W m −2 .Selecting months with more than 99% high quality data (less than 1% missing data or outside the "extreme rare" limits), M4 and M7 show the best agreement.This gives some confidence that M7 may be well qualified for the computation of BSRN monthly means.This algorithm omits flagged data and profits from the typical diurnal cycle of SW radiation fluxes.M6, however, significantly deviates from the monthly means derived from the other methods.This is likely due to the more stringent and sophistciated quality control that has been applied to the data prior to the monthly mean computation.The comparison of the four methods M1, M2, M3, and M4 reveals that it is crucial to take the quality flags into account.For example, M1 differs by more than 2 W m −2 from M2, M3, and M4 for GLOB in approximately 15% of all monthly means.
This study shows that monthly mean estimates may substantially depend on the selected averaging algorithm.The discrepancy between the methods generally increases with increasing fractions of missing/flagged data.It has been shown that it is essential to account for data quality flags when computing monthly fluxes from 1-min observations.From the comparison study, it is advantageous to compute monthly fluxes by first computing the mean monthly diurnal cycle as this minimizes the impact of missing values.
Of the methods used here, the authors suggest the application of method M7 when computing monthly means from BSRN observations.This method accounts for both the diurnal and seasonal cycle in the radiation data without computing the solar zenith angle.This helps to avoid different monthly mean estimates being used in the literature for the same site and month.Finally, it is essential to note that with missing data -inevitable in real-world observations -there will be no perfect and error-free method because by definition, not filling gaps will bias the results and filling data requires estimating values which are not exact, especially for such potentially highly variable radiation parameters.
In addition to the presented empirical methods, robust methods could be applied to BSRN data in order to avoid biased estimates for data records with a high percentage of flagged data and/or frequent data gaps.The field of robust statistics (for an introduction, see, e.g.Huber, 1981;Hampel et al., 1986or Maronna et al., 2006) might be the ideal tool to implement more mathematically founded methods which allow -contrarily to purely empirical methods -for the computation of confidence intervals.Robust estimators have been recently applied to incomplete data (Frahm, 2009) and to non-stationary processes (Horenke, 2010).

Fig. 1 .
Fig. 1.Examples of the distribution of data gap length (GL) for Alice Springs, Australia -ASP, panels (a), (b) -and Billings, USA (BIL, panels c, d) for GLOB and LWDOWN.Gap lengths (GL) are given in minutes on each bar.The length of observation period in months is 131 and 149 for ASP and BIL, respectively.

Figure 2 :Fig. 2 .
Figure 2: Comparison of different algorithms for the computation of monthly GLOB means (see Section 2.3).Shown are the differences between each single method and the sum of all methods.The analysis is restricted to the period during which all methods provide valid monthly means.The following 4 sites are displayed: GVN (Georg von Neumayer, Antarctica), KWA (Kwajalein, Marshall Islands), NYA (Ny Alesund, Spitsbergen), and PAY (Payerne, Switzerland).

Fig. 3 .
Fig.3.Mean absolute bias between all pairs of filling methods for four radiative quantities, GLOB (Global Radiation), SWDIR (direct SW radiation), SWDIFF (diffuse SW radiation), and LWDOWN (downward longwave radiation).Filling methods M1-M7 are described in Sect.2.3.The absolute biases are averaged over all BSRN sites.Note: M5 provides data for nine BSRN sites only as specified in Sect.2.3.Monthly SWDIR computed with M5 and M6 can not be compared to the other methods as they provide SWDIR on a horizontal surface.

Fig. 4 .
Fig. 4. Differences in monthly means of GLOB from pairs of M4, M5, M6, and M7 filling methods as indicated above each plot.The various methods are described in Sect.2.3.The bars show the percentage of monthly mean differences within the limits given in squared brackets above the bars (unit: W m −2 ).The comparison considers all monthly BSRN data where the individual methods concurrently provide valid monthly GLOB.

Fig
Fig. 7.Color coded correlation matrices for the monthly averages using the seven methods M1-M7.Displayed are the correlation coefficients for GLOB (Global Radiation), SWDIR (direct SW radiation), SWDIFF (diffuse SW radiation), and LWDOWN (downward longwave radiation).Correlations are computed from deseasonalized data and averaged over all BSRN sites.Monthly SWDIR computed with M5 and M6 cannot be compared to the other methods as they provide SWDIR on a horizontal surface.Note that M6 provides data for only nine BSRN sites.Further, M6 is based on a earlier retrieval from the BSRN database that may slightly differ from the most recent data version.

Table 1 .
This table shows the lower and up limits for the "Physically possible" intervals used in flagging the radiation quantities.Values were flagged if outside the indicated interval.S o is the solar constant adjusted for Earth-Sun distance.µ is the cosine of the solar zenith angle.Parameters: GLOB: Global radiation, SWDIFF: Diffuse shortwave radiation, SWDIR: Direct diffuse radiation, SWUP:

Table 2 .
Same as Table 1 except for the "Extremely rare" intervals for flagging the radiation quantities.

Table 4 .
Percentage of missing data and number of gaps (in brackets) for all BSRN sites.Note that missing data do not include data flagged "unphysical".For detailed information on the BSRN stations whose 3-letter acronyms are given here, see http://www.bsrn.awi.de/en/home/bsrn/.Parameters: GLOB: Global radiation, SWDIR: direct shortwave radiation, LWDOWN: downwelling longwave radiation.

Table 5 .
Percentage (× 10) of values flagged according to the "extremely rare" procedure as described in Sect.2.2 and Table1.Test can not be applied (due to missing data): −999.Numbers marked with * are primarily caused by flagged negative SW fluxes.

Table 7 .
Mean trends in global radiation(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)for 11 BSRN sites averaged over the different filling methods applied to each site.The trend analysis is restricted to the methods that produced valid annual means for the whole 10-year period for each site.The filling methods used are listed in the second column.For a method description along with the used abbreviations see Sect.2.3.The 4th column shows the standard deviation (STDEV).