Ecosystem fluxes of hydrogen : a comparison of flux-gradient methods

Our understanding of biosphere–atmosphere exchange has been considerably enhanced by eddy covariance measurements. However, there remain many trace gases, such as molecular hydrogen (H 2), that lack suitable analytical methods to measure their fluxes by eddy covariance. In such cases, flux-gradient methods can be used to calculate ecosystem-scale fluxes from vertical concentration gradients. The budget of atmospheric H 2 is poorly constrained by the limited available observations, and thus the ability to quantify and characterize the sources and sinks of H 2 by fluxgradient methods in various ecosystems is important. We developed an approach to make nonintrusive, automated measurements of ecosystem-scale H 2 fluxes both above and below the forest canopy at the Harvard Forest in Petersham, Massachusetts, for over a year. We used three flux-gradient methods to calculate the fluxes: two similarity methods that do not rely on a micrometeorological determination of the eddy diffusivity, K, based on (1) trace gases or (2) sensible heat, and one flux-gradient method that (3) parameterizes K. We quantitatively assessed the flux-gradient methods using CO2 and H2O by comparison to their simultaneous independent flux measurements via eddy covariance and soil chambers. All three flux-gradient methods performed well in certain locations, seasons, and times of day, and the best methods were trace gas similarity for above the canopy and K parameterization below it. Sensible heat similarity required several independent measurements, and the results were more variable, in part because those data were only available in the winter, when heat fluxes and temperature gradients were small and difficult to measure. Biases were often observed between flux-gradient methods and the independent flux measurements, and there was at least a 26 % difference in nocturnal eddy-derived net ecosystem exchange (NEE) and chamber measurements. H 2 fluxes calculated in a summer period agreed within their uncertainty and pointed to soil uptake as the main driver of H 2 exchange at Harvard Forest, with H2 deposition velocities ranging from 0.04 to 0.10 cm s−1.


Introduction
Atmospheric H 2 , with a global average mole fraction of 530 ppb (parts per billion; 10 −9 , nmol mol −1 ), exerts a notable influence on atmospheric chemistry and radiation.H 2 is scavenged by the hydroxyl radical (OH radical), thereby attenuating the ability of OH to scavenge potent greenhouse gases, like methane (CH 4 ) from the atmosphere, which classifies H 2 as an indirect greenhouse gas (Novelli et al., 1999).H 2 is also a significant source of water vapor to the stratosphere, and as such may adversely perturb stratospheric ozone chemistry (Solomon, 1999;Tromp et al., 2003;Warwick et al., 2004).The two major atmospheric H 2 sources are photochemical production from methane and non-methane hydrocarbons and combustion of fossil fuels and biomass (Novelli et al., 1999).The major H 2 sinks are L. K. Meredith et al.: A comparison of flux-gradient methods soil consumption, representing about 81 ± 8 % of the total sink, and oxidation by OH being about 17 ± 3 % based on a global inversion of sparse atmospheric H 2 measurements (Xiao et al., 2007).The major sources and sinks are nearly balanced so atmospheric H 2 concentrations are stable.Although the global atmospheric H 2 budget has been derived through a variety of methods, it remains poorly constrained at the regional level and disputed at the global level, and a process-based understanding is lacking (as reviewed by Ehhalt and Rohrer, 2009).Therefore, there are large uncertainties in the estimated impact of changes to the H 2 biogeochemical cycle that might arise from changes in energy use, land use, and climate.Field and laboratory measurements are needed to improve the process-level understanding of atmospheric H 2 sources and sinks, especially regarding its sensitivity to biological activity in the soils.
The paucity of data on key H 2 processes is related to difficulties in measuring sources and sinks in situ, in particular the soil sink.H 2 soil uptake is typically measured using soil flux chambers (e.g., Conrad and Seiler, 1980;Lallo et al., 2008;Smith-Downey et al., 2008).Chamber measurements are labor intensive and typically yield infrequent and discontinuous data that are difficult to scale up to the landscape scale, especially in ecosystems with high spatial heterogeneity (Baldocchi et al., 1988).Although chambers are subject to artifacts if not implemented carefully (Davidson et al., 2002;Bain et al., 2005), they are well suited for process-level studies.Boundary layer methods have been used to calculate H 2 soil uptake rates from H 2 mole fraction measurements and assumptions about atmospheric winds and mixing, boundary layer height, and/or the uptake rates of other trace gases (Simmonds et al., 2000;Steinbacher et al., 2007).The need for assumptions in these methods can introduce large uncertainties into reported H 2 fluxes.Most studies have focused on soil processes, and we have little information about any other processes in the canopy that affect H 2 .
Despite the limitations of these traditional methods, few alternatives are available for the measurement or estimation of atmospheric H 2 fluxes.The gas chromatographic methods used to measure H 2 are slow (> 4 min), which precludes use of eddy covariance techniques that rely on high-frequency measurement of the covariation of the trace gas mole fraction with the vertical wind component.In such cases, where no high-accuracy fast-response instrument (≥ 1 Hz sampling frequency) is available, a variety of micrometeorological methods under the umbrella of flux-gradient theory can be used to non-intrusively measure the biosphere-atmosphere exchange of trace gases from relatively slow ( 1 Hz) measurements of vertical gradients of trace gas mole fractions (Fuentes et al., 1996;Meyers et al., 1996).Flux-gradient methods assume that fluxes are equal to the gradient of the quantity in question scaled by the rate of turbulent exchange.These methods can be automated for near-continuous data collection, and by averaging over time, the spatial heterogeneity within the tower footprint is integrated (Baldocchi et al., 1988).As a result, flux-gradient methods avoid some of the aforementioned problems that arise from the use of flux chambers and box models.These methods are also useful in cases where fluxes are small and fast-response instruments lack the precision to resolve deviations in trace gas mole fraction from background levels (Simpson et al., 1998).The structure of the turbulence below the canopy can make eddy covariance measurements difficult, and flux-gradient methods may be a superior choice (Black et al., 1996).Fluxgradient methods do rely on simplifying assumptions, such as the one-dimensional representation of a three-dimensional process, the existence of steady-state conditions, horizontal homogeneity in the source-sink distributions, and flat topography (Baldocchi et al., 1988).
Recognizing the potential for flux-gradient methods for determining the H 2 flux, we designed, constructed, and evaluated a fully automated, continuous measurement system for determining H 2 fluxes in a forest ecosystem by three different flux-gradient methods: (1) trace gas similarity, (2) sensible heat similarity, and (3) K parameterization.Critical issues in instrument design and performance for making flux-gradient measurements were considered, including instrument precision, sampling error, and measurement accuracy.The validity of each flux-gradient method was demonstrated by application to CO 2 and H 2 O fluxes, for which simultaneous eddy covariance or chamber flux measurements were available for comparison.Finally, H 2 fluxes were calculated using the flux-gradient methods in the above-and below-canopy environment.The approach and findings could be extended to other trace gases that present similar measurement challenges to H 2 .

Measurement site
The study site, Harvard Forest (42 • 32 N, 72 • 11 W; elevation 340 m), is located in Petersham, Massachusetts, approximately 100 km west of Boston, Massachusetts.The largely deciduous 80-to 115-year old forest is dominated by red oak, red maple, red and white pine, and hemlock (Urbanksi et al., 2007).Harvard Forest soils are acidic and originate from sandy loam glacial till (Allen, 1995).Measurements presented in this study were made from November 2010 to March 2012 at the Environmental Measurement Station (EMS) (Wofsy et al., 1993), located in the Prospect Hill tract of Harvard Forest.The station is surrounded for several kilometers by moderately hilly terrain and forest that has been relatively undisturbed since the 1930s.Previous work at the site found no evidence for anomalous flow patterns that would interfere with eddy-flux measurements (Moore et al., 1996), the local energy budget has been balanced to within 20 % (Goulden et al., 1996), and about 80 % of the turbulent fluxes originate within a 0.7-1 km radius of the tower (Sakai et al., 2001;Urbanski et al., 2007).

Instrumentation
An instrument system was designed to measure mole fraction gradients and ancillary variables needed to calculate H 2 fluxes above and below the forest canopy at four heights (Fig. 1).H 2 mole fractions were measured with a gas chromatograph (GC, model 6890, Agilent Technologies) equipped with a pulsed discharge helium ionization detector (HePDD, model D-3 PDD, Valco Instruments Co. Inc. (VICI)) and two columns (HayeSep DB, 1/8 in.OD stainless steel, 2 m pre-column 80/100 and 4.5 m analytical column 100/120, Chromatographic Specialties).A 2-position, 12port injection valve (UW type, 1/16 in.ports, M-type rotor, purged housing, VICI) was used to introduce 2 mL samples and control the chromatographic timing.The GC-HePDD was run with research-grade helium carrier gas (99.9999 % purity, Airgas) and was configured as in Novelli et al. (2009), with the exception of a shorter pre-column to reduce the analysis time to 4 min (Meredith, 2012, Fig. 2-3).Sample loop pressure (transducer model 722B13TFF3FA, MKS Instruments) and temperature (thermistor affixed to sample loop) were measured to quantify the exact number of moles of sample air injected.The GC sample stream was dried using a Nafion drying tube (MD-070-12S-2, Perma Pure).CO 2 and H 2 O mole fractions were measured at four heights using a pair of nondispersive, infrared gas analyzers (model 6262, LI-COR) configured to measure vertical gradients (Dunn et al., 2009).
Gas sampling inlets were installed at 24 and 28 m on the EMS tower and at 0.5 and 3.5 m on a small tower erected 14 m to the north-northwest of the EMS tower in an area of undisturbed vegetation and soil (Fig. 2).The leaf foliage distribution (Fig. 2) during summer at the Harvard Forest EMS site is top heavy and is important to consider for its interactions with the turbulent structures at the site (Parker, unpublished data).In this manuscript, we refer to measurement heights by their relation to the median forest canopy height (18 m; Fig. 2) when relevant to the topic at hand: above canopy for 24 and 28 m and below canopy for 0.5 and 3.5 m.Tubing lines (OD 1/4 in., Synflex ® ) of lengths 45-55 m were installed with inline PFA filter holders (47 mm PFA, Cole Palmer) containing 0.2 µm pore size filters (Zefluor ™ , Pall Corporation) and inverted Teflon funnels to protect the tubing inlet from precipitation.During the normal sampling routine, H 2 GC-HePDD measurements were made at 28, 24, 3.5, and 0.5 m over a 16 min cycle.Meanwhile, the IRGAs measured 1 Hz CO 2 and H 2 O mole fractions in either the 28 and 0.5 m or the 24 and 3.5 m gas sample streams (250 mL min −1 ), switching on 1 min intervals.Each sample stream was mixed in 2 L glass integrating volumes with fans (Meredith, 2012;Fig. A1).Three times per week, we used a nulling procedure to assess measurement accuracy, in which all gas streams sampled a common gas inlet installed at 2 m connected to an unmixed 25 L nulling reservoir (glass carboy).
Custom-designed small-footprint aspirated temperature shields (Dunn et al., 2009) containing thermistors (YSI) were colocated with the gas inlets.Temperature data were corrected for offsets between the sensors, which were  determined on two occasions by temporarily colocating temperature shields.Three-dimensional sonic anemometers were installed on the small tower at 2 m (CSAT3, Campbell Scientific) and on the EMS tower at 29 m (Applied Technologies).Three-dimensional winds are rotated to the plane where the mean vertical wind is zero (Wilczak et al., 2001).Data acquisition/logging and sample valve control was handled by Campbell Scientific CR10X data loggers.GCwerks (version 3.02-2, Peter Salameh, Scripps Institute of Oceanography, http://gcwerks.com)was used for gas chromatograph control and peak integration (example chromatogram in Meredith, 2012, Fig. 2-4).Independent eddy covariance CO 2 and H 2 O flux measurements were made at 29 m (Urbanski et al., 2007).The soil-atmosphere flux of CO 2 was measured using an automated flow-through flux chamber system located approximately 0.6 km south of the EMS tower with similar soils and vegetation.The system consisted of an infrared gas analyzer (IRGA, LI-7000, Li-Cor Inc., Lincoln, NE), six automated soil chambers (20 cm diameter), a data logger, and the gas control system.Only data from the three control chambers not placed on a root-exclusion plot are used here.Every halfhour all six chambers were measured in succession.Chamber air was circulated through a flow meter to the IRGA and back to the chamber, and pressure was equalized in and out of the chamber by venting.CO 2 fluxes were calculated from the increase in CO 2 concentration following chamber lid closure over the 2-3 min measurement period.

Calibration
Trace gas measurements were calibrated every 1.5 and 3 h for H 2 and CO 2 , respectively.GC-HePDD calibrations were based on duplicate sampling from an H 2 calibration standard of compressed air from Niwot Ridge in an electropolished stainless steel tank (34 L, Essex Industries) referenced against the NOAA Earth System Research Laboratory Global Monitoring Division (ESRL/GMD) primary standards on their in-house instrument before (501.5 (±10) ppb) and after (499.0 (±7.5) ppb) the experiment.H 2 mole fractions were stable in our calibration cylinder, as has been reported for other steel cylinders, but not for many aluminum tanks (Jordan and Steinberg, 2011).The GC-HePDD response was stable over the study period (Meredith, 2012, Figs. 2-9 and 2-10).
CO 2 calibrations were performed using three CO 2 span gases (HI ∼ 500 ppm, MID ∼ 420 ppm, and LO ∼ 350 ppm) traceable to the NOAA and World Meteorological Organization (WMO) CO 2 scales.IRGA zeros were determined by periodically passing ambient air through a CO 2 scrubber (soda lime) and desiccant (magnesium perchlorate) trap.Water vapor measurements were calibrated on one occasion with a dew point generator (model 610, LI-COR).Simultaneous, colocated mole fraction measurements of CO 2 and H 2 O (instrument from this study versus the independent EMS system) were used to derive scaling factors for comparison to the EMS eddy covariance fluxes from the slope of the linear regression forced through the origin: CO 2 (slope = 1.0047,R 2 = 0.84) and H 2 O (slope = 1.085,R 2 = 0.98).A detailed description of the instrument design, parts, and calibration is available online (Meredith, 2012).

Instrument precision
Application of the three flux-gradient methods relies on the ability to resolve vertical gradients in mole fraction or temperature.The problem is not trivial as vigorous turbulent mixing can cause the gradients, even those originating from strong source or sink processes, to be quite small.In this study, we aimed to resolve H 2 gradients both above and below the forest canopy at Harvard Forest.In previous work over a grassland in Quebec, H 2 gradients were typically < 5 % between inlets at 3.5 and 0.5 m (Constant et al., 2008), but somewhat larger at night under stable nocturnal conditions.Although there were no previous measurements above a forest canopy in the literature, H 2 mole fraction gradients in the turbulent above-canopy environment were expected to be much smaller than below the canopy.Assuming that H 2 uptake fluxes, F H 2 , are represented by , where v d is the H 2 deposition velocity of 0.07 cm s −1 and [H 2 ] is the hydrogen concentration (Conrad and Seiler, 1980), we anticipated needing relative mole fraction measurement precisions for high levels of turbulence of 0.07 and 0.7 % to resolve meaningful gradients under unstable and stable stratification, respectively (Wesely et al., 1989).The required precisions would be 0.4 and 4 % under unstable and stable stratification under low-turbulence conditions, but under the latter, eddy covariance measurements may not be valid.
Commonly used H 2 detectors were not adequate for the desired measurement precision: reported precisions were 0.5-5 % for mercuric oxide reduced gas detectors (Novelli et al., 1999(Novelli et al., , 2009;;Constant et al., 2008;Simmonds et al., 2000) and 1.1-2 % for N 2 O-doped electron capture detectors (Barnes et al., 2003;Moore et al., 2003).Therefore, we used the GC-HePDD to measure H 2 mole fractions because it had been used to measure H 2 with precisions of 0.06 % (1σ ) under laboratory conditions (Novelli et al., 2009).Our system achieved median H 2 measurement precisions over the field study between 0.06 and 0.11 %, and nearly always better than 0.3 % (95 % level) (Meredith, 2012, Fig. 2-8), which were on par with the laboratory-based configuration (Novelli et al., 2009) and at a 10-fold improvement over methods previously deployed to the field.The IRGA instruments measured mole fractions of CO 2 and H 2 O with high precisions as well: between 0.025 and 0.043 and between 0.04 and 0.05 % (Meredith, 2012, Fig. 2-12).The high precision capability was critical for measuring the small vertical differences in mole fractions (Sect.3.1).

Sampling error
We used well-mixed integrating volumes to smooth out the temporal fluctuations in gas sample streams to retain relevant information from each gradient level and reduce sampling error (Woodruff, 1986).An integrating volume (V ) acts as an exponential filter on a gas flow (Q) with an e-folding timescale (τ V = V /Q) that is set to span the time (T c ) to measure both H 2 mole fractions of a given gradient pair: specifically, τ V ∼ T c = 8 min.The sampling error increases with the ratio of the timescale of the measurement cycle (T c ) to the timescale of the scalar in turbulent flow (τ T ).Specifically, percent sampling error = 6 T c τ T 0.8 , and sampling errors can be in excess of 50 % for a 90 min GC-based measurement cycle (Woodruff, 1986;Meyers et al., 1996).The integrating volumes avoided sampling errors of around 10 % for our GC configuration that would have resulted from intermittent sampling with a single instrument (see Sect. 2.4.3)assuming τ T = 200-300 s (Baldocchi and Meyers, 1991).Sampling intervals interact systematically with the autocorrelation of the time series arising from the eddy structures (Woodruff, 1986).Integrating volumes (known also as buffer volumes) have been used in previous studies to dampen temporal fluctuations in trace gas mole fractions for flux-gradient measurements (Griffith et al., 2002), for contributions of advection (Yi et al., 2008), and for flask sampling (Bowling et al., 2003).A block-averaging effect is accomplished in flux-gradient measurements that trap the compound of interest over periods of minutes or hours (Müller et al., 1993;Goldstein et al., 1995Goldstein et al., , 1996Goldstein et al., , 1998;;Meyers et al., 1996) and eddy accumulation methods that use high-precision differential collection apparatus to trap and then sample air from upand down-drafts to determine the flux (Businger and Onlcey, 1990;Guenther et al., 1996;Bowling et al., 1998).
The effect of the integrating volumes is shown by example with our CO 2 measurements during a period when one gas stream (3.5 m level) passed through the well-mixed integrating volume, while the other (0.5 m level) stream bypassed its integrating volume at a flow rate of 500 mL min −1 .Each point represents the mean of the last 45 s of 1 Hz mole fraction measurements made for 1 min intervals at each level.The variability of CO 2 mole fractions in the bypassed stream was higher than the stream passing through the integrating volume, and that variability carried over to the mole fraction gradients (Fig. 3).The effect of the integrating volume was simulated by the exponential moving average (τ V = 2 min) of the 0.5 m level data.This example provides insight into the natural variability in trace gas mole fractions at the forest.Without using integrating volumes to reduce sampling error, the lower-frequency measurements of H 2 (T c = 8 min) would poorly represent the true vertical distribution of H 2 at the forest, which would increase the error in the flux-gradient calculations.

Measurement accuracy
High measurement accuracy was required to measure small differences in concentration between two sampling inlets.Any inherent nonzero differences in the measurement, here referred to as measurement bias, would cause errors in the gradient measurement and had to be accounted for.To avoid one potential source of measurement bias, we measured mole fraction gradients using a single instrument that alternately sampled from a pair of inlets.The alternative -simultaneously sampling a pair of gas inlets using separate instruments -could produce a measurement bias due to mismatch in the calibrations or drift between the two instruments (Woodruff, 1986).Only with very rigorous application of zeroing and intercomparison procedures can that method be applied with confidence, and even then, sudden changes in the offset between instrument sensitivity may occur without obvious causes (Bocquet et al., 2011).Operating two GC-HePDD instruments or separate columns would have a high potential for bias due to chromatography effects or differential detector sensitivity.
In addition to choosing a single instrument setup, we incorporated a nulling routine into the sampling procedure to diagnose measurement biases under a null condition when no difference should be detected.Differences may arise in the sample line segments dedicated to each inlet level due to leaks or physical interactions.During the nulling routine, run three times weekly at different times of the day, each inlet During this period, the 0.5 m inlet sample stream bypassed the integrating volume (a, blue diamonds), while the 3.5 m inlet sample stream passed through the integrating volume and was physically smoothed (a, pink points).The effect the integrating volume would have had on the 0.5 m inlet measurements was simulated with an exponential moving average (a, dark-blue points).The mole fraction gradients from the physically smoothed 3.5 m inlet data and the 0.5 m inlet measurements bypassing the integrating volume (b, blue diamonds) were more variable than the physically smoothed 3.5 m inlet data and the computationally smoothed (exponential filter) 0.5 m inlet data (b, dark-blue points).
sampled ambient air from a 25 L glass carboy, which can be thought of as a large integrating volume.The volume was pre-flushed at 3 L min −1 (τ V = 8.3 min) and then sampled by all sample streams at 2 L min −1 total flow (τ V = 12.5 min).Similar systems have been engineered to sample air from the same inlet height by temporarily placing inlets at the same height or frequently interchanging the inlet positions (Goldstein et al., 1995;Meyers et al., 1996;Wesely et al., 1989), but that ambient air is still subject to atmospheric variability.Our goal was to have an automated nulling procedure where all inlets would sample from the same reservoir of air that had nearly the same thermal, barometric, and chemical characteristics as the ambient air, but with the high-frequency atmospheric variability filtered out.An example of our nulling procedure on the morning of 2 August 2011 (Fig. 4) shows the transition of the CO 2 sampling system from tower measurements to the nulling volume as the integrating volumes flushed.The null bias between inlet heights for each H 2 , CO 2 , and H 2 O gradient pair was calculated after detrending with a second-order polynomial to account for the drift in concentrations due to lowerfrequency atmospheric variability.In this example, the apparent null bias between the 24 and 28 m (0.5 and 3.5 m) canopy inlets was 1.06 (1.63) ppb and 0.92 (−0.33) ppm for H 2 and CO 2 , respectively.Over the entire study period, the median ) each of the four gas lines that were usually connected to the 28, 24, 3.5, and 0.5 m inlets.The full CO 2 time series (upper plot) shows that it took over 20 min to flush the integrating volumes and sample lines of the memory of the strong nighttime CO 2 mole fraction gradient.Over the procedure, each inlet was sampled twice for H 2 (lower right) and eight times for CO 2 (lower left, shaded portion upper plot) and H 2 O (not shown).Second-order polynomials were used to detrend the data to remove the drift in mole fractions of CO 2 and H 2 over the nulling period.
H 2 null bias was −0.17 and −0.01 ppb for the respective gradient pairs, and was approximately normally distributed (1σ ; −077 to 0.52) (Fig. 5).The observed null bias was smaller than the combined analytical uncertainty (minimum detectable difference given instrument precision √ 2σ ), so it was not possible to distinguish it from zero, and the H 2 bias between the inlet lines could be ignored.
The nulling procedure was a valuable tool to diagnose bias to between sampling lines, though in retrospect mixing the reservoir, increasing its volume, using multiple reservoirs in series, or filling the reservoir from a level with less variability (i.e., farther from the soil) in order to reduce concentration variability and drift over the nulling procedure would yield better data on the null bias.

Gradients and flux-gradient methods
In this section, we present mole fraction gradients measured at Harvard Forest.The theory and applicability of three fluxgradient methods are discussed and the filtering criteria are described.

Gradient measurements
This study was the first application of the GC-HePDD to measure H 2 gradients in the field.We observed statistically bias between sampling lines for the 24 m and 28 m (left plots) and the 0.5 m and 3.5 m 371 (right plots) H2 measurements as determined by the nulling procedure.The median and 372 the 1σ confidence intervals are reported for each distribution and are compared with 373 minimum detectable difference given the median instrument 1σ precision (grey shading).374 375 The nulling procedure was a valuable tool to diagnose bias to between sampling 376 lines, though in retrospect mixing the reservoir, increasing its volume, using multiple 377 reservoirs in series, or filling the reservoir from a level with less variability (i.e., farther 378 Nov10 Apr11 Sep11 Feb12 of the measurement bias between sampling lines for the 24 and 28 m (left plots) and the 0.5 and 3.5 m (right plots) H 2 measurements as determined by the nulling procedure.The median and the 1σ confidence intervals are reported for each distribution and are compared with minimum detectable difference given the median instrument 1σ precision (gray shading).significant H 2 gradients both above and below the canopy at Harvard Forest.Below-canopy H 2 gradients were typically larger than above canopy by a factor of 10 because of the reduced turbulence and proximity to the H 2 sink beneath the forest canopy (Fig. 6).Gradients exhibited a diurnal pattern, with stronger gradients at night during calm atmospheric conditions when H 2 lost to soil uptake was not replenished by H 2 in the overlying air mass.The 26 m H 2 gradients were often close to the precision of the GC-HePDD system, especially during turbulent daytime periods.As a result, raw measurements were averaged to reveal the environmental gradients and fluxes, as has been previously required for these types of measurements above the forest canopy (Simpson et al., 1997).For example, the 26 and 2 m gradients of H 2 and CO 2 averaged into 2 h bins for the month of July clearly showed the underlying environmental signal (Fig. 7).Soil uptake of H 2 led to positive H 2 gradients at both levels.Above-canopy CO 2 gradients oscillated from positive during the day, when photosynthetic uptake of CO 2 by the forest canopy was the dominant process, to negative at night, when ecosystem respiration was the overwhelming process.Respiration was the dominant process below the forest canopy, as indicated by consistently negative 2 m gradients.Our CO 2 mole fraction measurements agree well with simultaneous CO 2 profile measurements at the EMS site (unpublished Harvard Forest EMS data; Urbanski et al., 2007;Wehr et al., 2013).
Higher signal-to-noise ratios could have been achieved for H 2 gradients measured over a larger vertical height ( z) difference.However, the 4 m z for the 24 and 28 m inlets was gradients.Our CO2 mole fraction measurements agree well with simultaneous CO2 404 profile measurements at the EMS site (unpublished Harvard Forest EMS data; Urbanski 405 et al., 2007;Wehr et al., 2013).406  limited by the height of the tower above the forest canopy.
Close to the soil sink, H 2 gradients were greater in magnitude than the instrument precision.For future studies, inlets below the canopy could be installed farther from the soils (> 0.5 m) and placed closer together ( z < 3 m) so as to still measure statistically significant gradients that may be more linear than observed here.For studies with taller towers extending beyond the vegetative canopy, a greater distance between the inlets ( z > 4 m) could increase the mole fraction gradient signal-to-noise ratio, but should not exceed relevant eddy length scales, which can range from the mechanical eddy size forced by obstruction of the wind by the trees (∼ 5 m) to the lower planetary boundary layer buoyant eddy size (∼ 100 m).At Harvard Forest, the dominant fluxcarrying eddy frequency is between 0.01 and 0.2 Hz, which corresponds to eddy scales of 10 to 200 m for mean winds around 2 m s −1 (Goulden et al., 1996)

Flux-gradient methods
Flux-gradient methods were used to calculate the flux of a trace gas from the measured gradient and a number of different parameters.In most presentations of flux-gradient methods, an analogy is drawn to Fick's first law for molecular diffusion, such that it is directly or implicitly stated that conservative fluxes, F C 1 , of gas molecules (Eq. 1) are equal to the product of their mole fraction gradient ( C 1 / z) in the down-gradient direction and the eddy diffusivity, K, which depends on the intensity of turbulent mixing over time intervals appropriate to the scale of the process (Baldocchi et al., 1995;Goldstein et al., 1996Goldstein et al., , 1998;;Dunn et al., 2009).
In this context, denotes the mean difference between 30 min measurements at each level of a vertical gradient pair and ρ n is the molar density of air.The turbulent mixing coefficient K is inferred or parameterized, unlike the molecular diffusion coefficient in Fick's first law that can be derived from first principles using molecular kinetic theory.Fluxgradient methods assume that, at a given time and place, the eddy diffusivity is invariant for mass, heat, and momentum (e.g., Reynold's analogy) (Garratt and Hicks, 1973;Sinclair and Lemon, 1975;Baldocchi et al., 1988).
In general, to calculate trace gas fluxes, flux-gradient methods require that there are no sources or sinks of the trace gas or the reference species between the gradient inlets.This was not a problem in our study because gradient pairs were located either above or below the forest canopy (Fig. 2), and whole-canopy gradients were only used when gas fluxes from the canopy should have been minimal.For the methods to work, trace gas species should not have significantly different vertical distributions of sources and sinks.Furthermore, the trace gas in question should be inert over the timescale of the flux-gradient measurement, meaning that the timescale of turbulence (200-300 s in such ecosystems) should not exceed the timescale of chemical reactions (Baldocchi et al., 1988;Baldocchi and Meyers, 1991).
Flux-gradient theories have been found to overestimate scalar fluxes within the roughness sublayer, which is the region from the ground to 2 or 3 times the canopy height because the turbulent structure is influenced (mechanically and thermally) by the canopy elements (Raupach and Thom, 1981;Baldocchi and Meyers, 1988;Högström et al., 1989;Simpson et al., 1998) and tall vegetation (Garratt, 1978).That said, the theory might be less compromised than previously thought above forests even at just 1.4 times the canopy height (Simpson et al., 1998).In our case, the tower height (30 m) constrained the height of the above-canopy inlets, which were centered on approximately 1.2 times the canopy height within the roughness sublayer.We evaluated the performance of flux-gradient methods against independent flux measurements of CO 2 and H 2 O to validate the use of fluxgradient theories in this region.
Below-canopy environments are characterized by low wind speeds and intermittent turbulent events that can violate flux-gradient theory assumptions.Counter-gradient transport of heat, momentum, and trace gases has been documented beneath plant canopies and may severely compromise fluxgradient methods (Shaw, 1977;Raupach and Thom, 1981;Baldocchi and Meyers, 1988;Amiro 1990;Baldocchi and Meyers, 1991).On the other hand, flux-gradient methods have been preferred over eddy covariance techniques for measuring surface-atmosphere fluxes within a few meters of the surface layer (Fitzjarrald and Lenschow, 1983;Gao et al., 1991;de Arellano and Duynkerke, 1992;Wagner-Riddle et al., 1996;Taylor et al., 1999;Dunn et al., 2009).Intermittent turbulent transport events may become less important near the ground, where the sources or sinks of tracers can be large; therefore, Meyer et al. (1996) argue that the flux-gradient relationships near the forest floor are valid and their application is justified.
The availability of different parameters and the applicability of a given flux-gradient method varied with time and location in our experiment (Table 1).Whole-canopy fluxes could not be calculated during the growing season daytime because of canopy interference.The sensible heat flux method was only applied outside the 2011 growing season because the fans in the aspirated temperature shields were damaged by a lightning strike on 28 May 2011, which was not apparent from the data, and was only discovered 6 months later.In this study, we determined H 2 fluxes using three different flux-gradient methods: trace gas similarity, sensible heat (H ) similarity, and K parameterization.

Trace gas similarity
The first method, trace gas similarity, assumes similarity of H 2 fluxes and gradients to CO 2 or H 2 O flux-gradients that can be measured by an independent method and is often referred to as a modified Bowen ratio (MBR) technique.The flux (F C 1 ) of a given trace gas is calculated from its mole fraction gradient ( C 1 / z) and measurements of the flux (F C 2 ) and gradient ( C 2 / z) of a second reference trace gas using Eq. ( 2) (Meyers et al., 1996;Goldstein et al., 1996Goldstein et al., , 1998;;Lindberg and Meyers, 2001).
The trace gas eddy diffusion coefficients (K) for CO 2 and H 2 O were compared (slope = 1.07,R 2 = 0.68) at Harvard Forest in the past (Goldstein et al., 1996).However, it is important to note that the idea of similarity applied in this method is more general than diffusional theory and calculation of K. Trace gas similarity only assumes linear transport of trace gases considered to be inert over the spatial and temporal scale of the measurement and that have a similar spatial distribution of sources and sinks.The method is therefore more general than is often attributed to flux-gradient methods.K is not calculated explicitly by similarity methods.These points also apply to the sensible heat similarity method.
In previous work, the trace gas similarity method was used to derive H 2 fluxes using CO 2 as the reference gas over a weeklong manual collection experiment in an Alaskan boreal forest with promising but limited results (Rahn et al., 2002).In this study, independent flux measurements of CO 2 and H 2 O via eddy covariance and of CO 2 via automated flux chambers were available above and below the forest canopy, respectively.We applied the trace gas similarity method both above and below the canopy all year round and to the wholecanopy gradient outside the growing season as data availability allowed.

Sensible heat similarity
The second method, sensible heat similarity, assumes similarity of H 2 fluxes and gradients to the sensible heat flux and temperature gradient (Meyers et al., 1996;Liu and Foken, 2001;Dunn et al., 2009).The sensible heat flux (H ) and temperature gradient ( T / z) are related by the turbulent transfer coefficient for heat K H (Businger, 1986), where ρ m is the mass density of air and c p is the specific heat capacity of air.Following Liu and Foken (2001) and Dunn et al. ( 2009), the sensible heat flux was obtained by applying a water vapor correction to the buoyancy flux derived from sonic anemometer temperature measurements, and the crosswind term was neglected because it should be small compared to the other terms, where w T s is the sonic heat flux and q represents the specific humidity (kg H 2 O kg air −1 ).Equation ( 4) gives the fluxgradient form (Eq. 1) for sensible heat, which can then be used to determine the H 2 flux (Eq.5) by inferring K H .
The eddy diffusion coefficients for trace gases (K) and heat (K H ) were measured at Harvard Forest in the past, agreeing within 12 ± 10 % when compared to both H 2 O and CO 2 (Goldstein et al., 1996).The method has been applied to determine hydrocarbon fluxes above a forest canopy (Goldstein et al., 1996) and the specific method adopted in our study was developed for the calculation of CO 2 and H 2 O fluxes close to the ground (Dunn et al., 2009).This is an MBR technique (Liu and Foken, 2001) that can be used to determine sensible (and latent heat) fluxes (errors of less than 10 %) and it circumvents errors (often on the order of 20-30 %) associated with methods that require closure of the measured surface energy budget, such as the modified Bowen ratio energy balance (MREB) method (Sinclair and Lemon, 1975;Baldocchi et al., 1988;Liu and Foken, 2001, and references therein).
In previous work, annual H 2 fluxes were determined using the MREB method over grassland in Quebec (Constant et al., 2008).In this study, the sensible heat similarity method was applied below the canopy during months when the aspirated temperature shields were functioning (November 2010 to May 2011 and December 2012 to March 2012).

K parameterization
The third method, K parameterization, invokes Monin-Obukhov similarity theory to parameterize a turbulent exchange coefficient (K) from sonic anemometer measurements, and is often referred to as an aerodynamic method (Monin and Obukhov, 1954;Simpson et al., 1998).K can be estimated by means of a variety of aerodynamic methods derived from energy or momentum balances (Högström et al., 1989;Celier and Brunet, 1992;Simpson et al., 1998;Foken, 2006).For example, K can be determined from where u * is the friction velocity (a characteristic velocity scale calculated from the square root of covariance between vertical and horizontal wind), k is von Karman's constant (taken as 0.4), z is the height above the ground, d is the zero-plane displacement height, and φ m is the diabatic influence function for momentum (Monin and Obukhov, 1954;Simpson et al., 1998).The Monin-Obukhov length (L ) is used to determine φ m from the empirical descriptions outlined by Eqs. ( 22a) and (22b) in Foken (2006).The method has been applied close to the surface (Fritsche et al., 2008) and above the forest canopy (Simpson et al., 1997(Simpson et al., , 1998)).In this study, the K parameterization method was applied below the canopy.Assuming that z = 2 m (the height of the u * measurement), the displacement height was inferred empirically to be around 1.63 m (z − d = 0.37 m) by comparing parameterized K values to the values for K determined from the chamber flux and concentration gradient using Eq. ( 1).The determination of d is often problematic (Raupach and Thom, 1981).Physically, d represents an adjustment of the basis height to reflect the displacement by the surface features of the profiles of micrometeorological variables fundamental to the K parameterization at hand.The inferred value for d was consistent throughout the study period and may reflect the effect of below-canopy environment on the turbulent fluxes at the EMS site.

Data filtering
Data were filtered to reject unrealistic values and to appropriately apply flux-gradient methodology.By their nature, the trace gas similarity and sensible heat similarity methods are not valid when the gradient of the comparative species (Eqs. 2 and 5, denominator) approaches zero or changes sign over the measurement period.Similarity methods cannot work during such periods, so we limited flux calculations to periods when gradients in the denominator exceeded their measurement precision.In general, the fluxes calculated during dawn and dusk periods are not included in averages or comparative assessments because of the tendency for conditions to change such that the observed fluxes and gradients provide no information about the turbulence.For example, conditions pass through an isothermal point when air and surfaces have the same temperature so that there is no gradient driving a heat flux; when air is saturated, there is no gradient driving a water vapor flux; and when photosynthesis ceases, CO 2 gradients change sign.
Data were rejected during rainy periods with more than 0.2 mm of rain per 30 min (Baldocchi and Meyers, 1991).Periods with u * < 0.07 m s −1 and u * < 0.17 m s −1 were excluded for below-and above-canopy data, respectively, because of poorly developed turbulent conditions (Goulden et al., 1996;Liu and Foken, 2001;Bocquet et al., 2011) and potential for advective fluxes driven by drainage flows on sloping terrain (Yi et al., 2008).We excluded unrealistic values of the implied turbulent transfer coefficients, K, such that 0 ≤ K ≤ 0.5 m 2 s −1 and 0 ≤ K ≤ 5 m 2 s −1 for the belowcanopy and the whole/above-canopy fluxes, respectively.We did not filter based on the wind sector because we found no interference from the tower and instrument shed to the east (45 to 180 • ).We considered quantile-quantile plots of the residual between flux-gradient methods and eddy covariance fluxes, and excluded clear outliers: residual absolute values > 20 and > 10 mmol m −2 s −1 for CO 2 and H 2 O fluxes.Additional filters were applied to the sensible heat flux method to retain only reasonable sonic and sensible heat flux values: w T s < 0.1 K m s −1 and −100 < H < 200 W m −2 .Finally, data were excluded when the temperature gradient was greater than 0 • C at the same time as H was greater than 3 W m −2 , which occurred at sunrise and sunset because of the rapidly changing conditions over the 30 min averaging interval (Meyers et al., 1996;Dunn et al., 2009).
Filtering is known to result in a large amount of rejected data in flux-gradient methods (e.g., net 57 % data loss in detailed analysis by Bocquet et al., 2011).Conditions at the site, external to the measurement system, resulted in a 13, 19, and 6 % loss of data based on the above-canopy u * , belowcanopy u * , and precipitation filter criteria, respectively.Filtering criteria related to the flux methods resulted in a raw data loss over the experimental period of around 40-82 % for trace gas similarity, 87 % for sensible heat similarity, and 18-26 % for K parameterization.

Flux-gradient methods: evaluation and application
Using independent flux measurements of CO 2 and H 2 O derived from eddy covariance and chamber techniques, we produced a quantitative guide (Table 2) for the performance of each flux-gradient method (trace gas similarity, sensible heat similarity, and K parameterization) at each measurement location (above, whole, and below canopy) for different seasons and times of day.We compared measurements during summer (23 June 2011 to 16 October 2011) and winter (15 November 2011 to 28 February 2012) during daytime (10:00-16:00) and nighttime (21:00-05:00) periods.We assigned a performance flag of good, fair, or poor based on the statistical tests described in Table 2, and we indicated statistically significant correlations with * and zero bias with †.In the framework of this study, a negative flux indicates uptake by the underlying biosphere, while a positive flux indicates emission.

Comparing eddy covariance and chamber CO 2 flux measurements
We assessed the consistency between the eddy covariance and chamber CO 2 flux data before using these independent flux measurements to evaluate our flux-gradient methods.Eddy covariance measurements represent the entire ecosystem flux and chamber measurements just the soil flux.Studies comparing nocturnal CO 2 eddy flux with chamber measurements often report significant discrepancies of 20 to over 50 % (e.g., Goulden et al., 1996;Lavigne et al., 1997;Phillips et al., 2010).We did not expect exact agreement between eddy and chamber fluxes because of the mismatch in measurement footprints, but we expected to see fluxes of a comparable magnitude with similar diel patterns.
The relevant eddy-derived quantity to compare to surfacebased chamber measurements is the u * -filtered (Sect.3.3) nocturnal net ecosystem exchange (NEE), which accounts for the time lag between CO 2 emitted by respiration and its release from storage from beneath the canopy (storage in the sense of Wofsy et al., 1993).The mean chamber flux was approximately 2 times higher than NEE at night over the weeklong summer example period (Fig. 8), and the diel chamber fluxes were about 50% higher than a simple ecosystem respiration model (Urbanski et al., 2007).Nocturnal chamber and NEE CO 2 fluxes were correlated over the whole study period with a nonzero bias (r = 0.42 * , bias = −0.2µmol m −2 s −1 ) and had fair agreement in the summer and winter periods but poor daytime winter performance (Table 2).Chamber measurements overestimated CO 2 fluxes relative to NEE in the summer (26 % bias) and underestimated them in the winter (−50 % bias) (Table 2).The summer bias estimate does not include respiration from canopy elements (woody tissue and foliage), which can contribute up to 50 % of the total ecosystem respiration, but usually less than 20 % (Goulden et al., 1996;Lavigne et al., 1997;Davidson et al., 2002).Therefore, summertime chamber plus canopy respiration was likely at least 46 % higher than the NEE estimates in the median.This discrepancy could arise from overestimation of soil respiration by the chamber measurement method or a mismatch in the measurement footprint of tower-based EMS eddy fluxes and for ground-based chamber fluxes located 0.6 km to the south (Goulden et al., 1996).In spite of this offset, chamber fluxes and NEE were significantly correlated, which gave us confidence that the independent data sets both contained information on ecosystem fluxes of CO 2 that could be used to evaluate the flux-gradient methods.

Flux-gradient method evaluation
We evaluated the performance of the flux-gradient methods against independent CO 2 (eddy covariance and chamber) and H 2 O (eddy covariance) flux measurements.The abovecanopy trace gas similarity method systematically underestimated CO 2 fluxes and overestimated H 2 O fluxes, despite the agreement of the mole fraction measurements between the EMS system and this study (Sect.2.3).This difference could be translated to the inferred K values (Eq.1), where ) over the study period.This linear scaling relationship was used to correct trace gas similarity flux results reported in this section unless otherwise indicated (dashed lines in Figs. 8 and A1; Table A1) and generally improved performance of the trace gas similarity method (compare Tables 2 and A1).The results indicate that the turbulent eddy diffusivity was not invariant for CO 2 and H 2 O above the forest canopy as was assumed in the trace gas similarity method.The cause could be the different distribution of sources and sinks of the two trace gases.CO 2 and H 2 O both have significant (sink and source, respectively) fluxes originating in the canopy, but the CO 2 soil source was stronger relative to the above-canopy fluxes than the H 2 O soil source (e.g., 4-fold greater in the case of Fig. 8 versus Fig. A1) and is of opposite sign.
The performance of the flux-gradient methods is illustrated by weeklong periods from the summer (15-22 July 2011 period) and winter (1-14 February 2012) in Figs. 8 and 9, respectively.The trace gas similarity method above the forest canopy (Sect.3.2.2;26 m) reproduced the diel cycle measured via eddy covariance in both the summer (Fig. 8 for CO 2 and Fig. A1 for H 2 O) and winter (Fig. 9), as was reflected in the significant correlations between the measurements (Table 2).The trace gas similarity H 2 O flux on winter days had mostly poor performance, which may have been caused by the low signal-to-noise ratio of concentration gradients in the turbulent above-canopy environment when water vapor fluxes were low.The trace gas similarity method showed good performance for both CO 2 and H 2 O during the day above the forest canopy in   the summer period (Table 2) and for the whole measurement period (CO 2 : r = 0.88 * , bias = −0.1 † µmol m −2 s −1 ; H 2 O, r = 0.71 * , bias = −0.39mmol m −2 s −1 ).
The whole-canopy trace gas similarity method (Sect.3.2.2;centered on 10 m) could only be applied in the absence of interfering canopy sources or sinks between the gradient inlets (24 and 3.5 m), making this method more restricted in its application than the above-canopy method.However, we found that the whole-canopy method was an equal or superior method in some cases when trace gas gradients were small and difficult to detect above the forest canopy, such as during the winter and at night (Table 2).For example, day-and wintertime H 2 O fluxes from whole-canopy trace gas similarity were good, while that method applied above the canopy had poor performance (Table 2).An example of the 10 m trace gas similarity CO 2 fluxes is shown in Fig. 9.
CO 2 fluxes calculated by the sensible heat similarity method (Sect.3.2.2; 2 m) were significantly correlated with chamber measurements all year (daytime: r = 0.67 * , bias = 1.1 µmol m −2 s −1 ), but tended to overestimate daytime fluxes (Table 2).The method was only available during the wintertime, when heat fluxes and temperature gradients were small, which contributed to higher uncertainty in the results than for the other methods, as shown in Fig. 9. Agreement of the wintertime sensible heat similarity and eddy flux data across the forest canopy was poor for CO 2 and poor to fair for H 2 O (Table 2).
The performance of the K parameterization method below the forest canopy (Sect.3.2.4; 2 m) versus chamber measurements was good throughout the year (daytime: r = 0.74 * , bias = −0.12µmol m −2 s −1 ).These fluxes were significantly correlated with chamber data, and bias was low or insignificant (Table 2).K parameterization fluxes were correlated with eddy covariance fluxes in most cases, but typically were biased positively in the summer and negatively in the winter, as can be seen from the comparison with the NEE-derived simple ecosystem respiration model in Figs. 8  and 9.The overestimation of nocturnal summertime fluxes by K parameterization was likely related to the large and nonlinear CO 2 gradients (determined from profile measurements) that arise under calm nocturnal conditions.In contrast, trace gas and sensible heat similarity methods use ratios of vertical mole fraction or temperature gradients, which can compensate for nonlinear vertical concentration gradients.The K parameterization has been shown to agree with trace gas similarity and eddy covariance-derived fluxes in the past (Fritsche et al., 2008).A larger period of overlapping data for the sensible heat similarity method was available with K parameterization results than chamber data, and the two flux-gradient methods were highly correlated but had a relative positive bias of the CO 2 flux in the sensible heat method relative to K parameterization over the whole period (day r = 0.63 * , bias = 0.37 µmol m −2 s −1 ; night r = 0.42 * , bias = 0.10 µmol m −2 s −1 ).

Flux-gradient method application:
H 2 gradient fluxes Summertime H 2 fluxes were calculated for the 15-22 July 2011 period above the canopy by the trace gas similarity methods using the CO 2 and H 2 O eddy fluxes, and below the canopy from trace gas similarity to CO 2 using CO 2 chamber measurements and the K parameterization method (Fig. 10).The H 2 fluxes were characterized by net ecosystem H 2 uptake and were consistent with our expectation that H 2 uptake by soil would be the dominant ecosystem process.The below-canopy fluxes were −8 and −10 nmol m −2 s −1 during midday over this period for the K parameterization and chamber-based trace gas similarity methods, respectively.The above-canopy trace gas similarity average midday H 2 fluxes via CO 2 and H 2 O were −21 and −15 nmol m −2 s −1 , respectively.Larger trace gas fluxes were calculated using CO 2 as the correlative variable than H 2 O, but in the case of H 2 this difference (and the difference with the below-canopy fluxes) fell within the 95 % confidence intervals because of the higher uncertainty in H 2 gradients measurements above the canopy.Potential systematic differences in the trace gas similarity fluxes of H 2 were not corrected for as was done for CO 2 and H 2 O in Sect.4.2 because the true relationship of K H 2 with K CO 2 and K H 2 O was unknown.Storage fluxes of H 2 were calculated, but were typically small (<| 1 nmol m −2 s −1 |), and were therefore not included in the comparison.The midday summertime H 2 uptake rates correspond to H 2 deposition velocities of 0.04 to 0.10 cm s −1 , which were within the range of previously reported soil H 2 deposition velocities, so our results support the previously reported values that typically range between 0.01 and 0.10 cm s −1 (Ehhalt and Rohrer, 2009).
Wintertime H 2 fluxes were calculated for the 1-14 February 2012 period using the whole-canopy trace gas similarity, K parameterization, and sensible heat methods (Fig. 11).The wintertime H 2 fluxes were −4 and −6 nmol m −2 s −1 for the whole canopy using the trace gas similarity via CO 2 and H 2 O, respectively, and −0.5 to −0.8 nmol m −2 s −1 below the canopy using the K parameterization and sensible heat similarity methods, respectively.H 2 soil uptake has been shown in previous work to persist at low rates in the winter (Constant et al., 2008;Lallo et al., 2008).The apparent H 2 flux divergence below and above the canopy was consistent with the diagnosed median daytime biases for each method: compared with the wintertime CO 2 chamber data, K parameterization tended to slightly underestimate CO 2 fluxes, while uncorrected trace gas similarity (10 m) and sensible heat similarity methods overestimated CO 2 fluxes (Tables 2 and A1).However, we cannot exclude the effect of different sourcesink distribution for H 2 versus CO 2 and H 2 O or the measurement of different patches of forest and H 2 exchange rates as a result of the difference in the 2 m versus 10 m footprints.
The uncertainty in the H 2 gradient fluxes depended on the method used and the location applied.The uncertainty was large for H 2 fluxes calculated by trace gas similarity above the canopy due to the low signal-to-noise ratio of those H 2 gradients.For example, the summer daytime median propagated relative error (using the mean and uncertainty for terms in Eq. 2 and a 15 % uncertainty in eddy covariance fluxes following Urbanski et al., 2007) was 200 % for H 2 fluxes during the day and night, while CO 2 flux relative error in the same period was around 40 % for each measurement.Therefore, H 2 flux calculations were aggregated into hourly bins to reduce the uncertainty around each measurement such as in the weeklong summer and winter examples (Figs. 10 and 11).In those cases, the relative error in aggregated H 2 fluxes (calculated from the 10:00 to 16:00 LT mean and 95 % confidence intervals) in the summer was 80 % for trace gas similarity (26 m) both via CO 2 and H 2 O, 10 % for trace gas similarity (2 m) via CO 2 , and 10 % for parameterization (2 m), and in the winter it was 22 and 16 % for trace gas similarity ( 10 via CO 2 and H 2 O, respectively, 92 % for sensible heat similarity (2 m), and 30 % for K parameterization (2 m).
There was uncertainty due to the choice in flux-gradient method, which we calculated as the relative error in the inferred K from the available flux-gradient methods by assuming that each have equal validity (the same H 2 gradient was applied to each K in a given location).Over the whole year, the uncertainty across flux-gradient methods for the above-, whole-, and below-canopy environments was 46, 74, and 76 % in the median, respectively.The trace gas similarity uncertainty was smaller above the canopy than for the whole-canopy measurement, where there was more potential for canopy source-sink interference.Greater uncertainty between the methods existed below the forest canopy where three different flux-gradient methods were compared: trace gas similarity via chamber fluxes, sensible heat similarity, and K parameterization (Table 2).

Summary and conclusions
This paper describes design factors in the experimental setup that were key to the success of the flux-gradient method.Perhaps the most critical factors were the ability to measure H 2 mole fraction gradients with high instrumental precision (0.06 to 0.11 %), with low sampling error (by use of integrating volumes), and without significant measurement bias (determined by a frequent nulling procedure).By addressing these three potential sources of error, we were able to measure statistically significant above-canopy H 2 fluxes, which still had relatively large uncertainty, but were consistent with the below-canopy results.Furthermore, the choice to design a system that could use multiple flux-gradient methods was important, especially given the possibility for one method to fail or to suffer from large data losses (e.g., failure of temperature shields over the growing season for the sensible heat similarity method).Validating the flux-gradient method(s) using trace gases with independent flux measurements such as eddy covariance and/or chamber measurements gave us the confidence to apply the flux-gradient methods to H 2 .Finally, we encourage the use of independent flux measurements to correct for any systematic biases in the flux-gradient methods (i.e., from different source-sink distributions), which should be determined for the particular ecosystem and time period of interest.
This study provides a temporal guide to the suitability of each flux-gradient method at Harvard Forest.We found that all three flux-gradient methods (trace gas similarity, sensible heat similarity, and K parameterization) had good and fair performance in certain locations, seasons, and times of day.In general, the best agreement between flux-gradient methods and the independent eddy covariance and chamber flux measurements was observed for measurements made on the same side of the canopy; that is, the correlation was typically reduced when the eddy or chamber measurements were located on the opposite side of the forest canopy to the fluxgradient method.This is not surprising given the separation of dynamical flows above and below the canopy.Large relative biases were observed for flux-gradient methods tested against H 2 O eddy covariance measurements at night because of the low signal-to-noise ratio in H 2 O gradients and fluxes.The trace gas similarity method performed the best above the forest canopy, and when applied to the whole canopy the performance was also good to fair.The whole-canopy approach provides a useful alternative to the above-canopy method during periods with low signal-to-noise ratio in the mole fraction gradients.Flux-gradient methods performed well below the forest canopy, despite the potential pitfalls in such locations (Sect.3.2).The relatively open and top-heavy canopy at Harvard Forest may foster a turbulent environment conducive to flux-gradient methods at this site.We found the K parameterization method to perform best below the canopy.An instrument failure meant that the sensible heat similarity method could not be used in the summer, but all indications are that the method would have worked if this instrument had been operational (e.g., Dunn et al., 2009).However, problems with the aspirated temperature shields were not obvious in situ, which could be a risk for future studies as well.
This paper shows that a variety of flux-gradient techniques can be used at Harvard forest to study the ecosystem exchange of H 2 .We observed net uptake of H 2 by the biosphere both above and below the canopy during the example periods, which point to the particular sensitivity of H 2 to soil uptake, and uptake was stronger in summer than winter, as is presented over the entire study period in Meredith (2012) and Meredith et al. (2014).The H 2 gradient fluxes were generally consistent across methods and with previous measurements.The flux-gradient approach generated automated, continuous results representing a larger, spatially averaged, undisturbed measurement footprint than possible using chamber techniques.The uncertainty in H 2 fluxes for a given method ranged between 10 and 92 % and across all

Figure 1 .
Figure1.Schematic of the flux-gradient instrument system, which includes a gas chromatograph (GC-HePDD, H 2 measurements), infrared gas analyzers (IRGA, CO 2 , and H 2 O measurements), and the gas stream selection system.
Figure2.Meteorological equipment and gas inlets installed on the Environmental Measurement Site (EMS) tower and a small tower installed 14 m to the north-northwest of the EMS tower over undisturbed soil.Squares represent gas inlets and temperature shield locations.Sonic anemometers, sketched in gray, were installed at 2 and 29 m.The distribution of foliage per meter of height (leaf foliage distribution) at the Harvard Forest EMS site in summer has a median height of 18 m.

Figure 3 .
Figure3.Two-hour period highlighting the importance of using integrating volumes for our mole fraction gradient measurements.During this period, the 0.5 m inlet sample stream bypassed the integrating volume (a, blue diamonds), while the 3.5 m inlet sample stream passed through the integrating volume and was physically smoothed (a, pink points).The effect the integrating volume would have had on the 0.5 m inlet measurements was simulated with an exponential moving average (a, dark-blue points).The mole fraction gradients from the physically smoothed 3.5 m inlet data and the 0.5 m inlet measurements bypassing the integrating volume (b, blue diamonds) were more variable than the physically smoothed 3.5 m inlet data and the computationally smoothed (exponential filter) 0.5 m inlet data (b, dark-blue points).

18 358Figure 4 .Figure 4 .
Figure 4. Nulling procedure example for 2 August 2011.Around 05:50 local time, the nulling valves were activated to draw air through the nulling volume (flushed for 40 min prior) each of the four gas lines that were usually connected to the 28, 24, 3.5, and 0.5 m inlets.The full CO 2 time series (upper plot) shows that it took over 20 min to flush the integrating volumes and sample lines of the memory of the strong nighttime CO 2 mole fraction gradient.Over the procedure, each inlet was sampled twice for H 2 (lower right) and eight times for CO 2 (lower left, shaded portion upper plot) and H 2 O (not shown).Second-order polynomials were used to detrend the data to remove the drift in mole fractions of CO 2 and H 2 over the nulling period.

Figure 5 .
Figure 5.Time series (upper plots) and distributions (lower plots) of the measurement 370

Figure 5 .
Figure5.Time series (upper plots) and distributions (lower plots) of the measurement bias between sampling lines for the 24 and 28 m (left plots) and the 0.5 and 3.5 m (right plots) H 2 measurements as determined by the nulling procedure.The median and the 1σ confidence intervals are reported for each distribution and are compared with minimum detectable difference given the median instrument 1σ precision (gray shading).

407 408Figure 6 .
Figure 6.Example of mole fraction gradients of hydrogen (ppb H2 m -1 ) above (26 m) and 409 below (2 m) the forest canopy: a) both levels compared side-by-side in the same scale 410 and b) above canopy at a magnified scale.Black whiskers represent instrument precision 411(median 1σ bracketed precision scaled over the vertical distance in ppb H2 m -1 ).Grey 412

Figure 7 .
Figure 7. Two-hour mean mole fraction gradients of H 2 and CO 2 versus the hour of day (hod) above (26 m) and below (2 m) the Harvard Forest canopy in July 2011.Bars indicate 95 % confidence interval of the sample mean of mole fraction gradients in 2 h bins.Gray shading designates nighttime hours.

Figure 10 .
Figure 10.Comparison of summertime H 2 fluxes (15-22 July 2011 period) above (left) and below (right) the canopy: trace gas similarity via CO 2 or H 2 O, eddy covariance (eddy), K parameterization, and trace gas similarity via flux chamber data.Data points represent 6 h aggregate mean and 95 % confidence intervals.

Figure 11 .
Figure 11.Comparison of wintertime (1-14 February 2012) H 2 fluxes above (left) and below (right) the canopy determined by trace gas similarity via CO 2 and H 2 O, K parameterization, and sensible heat similarity in the winter.Data points represent 6 h aggregate mean and 95 % confidence intervals.
Example of mole fraction gradients of hydrogen (ppb H 2 m −1 ) above (26 m) and below (2 m) the forest canopy: (a) both levels compared side by side in the same scale and (b) above canopy at a magnified scale.Black whiskers represent instrument precision (median 1σ bracketed precision scaled over the vertical distance in ppb H 2 m −1 ).Gray shading designates nighttime hours.

Table 1 .
The type, location, and availability of the ancillary measurements required for each flux-gradient method.

Table 2 .
Quantitative comparison of CO 2 and H 2 O fluxes determined by flux-gradient methods and by independent eddy covariance and chamber measurements.Trace gas similarity fluxes were corrected in this table (Sect.4.2).First, the correlation between flux method pairs (r, Pearson's linear correlation coefficient) is given, and statistically significant correlations are indicated by * (Student's t test, p value < 0.05, α = 0.05).Second, the median of the bias over the period (column header minus row) is given in µmol CO 2 m −2 s −1 and mmol H 2 O m −2 s −1 , and instances when this mean bias is not significantly different from zero are indicated by † (Student's t test, p value > 0.05, α = 0.05).A text-type flag a is assigned to each flux-gradient method to indicate the level of performance against the inde-