Accurate measurements of atmospheric carbon dioxide and methane mole fractions at the Siberian coastal site Ambarchik

Sparse data coverage in the Arctic hampers our understanding of its carbon cycle dynamics and our predictions of the fate of its vast carbon reservoirs in a changing climate. In this paper, we present accurate measurements of atmospheric CO2 and CH4 dry air mole fractions at the new atmospheric carbon observation station Ambarchik, which closes a large gap in the atmospheric trace gas monitoring network in northeastern Siberia. The site, operational since August 2014, is located near 15 the delta of the Kolyma River at the coast of the Arctic Ocean. Data quality control of CO2 and CH4 measurements includes frequent calibrations traced to WMO scales, employment of a novel water vapor correction, an algorithm to detect influence of local polluters, and meteorological measurements that enable data selection. The available CO2 and CH4 record was characterized in comparison with in situ data from Barrow, Alaska. A footprint analysis reveals that the station is sensitive to signals from the 20 East Siberian Sea, as well as northeast Siberian tundra and taiga regions. This makes data from Ambarchik highly valuable for inverse modeling studies aimed at constraining carbon budgets within the pan-Arctic domain, as well as for regional studies focusing on Siberia and the adjacent shelf areas of the Arctic Ocean.


Introduction
Detailed information on the distribution of sources and sinks of the atmospheric greenhouse gases (GHG) CO 2 and CH 4 is a prerequisite for analyzing and understanding the role of the carbon cycle within the context of global climate change. The Arctic plays a unique role in the carbon cycle because it hosts large carbon reservoirs preserved by cold climate conditions (Hugelius et al., 2014;James et al., 5 2016;Schuur et al., 2015). Yet, the net budgets of both terrestrial (Belshe et al., 2013;McGuire et al., 2012) and oceanic (Berchet et al., 2016;Shakhova et al., 2014;Thornton et al., 2016) carbon surfaceatmosphere fluxes are still highly uncertain, as are the mechanisms controlling them. Furthermore, the Arctic is subject to faster warming than the global average at present and in the coming decades (IPCC, 2013). Thus, a considerable fraction of terrestrial (Schuur et al., 2013) and subsea (James et al., 2016) 10 permafrost carbon reservoirs is at risk of being degraded and released under future climate change. The fate of further carbon reservoirs in the Arctic seabed is uncertain as well under warmer conditions. A substantial release of the stored carbon in the form of CO 2 and CH 4 would constitute a significant positive feedback enhancing global warming. Therefore, improved insight into the mechanisms that govern the sustainability of Arctic carbon reservoirs is essential for the assessment of Arctic carbon-15 climate feedbacks and the simulation of accurate future climate trajectories.
A key limitation for understanding the carbon cycle in the Arctic is limited data coverage in space and time (Oechel et al., 2014;Zona et al., 2016). Besides infrastructure limitations, the establishment of long-term, continuous and high-quality measurement programs at high latitudes is severely challenged by the harsh climatic conditions especially in the cold season (Goodrich et al., 2016). During the Arctic 20 winter, even rugged instrumentation may fall outside its range of applicability, and measures may be required to prevent ice buildup and instrument failure without compromising data quality (Kittler et al., 2017a). Also, many sites are difficult to access for large parts of the year, complicating regular maintenance and therefore increasing the risk of data gaps because of broken or malfunctioning equipment. 25 A widely used approach to quantify carbon fluxes on a regional scale builds on measurements of atmospheric CO 2 and CH 4 mole fractions and inverse modeling of their transport in the atmosphere (Miller et al., 2014;Peters et al., 2010;Rödenbeck et al., 2003;Thompson et al., 2017). The performance of inverse models to constrain surface-atmosphere exchange processes depends on the accuracy of atmospheric trace gas measurements. Because biases in the measurements (e.g. drift in time or bias between stations) translate into biases in the retrieved fluxes (Masarie et al., 2011;Peters et al., 2010;Rödenbeck et al., 2006), the World Meteorological Organization (WMO) has set requirements for the inter-laboratory compatibility of atmospheric measurements: ±0.1 ppm for CO 2 in the northern 5 hemisphere and ±0.05 ppm in the southern hemisphere, and ±2 ppb for CH 4 (WMO, 2016).
Atmospheric inverse modeling has a high potential for providing insights into regional to pan-Arctic scale patterns of CO 2 and CH 4 fluxes, as well as their seasonal and interannual variability and long-term trends. The technique could also serve as a link between smaller scale, process-oriented studies based e.g. on eddy-covariance towers Kittler et al., 2016;Zona et al., 2016) or flux 10 chambers (e.g. Kwon et al., 2017;Mastepanov et al., 2013) and the coarser scale satellite-based remote sensing retrievals of Arctic ecosystems and carbon fluxes (e.g. Park et al., 2016). However, to date, sparse data coverage limits the spatiotemporal resolution and the accuracy of inverse modeling products at high northern latitudes. To improve inverse model estimates of high latitude GHG surfaceatmosphere exchange processes, the existing atmospheric carbon monitoring network ( Fig. 1) needs to 15 be expanded (McGuire et al., 2012).
In this paper, we present the new atmospheric carbon observation station Ambarchik, which improves data coverage in the Arctic. The site is located in northeast Siberia at the mouth of the Kolyma River (69.62° N, 162.30° E) and is operational since August 2014. In Sect. 2, we introduce the station location and instrumentation, and in Sect. 3 the quality control of the data. We characterize which areas the 20 station is sensitive to in Sect. 4, and present a signal characterization of the available record in Sect. 5.
Section 6 contains concluding remarks.

Area overview
Ambarchik is located at the mouth of the Kolyma River, which opens to the East Siberian Sea (69.62° N, 162.30° E; Fig. 2). The majority of the landscape in the immediate vicinity of the locality is wet tussock tundra. On ecoregion scale, Ambarchik is bordered by Northeast Siberian Coastal Tundra ecoregion in the West, the Chukchi Peninsula Tundra ecoregion in the East, and the Northeast Siberian 10 Taiga ecoregion in the South (ecoregion definitions from Olson et al., 2001). Major components contributing to the net carbon exchange processes in the area are tundra landscapes including wetlands and lakes, as well as the Kolyma River and the East Siberian Arctic Shelf.

Site overview
Ambarchik hosts a weather station operated by the Russian meteorological service (Roshydromet), whose staff is the entire permanent population of the locality. The closest town is Chersky (~100 km to 5 the south, population 2,857 as of 2010), with no other larger permanent settlement closer than 240 km.
The site therefore does not have any major sources of anthropogenic greenhouse gas emissions in the near field. The only regular anthropogenic CO 2 and potentially CH 4 sources that may influence the measurements are from the Roshydromet facility, including the building that hosts the power generator and the inhabited building. 10 The atmospheric carbon observation station Ambarchik started operation in August 2014. It consists of a 27 m-tall tower with two air inlets and meteorological measurements, while the majority of the instrumentation is hosted in a rack inside a building. The rack is equipped for temperature control, but due to the risk of overheating, it is open most of the time and thus in equilibrium with room temperature (room and rack temperature are monitored). Atmospheric mole fractions of CH 4 , CO 2 , and H 2 O are 15 measured by an analyzer based on the cavity ring-down spectroscopy (CRDS) technique (G2301, Picarro Inc.), which is calibrated against WMO-traceable reference gases at regular intervals (Sect. 3.2).
The tower is located 260 m from the shoreline, with a base elevation of 20 m a.s.l. (estimated based on GEBCO_2014 (Weatherall et al., 2015), which in this region is based on GMTED2010 (Danielson and Gesch, 2011)).

Gas handling
The measurement system allows switching between two different air inlets and four different calibration 5 gas tanks (Fig. 3). Component manufacturers and models of the individual components are listed in Air inlets with rain guards are mounted on the tower at 27 ("Top") and 14 ("Center") m a.g.l., respectively, and are equipped with 5 µm polyester filters (labels F1 and F2 in Fig. 3). The two air inlets are probed in turns (15 minutes Top,5 minutes Center). Signals from the Center Inlet are mainly used 10 for quality control purposes (Sect. 3.4). Air is drawn from the inlets (I1, I2) through lines of flexible tubing (6.35 mm outer diameter) by a piston pump located downstream of the measurement line branch Calibration gases pass through a line composed exclusively of stainless steel components as well. Air from gas tanks (High, Middle, Low, Target) passes through pressure regulators (RE1-4), reducing their 25 pressure roughly to ambient pressure. This way, the CRDS analyzer can cope with the pressure difference between sample air and calibration air from the tanks without an open split, which would normally be installed to equilibrate the line with ambient pressure. This setup was chosen in order to conserve calibration air. The lines from the gas tanks are connected to a multiposition valve (MPV1), which is used to select between gas tanks. Downstream of the multiposition valve, the calibration gas line is connected to the sample line by a solenoid valve (V3). The solenoid valves V2 and V3 are used to select between sample air from the tower and calibration air. 5 During calibrations, the part of the measurement line that is not part of the calibration line is continuously flushed by the high flow pump (PP1) through the purge line, which comprises solenoid valve V4 (which shuts off air flow from the gas tanks through the purge line in case of a power outage during a tank measurement), needle valve NV3 (which is used to match the purge flow to the usual sample flow), and flow meter FM3 (which monitors the purge flow). 10 The flow meters (FM1-3) and pressure sensor (P1) are used to diagnose problems such as weakening pump performance, clogged filters, leaks or obstructions.
The gas handling system was tested for leaks after installation. This was done by capping the tubing and evacuating it using a hand pump to pressures of 0.3-0.4 bar (normal operating pressure is around 0.7 bar). The leak rate was then computed from pressure increase over several hours, corrected for 15 temperature fluctuations measured in the lab. To mitigate the effect of inhomogeneous temperature fluctuations throughout the tubing and increase sensitivity of the pressure to small leaks, the experiments were limited to the small tubing volume inside the laboratory, ignoring the tubing on the tower. This is the part that is most susceptible to leaks, due to the number of tubing connections and the potentially higher CO 2 mole fractions. The results of several such experiments indicated leak rates on 20 the order of no more than 1.3×10 -6 mbar L s -1 . At this rate, CO 2 and CH 4 contamination is negligible even with extremely high mole fractions in the laboratory. During later maintenance visits, simpler leak tests, which did not require opening tubing connections, were performed by breathing on individual connectors and observing the CO 2 mole fraction measured by the gas analyzer. No indications of leaks were observed during these tests. 25

Meteorological measurements
Meteorological measurements performed by MPI-BGC at Ambarchik include wind speed and direction at 20 m a.g.l., air temperature and humidity at 20 and 2 m a.g.l., and air pressure at 1 m a.g.l.
(instruments listed in Table A.2). The measurements mainly serve to monitor atmospheric conditions like wind and stability of atmospheric stratification for quality control of the GHG data (described in 5 Sect. 3.4). The 2D sonic anemometer, which is used to measure wind speed and direction, features a built-in heating to prevent freezing. The heating is switched on if temperature decreases below 4.5 °C and relative humidity is higher than 85 %, and switched off when temperatures increase above 5.5 °C.

Power supply
Power is supplied by the diesel generator of the Roshydromet meteorological station. Power 10 consumption of the MPI-BGC measurement system is about 350 W, and an additional 125 W is required in case the heating of the sonic anemometer is switched on. In order to avoid loss of power during routine generator maintenance, an uninterruptible power supply (9130 UPS, Eaton) was installed, which is able to buffer power outages of up to about 40 minutes (the heating of the sonic anemometer is not powered by the UPS). In case of a longer power loss, the UPS initiates a controlled 15 shutdown of the CRDS analyzer.

Data logging
Trace gas measurements and related data are logged by the factory-installed software of the CRDS analyzer. All other measurements are logged by an external data logger (CR3000, Campbell Scientific).
The logger samples all variables every 10 seconds. Raw samples are stored for wind measurements as 20 well as flow and pressure in the tubing (FM1-FM3, P1). Of the remaining meteorological measurements, room and rack temperature, and diagnostic variables, 10-minute averages are stored. The data are transferred from the external data logger to the hard drive of the CRDS analyzer daily. All data is backed up to an external hard drive hourly. The internal clocks of the CRDS analyzer and the data logger are synchronized with a GPS receiver (GPS 16X-HVS, Garmin) once per day.

Water correction
In order to minimize maintenance efforts and reduce the number of components prone to failure, CO 2 and CH 4 mole fractions are measured in humid air. Hence, the values reported by the analyzer have to be corrected for the effects of water vapor to obtain dry air mole fractions. This is done by applying a 5 water correction function to the raw data: Here, ! !"# is the mole fraction of CO 2 or CH 4 in humid air reported by the analyzer, ℎ is the water vapor mole fraction (also measured by the CRDS analyzer), ! ! ! is the water correction function, and ! !"# is the desired dry air mole fraction. Picarro Inc. provides a factory water correction based on Chen et al. (2010), but to achieve accuracies within the WMO goals for water vapor mole fractions above 1 % 10 H 2 O, custom coefficients must be obtained for each analyzer (Rella et al., 2013). Here, we employ the novel water correction method by Reum et al. (2019). In Reum et al. (2019), data from gas washing bottle experiments (explained in Appendix B) with the CRDS analyzer in Ambarchik were analyzed in the context of the new method (labeled "Picarro #5" therein). Here, we use these data together with data from additional experiments to derive water correction coefficients for the application to the complete 15 Ambarchik record. The results of this procedure are briefly summarized here, while more details are given in Appendix B.
Water correction experiments have been performed in 2014, 2015 and 2017. Differences between the water corrections based on the different experiments were on the order of magnitude of the WMO goals ( Fig. 4). Here, we chose the WMO internal reproducibility goals as reference, which correspond to half 20 of the interlaboratory compatibility goals (WMO, 2016). The motivation for this choice is that keeping biases of observations with respect to the calibration scale within these goals ensures that biases between stations are within the interlaboratory compatibility goals. Given the small number of water correction experiments conducted so far, it is unknown whether these differences represent drifts over long time scales, short-term variations and/or systematic differences between the experimental methods. similar as that of annual tests over two years. This indicates that the differences of the Ambarchik analyzer could be short-term variations. In the absence of evidence for trends, water correction coefficients were derived based on the averages of the individual water correction function responses for each species (see Appendix B). The maximum deviations of the individual functions to these synthesis functions were 0.018 % CO 2 at 3 % H 2 O, which corresponds to 0.07 ppm at 400 ppm dry air 5 mole fraction, and 0.034 % CH 4 at 2.7 % H 2 O, which corresponds to 0.7 ppb at 2000 ppb dry air mole fraction (Fig. 4).

Calibration
Calibrations are performed with a set of pressurized dry air tanks filled at the Max Planck Institute for 5 Biogeochemistry (Jena, Germany). The levels of GHG mole fractions of these tanks have been traced to the WMO scales X2007 for CO 2 and X2004A for CH 4 (Table C.

1). Three calibration tanks (in order
High, Middle, Low) are probed once every 116 hours for 15, 10 and 10 minutes, respectively. The longer probing time of the first (High) tank serves to flush out residual water vapor due to water molecules that adhere to the inner tubing walls. Thus, residual water vapor during tank measurements is 10 well below 0.01 % H 2 O. From these three tanks, coefficients for linear calibration functions are derived.
Due to the scatter of the coefficients over time, the coefficients are smoothed using a tricubic kernel with a width of 120 days (

Uncertainty in CO 2 and CH 4 measurements
Measurement uncertainties in the CO 2 and CH 4 data arise from instrument precision, the calibration and 5 the water correction. We estimated time-varying uncertainties of hourly trace gas mole fraction averages based on the method by Andrews et al. (2014), with some modifications. Details of the procedure are given in Appendix E. Average uncertainties at 1σ-level were 0.085 ppm CO 2 and 0.77 ppb CH 4 . Both were dominated by the variability between the water vapor correction experiments. The contribution of analyzer signal 10 precision for averages over one hour to these uncertainties was 0.013 ppm CO 2 and 0.25 ppb CH 4 .
These numbers may be used to distinguish analyzer signal precision from atmospheric variability.

Data screening
After water correction and calibration, invalid data are automatically removed before calculating hourly averages using filters for bad analyzer status (Sect. 3.4.1), flushing of lines (Sect. 3.4.2), times of calibration and maintenance, contamination from local polluters (Sect. 3.4.3) and water vapor spikes (Sect. 3.4.4). In the case of contamination from local polluters, CO 2 and CH 4 averages are also 5 computed with the flagged data to allow assessing the impact of the filter. Additional variables reported in the hourly averages allow for further data screening, e.g. for using the data in inverse models (Table   1). Details on the gradient of virtual potential temperature are given in Sect. 3.4.5. Table 1: Variables for data screening and an example for a strict filter for background conditions that was used to infer average 10 growth rates in Sect. 5.1.

Analyzer status diagnostics
Picarro Inc. provides the diagnostic flags INST_STATUS and ALARM_STATUS that monitor the operation status of the analyzer. The values in Table 2 indicate normal operation. The flag 15 ALARM_STATUS indicates both exceeding user-defined thresholds for high mole fractions (ignored here), and data flagged as bad by the data acquisition software. The code reported in INST_STATUS contains, among other indicators, thresholds for cavity temperature and pressure deviations from their target values. We created stricter filters for these two values based on their typical variation during normal operation of this particular measurement system. Occasionally, small numbers (< 5) of outliers are recorded after a period of lost data (e.g. due to high CPU load). These are removed manually. Air from the two inlets at the tower and the calibration tanks flows through some common tubing (Fig.   3). Hence, air measured immediately after a switch is influenced by the previous air source. We remove the first 30 seconds from the record after a switch between inlets to avoid sample cross-contamination.
Air from calibration tanks exhibits larger differences in humidity and mole fractions to ambient air.
Hence, the first five minutes of ambient air measurements after tank measurements are removed from 10 the record.

Contamination from local polluters
Possible frequent contamination sources in the immediate vicinity of the tower are the building hosting the power generator of the facility (65 m northwest from tower), the heating and oven chimneys of the only inhabited building (30 m and 20 m northeast, respectively) and waste disposal. These local 15 polluters can cause sharp and short increases in CO 2 and CH 4 mole fractions on the timescale of seconds to a few minutes. These features cannot be modeled by a regional or global atmospheric transport model and should therefore be filtered out. We developed a detection algorithm to identify spikes based on their duration, gradients, and amplitude in the raw CO 2 data. Spike detection algorithms are often compared to manual flagging by station operators (El Yazidi et al., 2018). Parameters of our algorithm 20 were tuned in this way based on the first year of data. The algorithm is described in Appendix D. The impact of the CO 2 spike flagging procedure is shown in Table 3. Impacts on the hourly mole fractions are small, more so when considering only data that pass other quality filters.
We observed that large CH 4 spikes were much less frequent than and often coincided with CO 2 spikes.
Hence, the spike detection algorithm developed for CO 2 was used to flag CH 4 as well. This strategy may remove some unpolluted CH 4 signals and, in rare cases, leave contaminated CH 4 signals 5 undetected. However, given the small impact of filtering flagged CO 2 spikes and the smaller frequency of large CH 4 spikes, we think that contamination of CH 4 independent of CO 2 is a negligible source of error in Ambarchik data. Furthermore, due to the large variability of natural CH 4 sources, a spike detection algorithm for CH 4 may bear the risk of flagging natural signals. In addition, CH 4 contamination may also be flagged based on other criteria, in particular their intra-hour variability. For 10 these reasons, we decided that a common filter for both CO 2 and CH 4 works best at Ambarchik.

Metric
All data Data with w v > 2 ms -1 and ΔT v,p < 0 K

Water vapor spikes 15
During winter, the CRDS analyzer occasionally records H 2 O spikes with durations of a few seconds.
The spikes typically exhibit much higher mole fractions than possible given ambient air temperature.
This suggests that they are caused by small amounts of liquid water in the sampling lines in the laboratory upon evaporation. Since we observed the phenomenon exclusively during the cold season, we speculate that it is caused by small ice crystals that may form on the air inlet filters (F1, F2), detach, 20 are trapped by one of the filters inside the laboratory, and evaporate.
Since fast water vapor variations deteriorate the accuracy of the water vapor correction, we remove the spikes before creating hourly averages. Spikes are identified using a flagging procedure similar to the one for CO 2 contamination described in Appendix D, with parameters adapted to the different shape of the H 2 O spikes.

Virtual potential temperature 5
Regional and global scale atmospheric tracer transport models rely on the assumption that the boundary layer is well-mixed (e.g. Lin et al., 2003). This requirement is not satisfied when the air is stably stratified due to a lack of turbulent mixing (Stull, 1988). This may occur when the virtual potential temperature increases with height. To detect these situations, sensors for temperature and relative humidity are installed at 2 m and 20 m above ground level on the measurement tower (Table A.2). 10 Based on these measurements, the virtual potential temperature is calculated for both heights, and the difference can be used as an indicator for stable stratification of the atmospheric boundary layer at the station (e.g. Table 1 and Sect. 5.1).

Atmospheric tracer transport to Ambarchik
The predominant wind directions at Ambarchik were southwest and northeast (Fig. 6)   We used an atmospheric transport model (Henderson et al., 2015) to determine regions within the Arctic that influence the atmospheric signals captured at Ambarchik. For the case studies shown here, 15-day backtrajectories were calculated for the period August 2014 to December 2015. Atmospheric transport 5 was modeled using STILT (Lin et al., 2003) driven by WRF (Skamarock et al., 2008), for which boundary and initial conditions were taken from MERRA reanalysis fields (Rienecker et al., 2011). The resolution of the transport model in our domain was mostly 10 km horizontally with 41 vertical levels.
Based on these trajectories, the sensor source weight functions ("footprints") were calculated on a ind Speed (m/s) 2 - 5 5 -10 10 -15 15 -20 > 20 square-shaped lambert azimuthal equal area grid with a resolution of 32 km and an extent of 3200 km centered on Ambarchik. To better visualize the representativeness of Ambarchik data to different origins of air masses, we aggregated these footprints over seasons. Furthermore, we sorted the aggregated footprints into bins each covering a quartile of the cumulative footprint (Fig. 7). Footprints covered adjacent northeast Siberian tundra and taiga ecoregions as well as the East Siberian Arctic 5 Shelf, with seasonally varying influences. In winter, spring and summer, the top quartile of the footprint concentrated on a few grid cells (order of ~100 km) around Ambarchik, with a slightly larger spread in fall. The two central quartiles had a focus on easterly directions in spring and on the north in summer.

Ambarchik time series in comparison with Barrow, Alaska 15
In order to provide a context for the characteristics of greenhouse gas signals measured at Ambarchik, we compared the time series from Ambarchik with in-situ CO 2 (NOAA, 2015) and CH 4 (Dlugokencky et al., 2017) mole fractions observed at Barrow Observatory, Alaska, which is located close to the village of Utqiaġvik (71.32° N, 156.61° W). Data from Barrow were chosen for the comparison because of the station's proximity to Ambarchik (distance ~1.500 km, latitudinal difference 1.7°; cf. Fig. 1), and 20 because they have been used in many studies on both global and regional greenhouse gas fluxes (e.g.  Berchet et al., 2016;Jeong et al., 2018;Rödenbeck, 2005;Sweeney et al., 2016). The analyzed period was August 2014 to December 2016.
For the comparison, afternoon data (1-4 pm) for which the wind speed was above 2 ms -1 were used (gaps in the MPI-BGC wind measurements were filled with Roshydromet 10 m wind speed data). In addition, Ambarchik data were filtered out when the virtual potential temperature increased with height. 5 This filter was omitted for Barrow, because it would have removed most of the data from October to April, including data classified as "background" signals (which occurred throughout the year). Barrow data were filtered according to their quality flag. For CO 2 , data with quality flags "...", ".D.", ".V." and ".S." were included. For CH 4 , data with quality flags "..." and ".C." were included. Data with other flags than a "." in the first column were removed as invalid. Other quality flags (differing in the second or 10 third column) were excluded because their number was negligible. We inferred average growth rates and seasonal cycles for the analyzed period based on the curve fitting procedure by Thoning et al. (1989): linear trends and four harmonics representing the seasonal cycles were fitted to the data, and a low-pass filter was applied to the residuals. We emphasize that the purpose of this procedure was not to infer baselines, which would not be suitable for CH 4 . Instead, the fitted curves were smooth 15 representations of the time series, including regional signals. To minimize the influence of interannual variations on the estimated average growth rates at Ambarchik, they were estimated with additional strict filters for background conditions applied to Ambarchik data (Table 1). Given the short duration of the Ambarchik record, we estimated seasonal cycle amplitude and timing based on the harmonic part of the fit function, which was more robust than including smoothed residuals. 20

Carbon dioxide
In spring, CO 2 mole fractions observed at Ambarchik closely tracked those measured at Barrow (Fig.   8), which was likely due to the absence of local to regional sources and sinks during this period. In summer, Ambarchik recorded a stronger seasonal drawdown of CO 2 mole fractions compared to Barrow, leading to a lower minimum value that occurred 12 days earlier. In fall, CO 2 rose faster at 25 Ambarchik, reaching the midpoint between minimum and maximum 21 days earlier compared to Barrow. The mole fraction maxima in winter were at similar values. Carbon dioxide mole fractions at Ambarchik were more variable than at Barrow in summer and fall, which indicates stronger local and regional sources and sinks captured by the Ambarchik tower. The annual amplitude of CO 2 was slightly larger at Ambarchik (20 ppm vs. 18 ppm) because of the lower summer minimum. The average growth rates were (2.77 ± 0.09) and (2.82 ± 0.05) ppm CO 2 yr -1 at Ambarchik and Barrow, respectively. Note that despite the good agreement of these growth rates, their uncertainties are larger than the statistical 5 uncertainties given here, since the estimates depended on data selection and were based on less than three years of data. We note that in November and December 2016, exceptionally high CO 2 mole fractions were measured at Ambarchik. However, analysis of individual signals is beyond the scope of this paper.

Methane
Similar to CO 2 mole fractions, in spring, CH 4 mole fractions at Ambarchik matched those at Barrow and had low variability (Fig. 8). Throughout the rest of the year, CH 4 mole fractions at Ambarchik were higher and more variable than at Barrow, which is reflected by the larger annual amplitude of 72 ppb at Ambarchik, compared to 47 ppb at Barrow. The summer minimum of the harmonics occurred 70 days 5 earlier at Ambarchik. By contrast, the minimum of the visual baseline of hourly data occurred much later, and was close in values and timing compared to the Barrow measurements (Fig. 9). This discrepancy was due to the fact that the harmonics fitted to Ambarchik CH 4 data were influenced by large positive CH 4 enhancements starting in early summer, which are likely caused by strong regional sources. Such CH 4 enhancement events were also recorded throughout most of the winters. Estimated 10 average growth rates of CH 4 were (6.4 ± 1.0) ppb yr -1 at Ambarchik and (10.0 ± 0.7) ppb yr -1 at Barrow.
Note that, as for CO 2 , the true uncertainties of these growth rates are larger than the statistical uncertainties given here, since the estimates depended on the data selection.

Angular distribution of regional CO 2 and CH 4 anomalies
Ambarchik is located at a junction of several different ecoregions, and in particular at the coast of the East Siberian Sea. Therefore, the dependence of CO 2 and CH 4 signals on wind direction could provide 5 insights into CO 2 and CH 4 exchange between these different regions and the atmosphere. We examined this dependence based on CO 2 and CH 4 anomalies representative of fluxes inside the domain introduced in Sect. 4 (3200 km × 3200 km, centered on Ambarchik). These anomalies were computed following a standard method in regional inverse modeling of atmospheric tracer transport, i.e. by subtracting the contribution of CO 2 and CH 4 transported into the domain (the background signal) from the 10 observations. The anomalies therefore represent the atmospheric signature of sources and sinks inside the domain. The background signal was computed by sampling global atmospheric CO 2 and CH 4 mole fraction fields at the end points of the backtrajectories introduced in Sect. 4. The global CO 2 fields were based on Rödenbeck (2005, version doi:10.17871/CarboScope-s04_v3.8.), and the CH 4 fields were based on the code by Rödenbeck (2005) modified by T. Nunez-Ramirez (personal communication).
Both fields were optimized for station sets that included Ambarchik data. We analyzed the data that passed the filters for low wind speeds and temperature inversions (see Table 1) grouped by season, and focused the interpretation on the signals from the predominant wind directions, since sample sizes from 5 other sectors were small.

Carbon dioxide
The most pronounced CO 2 signals from predominant wind directions were positive anomalies during southwesterly winds in fall and winter. During summer, CO 2 anomalies from the predominant wind direction (northeast) were small. During spring, almost no CO 2 anomalies were observed. 10

Methane
The strongest CH 4 enhancements were observed from westerly winds in summer, and southwesterly winds in fall and winter. The predominant northeasterly winds in summer carried comparatively small CH 4 enhancements. The overall variability of CH 4 was highest in summer and fall, with considerable enhancements especially from the southwest in winter. Like CO 2 , CH 4 showed almost no anomalies in 20 spring. In this paper, we presented the first years (August 2014 -April 2017) of CO 2 and CH 4 measurements from the coastal site Ambarchik in northeast Siberia. The site has been operational without major 5 downtime since its installation. Greenhouse gas measurements are calibrated about every five days using dry air from gas tanks with GHG mole fractions traced to WMO scales. Mole fractions of CO 2 and CH 4 are measured in humid air and corrected for the effects of water vapor using a novel water vapor correction method. An algorithm was developed to remove measurements influenced by local polluters, which affected a small fraction of the measurements. Measurements of the gradient of the 10 virtual potential temperature and the two sampling heights allow for detection of stable stratifications of the atmospheric boundary layer at the station. Uncertainties of the GHG measurements, which were inferred from measurements of dry air from calibrated gas tanks and water correction experiments, were on average 0.085 ppm CO 2 and 0.77 ppb CH 4 . We continue work on improvements of the accuracy of the calibrations and uncertainty estimates and will adapt them as additional information becomes 15 available (e.g. based on post-deployment calibration of used gas tanks).
A footprint analysis indicates that Ambarchik is sensitive to trace gas emissions from both the East Siberian Sea and terrestrial ecosystems. Both CO 2 and CH 4 anomalies were large during southwesterly and westerly winds and small during northeasterly winds. This suggests that the larger signals originated from terrestrial rather than oceanic fluxes and demonstrates the value of sampling at the 20 Number of observations location of Ambarchik for distinguishing fluxes from different source regions and thus insights into carbon cycle processes in this region. In comparison with Barrow, Alaska, Ambarchik recorded larger CO 2 and CH 4 anomalies, which resulted in larger seasonal cycle amplitudes as well as earlier minima and fall growth. We interpret the stronger CO 2 and CH 4 signals at Ambarchik as stronger local and regional fluxes compared to those captured at Barrow. Strong CH 4 enhancements were recorded at 5 Ambarchik well into the winter, which is evidence for the relevance of cold season emissions (Kittler et al., 2017b;Mastepanov et al., 2008;Zona et al., 2016). While the average growth rate of CO 2 at Ambarchik matched the one at Barrow, the growth rate of CH 4 at Ambarchik was smaller. We attribute the discrepancy to the short analysis period, which makes the growth rate estimate sensitive to interannual variability and differences in the timing of the annual maximum and minimum. 10 The accuracy of the CO 2 and CH 4 data obtained at Ambarchik, and their sensitivity to sources and sinks of high-latitude terrestrial and oceanic ecosystems make the Ambarchik station a highly valuable tool for carbon cycle studies focusing on both terrestrial and oceanic fluxes from Northeast Siberia.

Appendix B Derivation of water correction coefficients
The influence of water vapor on CO 2 and CH 4 measurements was corrected for based on several water 5 correction experiments and a novel water correction model, which we describe in the following paragraphs. For more details, please refer to Reum et al. (2019). As stated in Sect. 3.1, data from gas washing bottle experiments (explanation below) with the CRDS analyzer located in Ambarchik were analyzed in Reum et al. (2019) in the context of the new water correction method (labeled "Picarro #5" therein). Here, we use these data together with data from additional experiments to derive water 10 correction coefficients for the application to the complete Ambarchik record. Experiments were performed with two different humidification methods. For the so-called droplet method, a droplet of de-ionized water (ca. 1 ml) was injected into the dry air stream from a pressurized air tank and measured with the CRDS analyzer. The gradual evaporation of the droplet provided varying water vapor levels. By contrast to the droplet method, the gas washing bottle method was designed to hold water content in the sampled air at stable levels. For this purpose, the air stream from a 5 pressurized tank was humidified by directing it through a gas washing bottle filled with de-ionized water, resulting in an air stream saturated with water vapor. The humid air was mixed with a second, untreated air stream from the same tank. Different water vapor levels were realized by varying the relative flow through the lines using needle valves.
Initial experiments have been performed using the droplet method, but systematic biases in the resulting 10 dry air mole fractions at H 2 O < 0.5 % led to further experiments with the gas washing bottle method and the development of an improved water correction model: Here, ! ! !"#" ! corrects for dilution and pressure broadening (Chen et al., 2010). The parameters ! ! and ℎ ! correct for a sensitivity of pressure inside the measurement cavity of Picarro analyzers to water vapor (Reum et al., 2019). 15 Three droplet experiments were performed in 2014, while one gas washing bottle experiment was performed in each 2015 and 2017. The droplet results proved unsuitable to derive the pressure-related coefficients ! ! and ! ! due to fast variations of water vapor, which typically occurred below 0.5 % H 2 O (Reum et al., 2019). Therefore, from the droplet experiments only the data with slowly varying water vapor were used, and ! ! and ! ! were based only on the gas washing bottle experiments. For each 20 species, a synthesis water correction function was derived by fitting coefficients to the average response of the individual functions (Table B.1). CO 2 (-1.2 ± 0.2) × 10 -2 (-2.7 ± 0.5) × 10 -4 (2.2 ± 1.0) × 10 -4 0.22 ± 0.12 CH 4 (-0.97 ± 0.07)× 10 -2 (-3.1 ± 1.4) × 10 -4 (1.1 ± 0.7) × 10 -3 0.22 ± 0.12  The CO 2 spike detection algorithm is a multi-step process. First, candidates for CO 2 spikes are identified. In subsequent steps, false positives are removed. Parts of the algorithm are based on Vickers and Mahrt (1997).

Appendix C Calibration scale and coefficients
Step 1. Identifying spike candidates based on variation of differences between CO 2 measurements 5 For this step, data are processed in intervals spanning 1.5 hours. Candidates for CO 2 spikes are identified based on the variability of differences between individual consecutive CO 2 measurements.
Measurements with differences that exceed 3.5 standard deviations from non-flagged data are flagged as spike candidates. Since flagging the data changes the standard deviation of the non-flagged data, flagging is repeatedly applied until changes between standard deviations of the non-flagged data 10 between the last and second-last loop are less than 10 -10 ppm CO 2 . In some cases, this procedure flags the complete interval as spikes. This happens when the variations throughout the interval are rather uniform. This might be the case both in the presence of spikes throughout the interval, or absence of spikes altogether. To avoid false positives, all flags are removed, and the interval is considered to have no spikes. Cases with many spikes throughout the interval can be filtered based on the intra-hour 15 variability flag.

Step 2. Blurring
Around the top of a spike, differences between individual CO 2 soundings are often small and thus, these measurements are not captured as part of a spike in step 1. To unite the ascending and descending parts of spikes, the 20 data points before and after a flagged measurement are flagged. From here on, each 20 group of consecutive flagged measurements is considered a spike candidate.

Step 3. Unflagging individual outliers
Step one often identifies individual or very few consecutive data points as spikes, spanning few seconds. We regard these very small groups of flagged data points as noise misidentified as spikes.
After blurring (step 2), these individual outliers form groups of at least 41 data points. In step 3, spike 25 candidates consisting of less than 45 data points are unflagged.
Step 4. Baseline, detrending For each spike candidate, the baseline is identified as a linear fit to the unflagged measurements within five minutes of any data point of the spike candidate. Using this baseline, the data in this interval are detrended, including the spike candidate.

Step 5. Spike height
From the detrended data from step 4, the maximum deviation from the baseline ("spike height") is 5 calculated. Spike candidates smaller than 8 standard deviations of the baseline measurements are unflagged.
Step 6. Unflagging abrupt but persistent changes Until the previous step, the algorithm flags abrupt CO 2 changes even if they are persistent. This pattern occurs for example during changes of wind direction and does not constitute an isolated spike. In this 10 case, a trough is present in the detrended spike. The minimum deviation from the baseline is calculated ("trough depth") and compared to the spike height. Since spike height and trough depths can be based on few data points, the influence of noise is strong. To counteract, spike height and trough depth are diminished by two standard deviations of the baseline. Spike candidates with trough depths greater than one fifth of the spike height are unflagged. 15 Step 7. Unflagging persistent variability changes The procedure so far can flag the beginning or end of longer periods of larger CO 2 variability. To unflag these false positives, steps 4-5 are applied again with the following changes: (1) a longer baseline of 30 minutes before and after the spike candidate (instead of five minutes) is used, (2) baseline standard deviations are calculated separately for the period before and after the spike candidate, (3) the spike 20 height from step 5 is used instead of recalculated, and (4) the spike height must exceed the maximum of the two baseline standard deviations by a factor of 6 instead of 8.
Step 8. Repeat The result from steps 4-7 depends on unflagged data points surrounding a spike candidate. Therefore, these steps are repeated until a steady state is reached. 25 An example of flagged spikes is shown in Fig. D.1. In this example, removing flagged data reduced the hourly averages of Center inlet data between 3 and 4 a.m. by 0.5 ppm (CO 2 ) and 7.0 ppb (CH 4 ). No Top inlet data were flagged in this period. Since small spikes can be hard to distinguish from natural signals, some smaller features can pass the algorithm without being flagged that may be classified as spikes upon visual inspection, e.g. at 5:33 a.m. in Fig. D.1. However, given that larger spikes alter hourly averages by values on the order of magnitude of the WMO goals, the impact of these features is likely negligible. In this particular example, removing the detected spikes reduced average CO 2 mole fractions 5 between 5 and 6 a.m. from the Center inlet by 0.07 ppm. Removing the unflagged small spike at 5:33 a.m. would further reduce this average by 0.005 ppm, which is inconsequential.

Appendix E Measurement uncertainties 10
We adopted the uncertainty quantification method of Andrews et al. (2014). Here, we summarize the main ideas of this approach, the modifications we made, and quantify individual uncertainty components. A detailed description of the nomenclature and method was omitted; please refer to Andrews et al. (2014). Andrews et al. (2014) calculated the measurement uncertainty as the largest of four different formulations (Eq. (9a-d) therein). Formulations (a) and (b) were the prediction interval of the linear 5 regression of the calibration tanks, which takes into account the standard error of the fit (!" !"# ) and the uncertainty in the analyzer signal. The difference between (a) and (b) was the estimate of the uncertainty in the analyzer signal. In formulation (a), it was estimated from a model (! ! ) that accounts for analyzer precision (! ! ) and drift (! ! ), uncertainty of the water vapor correction (! !" ), equilibration after switching calibration tanks (! !" ) and extrapolation beyond the range covered by the calibration 10 tanks (! !" ). In measurement uncertainty formulation (b), the uncertainty estimate of the analyzer signal was estimated from the residuals of the linear fits of the calibration tank mole fractions (! ! ), accounting for the fact that the assigned values of the calibration tanks have non-zero uncertainty (! ! ):

E.1 Uncertainty estimation framework by Andrews et al. (2014) and modifications
Here, ! is the slope of the calibration function. Formulation (c) was the bias of the Target tank (! !"! ), and formulation (d) the uncertainty in the assigned values of the calibration tanks (! ! ). In this approach, 15 uncertainty formulations (b), (c) and (d) only accounted for uncertainties of dry air measurements.
Hence, we modified it by adding the uncertainty of the water correction to these formulations. Thus, the analyzer precision model for uncertainty formulation (a) became: The full uncertainty terms were thus: 20 Here, ! !,! is a factor based on the quantile function of Student's t distribution with confidence level ! (! = 0.675 for prediction interval at 1!-level) and degrees of freedom !. Calibration uncertainties were estimated based on the averaging strategy for coefficients, i.e., using linear fits of weighted observations from individual calibration episodes over a window of 120 days (Sect. 3.2), which usually contained about 25 calibration episodes. The standard error of the fit (!" !"# ) was computed based on these 5 weighted fits. In the notation of Andrews et al., (2014), the equations for !" !"# become (cf. Taylor, 1997): Here, all quantities are as in Andrews et al., (2014), with the addition of weights ! ! and degrees of freedom !", which change with the number of calibration episodes in an interval. 10 Compared to calibrating based on single calibration episodes, this affected the uncertainty because of the larger number of observations (reduction of !" !"# and ! !,! ), and because of drift of the analyzer signal over the averaging window (increase of !" !"# and ! ! ! ).

E.2 Uncertainty components and estimates
In the following paragraphs, the individual components of the four uncertainty estimates Eq. (E.3)-(E.6) are described. For numerical values of the components, see

Water-vapor (! !" ) 5
For the water correction uncertainty ! !" , we used the maximum of the difference between individual water correction functions and the synthesis water correction function, i.e. 0.018 % CO 2 and 0.034 % CH 4 , regardless of actual water content. This approach likely overestimates ! !" at low water vapor content, but was chosen because ! !" was not well constrained by the small number of water correction experiments conducted so far. 10

Assigned values of calibration gas tanks (! ! )
For the uncertainty of the assigned values of the calibration gas tanks ! ! , we followed the approach by Andrews et al. (2014), who set them to the reproducibility of the primary scales WMO X2007 (CO 2 ) and WMO X2004 (CH 4 ). Estimates based on the MPI-BGC implementations of the primary scales yielded smaller uncertainties that underestimated the mismatch between the CO 2 mole fractions of the 15 calibration tanks.

Target tank (! !"! )
The uncertainty based on the Target tank measurements ! !"! was the same as in Andrews et al. (2014), but with the weighting and window we used for smoothing the calibration coefficients.

Analyzer signal precision model (! ! ) 20
For the analyzer signal precision model ! ! , analyzer precision (! ! ) and drift (! ! ) were estimated jointly ( ! ! ! + ! ! ! ) as the standard deviation of hourly averages of a gas tank measurement over 12 days prior to field deployment. Note that !" !"# also accounts for drift of the analyzer signal. However, the contribution of drift on timescales significantly shorter than the averaging window of 120 days to !" !"# tends toward zero. Since the estimate of ! ! was based on 12 days of measurements, it represents drift 25 over this shorter time scale in the prediction interval, which is why it was included in the model. The other components (! !" , ! !" ) appeared negligible. In particular, we found no conclusive evidence of non-negligible equilibration errors (! !" ) in our calibrations; however, this remains subject of future research (Appendix E.4). The extrapolation uncertainty (! !" ) applied only to a small fraction of Ambarchik data, so we ignored this error.

E.3 Random and systematic uncertainty components
The uncertainty components described in Sect. E.1 and E.2 are mostly independent of the averaging period for which atmospheric data are reported (one hour). Rather, they describe systematic uncertainties inherent to the calibration procedure and long-term drift (! ! , !" !"# , ! !"! ), and the water correction (! !" ). Thus, these uncertainty estimates would not be smaller for atmospheric data averaged 5 over longer periods. Exceptions are the analyzer signal precision estimates ! ! and ! ! ! , which contain random uncertainties: the precision model ! ! was estimated based on hourly averages and reflects both their uncertainty and drift on the timescale of 12 days. Thus, it might change for different averaging periods. The analyzer signal uncertainty estimate ! ! ! was sensitive to several timescales, i.e., two minutes (averaging period of calibration data), 22 minutes (timespan of data of one calibration episode), 10 116 hours (time between individual calibration episodes) and 120 days (averaging window for calibration coefficients). To investigate whether uncertainties at these timescales were similar to those of the hourly averages of atmospheric data, we computed the Allan deviations for CO 2 and CH 4 . The uncertainties of averages over two minutes, 22 minutes and one hour were close (Fig. E.2). In addition, the analyzer precision deteriorated beyond one hour. These results are similar (qualitatively and 15 quantitatively) to those documented by Yver Kwok et al. (2015) for several Picarro GHG analyzers.
The analyzer signal precision estimates accounted for only a small fraction of the total uncertainty (Table E.1). Thus, the random uncertainty components play a minor role in the calibration of Ambarchik data, and averaging atmospheric data over different periods would not change the total estimated uncertainty considerably. 20

E.4 Potential improvements of the calibration accuracy
Several aspects to the accuracy of the calibration using regular gas tank measurements are subject to future research. Here, we outline potential calibration errors that could not be conclusively quantified, and how we plan to address them in the future. 10 To investigate whether the regular probing time of the gas tanks was sufficient for equilibration (e.g. due to flushing of the tubing), we fitted exponential functions to the medians of the regular tank measurements. Deviations between modeled equilibrium mole fractions and the averages used for calibration were negligible (|ΔCO 2 | < 0.008 ppm; |ΔCH 4 | < 0.09 ppb) and thus ignored. Furthermore, in two experiments, we investigated equilibration error and other drifts (e.g. diffusion in the pressure 15 reducers) by measuring the calibration tanks in reversed order, and in original order for up to two hours.
However, the experiments were inconclusive. Based on the available data, we estimated the largest conceivable biases for the ranges 350-450 ppm CO 2 and 1800-2400 ppb CH 4 . They were up to 0.06 ppm CO 2 and 0.5 ppb CH 4 at the edges of these ranges and vanished around their centers. An additional source of bias might be inlet pressure sensitivity of the Picarro analyzer as documented by Gomez-20 Pelaez et al. (2019). Using the sensitivities reported therein, some of the gas tank measurements in 1h 10 Ambarchik could have a bias of up to 0.03 ppm CO 2 and 0.2 ppb CH 4 . More experiments are necessary to rule out or confirm and assess these possible biases; hence, no bias correction was implemented.
The CO 2 bias of the water-corrected Target tank mole fractions varied from -0.06 to -0.01 ppm (Fig. 5, left). These variations correlated with residual water vapor (which was much smaller than 0.01 %) and temperature in the laboratory during the Target tank measurements, as well as with ambient CO 2 mole 5 fractions sampled before. This suggests that the variations may be due to insufficient flushing during calibration. However, the correlations varied over time without changes to the hardware or probing strategy. Therefore, further investigation of this observation is required, and no correction was implemented.
So far, possible drifts of the gas tanks could not be assessed and have thus not been included in our 10 uncertainty assessment. This will be assessed only when the gas tanks are almost empty, and shipped back to the MPI-BGC for recalibration.

Data availability
Quality-controlled hourly averages of data from Ambarchik are available on request from Mathias Göckede. We plan to publish continuous updates to the data to an open access repository in the future. 15 Data from Barrow are available at https://www.esrl.noaa.gov/gmd/dv/data/.

Author contributions
MH, SZ and MG conceptualized the study. JL, MH, OK, NZ, FR and MG designed and set up the Ambarchik station. NZ and SZ coordinated setup and maintenance of the Ambarchik station. FR and MP performed calibration experiments. FR curated and analyzed the data. FR prepared the manuscript 20 with contributions from all authors. MG supervised the project, and reviewed and edited the manuscript.