Intercomparison of O 2 /N 2 Ratio Scales Among AIST, NIES, TU, and SIO Based on Round-robin Exercise Using Gravimetric Standard Mixtures

. A study was conducted to compare the δ(O 2 /N 2 ) scales used by four laboratories engaged in atmospheric δ(O 2 /N 2 ) 15 measurements. These laboratories are the Research Institute for Environmental Management Technology, Advanced Industrial Science and Technology (EMRI/AIST), the National Institute for Environmental Studies (NIES), Tohoku University (TU), and Scripps Institution of Oceanography (SIO). Therefore, five high-precision standard mixtures for O 2 molar fraction gravimetrically prepared by the National Metrology Institute of Japan (NMIJ), AIST (NMIJ/AIST) with a standard uncertainty of less than 5 per meg were used as round-robin standard mixtures. EMRI/AIST, NIES, TU, and SIO reported the analysed 20 values of the standard mixtures on their own δ(O 2 /N 2 ) scales, and the values were compared with the δ(O 2 /N 2 ) values gravimetrically determined by NMIJ/AIST (the high-precision primary standard standard uncertainties less 75 than 5 per meg for 2 /N 2 The high-precision standard mixtures us evaluate span offset accurately and precisely. Absolute drift of scale zero offset is also able to be evaluated accurately and precisely by periodically comparing laboratory reference air with the high-precision standard mixtures, which are prepared every time each comparison. In this study, we conducted intercomparison experiments to compare span sensitivities among the O 2 /N 2 scales of the Research (TU), and Scripps (SIO) based on round-robin exercise for the laboratory measuring the developed high-precision standard mixtures in order. Following this, a regression analysis is applied to the intercomparison results to investigate the relationship between the individual laboratory O 2 /N 2 scales. Results showed a slight but significant difference in the span sensitivities of the individual scales. Finally, we compare the atmospheric δ(O 2 /N 2 ) values observed on the EMRI/AIST scale with those on the NIES scale for the air samples collected at Island (HAT; 24°03’N, 123°49’E), Japan, using the relationship between the individual laboratory scales obtained in this study.

They extracted solubility-driven components of the atmospheric potential oxygen (APO = O2 + 1.1 × CO2) (Stephens et al., 1998) by combining their observational results with climate and ocean models. The global OHC change is a fundamental measure of global warming. Indeed, the ocean takes in more than 90% of the Earth's excess energy as evaluated based on ocean temperature measurements using Argo floats (e.g., Levitus et al., 2012). Thus, the atmospheric O2 measurements are 45 linked to the global CO2 budget and OHC.
The approaches described above rely on precise measurements that can detect micro-mole-per-mole-level changes in atmospheric O2 molar fraction (~21%). After Keeling and Shertz (1992) succeeded in developing the measurement technique based on the interferometer, various measurement techniques have been developed to quantify atmospheric O2 molar fraction, including using mass spectrometry (Bender et al., 1994;Ishidoya et al., 2003;Ishidoya and Murayama, 2014), a paramagnetic 50 technique (Manning et al., 1999;Ishidoya et al., 2017;Aoki and Shimosaka, 2018), a vacuum-ultraviolet absorption technique (Stephens et al., 2003), gas chromatography (Tohjima, 2000), a method using fuel cells (Stephens et al., 2007;Goto et al., 2013), and a cavity ring-down spectroscopy analyser (Berhanu et al., 2019). All programs have reported changes in O2 regarding the equivalent changes in the O2/N2 ratio by convention. This is expressed as the relative change compared to an arbitrary reference (Keeling and Shertz, 1992;Keeling et al., 2004) in per meg (one per meg is equal to 1 × 10 −6 ).
In the equation, n depicts the molar amount of each substance, and the subscripts sam and ref represent sample and reference air, respectively. The δ(O2/N2) value multiplied by 10 6 is expressed in per meg. The O2 molar fraction in air in 2015 is 209339.1 60 ± 1.1 μmol mol -1 (Aoki et al., 2019). Therefore, adding 1 μmol of O2 to a mole of dry air will increase in δ(O2/N2) by 4.8 per meg.
Each laboratory has typically employed its own O2/N2 reference based on natural air compressed and stored in high-pressure cylinders. Each laboratory has also assumed responsibility for calibrating the relationship between the measured instrument response and the reported change per meg units (span sensitivity). Therefore, the reported trends in O2/N2 are potentially biased 65 by any long-term drift in the O2/N2 ratio of the reference cylinders (zero drift) or errors in the calibrated span sensitivity of the instrument (span error). Note that a span stability below 5 per meg is required for the global CO2 budget analyses based on δ(O2/N2) observations [ Table 2 in Keeling et al. (1993)]. Challenges in achieving this stability include fractionations of O2 and N2 induced by pressure, temperature, and water vapour gradients (Keeling et al., 2007), adsorption/desorption of the constituents on the cylinder's inner surface (Leuenberger et al., 2015), and permeation/leakage of the constituents from/through 70 the valve (Sturm et al., 2004;Keeling et al., 2007). Tohjima et al. (2005) developed high-precision O2 standard mixtures with 2.9 μmol mol -1 uncertainty for O2 molar fraction (equivalent to 15.5 per meg uncertainty for δ(O2/N2)) in absolute terms to resolve these problems by preparing gravimetric standard mixtures of pure N2, O2, Ar, and CO2. Their study was significant, but the uncertainty larger than those recommended by Keeling et al. (1993) still remains, as mentioned above.
Recently, a technique was developed for preparing high-precision primary standard mixtures with standard uncertainties less 75 than 5 per meg for δ(O2/N2) at the National Metrology Institute of Japan, National Institute of Advanced Industrial Science and Technology (NMIJ/AIST) (Aoki et al., 2019). The high-precision standard mixtures allow us to evaluate span offset accurately and precisely. Absolute drift of scale zero offset is also able to be evaluated accurately and precisely by periodically comparing laboratory reference air with the high-precision standard mixtures, which are prepared every time each comparison.
In this study, we conducted intercomparison experiments to compare span sensitivities among the O2/N2 scales of the Research 80 Institute for Environmental Management Technology, Advanced Industrial Science and Technology (EMRI/AIST), National Institute for Environmental Studies (NIES), Tohoku University (TU), and Scripps Institution of Oceanography (SIO) based on round-robin exercise for the laboratory measuring the developed high-precision standard mixtures in order. Following this, a regression analysis is applied to the intercomparison results to investigate the relationship between the individual laboratory O2/N2 scales. Results showed a slight but significant difference in the span sensitivities of the individual scales. Finally, we 85 compare the atmospheric δ(O2/N2) values observed on the EMRI/AIST scale with those on the NIES scale for the air samples collected at Hateruma Island (HAT; 24°03'N, 123°49'E), Japan, using the relationship between the individual laboratory scales obtained in this study.

NMIJ/AIST Scale and Round-robin Standard Mixtures 90
In this study, five high-precision standard mixtures with standard uncertainties less than 5 per meg for δ(O2/N2) were used as round-robin standard mixtures. The NMIJ/AIST previously mixed them gravimetrically following ISO 6142-1:2015 (Aoki et al., 2019). The details of the gravimetric preparation technique were given in previous papers (Aoki et al., 2019, Matsumoto et al., 2004, 2008. They were contained in 10 L aluminium-alloy cylinders (Luxfer Gas Cylinders, UK) with a diaphragm brass valve Hamai Industries Limited,Japan). Table 1 shows the gravimetrically determined molar fractions for N2, O2, 95 Ar, CO2, as well as δ(O2/N2) in the round-robin standard mixtures. The gravimetric values of N2, O2, Ar, and CO2 molar fractions were recalculated based on the cylinders' updated expansion rate, which was used for the correction of buoyancy acting on a cylinder. The updated rate was determined as 1.62 ± 0.06 ml Mpa −1 (unpublished data), which was determined by measuring change of water volume with depletion of inner pressure of the cylinders sunk in water since the previous expansion rate (2.2 ± 0.2 ml Mpa −1 ) was provided by a cylinder supplier. The source gases used are pure CO2 (>99.998%, 100 Nippon Ekitan Corp., Japan), pure Ar (99.9999%, G1-grade, Japan Fine Products, Japan), pure O2 (99.99995%, G1-grade, Japan Fine Products, Japan), and pure N2 (99.99995%, G1-grade, Japan Fine Products, Japan). Impurities in the source gases were identified and quantified via several techniques. GC equipped with a thermal conductivity detector (GC/TCD) was used to analyse N2, O2, CH4, and H2 in pure CO2. O2 and Ar in pure N2 and N2 in pure O2 were analysed using GC, equipped with a mass spectrometer. A Fourier-transform infrared spectrometer was used to detect CO2, CH4, and CO in pure N2, O2, and Ar. 105 A galvanic cell O2 analyser was used to quantify O2 in pure Ar. A capacitance-type moisture sensor measured H2O in pure CO2, and a cavity ring-down moisture analyser measured H2O in pure N2, O2, and Ar.
In this study, the absolute O2/N2 scale determined using the gravimetric values in the round-robin standard mixtures is hereafter called as the NMIJ/AIST scale. The NMIJ/AIST scale is presented only for scientific research and is uncertified by NMIJ.
The range of δ(O2/N2)NMIJ/AIST values for the round-robin standard mixtures was −3600 per meg to 2900 per meg in order to evaluate the difference of the individual span sensitivities accurately although it is larger than their variation in air. The standard uncertainties of the δ(O2/N2)NMIJ/AIST values were 3.3 per meg to 4.0 per meg.

Procedure of Intercomparison 115
The EMRI/AIST, NIES, TU, and SIO conducted the intercomparison experiment. Each lab analysed air delivered from the cylinders after placing them horizontally for more than five days after their transport 120 to avoid the change of δ(O2/N2) values in the standard mixtures by thermal diffusion and gravitational fractionation. The δ(O2/N2)round-robin values determined by the individual laboratories using their methods were compared with the δ(O2/N2)NMIJ/AIST values. EMRI/AIST and TU used mass spectrometry, NIES used GC, and SIO used the interferometric method, as summarised in Table 2. The stability of O2/N2 ratios in the round-robin standard mixtures during the intercomparison experiment was evaluated by measuring their δ(O2/N2)round-robin values using a mass spectrometer (Delta-V, 125 Thermo Fisher Scientific Inc., USA) (Ishidoya and Murayama, 2014) at EMRI/AIST during the intercomparison experiment.
Ar molar fractions in the round-robin standard mixtures were from 9297 to 9351 μmol mol −1 , much larger than variations in the tropospheric air (less than 1 μmol mol −1 ) (Keeling et al., 2004). Isotopic ratios of δ( 17 O/ 16 O), δ( 18 O/ 16 O), and δ( 15 N/ 14 N) in the round-robin standard mixtures were determined by the mass spectrometer at EMRI/AIST to be 4.7‰, 9‰, and 2.4‰, larger than the atmosphere. The atmospheric value is used as primary standard (De Laeter et al., 2003, Wieser andBerglund, 130 2009 Values of δ(O2/N2) in sample air have generally been determined on assumption that Ar molar fractions and isotopic ratios of N2 and O2 in reference air and sample air are identical. However, the round-robin standard mixtures had different in the Ar 140 molar fraction and the isotopic ratios from reference air. We applied the following corrections to the measured δ(O2/N2)roundrobin values from the individual laboratories by considering the deviations in the Ar molar fraction and the isotopic ratios in the round-robin standard mixtures from the atmospheric level. The δ(O2/N2)round-robin values reported by EMRI/AIST and TU were corrected based on the deviation in the isotope ratio from the atmospheric level using isotopic ratios of N2 and O2 measured simultaneously at EMRI/AIST. This is because EMRI/AIST and TU measured the values of δ( 16 O 16 O/ 14 N 14 N) and 145 δ( 16 O 16 O/ 14 N 15 N), respectively. NIES corrected δ(O2/N2)round-robin using the Ar molar fraction difference from its atmospheric level since the O2 peak obtained in GC included the Ar peak. SIO also corrected δ(O2/N2)round-robin using the difference in the Ar molar fraction from its atmospheric level since they only measured O2 molar fractions. The measurement techniques and calculation procedures of the δ(O2/N2)round-robin values for individual laboratories are detailed in the next section.

EMRI/AIST
The δ(O2/N2)round-robin values for EMRI/AIST were calculated based on the δ( 16 O 16 O/ 14 N 14 N)round-robin values measured using the mass spectrometer. The δ( 16 O 16 O/ 14 N 14 N)round-robin values were calculated against the reference air on the EMRI/AIST scale, which is natural air filled in a 48 L aluminium cylinder with a diaphragm valve (G-55, Hamai Industries Limited, Japan). The EMRI/AIST scale's long-term stability is described in following section 3.1. The measurement technique's detail was given 155 in Ishidoya and Murayama (2014 can be taken as globally constant because atmospheric mixing is very rapid compared to the processes altering oxygen isotopic composition (Junk and Svec, 1958;Baertschi, 1976;Li et al., 1988;Barkan and Luz, 2005).

NIES
NIES reported the δ(O2/N2)round-robin values based on the δ{(O2+Ar)/N2}round-robin values measured using a GC/TCD (Tohjima, 185 2000). The δ{(O2+Ar)/N2} round-robin values were calculated against the reference air on the NIES scale, which is natural air filled in a 48 L aluminium cylinder. A column separates the (O2 + Ar) and N2 in the air sample, and a TCD detected the individual peaks. The reference and sample air were repeatedly measured using the GC/TCD, and the δ{(O2+Ar)/N2} round-robin values were calculated based on the ratios of the (O2 + Ar) peak area to N2 peak area using Eq. (9). 190 The δ(O2/N2) round-robin value is given by Eq. (10).
where the coefficient a is defined by a = k(Ar/O2)ref. k represents the TCD sensitivity ratio of Ar relative to O2, and the value was evaluated as 1.13 by comparing gravimetric mixtures of O2 + N2 and Ar + O2 + N2 (Tohjima et al., 2005). Natural air is used for the reference air. Therefore, the value of a is calculated as 0.050 (Ar = 0.93% and O2 = 20.94%). For NIES, the δ(Ar/N2)round-robin value was calculated using the gravimetric values of N2 and Ar in the round-robin standard mixtures. 200 The NIES O2/N2 scale is related to a set of 11 primary reference air cylinders. The NIES O2/N2 scale's long-term stability has been maintained within ±0.45 per meg yr −1 with respect to these cylinders by analysing the relative differences in the O2/N2 ratios in the primary and working reference air (Tohjima et al., 2019). Details of the analytical methods and the NIES O2/N2 scale are given in Tohjima et al. (2005Tohjima et al. ( , 2008.

TU 205
The δ(O2/N2)round-robin values for TU were calculated based on the δ( 16 O 16 O/ 15 N 14 N)round-robin values measured using a mass spectrometer (Finnigan MAT-252). The δ( 16 O 16 O/ 15 N 14 N)round-robin values were calculated against the reference air on the TU scale, which is natural air filled in a 47 L manganese steel cylinder in 1998. The measurement technique's detail was given by Ishidoya et al. (2003). The TU scale's stability was evaluated by measuring the values of δ(O2/N2) in six working reference air against the primary reference air from 1999 to 2020. The changing rate and their standard deviation of δ(O2/N2) in the working 210 reference air were −0.02 ± 0.37 per meg yr −1 on average. The mass spectrometer was adjusted to measure ion beam currents for masses 29 ( 15 N 14 N) and 32 ( The isotopic ratios in the round-robin standard mixtures were calculated using Eqs. (6), (7), and (12). 220 14 N 14 N/ 15 N 14 N = [δ( 14 N 14 N/ 15

SIO
SIO reported the δ(O2/N2) values based on measurements using a two-wavelength interferometer . The SIO O2/N2 reference, of which scale is defined as δ(O2/N2) = 0, is based on a suite of 18 primary reference gases stored in 230 high-pressure cylinders (aluminium or steel, volumes ranging from 29 to 47 L) filled with natural air (Keeling et al., 2007).
The SIO O2/N2 scale's long-term stability has been maintained within ±0.4 per meg yr −1 with respect to these cylinders by analysing the relative differences in the O2/N2 ratios in the primary reference air. Differences between the round-robin cylinders and the SIO reference were determined from 235 where ̃ is the difference in refractivity ratio ̃ = r(2537.27 Å)/r(4359.57 Å) between the round-robin cylinder and the SIO reference, determined via interferometric comparisons with secondary reference gases linked to the primary suite. O 2 240 =0.03397 is a constant sensitivity factor, O 2 is the molar fraction of the SIO reference, CO 2 is a constant (1.0919 per meg/ppm), and ∆CO 2 is the difference in CO2 molar fraction from the SIO reference (363.29 μmol mol -1 ). SIO data are routinely corrected for CO2 interference. The sensitivity O 2 and interference factors (e.g., Ar/N 2 = −0.0124) in Eq. (13) are based on refractivity data for the pure gases and natural air (Keeling, 1988b. SIO applies additional corrections for Ar/N2, Ne, He, Kr, Xe, CH4, N2O, and CO. The additional corrections are effectively constant (or small) in 245 natural air. They can usually be neglected in comparisons of natural air samples. However, these corrections cannot be neglected in relating the SIO scale to an absolute O2/N2 reference based on the round-robin cylinders, which differ in their Ar/N2 ratios from natural air and which lack constituents other than N2, O2, Ar, and CO2. These corrections require estimates of the molar Ar/N2 ratio and other gases' abundances in typical background air. Notably, the primary reference gases are relevant in Eq. (13) as references for relative refractivity. Therefore, the exact Ar/N2 ratio and abundances of other gases in 250 the SIO reference are not directly relevant. For background air, the following values were adopted: Ar/N2 = 0.0119543, Ne/N2 = 2.328 × 10 −5 , He/N2 = 6.71×10 −6 , Kr/N2 = 1.46×10 −6 , Xe/N2 = 1.11×10 −7 , CH4 = 1.8 μmol mol -1 , N2O = 0.3 μmol mol -1 , CO = 0.1 μmol mol -1 . Here, Ar/N2 is from Aoki et al. (2019), and the other (noble gas)/N2 ratios are from Glueckhauf (1951). using Xe data from Kronjäger (1936) (also see Keeling et al., 2020). The quantity δ(Ar/N2) was computed using the AIST gravimetric data, δ(Ar/N2) = ((Ar/N2)grav/0.0119543 −1). 255 The Ar/N2 interference (− Ar/N 2 • (Ar/N 2 )) ranges from −55 to + 24 per meg, depending on the round-robin cylinder. The sum of the remaining interferences, other than for CO2 (-other interferences), is effectively constant at −14.3 per meg. The largest individual contributions are from Ne (−32.8 per meg) and CH4 (+11.9 per meg).

Stability of δ(O2/N2) During Intercomparison
The δ(O2/N2)round-robin values were measured four times using the mass spectrometer by EMRI/AIST to evaluate the stability of the O2/N2 ratios of the standard mixtures during the intercomparison experiment. The initial δ(O2/N2)round-robin values in the measurement of four times were used as the EMRI/AIST assigned values. The δ(O2/N2)round-robin values were calculated against the EMRI/AIST scale. The EMRI/AIST scale's stability was evaluated by measuring the values of δ(O2/N2) in three working 265 reference air against the primary reference air from 2012 to 2020. The changing rate and their standard deviation of δ(O2/N2) in the respective cylinders were 0.08 ± 0.11 per meg yr −1 on average. Therefore, the working standards show no systematic trend in δ(O2/N2) regarding the primary reference air. The temporal drifts analysed in March 2018 (before shipment) ranged from −5.9 to 5.5 per meg. This range was within the expanded uncertainty (6.4 per meg) of the measurement which was estimated based on standard uncertainty of δ(O2/N2) value measured using the mass spectrometer of EMRI/AIST. Here the expanded uncertainty (a coverage factor of 2) represents ≈ a 95% level of confidence. The temporal drifts analysed in March 2019 (after the cylinders' return from SIO) ranged from −16.4 275 per meg to 2.9 per meg. This range was larger than the expanded uncertainty of the measurement.
We also analysed the round-robin standard mixtures in March 2020 (a year after return) and found that the temporal drifts ranged from −18.3 per meg to −5.6 per meg. The δ(O2/N2)round-robin values decreased with time in all cylinders, especially for cylinder no. CPB16379. The average decreasing rate of the δ(O2/N2)round-robin values in the cylinders, except for CPB16379, was −3.2 ± 1.1 per meg yr −1 . Meanwhile, that of the CPB16379 cylinder was −6.7 ± 2.1 per meg yr −1 . The decreasing rates 280 and standard deviations were calculated from least-square fitting. The decrease in the δ(O2/N2)round-robin values during the intercomparison experiment are thought to be caused by O2 consumption by the oxidation of residual organic material, oxidation of the inner surface of the cylinders, and the difference in adsorption/desorption between N2 and O2 on the inner surface of the cylinders rather than the fractionation of N2 and O2 since the escape of gas from the cylinder generally increases the O2/N2 in a cylinder (Langenfelds et al., 1999). We corrected the temporal drifts during the intercomparison experiment by 285 linearly interpolating the δ(O2/N2)NMIJ/AIST value of the date analysed by individual laboratories using the temporal drifts measured before and after the analysis of individual laboratories. The correction was performed in each cylinder separately.
We evaluated the NMIJ/AIST scale's reproducibility at EMRI/AIST using nine high-precision standard mixtures prepared in different periods (from April 2017 to February 2020). Figure 2 shows the relations between the δ(O2/N2)NMIJ/AIST values 290 gravimetrically determined by NMIJ/AIST and the δ(O2/N2) values measured using the mass spectrometer at EMRI/AIST. Figure 2a represent the Deming least-square fit to the data, and Figure 2b shows residuals of δ(O2/N2)NMIJ/AIST from the line. The error bar represents the expanded uncertainty of the δ(O2/N2)NMIJ/AIST values. All residuals were within the expanded uncertainties of less than 8 per meg, which showed that the NMIJ/AIST scale could be reproduced any time by preparing high-precision standard mixtures. This show that an absolute long-term temporal stability of each laboratory's 295 δ(O2/N2) scale, which is determined against a reference natural air in a high-pressure cylinder, can be evaluated by comparing the reference air with high-precision standard mixtures prepared by NMIJ/AIST at interval.  Table 3 are corrected for the deviations in Ar/N2 ratios and isotopic ratios of N2 and O2 in the round-robin standard mixtures 300 from the atmospheric values and determined against their scales, as described in Section 2.3.  Table 4). The deviations from 1 for the slopes of the lines represent the differences from the NMIJ/AIST scale's span sensitivity, of which their relative values were −0.11 ± 0.10, −0.10 ± 0.13, 3.39 ± 0.13, and 0.93 ± 0.10 % for EMRI/AIST, TU, NIES, and SIO, respectively. The intercepts of the lines represent the differences between individual laboratory scales and the NMIJ/AIST scale corresponding to δ(O2/N2)NMIJ/AIST = 0: 65.8 ± 2.2, 425.7 ± 3.1, 404.5 ± 3.0, and 596.4 ± 2.4 per meg for EMRI/AIST, TU, NIES, and SIO, respectively. The numbers following the symbol ± represent the standard uncertainties 310 which were calculated based on the Deming least-square fit. The differences in intercepts between individual scales reflect those of O2 mole fractions in the laboratory's reference air.

Intercomparison Between Laboratory Scales and Their Span Sensitivities
The differences in the intercepts between SIO and other laboratories were −530.6 ± 3.3, −170.8 ± 3.9, and −191.9 ± 3.9 per meg for EMRI/AIST, TU, and NIES, respectively. The differences of NIES and TU from SIO were consistent with those obtained from a past intercomparison experiment, which is the GOLLUM exercise coordinated by SIO and the University of 315 East Anglia from -2014(GOLLUM, 2015, WMO, 2003, within their uncertainties (Table 4). Figure 3b shows the residuals from the fitting lines. The error bar represents the expanded uncertainty which was calculated based on the standard uncertainties of δ(O2/N2) values measured by individual laboratories. All of them fall within expanded uncertainties.

Budget Analysis
The goal of this study is to make the observational data from different laboratories directly comparable. We compared the and δ{(O2+Ar)/N2} which were measured using the mass spectrometer and GC/TCD equal to δ(O2/N2) in Eq. (1). Figure 4a  330 shows the δ(O2/N2) values reported on the NIES and EMRI/AIST scales. The average difference in the δ(O2/N2) between the two scales was −329.3 ± 6.9 per meg (subtracting the δ(O2/N2) values of EMRI/AIST from those of NIES). The uncertainty represents the standard deviation of the differences. Both values of δ(O2/N2) were converted to the NMIJ/AIST scale using Eq.
where an and bn are the slope and intercept of each laboratory's line (n) obtained in Section 3.2. Figure 4b shows the converted δ(O2/N2) values. The average difference and the standard deviation in the converted δ(O2/N2) between the two scales was −6.6 ± 6.8 per meg, which showed that this scale conversion reduced the bias between the δ(O2/N2) values of EMRI/AIST and 340 NIES. The bias dropped within the standard deviation, although it was more than the compatibility goal of 5 per meg for the O2/N2 ratio measurement. Figures 5a and 5b plot both values of δ(O2/N2) before and after the scale conversion, confirming the compatibility between the span sensitivities on the EMRI/AIST and NIES scales. The lines represent a Deming least-square fit to the scatter plots. The slope of the line before scale conversion and its standard uncertainty are 0.956 ± 0.015, consistent with the difference in the span sensitivity between both scales (0.9989/1.0339 = 0.966) within uncertainty. After the scale 345 conversion, the slope and its standard uncertainty are 0.990 ± 0.015, identifying that the scale conversion improved the difference in the span sensitivity between the EMRI/AIST and NIES scales to the NMIJ/AIST scales.
Observing the long-term trend in atmospheric δ(O2/N2) provides critical information on the global CO2 budget (Manning and Keeling, 2006). Recently, Tohjima et al. (2019) estimated the land biospheric and oceanic CO2 uptakes using the average changing rate of atmospheric O2/N2 ratio and CO2 molar fraction reported on the NIES scale. We converted the changing rate 350 of δ(O2/N2) on the NIES scale to that on the NMIJ/AIST scales and recalculated the global CO2 budgets from 2000 to 2016 using the converted rates. Table 5 summarises the CO2 budgets reported by Tohjima et al. (2019) and recalculated by this study. Notably, the fossil fuel-derived CO2 emissions and the global average of the atmospheric CO2 molar fractions used for the CO2 budget calculation are the same as those used in the Global Carbon Project for estimating the global carbon budget in 2020 (Friedlingstein et al., 2020). 355 We found a decrease and increase of 0.30 Pg yr −1 to the land biospheric and oceanic CO2 uptakes due to the scale conversions as shown in Table 5 scale error from the span calibration of the O2/N2 analyser which is 2% on δ(O2/N2) contribution. They also mentioned that the error would be reduced via within-lab and inter-lab comparisons. Therefore, if the scale error is corrected using the span offset and the standard uncertainty of SIO scales against the NMIJ/AIST absolute scale obtained from the intercomparison experiment, the scale error may reduce from 2 % to 0.1 %, which should improve the accuracy of the OHC increase estimate significantly. 365

Conclusions
The intercomparison experiment was used to evaluate the relationship between the measured δ(O2/N2) values and span sensitivities of the individual laboratory scales from the NMIJ/AIST scale using gravimetrically prepared high-precision standard mixtures. The relative deviations in span sensitivity of the EMRI/AIST, TU, NIES, and SIO scales against the NMIJ/AIST scale were −0.11 ± 0.10, −0.10 ± 0.13, 3.39 ± 0.13, and 0.93 ± 0.10 %, which were quantified for the first time. 370 The largest offset corresponded to the 0.30 Pg yr −1 decrease and increase in global estimates for land biospheric and oceanic CO2 uptakes, which are not negligible. The deviations in the measured δ(O2/N2) values on the EMRI/AIST, TU, NIES, and SIO scales from the NMIJ/AIST scale corresponding to δ(O2/N2)NMIJ/AIST = 0 were 65.8 ± 2.2, 425.7 ± 3.1, 404.5 ± 3.0, and 596.4 ± 2.4 per meg, respectively. The differences between individual absolute values were consistent with the results from the GOLLUM round-robin cylinder comparison. However, the δ(O2/N2) values in the five round-robin standard mixtures 375 decreased at rates of −6.7 ± 2.1 per meg yr −1 for one cylinder and −3.2 ± 1.1 per meg yr −1 for the other four cylinders. The decrease suggests that it is necessary to evaluate long-term stability of laboratory's scale absolutely to link future δ(O2/N2) values. The O2/N2 ratios in high-precision standard mixtures prepared in different periods by NMIJ/AIST are reproduced within the O2/N2 ratios' uncertainty, identifying that the NMIJ/AIST scale can be reproduced any time by preparing highprecision standard mixtures. Further, a long-term temporal drift of each laboratory's scale can be evaluated by comparing the 380 reference air with high-precision standard mixtures prepared by NMIJ/AIST. Finally, we demonstrated that the differences between δ(O2/N2) on the EMRI/AIST and NIES scales in flask samples collected at HAT became consistent within uncertainty by converting both scales to the NMIJ/AIST scale, although the bias of −6.6 ± 6.8 per meg is not negligible. The results obtained in this study should improve the estimation method of carbon budgets and OHC increase through more precise estimation of the atmospheric δ(O2/N2) trend. The span sensitivities of the laboratory O2/N2 scales will be able to be absolutely 385 evaluated by calibrating the cylinders based on the NMIJ/AIST scale if the GOLLUM will be performed using cylinders with sufficient different O2/N2 ratios. We expect that the compatibility goal of 5 per meg for the O2/N2 measurement is accomplished by comparing individual laboratory scale with absolute scale such as NMIJ/AIST scale.   (Aoki et al., 2019). However, the gravimetric values of N2, O2, Ar, and CO2 molar fractions were recalculated based on the cylinders' expansion rate, which was determined by measuring change of water volume with depletion of inner pressure of the cylinders sunk in water from 110 5 bar to 1 bar. The value was determined as 1.62 ± 0.06 ml MPa −1 by our experiment (unpublished data) and used to correct buoyancy of cylinders. b The numbers following the symbol ± denote the standard uncertainty of the gravimetric value which was calculated according to the law of propagation of uncertainties.

Figure 1
The temporal drift of δ(O2/N2)round-robin values from the initial values was measured using a mass spectrometer at EMRI/AIST after preparing the round-robin standard mixtures before the shipment of the cylinders to SIO, after the return of the cylinders from SIO, and a year after the return.    Table 4. Slopes and intercepts of the lines obtained by the Deming least-square fit to the reported δ(O2/N2)round-robin values for individual laboratories, and deviation in the individual scales from SIO in this study and the GOLLUM.

Institutes
Slopes ( Numbers following the symbol ± denote the standard uncertainty. The uncertainties of slopes and intercepts were calculated 45 based on the Deming least-square fit. a Slope represents the difference in span sensitivity between individual laboratory scales and the NMIJ/AIST scale. Conversion of NIES and EMRI/AIST scales to NMIJ/AIST scale Average difference: −329.3 ± 6.9 per meg Average difference: −6.6 ± 6.8 per meg