How bias correction goes wrong: Measurement of XCO2 affected by erroneous surface pressure estimates

All measurements of XCO2 from space have systematic errors. To reduce a large fraction of these errors, a bias correction is applied to XCO2 retrieved from GOSAT and OCO-2 spectra using the ACOS retrieval algorithm. The bias correction uses, among other parameters, the surface pressure difference between the retrieval and the meteorological reanalysis. Relative errors in the surface pressure estimates, however, propagate nearly 1:1 into relative errors in bias-corrected XCO2 . For OCO-2, small errors in the knowledge of the pointing of the observatory (up to ∼130 arcsec) introduce a bias in XCO2 in regions with 5 rough topography. Erroneous surface pressure estimates are also caused by a coding error in ACOS version 8, sampling meteorological analyses at wrong times (up to three hours after the overpass time). Here, we derive new geolocations for OCO-2’s eight footprints and show how using improved knowledge of surface pressure estimates in the bias correction reduces errors in OCO-2’s v9 XCO2 data.

near Lauder, New Zealand showed strong sensitivity to (different) estimates of the pointing of OCO-2, introducing an apparent topography related bias in the data. Finally, due to atmospheric tides, the estimate of the surface pressure is sensitive to when the meteorological reanalysis is sampled. Given the precision we need to achieve in X CO2 measurements, seemly insignificant issues can not necessarily be ignored. For example, the mean canopy height of the Amazon rain forest is ∼25 m (Benson et al., 2016) and might vary temporally due to fires or deforestation. Furthermore, the usual tidal range in the open ocean is ∼0.5 5 m but coastal tidal ranges can reach up to 12 m (NOAA, last access: Aug. 2018). At sea level, altitude variations of ∼8 m correspond to changes in surface pressure of ∼1 hPa. This might introduce errors in X CO2 in the order of ∼0.4 ppm.
In this analysis, we address two issues with the OCO-2 v8 estimate of surface pressure: erroneous surface pressure values from the meteorological reanalysis due to small miss-specifications of the geolocations of OCO-2's eight footprints in the instrument-to-spacecraft pointing offsets, and erroneous surface pressure estimates due to sampling the meteorological 10 reanalysis at incorrect times. We illustrate how, using improved knowledge of the surface pressure, we can improve the bias correction and reduce errors in X CO2 . The resulting hybrid product which uses v8 retrieval results with a revised bias correction using updated surface pressure estimates is labeled as version 9 (v9). This paper is structured as follows: section 2 describes the impact of erroneous surface pressure estimates in the bias correction on X CO2 estimates. New footprint geolocations for OCO-2 are derived in section 3. Section 4 introduces the revised parametric bias correction in v9 and discusses changes in the 15 v9 filtration scheme. Section 5 gives a brief evaluation of the OCO-2 v9 data product and illustrates changes and improvements of v9 over v8 X CO2 on regional and global scales.
2 Biases in OCO-2 X CO2 due to erroneous surface pressure estimates OCO-2 v8 X CO2 estimates are derived using the ACOS retrieval algorithm. The algorithm uses optimal estimation to solve for parameters of the state vector to obtain the best match to spectra recorded in OCO-2's three spectral bands. The state vector 20 includes, among other parameters, the surface pressure which is primarily derived from information retrieved from the O 2 A-band. The prior surface pressure is taken from the GEOS-5 Forward Processing for Instrument Teams Atmospheric Data Assimilation System (GEOS5-FP-IT; Suarez et al., 2008;Lucchesi, 2013) and is sampled at the geolocation of each OCO-2 sounding. Surface pressure and prior surface pressure are used in the bias correction of X CO2 . The OCO-2 bias correction addresses three types of biases: footprint dependent biases, parameter dependent biases, and a global scaling of X CO2 to the 25 World Meteorological Organization (WMO) trace-gas standard scale using comparisons to the Total Carbon Column Observing Network (TCCON; Wunch et al., 2011a). An overview of the three different bias correction terms is given in Mandrake et al. (2015), Wunch et al. (2017b), andO'Dell et al. (2018).
Biases in OCO-2 X CO2 due to erroneous surface pressure estimates were initially illustrated in OCO-2 observations over Lauder, New Zealand (Fig. 10 in Wunch et al., 2017b). The Lauder TCCON site is situated in a remote area with no urban 30 sources of X CO2 nearby (Pollard et al., 2017). The area is dominated by rolling hills with mountain ridges spanning from southwest to northeast, almost perpendicular to the ground-track of the observatory (southeast to northwest). The terrain changes up to ±200 m in altitude over small distances (see Fig. 2, upper panel). The middle panel of Fig. 2 shows X CO2 enhancements retrieved by the ACOS algorithm (version 8) over Lauder for a target observation on February 17, 2015. No bias correction is applied here. X CO2 estimates are uniformly distributed over the observed scene with a mean value of 393.58 ppm and a standard deviation of 0.92 ppm. The lower panel of Fig. 2 shows OCO-2 X CO2 estimates after the v8 bias correction is applied.
The bias correction changes the mean value to 395.95 ppm and increases the standard deviation to 1.35 ppm. Bias corrected X CO2 enhancements vary up to ±3 ppm over the observed scene. The bias is spatially correlated with the underlying topog-5 raphy, more precisely, with the topographic slopes. The observed bias is introduced by erroneous values of the prior surface pressure in the dP term (the difference between the retrieved surface pressure and the prior surface pressure) in the parametric bias correction. The parametric bias correction accounts for spurious variability in X CO2 which correlates with retrieval parameters like albedo, retrieval aerosol quantities, or surface pressure. A multivariate regression is performed between spurious X CO2 variability and the parameters that account for the largest variance in the data to correct for these errors (Wunch et al.,10 2011b; Mandrake et al., 2015;O'Dell et al., 2018). The erroneous values of the prior surface pressure are caused by small misspecifications in the geolocations of OCO-2's eight footprints in the specified instrument-to-spacecraft pointing. As stated previously, at sea level, a surface pressure difference of 1 hPa corresponds to an altitude difference of ∼8 m. Therefore, in areas like Lauder with steep topography, misspecifications in the pointing of the observatory of a few arcsec can cause the prior surface pressure to be substantially different from the retrieved surface pressure. This introduces errors in bias-corrected 15 X CO2 , typically observed on local scales in areas with highly varying topography.
Another source for erroneous surface pressure estimates in v8 is caused by a temporal sampling error of the surface pressure estimate from the meteorological reanalysis. The prior surface pressure is taken from the GEOS5-FP-IT three-hourly output. A coding error in the meteorological sampling algorithm caused for some soundings the surface pressure estimate to be sampled as much as three hours after the overpass time. This mostly affected soundings of orbits whose first and last sounding fully 20 lies between synoptic GEOS5-FP-IT's three-hourly outputs (0z, 3z, etc); the soundings in such an orbit would be erroneously sampled at the upper bounding synoptic time for that orbit. For example, for an orbit whose soundings lie fully between 6:00 UTC and 9:00 UTC, the OCO-2 meteorological sampling algorithm erroneously samples the GEOS5-FP-IT surface pressure field at 9:00 UTC for each sounding in that orbit. On average, this introduced a mean prior surface pressure error of about +0.5 hPa for affected soundings. In some cases, however, the prior surface pressure error reached up to ± 20 hPa for individual 25 soundings. The sampling error also affects temperature and water vapor. Soundings over land are affected more than over ocean since diurnal surface heating tends to be stronger over land and because the surface pressure bias correction term over land is nearly 50% larger than over water. While the sampling error of the prior surface pressure is easy to correct for via the bias correction by fixing the coding error and re-running the meteorological sampling algorithm, erroneous surface pressure estimates caused by misspecifications in the instrument pointing offsets need greater attention. The core of the OCO-2 instrument is a three-channel grating spectrometer that records spectra of reflected sunlight in the O 2 A-band (0.76 µm), the weak CO 2 band (1.61 µm), and the strong CO 2 band (2.06 µm). The incoming light is guided through a common optics assembly but the light is sampled and focused sequentially and independently onto three spectrometer slits, each 3 mm long and 28 µm wide (Haring et al., 2004;. These long, narrow slits are aligned to produce nominally co-boresighted fields of view. After passing the slit and being spectrally dispersed, the light is focused on a two dimensional focal plane array (FPA) with eight independent readouts along the slits -the so called footprints. Spectra for the three spectral bands and each footprint are recorded simultaneously.

5
To obtain the best estimate for the geolocation of the eight footprints, the following must be known: 1) the location of the spacecraft along the orbit track, 2) the pointing of the instrument boresight relative to a local coordinate system, and 3) the relative pointing of the fields of view (FOV) of the eight footprints in the three spectrometers. A Global Positioning System (GPS) sensor provides the location of the observatory along its orbit track. The on-board star tracker determines the orientation of the observatory relative to fixed stars. The relative alignment of the eight footprints is characterized with respect to the 10 spacecraft body axes. The spatial FOV, defined along the long axis of the slit by the eight footprints, is aligned parallel with the spacecraft y-axis. The boresight of the spectrometer points down the x-axis. The spacecraft z-axis points across the narrow axis of the spectrometer slit, perpendicular to the y-axis (see Fig. 1). For nadir and glint measurements, the z-axis is rotated around the x-axis so it is oriented 30 • (clockwise from above) from the principal plane (i.e. the plane that includes the sun, the surface target and the instrument aperture). To maintain this viewing geometry, the spacecraft slowly rotates counter clockwise (from 15 above) around the x-axis as it travels from the southern terminator, across the sub-solar latitude, to the northern terminator.
South of a latitude that is ∼ 30 • north of the sub-solar latitude, footprint 1 (FP 1) is to the west of footprint 8 (FP 8). North of this latitude, FP 1 is east of FP 8. For target mode observations, the z-axis is always pointed along the spacecraft orbit track, so that FP 1 is always to the west of FP 8. Pre-launch instrument ground-tests were performed to characterize the spatial FOV of each footprint and correction factors -the so called pointing offsets -have been derived and integrated into the geometric 20 calibration algorithm (v0001 configuration, see Fig. 3). The pointing offsets are in the order of hundreds of arcsec. A change in the pointing offsets of, for example, 25 arcsec corresponds to a shift of the instrument FOV of ∼80 m at nadir. During the OCO-2 in-orbit checkout (IOC) period in 2014, lunar measurements were performed and in combination with data from coastal crossings the alignment of the three spectrometer slits was tested. The alignment of the instrument angular footprints in the coordinate system defined by the star tracker was within mission requirements (< 720 arcsec). Updated pointing offsets have 25 been integrated into the geometric calibration algorithm in November 2014 (v0006 configuration, see Fig. 3). The findings in the previous section, however, indicate that a reevaluation of the pointing vector correction factors is desirable.

Methodology
The analysis of the IOC lunar data exposed some deficiencies of its usage in elaborating footprint geolocations. Lunar data is typically taken in so-called single pixel mode when each pixel of the array is read out individually. This is in contrast to 30 normal operations where 20 spatial pixel samples are co-added to form each footprint. In addition, the moon only illuminates a fraction of the FPA. Furthermore, defocus compromises the analysis of the strong CO 2 band results, and the moon only provides positive constraints for the z-axis.
To overcome the aforementioned limitations for the v0006 configuration, the IOC lunar data results were used to constrain the pointing vector for FP 6 and 7, whereas for the other FPs the ground-test results were used. Here, we follow a different approach to derive new pointing offsets. We shift from estimating geolocations with lunar images, which are strictly geometric measurements, to optimizing footprint geolocations with retrieval variables. We utilize the ACOS Level 2 Full Physics (L2FP) algorithm and its associated pre-screeners, the A-band Preprocessor (ABP) and the IMAP-DOAS Preprocessor (IDP) to es-5 timate footprint geolocations. The ABP performs a fast retrieval of surface pressure using the O 2 A-band and assumes that no clouds or aerosols are present. The IDP performs clear-sky fits to the weak and strong CO 2 bands to derive CO 2 columns (Taylor et al., 2016). Using the preprocessors instead of the L2FP algorithm saves computational effort and allows us to study pointing offsets for each spectral band individually. The footprint geolocations for the O 2 A-band are derived by minimizing the variation in the difference between the surface pressure retrieved from the ABP and the meteorological analysis (dP ABP ). 10 The location of the CO 2 band footprints is determined by minimizing the variation in the CO 2 columns divided by the dry air column determined from the meteorological analysis (X CO2,met ). These two metrics are systematically explored for a set of different pointing offsets. The geolocations that provide the smallest standard deviation over a given scene for dP ABP are good estimates for the location of the O 2 A-band. The same holds for the standard deviation of X CO2,met regarding the weak and strong CO 2 band. The assumption here is that there are no significant variations in X CO2 over the field of analysis. This 15 may not be true in regions with large heterogeneous sources (e.g. urban areas) or sinks (vegetated areas) of CO 2 . It is only true for areas with a clean X CO2 background. Therefore, in our analysis we focus on remote desert-like mountainous areas to study pointing offsets.

Training Dataset
We identify two desert areas in the northern and southern hemisphere with topographic relief and frequent clear sky conditions values for different overpasses for each orbit, we normalize all X CO2,met soundings by the orbital mean. The orbital mean is calculated by taking into account all soundings of a particular orbit that are within the latitude and longitude limits of the analyzed scene. The standard deviation of dP ABP and X CO2,met is calculated by taking into account all grid squares in the 30 analyzed latitude and longitude limits. Analyzing data from both hemispheres allows us to check for possible errors introduced by the reversed orientation of the z-and y-axis in the northern and southern hemisphere in our pointing offset derivation (e.g. errors introduced by a timing error).
We run the ABP and IDP for a set of different pointing offsets for which the relative footprint positions of the v0006 configuration are preserved. If not otherwise stated, in the following we refer to the pointing offset of FP 4 of the O 2 A-band when we refer to pointing offset values. For example, if the pointing offset of FP 4 of the O 2 A-band is shifted by +25 arcsec along the y-axis, then all other footprint geolocations are also shifted in the same direction by +25 arcsec along the y-axis (even though their absolute positions differ from the FP 4 O 2 A-band position). The same holds for the z-axis. For the y-axis, 5 we run both algorithms for four different pointing offsets ranging from 175 to 250 arcsec in 25 arcsec steps. For each of these shifts, we also run a set of different offsets for the z-axis, ranging from -250 to +100 arcsec also in 25 arcsec steps. This leads to a total of 60 different geolocation configurations. Figure 5 shows the standard deviation of dP ABP and X CO2,met for FP 4 for all 60 geolocation configurations for the Death 10 Valley National Park. The observed metrics are less sensitive to changes along the footprint axis than along the z-axis. Differences in the standard deviation between neighboring pointing offsets are small, typically < 0.5 hPa for the O 2 A-band and < 0.2 ppm for the two CO 2 bands. This holds for all footprints in the three spectral bands. For example, for FP 2 to 7, the standard deviation of dP ABP is minimized for a pointing offset of 225 arcsec along the footprint axis. A pointing offset of 200 arcsec minimizes the standard deviation of FP 1 and 8. Similar results are derived for the Atacama Desert (not shown here). 15 In general, a pointing offset of 225 arcsec along the footprint axis minimizes the standard deviation of dP ABP and X CO2,met for the majority of the footprints. This offset value is nearly identical to the v0006 configuration (222.4 arcsec). Therefore, we adapt a pointing offset of 225 arcsec along the y-axis for all footprints in the three spectral bands. The absolute pointing offsets along the footprint axis are summarized in Table 1. Figure 6 shows the standard deviation of dP ABP and X CO2,met as a function of the z-axis pointing offsets for FP 4 for the 20 Death Valley National Park (for a pointing offset of 225 arcsec along the footprint axis). The analyzed metrics are strongly sensitive to changes of the pointing offset along this axis. We perform a quadratic regression to determine the best estimate of the location of the minimum. We only take data points into account that are distributed symmetrically around the minimum.

Results
For FP 4, our analysis indicates a minimum at -124 arcsec for the O 2 A-band, -71 arcsec for the weak CO 2 band, and -44 arcsec for the strong CO 2 band. We derive pointing offsets for all other footprints for all three bands in the same way. Figure   25 7 (upper panel) summarizes the z-axis pointing offsets for all footprints for all three bands for the Death Valley National Park and Atacama Desert. On average, the derived pointing offsets for the two areas differ by 13 arcsec for the weak CO 2 band and by 25 arcsec for the strong CO 2 band. For the O 2 A-band the differences between the two areas differ, on average, by 46 arcsec. Footprints 3 to 5 have the largest pointing offset values. This is in agreement with the relative footprint geolocations in the v0006 configuration. We average the derived pointing offsets for the CO 2 bands from both hemispheres. This provides 30 the best estimate for the footprint geolocations globally and takes into account that the z-axis is rotated by nearly 180 • (in glint and nadir mode) when the observatory overpasses the equator. However, for the O 2 A-Band, the difference between the pointing offsets for both areas reaches up to 60 arcsec for FP 2. In addition, the Atacama Desert analysis indicate larger relative pointing variations for neighboring footprints. Therefore, for the O 2 A-band, we only take the derived pointing offsets from the Death Valley National Park analysis into account. Final pointing offsets for all three bands are derived by applying a quadratic regression to the pointing offsets as a function of footprint. This preserves the parabolic shape of the relative footprint positions which is supported by findings from the pre-launch and IOC lunar analysis. The updated pointing offsets for the z-axis for each spectral band are summarized in Table 1.
To evaluate the impact of the updated footprint geolocations we sample the surface pressure from GEOS5-FP-IT with the 5 updated meteorological sampling algorithm (that was corrected for the time sampling error) at the footprint geolocations of the O 2 A-band. The surface pressure is mainly retrieved from the O 2 A-band, therefore sampling the meteorological reanalysis at the O 2 footprint geolocation should yield best surface pressure estimates. Figure 8 shows the prior surface pressure difference between v8 and sampled at the updated footprint geolocations. The striping pattern effect is mainly introduced by the updated sampling algorithm and follows orbital paths. As stated previously, the updated sampling method also introduces a mean bias 10 of +0.5 hPa between the v8 and newly derived surface pressure estimates. Figure 9 shows the change between the standard deviation of the prior surface pressure in each grid box for both sampling methods. The observed structures are mainly driven by changes in the footprint geolocations. The largest changes are over mountainous regions, e.g. the Tibetan Plateau, the Andes, or the U.S. West Coast. This will mostly manifest as local scale changes in X CO2 . As expected, there are no significant changes over ocean due to the updated footprint geolocations. 15 4 The OCO-2 v9 data product Our improved knowledge of OCO-2's footprint geolocations and the update of the meteorological sampling algorithm reduces errors in bias-corrected X CO2 that were introduced through erroneous surface pressure estimates in the v8 bias correction. The OCO-2 v9 data product combines the v8 ACOS L2FP retrieval results with a revised bias correction using updated surface pressure estimates from GEOS5-FP-IT. Moreover, filter limits that define the X CO2 quality flag and warn levels are adjusted 20 leading to a larger number of soundings that pass the filtration. Finally, the global scaling factor that is derived from direct observations over TCCON stations is updated. This section highlights the major changes in OCO-2's v9 X

25
The parametric bias correction accounts for spurious variability in X CO2 that is correlated with parameters in the retrieval state vector (Wunch et al., 2017b;O'Dell et al., 2018). A multivariate regression is performed between spurious X CO2 variations and the parameters that account for the largest fraction of the spurious variability. For all ACOS versions for GOSAT and OCO-2 observations, the mode dependent parametric bias (X CO2,para ) has the following form: Here, c i are regression coefficients which express the sensitivity of X CO2 from the L2FP retrieval (X CO2,raw ) to the selected parameter p i , and p i,ref are the corresponding reference values. In order to obtain bias-corrected X CO2 (X CO2,bc ), Eq. (1) is subtracted from the raw X CO2 retrieved by the L2FP algorithm: Note that we only focus on the parametric bias correction here and neglect the footprint dependent bias correction an global 5 scaling factor for now. To select the parameters and derive the regression coefficients in Eq. (1), different truth proxy training data sets were used for v8: TCCON, Small Area Approximation (SAA), and Multi-Model Median. These truth proxies represent an independent estimate of X CO2 to which we compare OCO-2 X CO2 . A detailed description of the truth proxies is given in Sect. 4.1 in O'Dell et al. (2018). For v8 land observations, three different parameters were identified that account for the largest fraction of variability: co2_grad_del, DWS, and dP. Over ocean, only co2_grad_del, and dP contribute to the parametric bias 10 correction. co2_grad_del represents the tropospheric lapse rate of the retrieved CO 2 profile and is defined as the difference in the retrieved CO 2 between the surface and the retrieval pressure level at 0.6 times the surface pressure, minus the same quantity for the prior profile. DWS represents the combined retrieved optical depth of large particles in the lower-to-middle troposphere in the retrieval, namely dust, water cloud, and sea salt aerosol. In v8, dP is defined as the difference between the retrieved surface pressure and the prior surface pressure from GEOS5-FP-IT. 15 For v9, we define two different dP parameters for observations over land (dP frac ) and ocean (dP sCO2 ) that are used in the parametric bias correction. The revised dP parameters take into account two problems: 1) the misspecifications in the geolocation calibration algorithm for the overall pointing of the observatory and 2) the pointing offsets between the three spectral bands. The first is characterized by the difference between the retrieved surface pressure of the v8 L2FP algorithm (P ret,v8 ) and the prior surface pressure at the new geolocation where the O 2 A-band is pointing (P ap,O2 ). The second is 20 characterized by the difference between the prior surface pressure where the O 2 A-band is pointing and the prior surface pressure where the strong CO 2 band is pointing (P ap,sCO2 ). For ocean, the revised dP parameter has the following form (given in hPa): dP sCO2 = (P ret,v8 − P ap,O2 ) + (P ap,O2 − P ap,CO2 ) 25 This approach allows us to reduce variations in X CO2 due to differences between the retrieved and estimated surface pressure without re-running the L2FP algorithm. Only the prior surface pressure sampled at the geolocation where the CO 2 bands are pointing is needed. Tests have shown that best results are achieved when the prior surface pressure is sampled at the geolocation of the strong CO 2 band. Over land, the revised dP parameter accounts for the fractional change in X CO2 when error is present in surface pressure estimates (given in ppm): 30 dP frac = X CO2,raw 1 − P ap,sCO2 P ret,v8 Here, X CO2,raw represents the v8 X CO2 from the L2FP run when no bias correction is applied. A theoretical motivation for our choice of the dP parameters over land and ocean is given in Appendix A. The definitions of co2_grad_del and DWS remains the same in v9.
Similar to v8, we use three truth proxies to derive the parametric bias correction coefficients for co_grad_del, DWS and the revised dP parameters (see Table 2). Compared to v8, the truth proxy data sets are extended in time to cover the longer 5 OCO-2 data record. For the Multi-Model Median, nine models from the OCO-2 model-intercomparison project (MIP) are used (see Table 3). For all datasets a correction was applied using the OCO-2 averaging kernels based on Connor et al. (2008). We convolve the CO 2 profiles from the truth proxies with the OCO-2 column averaging kernel before we compare it to OCO-2 X CO2 . The parametric bias correction coefficients for v9 are derived from the average of all coefficients derived from the different truth proxies. The adapted coefficients and reference values for land and ocean glint data are summarized in Table   10 4. The dP frac coefficient over land is -0.9. This is in agreement with the theoretical value since a change in surface pressure by ∼1% changes X CO2 by also ∼1% and seems to indicate that the retrieved surface pressure is still not sufficiently accurate to yield the best estimate of X CO2 ; indeed, as shown in X CO2 , the coefficient implies that the optimal surface pressure is a weighted average of the retrieved and prior surface pressure, with the prior surface pressure weight being about 0.9. Figure 10 shows the different contributions of the v9 parametric bias correction to the raw X CO2 .

Quality Filters
Bad soundings (e.g. those affected by clouds, low continuum level signal-to-noise ratio, etc.) are mostly screened out by the ABP and IDP before the ACOS L2FP algorithm performs retrievals. Some soundings that pass the pre-screening criteria, however, show errors in raw X CO2 when compared to the truth proxy training data sets that are too large to provide reliable constraints on CO 2 fluxes. Therefore, threshold limits are defined for several variables to filter out these soundings. A detailed We introduce the new filter variables dP O2 and dP sCO2 , the difference between the retrieved surface pressure and the estimated surface pressure at the geolocations of the O 2 A-band and dP sCO2 as given in Eq.
(3). These variables replace the dP filter variable in v8, which was defined as the difference between the retrieved surface pressure and a mean surface pressure 25 estimate at the geolocation of all three spectral bands. The improved knowledge of the estimated surface pressure values allows us to relax the filter limits for the standard deviation of the surface elevation in the FOV. Figure 9 shows the bias and scatter in X CO2 over land relative to the Multi-Model Median truth proxy data set as a function of the standard deviation of the surface elevation. In v9, the scatter in the X CO2 difference starts to increase for standard deviations of the surface elevation larger than 110 m whereas in v8 the scatter already increases for standard deviations larger than 60 m. Therefore, we extend the 30 rather strict upper filter limit of 60 m in v8 to 110 m. This leads to a larger throughput of soundings in mountainous areas in v9. The parameters Max_Declocking_wco2 and Max_Declocking_sco2 are removed from the v9 filtration scheme over land.
Moreover, filter limits for several other variables changed, e.g. rms_rel_wco2, τ oc , Band 3 albedo, and dP ABP . The revised filter limits for rms_rel_wco2, τ oc , and Band 3 albedo cause a larger throughput for regions with boreal forests at high northern latitudes. The updated limits for τ oc and Band 3 albedo also increase the number of soundings over rain forests. The updated filter limits for dP ABP cause a larger throughput in regions with bright surfaces, e.g. the Saharan desert (see Fig. 12). Overall, 10-15% additional soundings pass the new filtration scheme compared to v8. All v9 filter variables and limits for land and ocean observations are summarized in Table 5. For soundings that pass filtration in both v8 and v9, the quality flag did not change.

Global Scaling factor
The global scaling factor corrects for an overall bias in X CO2 which still remains after filtration and application of the parametric bias correction. The global scaling factor is derived by comparing the OCO-2 data to TCCON measurements which are tied to the WMO scale (e.g. Wunch et al., 2010;Messerschmidt et al., 2010;Geibel et al., 2012). Due to changes in the data filtration and the revised parametric bias correction in v9, the global scaling factor C 0 needs to be updated, too. TCCON 10 stations that are used to derive the global scaling factor are listed in Table. 6.
We use the same geographic and temporal co-location criteria for OCO-2 data from direct overpasses of TCCON stations as in O'Dell et al. (2018). We apply the OCO-2 averaging kernels to TCCON data as discussed in the derivation of the coefficients in the parametric bias correction. The slope of the best fit line (forced through a zero intercept) is calculated using the method described in York et al. (2004). The global scaling factor is roughly the same for the different observational modes over land 15 and ocean. Ultimately, we adapt a value of 0.9954 over land and 0.9953 over ocean in v9 (compared with 0.9958 over land and 0.9955 over ocean in v8).

Brief evaluation of OCO-2 X CO2 data
Here, we evaluate the impact of the changes made in v9 on bias-corrected X CO2 . To explore changes on local scales, we revisit the target observation over Lauder, New Zealand on February 17, 2015. Figure 13 shows both v8 and v9 bias-corrected 20 X CO2 . The improved knowledge of the prior surface pressure with the revised parametric bias correction clearly reduces the correlation between X CO2 and the underlying topography in v9. X CO2 values are distributed more uniformly over the observed scene. The standard deviation is reduced from 1.35 ppm in v8 to 0.74 ppm in v9. A small topography related bias is still apparent. However, compared to v8, it is a factor of two improvement in reducing biases caused by erroneous surface pressure estimates.
25 Figure 14 shows the absolute change in bias-corrected X CO2 between v8 and v9 globally. The observed changes are mainly driven by three factors: the updated meteorological sampling algorithm, improved knowledge of the footprint geolocations, and the revised parametric bias correction. In analogy to Fig. 8, the striping patterns follow orbital paths and are caused by the updated meteorological sampling algorithm. Differences over mountainous regions like the Tibetan Plateau or the Andes are driven by the improved knowledge of the prior surface pressure due to the updated footprint geolocations. The revised dP frac 30 parameter in the parametric bias correction over land also introduces changes in regions at high altitudes but not necessarily with highly variable topography (e.g. South Africa). In addition, the v9 global scaling factor introduces a systematic difference of approximately +0.15 ppm between v8 and v9.

Conclusions
The update of the pointing vector that is used to derive the geolocation for OCO-2's eight footprints, together with an update of the meteorological sampling algorithm that corrects for a temporal sampling coding error, provides a better estimate for the 5 surface pressure in OCO-2's v9 data product. Biases in X CO2 due to erroneous surface pressure estimates are clearly reduced in regions with rough topography. For example, over Lauder, New Zealand, the standard deviation of bias-corrected X CO2 is reduced by almost a factor of two when the updated surface pressure estimates are used in the revised parametric bias correction that accounts for misspecifications in the instrument pointing offsets.
Accurate knowledge of the surface pressure and its estimate is crucial to retrieve X CO2 accurately and many challenges       Appendix A: Theoretical motivation of dP parameters in the v9 parametric bias correction Column-averaged dry-air mole fractions of CO 2 are defined as the total column of CO 2 (C CO2 ) divided by the dry air column (C dryair ): C dryair is defined as: Here, P is the surface pressure, g 0 the gravitational acceleration, C H2O the total column of water vapour, m dryair the mean molecular weight of dry air, and m H2O the molecular weight of water vapor. The surface pressure P true can be written as: P true = a · P ap + (1 − a) · P ret (A3) P ap and P ret represent the prior and retrieved surface pressure, respectively. The parameter a is the fractional weight given to 10 the prior in the assumed surface pressure. A value of a = 0 means that we completely trust the retrieval, a = 1 means that we completely trust the prior. Because of retrieval biases, the true surface pressure is generally close to the prior surface pressure, such that a ≈ 0.9. For a start, we neglect the contribution of the total column of water vapor. Then the dry air column is directly proportional to the surface pressure and we can write: For bias-corrected X CO2 we can write: X CO2,bc ∝ C CO2 a · P ap + (1 − a) · P ret = X CO2,raw · P ret a · P ap + (1 − a) · P ret = X CO2,raw a · (P ap /P ret ) + (1 − a) = X CO2,raw 1 − a (1 − P ap /P ret ) Taylor expansion in x = a (1 − P ap /P ret ) around x = 0 leads to: 20 X CO2,bc ∝ X CO2,raw + a · X CO2,raw · 1 − P ap P ret dP frac The second term in Eq. (A6) is identical to the dP frac parameter that is used in the v9 parametric bias correction over land (see Sect. 4.1). Here, a represents the coefficient for the dP frac parameter in the parametric bias correction over land. Comparing Eq. (A6) to Eq. (2), if p 1 = dP frac , then c 1 = −a. Further, if we assume that relative variations in X CO2,raw /P ret are small compared to relative variations in (P ret − P ap ), then we can simplify to: 25 X CO2,bc = X CO2,raw + a · (P ret − P ap )