<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0" article-type="research-article">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">AMT</journal-id><journal-title-group>
    <journal-title>Atmospheric Measurement Techniques</journal-title>
    <abbrev-journal-title abbrev-type="publisher">AMT</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Atmos. Meas. Tech.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1867-8548</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/amt-18-673-2025</article-id><title-group><article-title>Forward model emulator for atmospheric radiative transfer using Gaussian processes and cross validation</article-title><alt-title>Radiative transfer emulator</alt-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes" rid="aff1">
          <name><surname>Lamminpää</surname><given-names>Otto</given-names></name>
          <email>otto.m.lamminpaa@jpl.nasa.gov</email>
        <ext-link>https://orcid.org/0000-0002-7784-3925</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Susiluoto</surname><given-names>Jouni</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Hobbs</surname><given-names>Jonathan</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-1679-0898</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>McDuffie</surname><given-names>James</given-names></name>
          
        <ext-link>https://orcid.org/0000-0002-9408-5695</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Braverman</surname><given-names>Amy</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2">
          <name><surname>Owhadi</surname><given-names>Houman</given-names></name>
          
        </contrib>
        <aff id="aff1"><label>1</label><institution>Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>Computing and Mathematical Sciences Department, California Institute of Technology, Pasadena, CA, USA</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Otto Lamminpää (otto.m.lamminpaa@jpl.nasa.gov)</corresp></author-notes><pub-date><day>6</day><month>February</month><year>2025</year></pub-date>
      
      <volume>18</volume>
      <issue>3</issue>
      <fpage>673</fpage><lpage>694</lpage>
      <history>
        <date date-type="received"><day>4</day><month>April</month><year>2024</year></date>
           <date date-type="rev-request"><day>3</day><month>May</month><year>2024</year></date>
           <date date-type="rev-recd"><day>19</day><month>November</month><year>2024</year></date>
           <date date-type="accepted"><day>28</day><month>November</month><year>2024</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2025 Otto Lamminpää et al.</copyright-statement>
        <copyright-year>2025</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025.html">This article is available from https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025.html</self-uri><self-uri xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025.pdf">The full text article is available as a PDF file from https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025.pdf</self-uri>
      <abstract><title>Abstract</title>

      <p id="d2e133">Remote sensing of atmospheric carbon dioxide (CO<sub>2</sub>) carried out by NASA's Orbiting Carbon Observatory-2 (OCO-2) satellite mission and the related uncertainty quantification effort involve repeated evaluations of a state-of-the-art atmospheric physics model. The retrieval, or solving an inverse problem, requires substantial computational resources. In this work, we propose and implement a statistical emulator to speed up the computations in the OCO-2 physics model. Our approach is based on Gaussian process (GP) regression, leveraging recent research on kernel flows and cross validation to efficiently learn the kernel function in the GP. We demonstrate our method by replicating the behavior of OCO-2 forward model within measurement error precision and further show that in simulated cases, our method reproduces the CO<sub>2</sub> retrieval performance of OCO-2 setup with computational time that is orders of magnitude faster. The underlying emulation problem is challenging because it is high-dimensional. It is related to operator learning in the sense that the function to be approximated maps high-dimensional vectors to high-dimensional vectors. Our proposed approach is not only fast but also highly accurate (its relative error is less than 1  %). In contrast with artificial neural network (ANN)-based methods, it is interpretable, and its efficiency is based on learning a kernel in an engineered and expressive family of kernels.</p>
  </abstract>
    </article-meta>
  <notes notes-type="copyrightstatement">
  
      <p id="d2e161">© 2025 California Institute of Technology. Government sponsorship acknowledged.</p>
</notes></front>
<body>
      


<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d2e172">Climate change, one of the most significant global environmental challenges, is primarily attributed to anthropogenic carbon emissions, which have accelerated  the increase of carbon dioxide (CO<sub>2</sub>) in the atmosphere, posing a threat to Earth's future. The industrial revolution marked the onset of increased CO<sub>2</sub> emissions due to the extensive use of fossil fuels in various industries, such as transportation, manufacturing, and agriculture. The Intergovernmental Panel on Climate Change underscores CO<sub>2</sub>'s potent effect on planetary warming due to significant radiative forcing <xref ref-type="bibr" rid="bib1.bibx26" id="paren.1"/>. The atmospheric concentration of this trace gas is increasing at an ever faster rate, and as of May 2023, the measured CO<sub>2</sub> at Mauna Loa station was 424.0 ppm, a 3 ppm increase from a year before (421.0 ppm in May 2022). Although the global terrestrial biosphere and oceans each take up about 25 % of these emissions <xref ref-type="bibr" rid="bib1.bibx20" id="paren.2"/>, this balance may not be sustainable, which might lead to unpredictable feedbacks in the carbon cycle and the global climate system. These couplings between the Earth's climate system and the carbon cycle can introduce significant uncertainty in future climate change projections <xref ref-type="bibr" rid="bib1.bibx19" id="paren.3"/>, which further renders mitigation efforts increasingly challenging.</p>
      <p id="d2e221">For reliable climate modeling and future scenario prediction, it is crucial to estimate carbon flux accurately (e.g., CarbonTracker, <xref ref-type="bibr" rid="bib1.bibx51" id="altparen.4"/>), which involves quantifying both the sources and natural sinks of carbon. However, current in situ measurement networks are primarily deployed in the northern midlatitudes, leaving areas like the tropics underrepresented <xref ref-type="bibr" rid="bib1.bibx57" id="paren.5"/>. This lack of extensive coverage results in large uncertainties in flux estimates, underscoring the need for a more comprehensive global measurement network.</p>
      <p id="d2e230">To provide a significant increase in coverage and resolution to the ground-based data set, global estimates of total column-averaged mole-fraction CO<sub>2</sub> (average amount of CO<sub>2</sub> over a vertical column of air at a specific ground pixel/location), denoted XCO<sub>2</sub>, are collected using satellite-borne spectrometers. These instruments include the Japanese Greenhouse gases Observing SATellite (GOSAT, <xref ref-type="bibr" rid="bib1.bibx33" id="altparen.6"/>), operational since January 2009; the follow-on GOSAT-2 <xref ref-type="bibr" rid="bib1.bibx24" id="paren.7"/> launched in October 2018; the Orbiting Carbon Observatory-2 from NASA (OCO-2, <xref ref-type="bibr" rid="bib1.bibx12" id="altparen.8"/>), launched in July 2014; the OCO-3 instrument <xref ref-type="bibr" rid="bib1.bibx17" id="paren.9"/> taken to the International Space Station in May 2019; and the Chinese TanSat <xref ref-type="bibr" rid="bib1.bibx53" id="paren.10"/> and TanSat 2 <xref ref-type="bibr" rid="bib1.bibx65" id="paren.11"/>. Planned future missions include the Geostationary Carbon Cycle Observatory (GeoCarb, <xref ref-type="bibr" rid="bib1.bibx42" id="altparen.12"/>), the European CO<sub>2</sub> Monitoring Mission (CO2M, <xref ref-type="bibr" rid="bib1.bibx58" id="altparen.13"/>), and the Global Observing Satellite for Greenhouse gases and Water cycle (GOSAT-GW, <xref ref-type="bibr" rid="bib1.bibx30" id="altparen.14"/>). In this work, we focus exclusively on OCO-2, which, like all the abovementioned missions, measures solar radiance at the top of the atmosphere, reflected by Earth's surface and attenuated by atmospheric scattering and absorption by trace gases and aerosols. From these observed radiances, the OCO-2 mission uses a framework called <italic>optimal estimation</italic> (OE, <xref ref-type="bibr" rid="bib1.bibx55" id="altparen.15"/>) to solve the related Bayesian inverse problem (see, e.g., <xref ref-type="bibr" rid="bib1.bibx28" id="altparen.16"/>), referred to as a retrieval. OE is an iterative algorithm, returning an estimate of posterior mean and covariance as a Gaussian approximation to the nonlinear retrieval problem. Operationally, the retrieval problem is solved using the Atmospheric Carbon Observations from Space (ACOS) software <xref ref-type="bibr" rid="bib1.bibx45" id="paren.17"/>, which implements OE using a state-of-the-art atmospheric full-physics (FP) model. Processing OCO-2 measurements with the ACOS algorithm is a computationally intensive task, and currently about one-third of prescreened clear soundings are used in a low-latency data processing stream <xref ref-type="bibr" rid="bib1.bibx45" id="paren.18"/>. As the data record grows, computational speed is also a major hindrance for retrospective processing of the full collection of cloud-free soundings for the current and any future improved algorithms. Thus, computational efficiency is a limiting factor in releasing the improved data to the user community. These issues are certain to get even worse with upcoming wider-swath missions like CO2M and GOSAT-GW, as evidenced by another greenhouse gas imaging mission, Tropospheric Ozone Monitoring Instrument (TROPOMI, <xref ref-type="bibr" rid="bib1.bibx64" id="altparen.19"/>), from regularly reprocessing their data record, which is more than 20 times greater in size than that of OCO-2.</p>
      <p id="d2e317">As with all inverse problems, some approximations and assumptions have to be made in the ACOS algorithm. The resulting XCO<sub>2</sub> estimates have to be validated and bias-corrected using ground-based measurements from the Total Carbon Column Observing Network (TCCON, <xref ref-type="bibr" rid="bib1.bibx66" id="altparen.20"/>) and the COllaborative Carbon Column Observing Network (COCCON, <xref ref-type="bibr" rid="bib1.bibx18" id="altparen.21"/>) as a reference. These sites are concentrated on the northern midlatitudes, and as a result of this coverage issue and the imperfections in the FP model, significant systematic errors persist in the data set. (See, e.g., <xref ref-type="bibr" rid="bib1.bibx31" id="altparen.22"/>, for the effect of systematic errors and <xref ref-type="bibr" rid="bib1.bibx10" id="altparen.23"/>, for an overview of statistical treatment of and issues in the retrieval.) Considerable effort has been exerted to tackle the high accuracy (less than 0.3 parts per million (ppm) in scenes with background levels of around 410 ppm) and high precision (standard errors less than 0.5 ppm) requirements of ingesting OCO-2 into flux inversion, which is the primary application of the data product <xref ref-type="bibr" rid="bib1.bibx21 bib1.bibx49 bib1.bibx37 bib1.bibx47 bib1.bibx14 bib1.bibx50 bib1.bibx6" id="paren.24"/>. Recent advancements in applying Markov chain Monte Carlo (MCMC, <xref ref-type="bibr" rid="bib1.bibx5 bib1.bibx35" id="altparen.25"/>) for non-Gaussian posterior characterization and simulation-based uncertainty quantification <xref ref-type="bibr" rid="bib1.bibx3" id="paren.26"/> for capturing the overall uncertainty in the retrieval pipeline have been successfully deployed for addressing persisting retrieval errors. These methods, although comprehensive, suffer equally from computational speed issues as they require an extensive number of FP evaluations.</p>
      <p id="d2e352">Computational speed issues in OE retrievals have been addressed in several ways. Neural network (NN)-based machine learning approaches <xref ref-type="bibr" rid="bib1.bibx16 bib1.bibx41 bib1.bibx4" id="paren.27"/> have been implemented to a combination of real-world radiance data and model atmospheres (outputs of computational atmospheric models, like the Copernicus Atmospheric Monitoring Services (CAMS) model; <xref ref-type="bibr" rid="bib1.bibx7" id="altparen.28"/>). The OCO-2 forward model itself was sped up by using a surrogate model <xref ref-type="bibr" rid="bib1.bibx22" id="paren.29"/> that only partially considered the physical processes present in the FP model and more recently by using a Gaussian process (GP) emulator <xref ref-type="bibr" rid="bib1.bibx39" id="paren.30"/> for replicating the output of the FP model. In this paper, we will take a similar approach using GPs but with several improvements and an application to solving the retrieval problem with the help of closed-form Jacobians required in the gradient-based algorithm. GPs are a well-suited technique for forward model emulation, since they can be used with less data (130 K,  <xref ref-type="bibr" rid="bib1.bibx16" id="altparen.31"/>, for NN vs. 20 K for GP; see Sect. <xref ref-type="sec" rid="Ch1.S4"/>) and trained more quickly than NN-based approaches. Additionally, a GP provides uncertainty estimates and closed-form Jacobians trivially, which are not straightforward to extract from a NN. For training efficiency, our approach will leverage recent novel techniques for GP parameter learning called <italic>kernel flows</italic> <xref ref-type="bibr" rid="bib1.bibx46" id="paren.32"/> and training data generation via evaluating the FP model using the Reusable Framework for Atmospheric Composition (ReFRACtor) <xref ref-type="bibr" rid="bib1.bibx40" id="paren.33"/>. We will demonstrate the accuracy of forward model emulation against a holdout test set of FP evaluations and further demonstrate the ability of our emulator to replicate the OE retrieval performance of ReFRACtor FP model in a fraction of the computational time. Our approach achieves a remarkably low prediction error, less than 1 % (“within measurement error limits”), which is an excellent result in the field of more general <italic>operator learning</italic>. Strategies to achieve learning more complicated operators, like the FP model in our case, often involve a NN-based architecture <xref ref-type="bibr" rid="bib1.bibx38 bib1.bibx36" id="paren.34"/>. Our approach follows the example set by <xref ref-type="bibr" rid="bib1.bibx1" id="text.35"/> that kernel methods are competitive in operator learning.</p>
      <p id="d2e392">The rest of the paper is organized as follows. Section <xref ref-type="sec" rid="Ch1.S2"/> will describe in detail the GP regression, kernel learning, and the resulting forward model emulator. Section <xref ref-type="sec" rid="Ch1.S3"/> will further elaborate on the details of the OCO-2 retrieval algorithm, the state vector, and the FP model describing atmospheric radiative transfer. Section <xref ref-type="sec" rid="Ch1.S4"/> will detail the emulator implementation of the ReFRACtor FP model and assess its performance. Section <xref ref-type="sec" rid="Ch1.S5"/> will show results of our emulator used in a simulated XCO<sub>2</sub> retrieval context, and finally Sect. <xref ref-type="sec" rid="Ch1.S6"/> will provide concluding remarks and ideas on future work and applications.</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>Gaussian process emulator</title>
      <p id="d2e423"><italic>Gaussian process (GP) regression</italic> <xref ref-type="bibr" rid="bib1.bibx54" id="paren.36"/> (also called <italic>kriging</italic> in spatial context: <xref ref-type="bibr" rid="bib1.bibx9 bib1.bibx60" id="altparen.37"/>) is a well-studied methodology for approximating any continuous function to an arbitrary accuracy, leveraging training data and a <italic>kernel function</italic> prescribed a priori. In addition, once trained, the GP model can be used to obtain fast and accurate predictions of a computationally demanding physics model, to estimate prediction uncertainty, and to compute closed-form derivatives and Jacobians for the prediction. Physical constraints like positive parameter values can be accounted for in training data design (which we address in later sections) so that predictions happen with the support of a training data set. This is to say that our training data set will cover the expected minimum and maximum values of each parameter in the state vector. For example, surface albedo is physically restricted between 0 (no reflected light) and 1 (full reflection). If our training data are sufficiently well spread covering this range, we can make good predictions essentially by interpolation with the physically feasible interval. Potential departure from this support can be detected by large prediction uncertainty values, as the prediction uncertainty of a GP gets large by design if the input location does not have training points near it. In this section, we outline the basic theory of GP regression and outline our approach to modeling the continuous function between atmospheric state vectors <inline-formula><mml:math id="M13" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula> and radiances <inline-formula><mml:math id="M14" display="inline"><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math></inline-formula> observed by the OCO-2 instrument. We also provide background on maximum likelihood estimation for fitting GP models and present a novel root mean square error (RMSE) cross-validation extension for the kernel flow <xref ref-type="bibr" rid="bib1.bibx46" id="paren.38"/> approach, which we employ for the rest of this work.</p>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Gaussian process regression</title>
      <p id="d2e465">To construct an emulator for the forward model <inline-formula><mml:math id="M15" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, we employ Gaussian process (GP) regression to predict a <italic>label</italic> <inline-formula><mml:math id="M16" display="inline"><mml:mrow><mml:msup><mml:mi>z</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>∈</mml:mo><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow></mml:math></inline-formula> at a new <italic>state</italic> <inline-formula><mml:math id="M17" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>m</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>. A GP is defined by a <italic>kernel function</italic> (defined explicitly later) <inline-formula><mml:math id="M18" display="inline"><mml:mrow><mml:mi>k</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi>x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>: <inline-formula><mml:math id="M19" display="inline"><mml:mrow><mml:mi mathvariant="script">X</mml:mi><mml:mo>×</mml:mo><mml:mi mathvariant="script">X</mml:mi><mml:mo>→</mml:mo><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow></mml:math></inline-formula>, where in the cases studied in this work <inline-formula><mml:math id="M20" display="inline"><mml:mrow><mml:mi mathvariant="script">X</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>m</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>. We denote by <inline-formula><mml:math id="M21" display="inline"><mml:mrow><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> the matrix of all kernel function evaluations over the training data <inline-formula><mml:math id="M22" display="inline"><mml:mrow><mml:mi mathvariant="bold">X</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mo>×</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> of <inline-formula><mml:math id="M23" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> points with the entries <inline-formula><mml:math id="M24" display="inline"><mml:mrow><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:msub><mml:mo>]</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>k</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M25" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M26" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the <inline-formula><mml:math id="M27" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th and <inline-formula><mml:math id="M28" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>th training data points, respectively. Furthermore, <inline-formula><mml:math id="M29" display="inline"><mml:mrow><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> denotes the vector of kernel evaluations of state <inline-formula><mml:math id="M30" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> against all training points <inline-formula><mml:math id="M31" display="inline"><mml:mi mathvariant="bold">X</mml:mi></mml:math></inline-formula>. Using the training data together with vector of corresponding labels <inline-formula><mml:math id="M32" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>N</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>, a GP prediction of label (or function value) at a new state <inline-formula><mml:math id="M33" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> is given by
            <disp-formula id="Ch1.E1" content-type="numbered"><label>1</label><mml:math id="M34" display="block"><mml:mrow><mml:msup><mml:mi>z</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>≡</mml:mo><mml:mtext>GP</mml:mtext><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>]</mml:mo><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>]</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="double-struck">I</mml:mi></mml:mrow></mml:mfenced><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where we have assumed without loss of generality that the training data are centered, and thus the GP has a zero mean. We add that the term <inline-formula><mml:math id="M35" display="inline"><mml:mrow><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>]</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="double-struck">I</mml:mi></mml:mrow></mml:mfenced><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mi mathvariant="bold-italic">z</mml:mi></mml:mrow></mml:math></inline-formula> does not depend on the new input state, so it can be precomputed. This makes the predictions take minimal computational time by avoiding inverting a potentially large matrix <inline-formula><mml:math id="M36" display="inline"><mml:mrow><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>]</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="double-struck">I</mml:mi></mml:mrow></mml:mfenced><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e911">In GP literature, the variance term <inline-formula><mml:math id="M37" display="inline"><mml:mrow><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="double-struck">I</mml:mi></mml:mrow></mml:math></inline-formula> is usually taken to be the measurement error or local-scale unexplained variability in the training labels <inline-formula><mml:math id="M38" display="inline"><mml:mi mathvariant="bold-italic">z</mml:mi></mml:math></inline-formula>. However, since we are interested in reproducing the outputs of a computer code, the “measurements” are exact, and hence there is no measurement error. It was shown in <xref ref-type="bibr" rid="bib1.bibx46" id="text.39"/> that learning the parameters of GP models from noiseless data can lead to unstable predictive models and numerical singularities. For this reason, we treat <inline-formula><mml:math id="M39" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula> as a regularization parameter, which captures the empirical mismatch between the model and the actual data, and optimize it together with other kernel parameters.</p>
      <p id="d2e941">In addition to point predictions, GP prediction can be associated with prediction uncertainty (the posterior variance of the GP), given by
            <disp-formula id="Ch1.E2" content-type="numbered"><label>2</label><mml:math id="M40" display="block"><mml:mrow><mml:msup><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:mi>k</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>]</mml:mo><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>]</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="double-struck">I</mml:mi></mml:mrow></mml:mfenced><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:msup><mml:mo>]</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
          The ability to include prediction uncertainties sets GP regression apart from many modern NN-based machine learning methods, which only provide a point estimate as a prediction. Large prediction variance can be an indication of departure from the support of a training data set, indicating that GP is likely to lose its prediction skill. Additionally, uncertainty from the predictions can be propagated forward and accounted for in further applications of GP-based emulators.</p>
      <p id="d2e1040">The GP formulas presented here rely on conditional Gaussian distributions and thus have a similar structure to that of optimal interpolation (OI; e.g., p. 157 of <xref ref-type="bibr" rid="bib1.bibx29" id="altparen.40"/>) and related iterative optimal estimation (OE) algorithms (e.g., Eq. <xref ref-type="disp-formula" rid="Ch1.E15"/>), both of which are widely used methods in atmospheric remote sensing. OI and OE use Gaussian assumptions to derive a mean and covariance for the posterior distribution as a solution to an inverse problem, i.e., data assimilation or a retrieval. In Gaussian process regression, the target function (here, the forward model) is represented similarly by a Gaussian distribution that has a mean (prediction) and covariance (error estimate of the prediction).</p>
      <p id="d2e1049">Our interest will be in replicating the results of a gradient-based optimization problem. Hence, in addition to fast evaluations of <inline-formula><mml:math id="M41" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, we would also benefit from fast derivatives obtained from closed-form expressions. Combining Eqs. (<xref ref-type="disp-formula" rid="Ch1.E1"/>) and (<xref ref-type="disp-formula" rid="Ch1.E4"/>), we get
            <disp-formula id="Ch1.E3" content-type="numbered"><label>3</label><mml:math id="M42" display="block"><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mtext>d</mml:mtext><mml:mrow><mml:mtext>d</mml:mtext><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:msup><mml:mi>z</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mtext>d</mml:mtext><mml:mrow><mml:mtext>d</mml:mtext><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold">Γ</mml:mi><mml:mfenced close="]" open="["><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi></mml:mrow></mml:mfenced><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:mi mathvariant="bold">Γ</mml:mi><mml:mfenced open="[" close="]"><mml:mrow><mml:mi mathvariant="bold">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi></mml:mrow></mml:mfenced><mml:mo>+</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="double-struck">I</mml:mi></mml:mrow></mml:mfenced><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          which describes taking the derivative of Eq. (<xref ref-type="disp-formula" rid="Ch1.E1"/>) with respect to <inline-formula><mml:math id="M43" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e1162">While other machine learning methods, such as artificial neural networks and multilayer perceptrons, can in principle be differentiated, computing the derivatives of a large architecture is computationally more demanding than evaluating Eq. (<xref ref-type="disp-formula" rid="Ch1.E3"/>), which motivates our use of GP regression. Other similar approaches (e.g., radial basis function networks) can be shown to be universal approximators as well and could be used in place of GPs. As will be shown, our approach yields a fast and accurate predictor that is intuitive and relatively easy to implement, so comparison against other machine learning methods will not be pursued further in the scope of this work.</p>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>Kernel function</title>
      <p id="d2e1175">A crucial modeling choice in GP regression is specification of a kernel function. This task involves either expert knowledge of the domain structure or some iterative trial-and-error search. In our application, we have empirically observed that a kernel function consisting of the sum of Matérn and linear kernels yields excellent predictive performance. This is likely due to a locally near-linear behavior commonly assumed with the OCO-2 forward model being captured by the linear kernel, together with a largely flexible Matérn term that is known to capture a large variety of nonlinear effects. The Matérn kernel is a more expressive choice of kernel compared to the usual Gaussian/radial basis functions used by default in Gaussian process regression, which tend to be “too smooth” to capture more abrupt changes in the function that is being approximated. Furthermore, such a kernel can also be differentiated in closed form. The kernel function used throughout this work is given by
            <disp-formula id="Ch1.E4" content-type="numbered"><label>4</label><mml:math id="M44" display="block"><mml:mtable class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mi>k</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mfenced open="(" close=")"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>+</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mo>‖</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:msub><mml:mo>‖</mml:mo><mml:mi mathvariant="script">W</mml:mi></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mi>exp⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mo>‖</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:msub><mml:mo>‖</mml:mo><mml:mi mathvariant="script">W</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:mi mathvariant="bold-italic">x</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
          where <inline-formula><mml:math id="M45" display="inline"><mml:mrow><mml:mo>‖</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:msub><mml:mo>‖</mml:mo><mml:mi mathvariant="script">W</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msqrt><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:msup><mml:mi mathvariant="script">W</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:msqrt></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M46" display="inline"><mml:mrow><mml:mi mathvariant="script">W</mml:mi><mml:mo>=</mml:mo><mml:mtext>diag</mml:mtext><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is a diagonal matrix, <inline-formula><mml:math id="M47" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>m</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> is a vector of weights, <inline-formula><mml:math id="M48" display="inline"><mml:mrow><mml:mi>l</mml:mi><mml:mo>∈</mml:mo><mml:mi mathvariant="double-struck">R</mml:mi></mml:mrow></mml:math></inline-formula> is a length scale parameter, and <inline-formula><mml:math id="M49" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M50" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>∈</mml:mo><mml:msub><mml:mi mathvariant="double-struck">R</mml:mi><mml:mo>+</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> are positive weights that are restricted to sum to 1.</p>
      <p id="d2e1454">To compute Jacobians, we need an expression for the derivative of the kernel function <inline-formula><mml:math id="M51" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mtext>d</mml:mtext><mml:mrow><mml:mtext>d</mml:mtext><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> in Eq. (<xref ref-type="disp-formula" rid="Ch1.E3"/>). This can be computed in closed form from Eq. (<xref ref-type="disp-formula" rid="Ch1.E4"/>) using known matrix identities. The derivation of a closed-form expression can be found in Appendix <xref ref-type="sec" rid="App1.Ch1.S1"/>.</p>
</sec>
<sec id="Ch1.S2.SS3">
  <label>2.3</label><title>Parameter learning</title>
      <p id="d2e1504">Prediction quality of GP regression depends on identifying the hyperparameters <inline-formula><mml:math id="M52" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> that best fit the training data. In our case, following the form of our kernel function, we have <inline-formula><mml:math id="M53" display="inline"><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>=</mml:mo><mml:mfenced close="]" open="["><mml:mrow><mml:mi mathvariant="bold-italic">w</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula>. Hyperparameters are commonly learned via optimization, using maximum likelihood estimation (MLE, <xref ref-type="bibr" rid="bib1.bibx54" id="altparen.41"/>). This amounts to minimizing
            <disp-formula id="Ch1.E5" content-type="numbered"><label>5</label><mml:math id="M54" display="block"><mml:mrow><mml:mi mathvariant="script">L</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:mi>log⁡</mml:mi><mml:mfenced close="]" open="["><mml:mrow><mml:mtext>det</mml:mtext><mml:mfenced open="(" close=")"><mml:mrow><mml:msub><mml:mi mathvariant="bold">Γ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:mfenced><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:msup><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M55" display="inline"><mml:mrow><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>[</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> evaluated at parameter values <inline-formula><mml:math id="M56" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>. Although this method is usually robust and performs well, GP applications with high-dimensional inputs and a large amount of training data are known to be challenging due to inverse matrix and log-determinant calculations. Numerous approaches have been suggested to tackle this problem (e.g., local approximations,  <xref ref-type="bibr" rid="bib1.bibx63 bib1.bibx15" id="altparen.42"/>). Inspired by the kernel flow approach <xref ref-type="bibr" rid="bib1.bibx46" id="paren.43"/> where kernel parameters are learned by minimizing a relative reproducing kernel Hilbert space (RKHS) norm, we propose a cross-validation RMSE-based method to be used in this work. Intuitively, RKHS norm is a way to measure the smoothness of a function approximation achieved with the kernel method. While smooth methods generally yield discretization-invariant predictors, we propose to directly minimize the prediction error instead. The intuition of this approach is to iteratively select small mini-batches of the training data set and individually leave points out one by one while using the rest of the mini-batch to predict the left-out values (via Eq. <xref ref-type="disp-formula" rid="Ch1.E1"/>). Learning the kernel parameters that minimize this prediction error leads to globally good predictions given new inputs. This approach leverages the known screening effect associated with Matérn kernels, where the effects of faraway points on prediction accuracy diminish, and only close-by points are necessary for prediction accuracy. The same intuition is the basis of numerous nearest neighbors GP methods (e.g., <xref ref-type="bibr" rid="bib1.bibx63" id="altparen.44"/>). The upside of our approach is the ability to select small mini-batches on each training iteration, allowing for faster computations while avoiding expensive log-determinant calculations and inverting the large covariance matrices required in MLE. We will later show that our proposed method converges reliably and yields excellent predictions.</p>
      <p id="d2e1670">We start by selecting a mini-batch <inline-formula><mml:math id="M57" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mtext>batch</mml:mtext></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M58" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mtext>batch</mml:mtext></mml:msup></mml:mrow></mml:math></inline-formula> of size <inline-formula><mml:math id="M59" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>batch</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> by randomly sampling from training data. We define a leave-one-out (LOO) cross-validation loss function with respect to <inline-formula><mml:math id="M60" display="inline"><mml:mrow><mml:msup><mml:mi>L</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> error (also known as RMSE) by first considering taking out one data point from the training data and using the rest to predict it. This can be achieved by modifying the GP prediction formula from Eq. (<xref ref-type="disp-formula" rid="Ch1.E1"/>) and leaving out the <inline-formula><mml:math id="M61" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th data point. This is achieved via a  rank-one downdate <inline-formula><mml:math id="M62" display="inline"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>-</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msubsup><mml:mo>)</mml:mo><mml:mrow><mml:mo>:</mml:mo><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msubsup><mml:mo>)</mml:mo><mml:mrow><mml:mo>:</mml:mo><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msubsup><mml:mo>)</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:math></inline-formula> to remove the effect of the <inline-formula><mml:math id="M63" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th data point from the inverse covariance matrix <inline-formula><mml:math id="M64" display="inline"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo mathvariant="normal" stretchy="true">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>. (See <xref ref-type="bibr" rid="bib1.bibx61" id="altparen.45"/>, and <xref ref-type="bibr" rid="bib1.bibx67" id="altparen.46"/>, for details.) The modified LOO prediction formula is then given by
            <disp-formula id="Ch1.E6" content-type="numbered"><label>6</label><mml:math id="M65" display="block"><mml:mrow><mml:mover accent="true"><mml:mtext>GP</mml:mtext><mml:mo mathvariant="normal" stretchy="true">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo mathvariant="normal" stretchy="true">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msubsup><mml:mo>)</mml:mo><mml:mrow><mml:mo>:</mml:mo><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow><mml:mi>T</mml:mi></mml:msubsup><mml:mfenced close=")" open="("><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo mathvariant="normal" stretchy="true">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msubsup><mml:mo>)</mml:mo><mml:mrow><mml:mo>:</mml:mo><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo mathvariant="normal" stretchy="true">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msubsup><mml:mo>)</mml:mo><mml:mrow><mml:mo>:</mml:mo><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo mathvariant="normal" stretchy="true">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msubsup><mml:mo>)</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:mfenced><mml:msup><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mtext>batch</mml:mtext></mml:msup><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M66" display="inline"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="bold">Γ</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mtext>batch</mml:mtext></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mtext>batch</mml:mtext></mml:msup></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula> is the <inline-formula><mml:math id="M67" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>batch</mml:mtext></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mtext>batch</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> covariance over the mini-batch evaluated at parameter values <inline-formula><mml:math id="M68" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>, Here, the notation <inline-formula><mml:math id="M69" display="inline"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msub><mml:mo>)</mml:mo><mml:mrow><mml:mo>:</mml:mo><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> means all rows of the <inline-formula><mml:math id="M70" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th column. We then define the final loss function by using Eq. (<xref ref-type="disp-formula" rid="Ch1.E6"/>) to predict <inline-formula><mml:math id="M71" display="inline"><mml:mrow><mml:msub><mml:mi>z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> (the <inline-formula><mml:math id="M72" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th training label removed from the mini-batch) as
            <disp-formula id="Ch1.E7" content-type="numbered"><label>7</label><mml:math id="M73" display="block"><mml:mrow><mml:mi mathvariant="italic">ρ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>k</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>k</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow></mml:munderover><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:mover accent="true"><mml:mtext>GP</mml:mtext><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:msub><mml:mi>z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>+</mml:mo><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mo>‖</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:msup><mml:mo>‖</mml:mo><mml:mi>k</mml:mi></mml:msup><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M74" display="inline"><mml:mrow><mml:mi>i</mml:mi><mml:mo>∈</mml:mo><mml:mo>[</mml:mo><mml:msub><mml:mi>k</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi mathvariant="normal">…</mml:mi><mml:msub><mml:mi>k</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>]</mml:mo><mml:mo>⊂</mml:mo><mml:mfenced close="]" open="["><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mi mathvariant="normal">…</mml:mi><mml:msub><mml:mi>N</mml:mi><mml:mtext>batch</mml:mtext></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula> is a subset of <inline-formula><mml:math id="M75" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>≤</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mtext>batch</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> indices denoting elements of the mini-batch selected for prediction, which can be chosen as, for example, the entire mini-batch or the <inline-formula><mml:math id="M76" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula> nearest neighbors of the center point of the mini-batch. The regularization term, with error norm <inline-formula><mml:math id="M77" display="inline"><mml:mrow><mml:mo>‖</mml:mo><mml:mo>⋅</mml:mo><mml:msup><mml:mo>‖</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, some penalty magnitude <inline-formula><mml:math id="M78" display="inline"><mml:mi mathvariant="italic">ϵ</mml:mi></mml:math></inline-formula>, and mean <inline-formula><mml:math id="M79" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, is included to ensure that kernel amplitude parameter values do not grow uncontrollably. This is done since we have observed empirically that letting non-identifiable parameters grow during optimization can lead to the optimizer getting “stuck”, whereas this problem is not observed when regularizing the loss function. One may, for example, set <inline-formula><mml:math id="M80" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> to be a vector of 1's.</p>
      <p id="d2e2320">We can  now optimize the kernel parameters iteratively by repeatedly selecting mini-batches and updating <inline-formula><mml:math id="M81" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> along the gradient of <inline-formula><mml:math id="M82" display="inline"><mml:mrow><mml:mi mathvariant="italic">ρ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, which is obtained by automatic differentiation using Julia's <italic>Zygote</italic> package <xref ref-type="bibr" rid="bib1.bibx25" id="paren.47"/>. We note that closed-form kernel derivatives could be used here as well, but since automatic differentiation with mini-batch sizes we use uses negligible computational time, we will not pursue this idea further in this work. We note that as the mini-batch is selected at random, this method can be viewed as stochastic gradient descent. For this reason, we use the adaptive moment estimation (ADAM, <xref ref-type="bibr" rid="bib1.bibx32" id="altparen.48"/>) optimizer to find the optimal value. Use of a momentum-based optimizer is further recommended in this application as we have observed that the cost function often has several local minima. The optimization procedure is summarized in Algorithm <xref ref-type="other" rid="Ch1.Prog1"/>. The final parameter value can be selected to be the one corresponding to the smallest loss function value achieved during training.</p><boxed-text content-type="algorithm" position="float" id="Ch1.Prog1"><label>Algorithm 1</label><caption><p id="d2e2357">Kernel parameter learning</p></caption><disp-quote content-type="algorithmic" specific-use="numbering{1}"><list>

    <list-item><label><bold>Input:</bold></label>

      <p id="d2e2367" specific-use="REQUIRE">kernel function <inline-formula><mml:math id="M83" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>, training data (<inline-formula><mml:math id="M84" display="inline"><mml:mrow><mml:mi mathvariant="bold">X</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">z</mml:mi></mml:mrow></mml:math></inline-formula>), batch size <inline-formula><mml:math id="M85" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>batch</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, number of prediction points <inline-formula><mml:math id="M86" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula>, number of iterations <inline-formula><mml:math id="M87" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>Iter</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>.</p>
            </list-item>

    <list-item><label><bold>Output:</bold></label>

      <p id="d2e2424" specific-use="ENSURE">matrix of kernel parameters <inline-formula><mml:math id="M88" display="inline"><mml:mi mathvariant="bold">Θ</mml:mi></mml:math></inline-formula> and vector of loss values <inline-formula><mml:math id="M89" display="inline"><mml:mi mathvariant="bold">R</mml:mi></mml:math></inline-formula></p>
            </list-item>

    <list-item>

      <p id="d2e2443" specific-use="STATE">initialize <inline-formula><mml:math id="M90" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>←</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">Θ</mml:mi><mml:mo>←</mml:mo><mml:mn mathvariant="bold">0</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="bold">R</mml:mi><mml:mo>←</mml:mo><mml:mn mathvariant="bold">0</mml:mn></mml:mrow></mml:math></inline-formula></p>
            </list-item>

    <list-item>

      <p id="d2e2479" specific-use="FORALL"><bold>for all</bold> <inline-formula><mml:math id="M91" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> in <inline-formula><mml:math id="M92" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mi mathvariant="normal">…</mml:mi><mml:msub><mml:mi>N</mml:mi><mml:mtext>Iter</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <bold>do</bold> <list>
    <list-item>
      <p id="d2e2512" specific-use="STATE"><inline-formula><mml:math id="M93" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mtext>batch</mml:mtext></mml:msup><mml:mo>←</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>[</mml:mo><mml:mtext>rand</mml:mtext><mml:mo>(</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mtext>batch</mml:mtext></mml:msub><mml:mo>)</mml:mo><mml:mo>]</mml:mo><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="1em"/><mml:msup><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mtext>batch</mml:mtext></mml:msup><mml:mo>←</mml:mo><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>[</mml:mo><mml:mtext>rand</mml:mtext><mml:mo>(</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mtext>batch</mml:mtext></mml:msub><mml:mo>)</mml:mo><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>   // <italic>Randomly select a mini-batch</italic>  <inline-formula><mml:math id="M94" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mtext>batch</mml:mtext></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mtext>batch</mml:mtext></mml:msup></mml:mrow></mml:math></inline-formula></p></list-item>
    <list-item>
      <p id="d2e2594" specific-use="STATE"><inline-formula><mml:math id="M95" display="inline"><mml:mrow><mml:mi mathvariant="bold">R</mml:mi><mml:mo>[</mml:mo><mml:mi>i</mml:mi><mml:mo>]</mml:mo><mml:mo>←</mml:mo><mml:mi mathvariant="italic">ρ</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>  // <italic>Compute loss</italic> <inline-formula><mml:math id="M96" display="inline"><mml:mrow><mml:mi mathvariant="italic">ρ</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <italic>from Eq.</italic> (<xref ref-type="disp-formula" rid="Ch1.E7"/>)</p></list-item>
    <list-item>
      <p id="d2e2650" specific-use="STATE"><inline-formula><mml:math id="M97" display="inline"><mml:mrow><mml:mi mathvariant="bold">Θ</mml:mi><mml:mo>[</mml:mo><mml:mi>i</mml:mi><mml:mo>]</mml:mo><mml:mo>←</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="1em"/><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>←</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mtext>ADAM</mml:mtext><mml:mo>(</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="italic">θ</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mi mathvariant="italic">ρ</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>  // <italic>Compute gradient</italic> <inline-formula><mml:math id="M98" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="italic">θ</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mi mathvariant="italic">ρ</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <italic>and update parameters</italic> <inline-formula><mml:math id="M99" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> <italic>using ADAM</italic></p></list-item></list></p>
            </list-item>

    <list-item>

      <p id="d2e2775" specific-use="ENDFOR"><bold>end</bold> <bold>for</bold></p>
            </list-item>

    <list-item>

      <p id="d2e2786" specific-use="RETURN"><bold>return</bold>  <inline-formula><mml:math id="M100" display="inline"><mml:mrow><mml:mi mathvariant="bold">Θ</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold">R</mml:mi></mml:mrow></mml:math></inline-formula></p>
            </list-item>
          </list></disp-quote></boxed-text>
</sec>
<sec id="Ch1.S2.SS4">
  <label>2.4</label><title>Training data generation</title>
      <p id="d2e2813">As we aim to reproduce the performance of a function represented as computer code, we take advantage of the freedom to use a space-filling design for <inline-formula><mml:math id="M101" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula> in <inline-formula><mml:math id="M102" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>m</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> for training data creation.  We first span the unit cube <inline-formula><mml:math id="M103" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:msup><mml:mo>]</mml:mo><mml:mi>m</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> with a <italic>Sobol' sequence</italic> <xref ref-type="bibr" rid="bib1.bibx59 bib1.bibx52" id="paren.49"/> of <inline-formula><mml:math id="M104" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> points. In practice we employ Julia's <italic>Sobol.jl</italic> <xref ref-type="bibr" rid="bib1.bibx27" id="paren.50"/> package for this step. Then, using information about the minimum and maximum physically feasible value of each input dimension, we scale the unit cube to span the whole state space. During research, we tested other methods like random sampling and Latin-hypercube-based methods, which turned out to leave “holes” in training data set, meaning non-constant predictive performance over the entire data set. Sobol' sequences, meanwhile, span the entire input space more evenly. Sobol' sequences are a space-filling design akin to Latin hypercubes, providing optimality (observed experimentally) in generation of training data. We further evaluate the computational model <inline-formula><mml:math id="M105" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> at each training point, obtaining states <inline-formula><mml:math id="M106" display="inline"><mml:mrow><mml:mi mathvariant="bold">X</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:mi>m</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> and model outputs <inline-formula><mml:math id="M107" display="inline"><mml:mrow><mml:mi mathvariant="bold">Y</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>×</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>.</p>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>The Orbiting Carbon Observatory-2</title>
      <p id="d2e2936">In this section, we describe OCO-2 and the related measurements, physics model, state vector, and retrieval algorithm. Further information on these topics can be found in, for example, <xref ref-type="bibr" rid="bib1.bibx8" id="text.51"/>, <xref ref-type="bibr" rid="bib1.bibx44" id="text.52"/>, <xref ref-type="bibr" rid="bib1.bibx12" id="text.53"/>, <xref ref-type="bibr" rid="bib1.bibx45" id="text.54"/>, and in the Algorithm Theoretical Basis Document (ATBD) <xref ref-type="bibr" rid="bib1.bibx2" id="paren.55"/>.</p>
<sec id="Ch1.S3.SS1">
  <label>3.1</label><title>The OCO-2 instrument</title>
      <p id="d2e2961">OCO-2 is a NASA-operated satellite mission dedicated to providing data products of global atmospheric carbon dioxide concentrations <xref ref-type="bibr" rid="bib1.bibx11" id="paren.56"/>. The satellite is pointed towards Earth as it measures solar light reflected by Earth's surface and atmosphere, recorded as radiances. The OCO-2 instrument itself is composed of three spectrometers that measure light reflected from Earth's surface in the infrared part of the spectrum in three separate wavelength bands. These bands are centered around 0.765, 1.61, and 2.06 µm and are called the O<sub>2</sub> A-band (O2), the weak CO<sub>2</sub> band (WCO2), and the strong CO<sub>2</sub> band (SCO2), respectively. Each observation consists of 1016 radiances on separate wavelengths from each band (for more information, see, e.g., <xref ref-type="bibr" rid="bib1.bibx13 bib1.bibx56" id="altparen.57"/>). These measurements are then used to infer a state vector containing information on atmospheric properties like CO<sub>2</sub> concentration on 20 pressure levels, surface pressure, temperature, and aerosol optical depth (AOD). The state vector also includes surface properties like albedo and solar-induced chlorophyll fluorescence (SIF). The primary scalar quantity of interest is the column-averaged CO<sub>2</sub> concentration (XCO<sub>2</sub>).</p>
</sec>
<sec id="Ch1.S3.SS2">
  <label>3.2</label><title>Atmospheric radiative transfer</title>
      <p id="d2e3033">A key part to inferring XCO<sub>2</sub> from observed radiances is the construction of a computational atmospheric radiative transfer model which describes how solar radiation is propagated, reflected, and scattered by Earth's surface and atmosphere. Together with an instrument model, this computer code is known as the full-physics (FP) model, referred to in this work as
            <disp-formula id="Ch1.E8" content-type="numbered"><label>8</label><mml:math id="M115" display="block"><mml:mrow><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>=</mml:mo><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mo>)</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M116" display="inline"><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math></inline-formula> is the output of the FP model (a wavelength-by-wavelength radiance), <inline-formula><mml:math id="M117" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula> is a state vector containing atmospheric and surface information, and <inline-formula><mml:math id="M118" display="inline"><mml:mi mathvariant="bold-italic">b</mml:mi></mml:math></inline-formula> denotes model parameters held fixed during data processing. A thorough description of the FP model is given in the ATBD <xref ref-type="bibr" rid="bib1.bibx2" id="paren.58"/>. To motivate our emulation approach, we will here describe parts of the forward model physics, which is not intended to be a full description of the included physics. Rather, we leverage this information to better design our emulator.</p>
      <p id="d2e3095">Part of the radiance comes from absorption of radiation by atmospheric molecules, given by
            <disp-formula id="Ch1.E9" content-type="numbered"><label>9</label><mml:math id="M119" display="block"><mml:mrow><mml:mi>I</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo><mml:mi>cos⁡</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">τ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>)</mml:mo><mml:mo>⋅</mml:mo><mml:mi>R</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">φ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>)</mml:mo><mml:mi>exp⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>-</mml:mo><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M120" display="inline"><mml:mi mathvariant="italic">λ</mml:mi></mml:math></inline-formula> is wavelength, the <inline-formula><mml:math id="M121" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>th wavelength corresponds to the <inline-formula><mml:math id="M122" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>th entry of radiance <inline-formula><mml:math id="M123" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M124" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the solar flux at the top of the atmosphere, <inline-formula><mml:math id="M125" display="inline"><mml:mrow><mml:mi>R</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">φ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the reflectance of the surface, <inline-formula><mml:math id="M126" display="inline"><mml:mrow><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is an integral over radiation path length that sums over for all modeled absorbers, <inline-formula><mml:math id="M127" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M128" display="inline"><mml:mi mathvariant="italic">φ</mml:mi></mml:math></inline-formula> are the observation zenith and azimuth angles, and <inline-formula><mml:math id="M129" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M130" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">φ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> are the corresponding solar zenith and azimuth angles. Observation and solar angles have a significant effect on the observed and modeled radiances, which will be important later in this work.</p>
      <p id="d2e3321">After calculating the absorption with Eq. (<xref ref-type="disp-formula" rid="Ch1.E9"/>), equations further describing atmospheric scattering are employed to solve for <italic>atmospheric radiative transfer</italic> (RT), which describes the total effect of atmosphere and surface on the scattered photons. The FP framework further includes an instrument model, which describes the effects of the observing system to the <italic>top-of-the-atmosphere radiances</italic>. These effects include instrument Doppler shift, spectral dispersion, and convolution with the instrument line shape (ILS) function, reducing the resolution from the finer RT grid to the coarser observational grid. On an abstracted level, this corresponds mathematically to
            <disp-formula id="Ch1.E10" content-type="numbered"><label>10</label><mml:math id="M131" display="block"><mml:mrow><mml:msub><mml:mi>I</mml:mi><mml:mtext>OBS</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mi mathvariant="normal">∞</mml:mi></mml:mrow><mml:mrow><mml:mo>+</mml:mo><mml:mi mathvariant="normal">∞</mml:mi></mml:mrow></mml:munderover><mml:mtext>RT</mml:mtext><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mtext>ILS</mml:mtext><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mtext>d</mml:mtext><mml:msup><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>+</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M132" display="inline"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M133" display="inline"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denote the instrument effects other than convolution that can be expressed as multiplication and addition. Generally speaking, the instrument effects depend on different physical properties that can vary between detector arrays, while the RT portion of the forward model is constant within the instrument. This observation motivates us to focus on emulating the outputs of the RT, referred to as <italic>monochromatic radiances</italic>, after which instrument functions can be applied appropriately after the fact. Looking forward to operational integration of our emulator, this will reduce the complexity of the emulated system and arguably make our task easier.</p>
</sec>
<sec id="Ch1.S3.SS3">
  <label>3.3</label><title>OCO-2 state vector</title>
      <p id="d2e3472">The state vector elements comprising <inline-formula><mml:math id="M134" display="inline"><mml:mi mathvariant="bold">x</mml:mi></mml:math></inline-formula> for the FP model are summarized in Table <xref ref-type="table" rid="Ch1.T1"/>. Notably, we have divided the table into two parts. The upper half lists the previously mentioned atmospheric and surface state vector elements that affect the RT part only, and the rest having to do with the instrument effects are in the lower half. This collection includes scaling factors for empirical orthogonal functions (EOFs) that capture unmodeled offsets in the observed radiances <xref ref-type="bibr" rid="bib1.bibx45" id="paren.59"/>.</p>

<table-wrap id="Ch1.T1" specific-use="star"><label>Table 1</label><caption><p id="d2e3490">Elements of the OCO-2 state vector by functional group. The second column indicates the total elements per group. The check marks in the remaining columns indicate which wavelength bands are sensitive to changes in each variable.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="5">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="center"/>
     <oasis:colspec colnum="4" colname="col4" align="center"/>
     <oasis:colspec colnum="5" colname="col5" align="center"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">State vector element</oasis:entry>
         <oasis:entry colname="col2">No. elements</oasis:entry>
         <oasis:entry colname="col3">O<sub>2</sub></oasis:entry>
         <oasis:entry colname="col4">WCO<sub>2</sub></oasis:entry>
         <oasis:entry colname="col5">SCO<sub>2</sub></oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">CO<sub>2</sub> concentration profile</oasis:entry>
         <oasis:entry colname="col2">20</oasis:entry>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">✓</oasis:entry>
         <oasis:entry colname="col5">✓</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">H<sub>2</sub>O scaling factor</oasis:entry>
         <oasis:entry colname="col2">1</oasis:entry>
         <oasis:entry colname="col3">✓</oasis:entry>
         <oasis:entry colname="col4">✓</oasis:entry>
         <oasis:entry colname="col5">✓</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Surface pressure (pascals)</oasis:entry>
         <oasis:entry colname="col2">1</oasis:entry>
         <oasis:entry colname="col3">✓</oasis:entry>
         <oasis:entry colname="col4">✓</oasis:entry>
         <oasis:entry colname="col5">✓</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Temperature offset (kelvin)</oasis:entry>
         <oasis:entry colname="col2">1</oasis:entry>
         <oasis:entry colname="col3">✓</oasis:entry>
         <oasis:entry colname="col4">✓</oasis:entry>
         <oasis:entry colname="col5">✓</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Aerosol height, width, and AOD</oasis:entry>
         <oasis:entry colname="col2">12</oasis:entry>
         <oasis:entry colname="col3">✓</oasis:entry>
         <oasis:entry colname="col4">✓</oasis:entry>
         <oasis:entry colname="col5">✓</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">O<sub>2</sub> band albedo</oasis:entry>
         <oasis:entry colname="col2">2</oasis:entry>
         <oasis:entry colname="col3">✓</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">WCO<sub>2</sub> band albedo</oasis:entry>
         <oasis:entry colname="col2">2</oasis:entry>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">✓</oasis:entry>
         <oasis:entry colname="col5"/>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">SCO<sub>2</sub> band albedo</oasis:entry>
         <oasis:entry colname="col2">2</oasis:entry>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">✓</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">O<sub>2</sub> band dispersion</oasis:entry>
         <oasis:entry colname="col2">2</oasis:entry>
         <oasis:entry colname="col3">✓</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">WCO<sub>2</sub>2 band dispersion</oasis:entry>
         <oasis:entry colname="col2">2</oasis:entry>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">✓</oasis:entry>
         <oasis:entry colname="col5"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">SCO<sub>2</sub> band dispersion</oasis:entry>
         <oasis:entry colname="col2">2</oasis:entry>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">✓</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">O<sub>2</sub> band EOF scaling</oasis:entry>
         <oasis:entry colname="col2">3</oasis:entry>
         <oasis:entry colname="col3">✓</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">WCO<sub>2</sub> band EOF scaling</oasis:entry>
         <oasis:entry colname="col2">3</oasis:entry>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">✓</oasis:entry>
         <oasis:entry colname="col5"/>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">SCO<sub>2</sub> band EOF scaling</oasis:entry>
         <oasis:entry colname="col2">3</oasis:entry>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">✓</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">SIF parameters</oasis:entry>
         <oasis:entry colname="col2">2</oasis:entry>
         <oasis:entry colname="col3">✓</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5"/>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e3907">In addition to state vector elements, the FP model is parameterized by a set of parameters that are held fixed based on auxiliary information, such as laboratory measurements or meteorological data sets. These parameters include instrument calibration details, spectroscopy properties for absorbing gases, land elevation, and aerosol microphysical parameters. These aerosol parameters arise from the selection of two dominant aerosol types as a function of space and time. All aerosol types have different optical properties. This choice is determined a priori by global maps based on meteorological knowledge and measurements (see Fig. <xref ref-type="fig" rid="Ch1.F1"/>). The possible dominant aerosol types are dust (DU), sulfate (SO), sea salt (SS), organic carbon (OC), and black carbon (BC). While constructing the emulator, we will consider data sets with a fixed pair of dominant aerosol species in order to decouple their physical effects from the rest of state vector. Separate emulators can then be constructed for each pair of aerosol species, and a selection of which one to use can be done by matching the measurement location with the appropriate types.</p>

      <fig id="Ch1.F1" specific-use="star"><label>Figure 1</label><caption><p id="d2e3915">Example global map of <bold>(a)</bold> primary and <bold>(b)</bold> secondary aerosol types used in the OCO-2 FP model. Different aerosol types imply different physics, which needs to be taken into account when building a forward model emulator. Image taken from <xref ref-type="bibr" rid="bib1.bibx2" id="text.60"/>.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f01.png"/>

        </fig>

</sec>
<sec id="Ch1.S3.SS4">
  <label>3.4</label><title>ReFRACtor</title>
      <p id="d2e3942">This work develops a proof-of-concept version of OCO-2 forward model emulator for a simulated case. For this reason and ease of access, we implement our simulations using the Reusable Framework for Atmospheric Composition (ReFRACtor, <xref ref-type="bibr" rid="bib1.bibx40" id="altparen.61"/>). ReFRACtor is an extensible multi-instrument atmospheric composition retrieval framework that supports and facilitates combined use of radiance measurements from different instruments in the ultraviolet, visible, near-infrared, and thermal-infrared. It has been open-source since 2014 when it was first developed as the Level-2 processing code for OCO-2. Since 2017 the development team has worked to create a more general framework that supports more instruments and spectral regions. This framework has been developed to provide the broader Earth science community a freely licensed software package that uses robust software engineering practices with well-tested, community-accepted algorithms and techniques. ReFRACtor is geared not only towards the creation of end-to-end production science data systems, but also towards scientists who need a software package to help investigate specific Earth science atmospheric composition questions. Although ReFRACtor includes an implementation of a version of the OCO-2 production algorithm, the two have drifted since the initial intercomparison work was done. At that time it was validated against the B9.2.00 version of the software. For the most part mainly bug fixes have been kept in sync between the two versions. Additionally the core radiative transfer algorithms are the same, which justifies the use of ReFRACtor for constructing our emulator at this stage. Some minor additional algorithmic features made their way into the ReFRACtor version of OCO-2 from the production version. For the most part the major discrepancy will be due to changes in configuration values not implemented in ReFRACtor. These include values such as a priori and covariance versions, EOF data sets, ABSCO versions, and the solar model.</p>
</sec>
<sec id="Ch1.S3.SS5">
  <label>3.5</label><title>Retrieval algorithm</title>
      <p id="d2e3956">Inferring XCO<sub>2</sub> from measured radiances is an ill-posed inverse problem, which is referred to as performing a retrieval. The relationship between measurement and state is first modeled as
            <disp-formula id="Ch1.E11" content-type="numbered"><label>11</label><mml:math id="M150" display="block"><mml:mrow><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>=</mml:mo><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">ε</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where data <inline-formula><mml:math id="M151" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>n</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> are a radiance vector; unknown <inline-formula><mml:math id="M152" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>m</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> is the state vector, <inline-formula><mml:math id="M153" display="inline"><mml:mi>F</mml:mi></mml:math></inline-formula>; <inline-formula><mml:math id="M154" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>m</mml:mi></mml:msup><mml:mo>→</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>n</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> is the OCO-2 FP model; and <inline-formula><mml:math id="M155" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">ε</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>n</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> is the measurement uncertainty. For completeness, we summarize the operational retrieval algorithm used in OCO-2 processing. The retrieval proceeds with solving the inverse problem by using Bayesian formulation, in which the additive error <inline-formula><mml:math id="M156" display="inline"><mml:mi mathvariant="bold-italic">ε</mml:mi></mml:math></inline-formula> and the prior for <inline-formula><mml:math id="M157" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula> are assumed to be Gaussian such that
            <disp-formula id="Ch1.E12" content-type="numbered"><label>12</label><mml:math id="M158" display="block"><mml:mrow><mml:mi mathvariant="bold-italic">ε</mml:mi><mml:mo>∼</mml:mo><mml:mi mathvariant="script">N</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">ε</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="1em"/><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>∼</mml:mo><mml:mi mathvariant="script">N</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi mathvariant="bold-italic">a</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="bold">a</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
          The measurement error covariance matrix <inline-formula><mml:math id="M159" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">ε</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is assumed to be diagonal, with elements for each wavelength <inline-formula><mml:math id="M160" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula> given by
            <disp-formula id="Ch1.E13" content-type="numbered"><label>13</label><mml:math id="M161" display="block"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>j</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mi>k</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msub><mml:mi>y</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>k</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M162" display="inline"><mml:mrow><mml:msub><mml:mi>k</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M163" display="inline"><mml:mrow><mml:msub><mml:mi>k</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> are calibration parameters adjusted by the instrument calibration team. The a priori covariance is taken to be diagonal for non-CO<sub>2</sub> parameters, and the CO<sub>2</sub> profile is assumed to have a correlation structure shown in Fig. <xref ref-type="fig" rid="Ch1.F2"/>, which promotes continuous concentration profiles and limits the variability higher up in the atmosphere.</p>

      <fig id="Ch1.F2" specific-use="star"><label>Figure 2</label><caption><p id="d2e4225">The a priori correlation matrix and standard deviation used for the CO<sub>2</sub> vertical profile in the OCO-2 retrieval. Vertical levels are ordered from the top of the atmosphere (Level 1) to the surface (Level 20).</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f02.png"/>

        </fig>

      <p id="d2e4243">The retrieval is operationally carried out using iterative gradient-based methods to solve for the maximum a posteriori estimate, which is equivalent to minimizing the cost function:
            <disp-formula id="Ch1.E14" content-type="numbered"><label>14</label><mml:math id="M167" display="block"><mml:mtable class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mo>=</mml:mo><mml:munder><mml:mtext>argmin</mml:mtext><mml:mi mathvariant="bold-italic">x</mml:mi></mml:munder><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>-</mml:mo><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mi>T</mml:mi></mml:msup><mml:msubsup><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="bold-italic">ε</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>-</mml:mo><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mo>+</mml:mo><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi mathvariant="bold-italic">a</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:msup><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="bold">a</mml:mi></mml:msub><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi mathvariant="bold-italic">a</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e4353">This optimization problem is solved using the Levenberg–Marquardt algorithm, in which at iteration <inline-formula><mml:math id="M168" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> the state is updated according to
            <disp-formula id="Ch1.E15" content-type="numbered"><label>15</label><mml:math id="M169" display="block"><mml:mtable class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mfenced close=")" open="("><mml:mrow><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>+</mml:mo><mml:mi mathvariant="italic">γ</mml:mi><mml:mo>)</mml:mo><mml:msup><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="bold">a</mml:mi></mml:msub><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msubsup><mml:mi mathvariant="bold">K</mml:mi><mml:mi>i</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:msup><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">ε</mml:mi></mml:msub><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mi mathvariant="bold">K</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mtext>d</mml:mtext><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mo>=</mml:mo><mml:mfenced close="]" open="["><mml:mrow><mml:msubsup><mml:mi mathvariant="bold">K</mml:mi><mml:mi>i</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:msup><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">ε</mml:mi></mml:msub><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mo>-</mml:mo><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>+</mml:mo><mml:msup><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="bold">a</mml:mi></mml:msub><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi mathvariant="bold-italic">a</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:mfenced><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
          where <inline-formula><mml:math id="M170" display="inline"><mml:mi mathvariant="italic">γ</mml:mi></mml:math></inline-formula> is a damping parameter and <inline-formula><mml:math id="M171" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">K</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the Jacobian of <inline-formula><mml:math id="M172" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> at iteration <inline-formula><mml:math id="M173" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>. After each iteration, before updating the state, the effect of forward model nonlinearity is assessed by computing the quantity
            <disp-formula id="Ch1.E16" content-type="numbered"><label>16</label><mml:math id="M174" display="block"><mml:mrow><mml:mi>R</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>c</mml:mi><mml:mtext>FC</mml:mtext></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M175" display="inline"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the value of the cost function (<xref ref-type="disp-formula" rid="Ch1.E14"/>) at iteration <inline-formula><mml:math id="M176" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M177" display="inline"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is similarly at iteration <inline-formula><mml:math id="M178" display="inline"><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M179" display="inline"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mtext>FC</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> is the cost function value assuming that <inline-formula><mml:math id="M180" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mtext>d</mml:mtext><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="bold">K</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mtext>d</mml:mtext><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> (i.e., a linear update). Based on the value of <inline-formula><mml:math id="M181" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>, one of the following is executed: <list list-type="bullet"><list-item>
      <p id="d2e4727"><inline-formula><mml:math id="M182" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M183" display="inline"><mml:mo>≤</mml:mo></mml:math></inline-formula> 0.0001: <inline-formula><mml:math id="M184" display="inline"><mml:mi mathvariant="italic">γ</mml:mi></mml:math></inline-formula> is increased by a factor of 10. State is not updated.</p></list-item><list-item>
      <p id="d2e4751">0.0001 <inline-formula><mml:math id="M185" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M186" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M187" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 0.25: <inline-formula><mml:math id="M188" display="inline"><mml:mi mathvariant="italic">γ</mml:mi></mml:math></inline-formula> is increased by a factor of 10; <inline-formula><mml:math id="M189" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mtext>d</mml:mtext><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>.</p></list-item><list-item>
      <p id="d2e4820">0.25 <inline-formula><mml:math id="M190" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M191" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M192" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 0.75: <inline-formula><mml:math id="M193" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mtext>d</mml:mtext><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>.</p></list-item><list-item>
      <p id="d2e4882"><inline-formula><mml:math id="M194" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M195" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 0.75: <inline-formula><mml:math id="M196" display="inline"><mml:mi mathvariant="italic">γ</mml:mi></mml:math></inline-formula> is decreased by a factor of 2, <inline-formula><mml:math id="M197" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mtext>d</mml:mtext><mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>.</p></list-item></list> After each nondivergent step, convergence is assessed by computing the error variance derivative (see <xref ref-type="bibr" rid="bib1.bibx2" id="altparen.62"/>, for details). The operational retrieval further provides an estimate for the posterior covariance as a Laplace approximation:
            <disp-formula id="Ch1.E17" content-type="numbered"><label>17</label><mml:math id="M198" display="block"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">S</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:msup><mml:mi mathvariant="bold">K</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msubsup><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">ε</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:mi mathvariant="bold">K</mml:mi><mml:mo>+</mml:mo><mml:msup><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="bold">a</mml:mi></mml:msub><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfenced><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
          This is done together with the so-called <italic>averaging kernel</italic>:
            <disp-formula id="Ch1.E18" content-type="numbered"><label>18</label><mml:math id="M199" display="block"><mml:mrow><mml:mi mathvariant="bold">A</mml:mi><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:msup><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="bold">a</mml:mi></mml:msub><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">K</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msubsup><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">ε</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:mi mathvariant="bold">K</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:msup><mml:mi mathvariant="bold">K</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msubsup><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">ε</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:mi mathvariant="bold">K</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          which can be interpreted as the sensitivity of the retrieved state <inline-formula><mml:math id="M200" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> to the true atmospheric state <inline-formula><mml:math id="M201" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula>. These quantities are important for downstream users of OCO data products, which highlights the value of producing closed-form Jacobians during data processing.</p>
</sec>
</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Forward model emulation</title>
      <p id="d2e5101">In this section, we will describe the practical implementation of our method laid out in Sect. <xref ref-type="sec" rid="Ch1.S2"/> applied to the OCO-2 retrieval problem in Sect. <xref ref-type="sec" rid="Ch1.S3"/>. This includes data transformations and dimension reduction, training data generation, convergence of the optimizer in kernel parameter learning, and assessment of forward model output quality. We stress that in order to be implemented in an operational retrieval algorithm, the emulator is required to perform with superior accuracy. We ensure accurate performance by making sure that the error in predicted radiances, compared to FP outputs, is less than the radiance measurement error standard deviation. This way, any systematic errors in emulation will be masked by measurement noise, and retrieval performance using emulation will closely resemble that of using the FP model.</p>
<sec id="Ch1.S4.SS1">
  <label>4.1</label><title>Data transformations</title>
      <p id="d2e5115">As GPs tend to perform worse with increasing input dimension and because the standard GP formulation is developed for one-dimensional outputs, we will need to reduce the dimension of both the atmospheric state and the radiance. For the atmospheric state <inline-formula><mml:math id="M202" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula>, we leverage the fact that OCO-2 measurements are made at three separate wavelength bands, which leads to the state vector having band-specific elements, which can be ignored when dealing with other bands. This partition has been summarized in Table <xref ref-type="table" rid="Ch1.T1"/>. Earlier work by <xref ref-type="bibr" rid="bib1.bibx39" id="text.63"/> considered cross-band correlations while emulating OCO-2 radiances, but the authors finally showed that the bands are distant enough from one another in wavelength space that they can be treated independently. With this insight, we proceed by constructing separate GPs for each band and using only the sensitive dimensions of <inline-formula><mml:math id="M203" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula> as inputs. We further notice that the 20-element CO<sub>2</sub> profile is continuous and can be expressed as loadings obtained using principal component analysis (PCA). The most straightforward way to do this is by truncated singular value decomposition (SVD) of the empirical covariance matrix of state vectors <xref ref-type="bibr" rid="bib1.bibx62" id="paren.64"/>. To accomplish this, we use a simulation distribution derived by <xref ref-type="bibr" rid="bib1.bibx3" id="text.65"/> for one selected set of realistic geophysical conditions as a basis for our experiments and perform SVD on the covariance matrix of this distribution. Analysis of singular value decay suggests that the CO<sub>2</sub> profile can be represented with just four principal components, which we collect to a matrix <inline-formula><mml:math id="M206" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">P</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> as the four leading singular vectors. We then project the CO<sub>2</sub> profile to a principal component space and further standardize the states by using the mean and variance of the simulation distribution, leading to
            <disp-formula id="Ch1.E19" content-type="numbered"><label>19</label><mml:math id="M208" display="block"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">σ</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mfenced open="(" close=")"><mml:mrow><mml:mtext>diag</mml:mtext><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="bold">P</mml:mi><mml:mi>x</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="double-struck">I</mml:mi><mml:mi>c</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:mfenced open="[" close="]"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mn mathvariant="bold">0</mml:mn><mml:mi>c</mml:mi></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:mfenced><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M209" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the CO<sub>2</sub> profile mean, <inline-formula><mml:math id="M211" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M212" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">σ</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the state mean and state standard deviation, <inline-formula><mml:math id="M213" display="inline"><mml:mrow><mml:mtext>diag</mml:mtext><mml:mo>(</mml:mo><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes a block diagonal matrix with blocks <inline-formula><mml:math id="M214" display="inline"><mml:mi>A</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M215" display="inline"><mml:mi>B</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M216" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="double-struck">I</mml:mi><mml:mi>c</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is a <inline-formula><mml:math id="M217" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>=</mml:mo><mml:mi>m</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">16</mml:mn></mml:mrow></mml:math></inline-formula>-dimensional identity matrix (as the profile is represented by 4 dimensions instead of original 20), and <inline-formula><mml:math id="M218" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mn mathvariant="bold">0</mml:mn><mml:mi>c</mml:mi></mml:msub><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> is a stacked vector of CO<sub>2</sub> profile mean and a <inline-formula><mml:math id="M220" display="inline"><mml:mi>c</mml:mi></mml:math></inline-formula>-dimensional zero vector.</p>

      <fig id="Ch1.F3" specific-use="star"><label>Figure 3</label><caption><p id="d2e5408"><bold>(a)</bold> The CO<sub>2</sub> profile covariance matrix of the simulation distribution used in this work. <bold>(b)</bold> Four leading singular vectors from the SVD of the covariance matrix.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f03.png"/>

        </fig>

      <p id="d2e5431">Next, we generate training data using a Sobol' sequence (see Sect. <xref ref-type="sec" rid="Ch1.S2.SS4"/>). For this study, we can omit dispersion, EOF and SIF parts of the state vector (see Table <xref ref-type="table" rid="Ch1.T1"/>), and fix them to the prior mean. This follows from the discussion in Sect. <xref ref-type="sec" rid="Ch1.S3"/> focusing on monochromatic radiances. Omitting dispersion simplifies computations as the wavelength grid would otherwise shift, making SVD for radiance dimension reduction hard. <xref ref-type="bibr" rid="bib1.bibx39" id="text.66"/> solved this problem by employing functional principal component analysis, while we can proceed with ordinary SVD. The empirical orthogonal functions (EOFs) are included in the operational retrieval to reduce fit residuals and therefore make convergence analysis easier. These have no direct impact on our study and can be safely omitted. Furthermore, the SIF parameters are fit on the O2 band only as part of the instrument effects, and we do not include them in the emulation for this reason. As is evident from Eq. (<xref ref-type="disp-formula" rid="Ch1.E9"/>), the measurement geometry has a significant impact on the output of the FP model. For this reason we include three extra parameters, <inline-formula><mml:math id="M222" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M223" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M224" display="inline"><mml:mrow><mml:mi mathvariant="italic">φ</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">φ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, to our training data vector. Sufficient and realistic limits to these parameters are obtained from the simulation distribution of <xref ref-type="bibr" rid="bib1.bibx3" id="text.67"/> by considering a 4<inline-formula><mml:math id="M225" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula> interval around the mean values. In all, we now have <inline-formula><mml:math id="M226" display="inline"><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">4</mml:mn><mml:mo>+</mml:mo><mml:mn mathvariant="normal">21</mml:mn><mml:mo>+</mml:mo><mml:mn mathvariant="normal">3</mml:mn><mml:mo>=</mml:mo><mml:mn mathvariant="normal">28</mml:mn></mml:mrow></mml:math></inline-formula> for input space, coming from profile PCs, other included state vector elements, and geometry. We create a Sobol' sequence of 20 000 points for training and scale all dimensions of the hypercube to <inline-formula><mml:math id="M227" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">4</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">4</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>, corresponding to 4 standard deviations on the normalized <inline-formula><mml:math id="M228" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover></mml:math></inline-formula> basis. We further obtain the training data set in original space by reversing the transformation (<xref ref-type="disp-formula" rid="Ch1.E19"/>).</p>
      <p id="d2e5545">Training data <inline-formula><mml:math id="M229" display="inline"><mml:mi mathvariant="bold">Y</mml:mi></mml:math></inline-formula> (radiances) are obtained by evaluating the FP model on each <inline-formula><mml:math id="M230" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula> from the scaled Sobol' sequence. For this work, we choose a single realistic land nadir measurement to represent physical parameters not included in state vector <inline-formula><mml:math id="M231" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula>. We perturb sampling geometry to reflect relevant solar and instrument angles. For a real-world application this approach can be extended to include different scenes and other location-dependent parameters. To obtain the labels <inline-formula><mml:math id="M232" display="inline"><mml:mi mathvariant="bold-italic">z</mml:mi></mml:math></inline-formula>, we similarly perform truncated SVD on the radiances <inline-formula><mml:math id="M233" display="inline"><mml:mi mathvariant="bold">Y</mml:mi></mml:math></inline-formula> separately on each wavelength band <inline-formula><mml:math id="M234" display="inline"><mml:mrow><mml:mi>B</mml:mi><mml:mo>∈</mml:mo><mml:mo>[</mml:mo><mml:mtext>O2</mml:mtext><mml:mo>,</mml:mo><mml:mtext>WCO2</mml:mtext><mml:mo>,</mml:mo><mml:mtext>SCO2</mml:mtext><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> and collect the leading <inline-formula><mml:math id="M235" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> singular vectors in matrices <inline-formula><mml:math id="M236" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">P</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. The four leading singular vectors for each band are included in Fig. <xref ref-type="fig" rid="Ch1.F4"/> to present the kinds of features the most significant principal components, or basis vectors, encode. While this decomposition could hold additional information on physical processes behind the radiative transfer model, we do not pursue such an analysis further in this work. With additional standardization of the variables, we obtain the following transformations for each wavelength band <inline-formula><mml:math id="M237" display="inline"><mml:mi>B</mml:mi></mml:math></inline-formula>:
            <disp-formula id="Ch1.E20" content-type="numbered"><label>20</label><mml:math id="M238" display="block"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow><mml:mo mathvariant="normal" stretchy="true">̃</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">σ</mml:mi><mml:mi>z</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mfenced open="(" close=")"><mml:mrow><mml:msubsup><mml:mi mathvariant="bold">P</mml:mi><mml:mi>B</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mfenced open="(" close=")"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mi>B</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>z</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M239" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the radiance mean, and <inline-formula><mml:math id="M240" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>z</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M241" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">σ</mml:mi><mml:mi>z</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the principal component mean and standard deviation for band <inline-formula><mml:math id="M242" display="inline"><mml:mi>B</mml:mi></mml:math></inline-formula>.</p>

      <fig id="Ch1.F4" specific-use="star"><label>Figure 4</label><caption><p id="d2e5742">Leading four basis vectors obtained from the SVD (principal components, PCs) for radiances <inline-formula><mml:math id="M243" display="inline"><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math></inline-formula> for each band: O2, weak CO<sub>2</sub> (WCO2), and strong CO<sub>2</sub> (SCO2). Basis vectors 2–4 are offset for illustration purposes.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f04.png"/>

        </fig>

      <p id="d2e5776">The quality of this approximation is assessed by plotting the reconstruction <inline-formula><mml:math id="M246" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">P</mml:mi><mml:mi>B</mml:mi></mml:msub><mml:msubsup><mml:mi mathvariant="bold">P</mml:mi><mml:mi>B</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mi>B</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>B</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> over a holdout data set not used in computing the SVD. We illustrate in the upper panel of Fig. <xref ref-type="fig" rid="Ch1.F5"/> the distribution of relative reconstruction error from this data set. We have further applied the instrument function to each residual and further divided them by the measurement error standard deviation given by Eq. (<xref ref-type="disp-formula" rid="Ch1.E13"/>). This metric is justified by the rationale that if the reconstruction error is less than or comparable to measurement error on the radiances, no significant amount of information is lost.</p>

      <fig id="Ch1.F5" specific-use="star"><label>Figure 5</label><caption><p id="d2e5826"><bold>(a, b, c)</bold> Distribution of relative reconstruction error for monochromatic radiances on the O2, WCO2, and SCO2 bands. <bold>(d, e, f)</bold> Distribution of reconstruction error for all bands after applying the instrument function and dividing by measurement error standard deviation. Shading represents 50 % (red), 90 % (blue), 95 % (green), and 99 % (gray) confidence intervals.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f05.png"/>

        </fig>

      <p id="d2e5840">The final emulator <inline-formula><mml:math id="M247" display="inline"><mml:mrow><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> can now be summarized in Fig. <xref ref-type="fig" rid="Ch1.F6"/>, where <inline-formula><mml:math id="M248" display="inline"><mml:mrow><mml:msub><mml:mtext>GP</mml:mtext><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>x</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mi>B</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the GP prediction given by Eq. (<xref ref-type="disp-formula" rid="Ch1.E1"/>), and the indices <inline-formula><mml:math id="M249" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M250" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>, and <inline-formula><mml:math id="M251" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> run through the number of principal components included in a given band. The effect of this choice will be examined further later in this work. The state is first normalized according to Eq. (<xref ref-type="disp-formula" rid="Ch1.E19"/>) for each band <inline-formula><mml:math id="M252" display="inline"><mml:mi>B</mml:mi></mml:math></inline-formula>, after which the GP-predicted principal component loadings are assembled back to radiances using relation in Eq. (<xref ref-type="disp-formula" rid="Ch1.E20"/>). When evaluating the emulator, each index is independent and can then be computed in parallel.</p>

      <fig id="Ch1.F6" specific-use="star"><label>Figure 6</label><caption><p id="d2e5920">Diagram showing the step-by-step process of emulator evaluation. Transformation <inline-formula><mml:math id="M253" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo mathvariant="normal" stretchy="false">̃</mml:mo></mml:mover></mml:math></inline-formula> refers to normalization in Eq. (<xref ref-type="disp-formula" rid="Ch1.E19"/>); similarly <inline-formula><mml:math id="M254" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo mathvariant="normal" stretchy="false">̃</mml:mo></mml:mover></mml:math></inline-formula> denotes the scaling given in Eq. (<xref ref-type="disp-formula" rid="Ch1.E20"/>). <inline-formula><mml:math id="M255" display="inline"><mml:mrow><mml:mtext>GP</mml:mtext><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes the GP prediction given by Eq. (<xref ref-type="disp-formula" rid="Ch1.E1"/>) of labels <inline-formula><mml:math id="M256" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula>, which are assembled back to radiances based on Eq. (<xref ref-type="disp-formula" rid="Ch1.E20"/>) for each band and each principal component <inline-formula><mml:math id="M257" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M258" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>, and <inline-formula><mml:math id="M259" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> therein.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f06.png"/>

        </fig>

</sec>
<sec id="Ch1.S4.SS2">
  <label>4.2</label><title>Training</title>
      <p id="d2e6011">Having obtained training data <inline-formula><mml:math id="M260" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold">X</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover></mml:math></inline-formula> and <inline-formula><mml:math id="M261" display="inline"><mml:mi mathvariant="bold-italic">z</mml:mi></mml:math></inline-formula>, we can now proceed in optimizing the kernel parameters as described in Sect. <xref ref-type="sec" rid="Ch1.S2.SS3"/>. We prescribe an individual GP per output parameter <inline-formula><mml:math id="M262" display="inline"><mml:mrow><mml:msub><mml:mi>z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. We have <inline-formula><mml:math id="M263" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M264" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 20 000 for training data size, and we set <inline-formula><mml:math id="M265" display="inline"><mml:mi>M</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M266" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 100 for mini-batch size, set <inline-formula><mml:math id="M267" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M268" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 5 for the number of prediction points per mini-batch, and run the ADAM optimizer for 5000 iterations with a small learning rate – in our case 0.02. We initialize all other parameters at 1, except for linear component weight at 0 and the nugget at <inline-formula><mml:math id="M269" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">6</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> for <inline-formula><mml:math id="M270" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover></mml:math></inline-formula>. As outlined in Sect. 3.3, we further reduce the dimension of the input space by selecting only the indices that a given wavelength band is sensitive to, given by Table <xref ref-type="table" rid="Ch1.T1"/>.</p>
      <p id="d2e6118">For testing the performance of the algorithm, we draw a random sample <inline-formula><mml:math id="M271" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mtext>test</mml:mtext></mml:msup></mml:mrow></mml:math></inline-formula> from the same simulation distribution from <xref ref-type="bibr" rid="bib1.bibx3" id="text.68"/> as independent test data, which is then used to evaluate the FP model to create radiances <inline-formula><mml:math id="M272" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">Y</mml:mi><mml:mtext>test</mml:mtext></mml:msup></mml:mrow></mml:math></inline-formula>. For test data, we fix dispersion, EOFs, and SIF at prior values as before. Example behavior of the loss function together with evolution of the kernel parameter values and true vs. predicted <inline-formula><mml:math id="M273" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula> values is shown in Fig. <xref ref-type="fig" rid="Ch1.F7"/>. We see that during training, the loss function values converge to a small value close to 0. The fluctuations of loss function values are due to the ADAM optimizer effectively being a stochastic gradient descent method: each mini-batch is sampled randomly, which causes consequent steps, possibly resulting in higher values, while the running average still always decreases. We can also see the evolution of different kernel parameters: the weight of the linear component, for example, was initially set very close to 0. During training, the algorithm correctly identifies the significance of the linear term, and the relative importance of this term overtakes the nugget (corresponding to random noise). Resulting predictions of principal component loadings can be seen to land very tightly on the one-to-one line, indicating good performance over the whole test data. The distribution of true vs. predicted <inline-formula><mml:math id="M274" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula> values for each component on each wavelength band is further illustrated in Fig. <xref ref-type="fig" rid="Ch1.F8"/>. We conclude that the predicted values correspond the true values most of the time. Some principal components show a larger spread on prediction errors (e.g., O2 PC 9 or WCO2 PC 9). These principal components can be redundant and do not contain meaningful information about the radiance, since the total predictive performance still remains very accurate.</p>

      <fig id="Ch1.F7" specific-use="star"><label>Figure 7</label><caption><p id="d2e6167">Example training performance for the first principal component of the O<sub>2</sub> band. <bold>(a)</bold> Loss function values for the first 400 iterations of parameter learning. The blue line depicts the cost function value per iteration. The red line depicts a cumulative running average of the cost function. <bold>(b)</bold> Evolution of the kernel parameters as a function of iteration. <bold>(c)</bold> True vs. predicted <inline-formula><mml:math id="M276" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula> values over a withheld test set for the first component of the O<sub>2</sub> band.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f07.png"/>

        </fig>

      <fig id="Ch1.F8" specific-use="star"><label>Figure 8</label><caption><p id="d2e6214">True versus predicted values for 10 radiance principal components on each wavelength band.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f08.png"/>

        </fig>

</sec>
<sec id="Ch1.S4.SS3">
  <label>4.3</label><title>Predictive performance</title>
      <p id="d2e6231">Finally, we assemble the predicted <inline-formula><mml:math id="M278" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula> values back to radiances and compute the relative differences with the test data, shown in the upper panel of Fig. <xref ref-type="fig" rid="Ch1.F9"/>. On the lower panels, as before, we apply the instrument function to these residuals and divide by the measurement error standard deviation to underline that the desired performance would be to make less prediction error than measurement error.</p>

      <fig id="Ch1.F9" specific-use="star"><label>Figure 9</label><caption><p id="d2e6245"><bold>(a, b, c)</bold> Distribution of relative prediction error for monochromatic radiances on the O2, WCO2, and SCO2 bands. <bold>(d, e, f)</bold> Distribution of the prediction error for all bands after applying the instrument function and dividing by the measurement error standard deviation. Shading represents 50 % (red), 90 % (blue), 95 % (green), and 99 % (gray) confidence intervals.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f09.png"/>

        </fig>

      <p id="d2e6259">After constructing the emulator obtaining radiances as outputs, we can further apply Eq. (<xref ref-type="disp-formula" rid="Ch1.E3"/>) to compute the Jacobians <inline-formula><mml:math id="M279" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mtext>d</mml:mtext><mml:mrow><mml:mtext>d</mml:mtext><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover></mml:mrow></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold-italic">z</mml:mi></mml:mrow></mml:math></inline-formula>. We can then reverse the normalizing transformations on both <inline-formula><mml:math id="M280" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover></mml:math></inline-formula> and <inline-formula><mml:math id="M281" display="inline"><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math></inline-formula> and further apply the instrument functions to our Jacobians to get back to the operational observation units. The Jacobians obtained by evaluating both the FP model and the emulator on an example state vector together with the resulting profile averaging kernels are shown in Fig. <xref ref-type="fig" rid="Ch1.F10"/>. Accurate averaging kernels are important for downstream usage of the retrieved XCO<sub>2</sub>, as it is used, for example, in flux inversion models to obtain the vertical sensitivity of XCO<sub>2</sub> to modeled atmospheric CO<sub>2</sub> profiles. We note that we have normalized the Jacobians and averaging kernels by maximum values of each row in the matrix for visual clarity. Although not perfectly similar, we conclude that these two outputs share significant similarity. The main difference in the averaging kernels mainly results from the choices of modeling concentration profiles by principal component loadings.</p>

      <fig id="Ch1.F10" specific-use="star"><label>Figure 10</label><caption><p id="d2e6334">Normalized Jacobians <inline-formula><mml:math id="M285" display="inline"><mml:mi mathvariant="bold">K</mml:mi></mml:math></inline-formula> and profile averaging kernels <inline-formula><mml:math id="M286" display="inline"><mml:mi mathvariant="bold">A</mml:mi></mml:math></inline-formula> from both the FP model and the emulator, together with the corresponding differences.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f10.png"/>

        </fig>

      <p id="d2e6357">As noted in previous work by <xref ref-type="bibr" rid="bib1.bibx39" id="text.69"/>, an emulator provides substantial appeal in terms of computational efficiency. For the current work, the average computational times for model evaluation and Jacobians are summarized in Table <xref ref-type="table" rid="Ch1.T2"/> on a 2023 MacBook Pro. It is worth mentioning that ReFRACtor computes Jacobians via automatic differentiation, while our emulator does this analytically. Three cases are contrasted: the standard ReFRACtor FP evaluation, the emulator for monochromatic radiances plus ILS, and the emulator alone.</p>

<table-wrap id="Ch1.T2"><label>Table 2</label><caption><p id="d2e6368">Evaluation times of the radiative transfer (RT) model and related Jacobian, comparing the ReFRACtor implementation, monochromatic emulator with instrument line shape (ILS) and other spectral corrections, and monochromatic emulator only.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="3">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">RT [s]</oasis:entry>
         <oasis:entry colname="col3">RT <inline-formula><mml:math id="M287" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> Jacobian [s]</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">ReFRACtor</oasis:entry>
         <oasis:entry colname="col2">33.45</oasis:entry>
         <oasis:entry colname="col3">55.26</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Emulator <inline-formula><mml:math id="M288" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula> ILS</oasis:entry>
         <oasis:entry colname="col2">2.06</oasis:entry>
         <oasis:entry colname="col3">2.17</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Emulator</oasis:entry>
         <oasis:entry colname="col2">0.05</oasis:entry>
         <oasis:entry colname="col3">0.19</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
<sec id="Ch1.S4.SS4">
  <label>4.4</label><title>Faster research version</title>
      <p id="d2e6458">In recent years, the uncertainty quantification and statistics community has benefited enormously by utilizing the surrogate model by <xref ref-type="bibr" rid="bib1.bibx22" id="text.70"/> to explore the OCO-2 retrieval in numerous applications <xref ref-type="bibr" rid="bib1.bibx5 bib1.bibx35 bib1.bibx43 bib1.bibx23 bib1.bibx48" id="paren.71"/>. We remark that for similar purposes, our emulator can be used as an even faster surrogate. As we see from Table <xref ref-type="table" rid="Ch1.T2"/>, the majority of the computational cost for the emulator comes from the instrument effects, which are part of the ReFRACtor software. If one is not interested in including the effects of dispersion, SIF, and EOFs during the retrieval, we notice from Eq. (<xref ref-type="disp-formula" rid="Ch1.E10"/>) that instrument corrections to RT amount to multiplication, addition, and convolution, which is associative with respect to multiplication. We can then write the emulator as
            <disp-formula id="Ch1.E21" content-type="numbered"><label>21</label><mml:math id="M289" display="block"><mml:mrow><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mtext>ILS</mml:mtext><mml:mfenced close=")" open="("><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">P</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover><mml:mi mathvariant="italic">η</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>=</mml:mo><mml:mtext>ILS</mml:mtext><mml:mfenced open="(" close=")"><mml:mover accent="true"><mml:mi mathvariant="bold">P</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover></mml:mfenced><mml:mi mathvariant="italic">η</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M290" display="inline"><mml:mrow><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the overall emulator, <inline-formula><mml:math id="M291" display="inline"><mml:mrow><mml:mtext>ILS</mml:mtext><mml:mo>(</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is a function applying the instrument corrections from Eq. (<xref ref-type="disp-formula" rid="Ch1.E10"/>), <inline-formula><mml:math id="M292" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold">P</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> is a projection matrix consisting of radiance basis functions that correspond to transforming predicted labels <inline-formula><mml:math id="M293" display="inline"><mml:mi mathvariant="bold-italic">z</mml:mi></mml:math></inline-formula> back to radiances <inline-formula><mml:math id="M294" display="inline"><mml:mi mathvariant="bold-italic">y</mml:mi></mml:math></inline-formula> following the last step in Fig. <xref ref-type="fig" rid="Ch1.F6"/>, and <inline-formula><mml:math id="M295" display="inline"><mml:mrow><mml:mi mathvariant="italic">η</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the emulator predicting labels <inline-formula><mml:math id="M296" display="inline"><mml:mi mathvariant="bold-italic">z</mml:mi></mml:math></inline-formula> from inputs <inline-formula><mml:math id="M297" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula>. Done this way, we can evaluate the instrument corrections on the basis vectors once, after which OE or MCMC can proceed an order of magnitude faster (according to Table <xref ref-type="table" rid="Ch1.T2"/>).</p>
</sec>
</sec>
<sec id="Ch1.S5">
  <label>5</label><title>Retrievals using the emulator</title>
      <p id="d2e6621">We are now ready to compare the performance of the emulator against the FP model when performing simulated retrievals. After obtaining the minimizer <inline-formula><mml:math id="M298" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> and a Laplace approximation of posterior covariance, <inline-formula><mml:math id="M299" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold">S</mml:mi><mml:mo mathvariant="normal" stretchy="true">^</mml:mo></mml:mover></mml:math></inline-formula>, the quantity of interest is further given by multiplying the CO<sub>2</sub> profile by the pressure weighting function <inline-formula><mml:math id="M301" display="inline"><mml:mi mathvariant="bold-italic">h</mml:mi></mml:math></inline-formula> that puts an appropriate weight for each pressure level, resulting in
          <disp-formula id="Ch1.E22" content-type="numbered"><label>22</label><mml:math id="M302" display="block"><mml:mrow><mml:msub><mml:mtext>XCO</mml:mtext><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">h</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">20</mml:mn></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e6692">The reported uncertainty coming with the quantity of interest (QoI) is given by
          <disp-formula id="Ch1.E23" content-type="numbered"><label>23</label><mml:math id="M303" display="block"><mml:mrow><mml:msub><mml:mtext>XCO</mml:mtext><mml:mrow><mml:msub><mml:mn mathvariant="normal">2</mml:mn><mml:mtext>uncert</mml:mtext></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msqrt><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">h</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold">S</mml:mi><mml:mo mathvariant="normal" stretchy="true">^</mml:mo></mml:mover><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">20</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">20</mml:mn></mml:mrow></mml:msub><mml:mi mathvariant="bold-italic">h</mml:mi></mml:mrow></mml:msqrt><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>

      <fig id="Ch1.F11" specific-use="star"><label>Figure 11</label><caption><p id="d2e6746"><bold>(a)</bold> Example O<sub>2</sub> band radiance. <bold>(b)</bold> Realization from the noise distribution <inline-formula><mml:math id="M305" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">ε</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. <bold>(c)</bold> Realization from the model discrepancy distribution <inline-formula><mml:math id="M306" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">δ</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">δ</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Units for all panels are W m<sup>−2</sup> sr<sup>−1</sup> µm<sup>−1</sup>, the units of radiance for OCO-2.</p></caption>
        <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f11.png"/>

      </fig>

      <p id="d2e6855">We present two test cases for assessing retrieval performance of our emulator. First, we create synthetic observations by evaluating the FP model on our test set of states <inline-formula><mml:math id="M310" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula> and by adding a realization from the Gaussian noise distribution:
          <disp-formula id="Ch1.E24" content-type="numbered"><label>24</label><mml:math id="M311" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mtext>test</mml:mtext></mml:msub><mml:mo>=</mml:mo><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">ε</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
        where <inline-formula><mml:math id="M312" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">ε</mml:mi><mml:mo>∼</mml:mo><mml:mi>N</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">ε</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Second, we follow the methods outlined in <xref ref-type="bibr" rid="bib1.bibx3" id="text.72"/> to further corrupt the simulated measurement by realistic <italic>model discrepancy</italic> (MD) adjustment, given by
          <disp-formula id="Ch1.E25" content-type="numbered"><label>25</label><mml:math id="M313" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi><mml:mtext>test</mml:mtext></mml:msub><mml:mo>=</mml:mo><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">ε</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">δ</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
        where <inline-formula><mml:math id="M314" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">δ</mml:mi><mml:mo>∼</mml:mo><mml:mi>N</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi mathvariant="italic">δ</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="bold">S</mml:mi><mml:mi mathvariant="italic">δ</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The shape of this adjustment is illustrated in Fig. <xref ref-type="fig" rid="Ch1.F11"/>. As noted by the authors, model discrepancy as presented here is a statistical representation of forward modeling mismatches so that our simulated measurements would better correspond to real data.</p>
      <p id="d2e6995">We then perform XCO<sub>2</sub> retrievals both using the full-physics model <inline-formula><mml:math id="M316" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and the emulator <inline-formula><mml:math id="M317" display="inline"><mml:mrow><mml:mi>g</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> following the algorithm laid out in Sect. <xref ref-type="sec" rid="Ch1.S3"/>. Results for retrieved XCO<sub>2</sub> for both cases with and without MD are illustrated in Fig. <xref ref-type="fig" rid="Ch1.F12"/>. The corresponding XCO<sub>2</sub> uncertainty values are compared in Fig. <xref ref-type="fig" rid="Ch1.F13"/>. We conclude that using the emulator in place of the FP model in retrieval preserves the accuracy and replicates the same biases as the FP model while having a good correlation with each other. On the other hand, the output uncertainty estimates do not seem to correspond to each other, and further analysis on this output will be required in future research work.</p>

      <fig id="Ch1.F12" specific-use="star"><label>Figure 12</label><caption><p id="d2e7062">Retrieved XCO<sub>2</sub> over the holdout data set using the FP model and the emulator. Full-physics model <bold>(a)</bold>, the emulator <bold>(b)</bold>, and comparison of the two <bold>(c)</bold>. <bold>(d, e, f)</bold> Similarly but with added model discrepancy in observed data. RMSE (root mean square error) describes the bias of the retrievals, while the <inline-formula><mml:math id="M321" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> value is included to assess the correlation between quantities of interest.</p></caption>
        <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f12.png"/>

      </fig>

      <fig id="Ch1.F13" specific-use="star"><label>Figure 13</label><caption><p id="d2e7106">Scatterplots of retrieval XCO<sub>2</sub> uncertainty over the holdout data set using the FP model and the emulator. <bold>(a)</bold> No MD in observations. <bold>(b)</bold> MD included in observations.</p></caption>
        <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f13.png"/>

      </fig>

<sec id="Ch1.S5.SS1">
  <label>5.1</label><title>Effect of PCA dimensionality</title>
      <p id="d2e7137">Previously in this work we did not prescribe a certain number of principal components to use in radiance dimension reduction. Figure <xref ref-type="fig" rid="Ch1.F14"/> illustrates the retrieved XCO<sub>2</sub> root mean square error (RMSE) and mean absolute error (MAE) against the true known value, together with Fig. <xref ref-type="fig" rid="Ch1.F15"/>, illustrating radiance reconstruction and prediction RMSE and MAE similarly to Figs. <xref ref-type="fig" rid="Ch1.F5"/> and <xref ref-type="fig" rid="Ch1.F9"/>, all as a function of the number of PCs used. We can collectively deduce that using more than 25 principal components per band does not yield any additional performance benefits. We remark that compared to the earlier work by <xref ref-type="bibr" rid="bib1.bibx39" id="text.73"/>, who argued for one to three principal components per band, our results show that many more components are needed for accurate retrievals. This highlights the importance of empirically checking the effect of dimensionality reduction and not relying on rules of thumb such as conserving 95 % of the variability.</p>

      <fig id="Ch1.F14"><label>Figure 14</label><caption><p id="d2e7163">RMSE and MAE as a function of the number of principal components used per band in radiance dimension reduction for XCO<sub>2</sub> retrievals, both with and without MD, over a holdout test data set.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f14.png"/>

        </fig>

      <fig id="Ch1.F15"><label>Figure 15</label><caption><p id="d2e7183">RMSE and MAE as a function of the number of principal components used per band in radiance dimension reduction reconstructions and predictions, all over a holdout test data set.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f15.png"/>

        </fig>

</sec>
<sec id="Ch1.S5.SS2">
  <label>5.2</label><title>Effect of aerosol types</title>
      <p id="d2e7201">To assess the effect of changing the dominant aerosol types on the performance of the retrievals, we repeat the training and retrieval procedure described in this section with two separate pairs of dominant aerosol types. Firstly, we consider dust (DU) and sea salt (SS) and, secondly, DU and sulfate (SO). These are among the most common aerosol combinations encountered in the OCO-2 operations. We repeat the retrievals for both cases with additional MD adjustment as before. Results of this experiment are summarized in Fig. <xref ref-type="fig" rid="Ch1.F16"/>. We conclude that the proposed method is robust in changing physical conditions, which indicates fitness for further operational integration.</p>

      <fig id="Ch1.F16"><label>Figure 16</label><caption><p id="d2e7208">Difference (in ppm) between true and retrieved XCO<sub>2</sub> from simulated measurements with different dominant aerosol species (in ppm). Symbols on the <inline-formula><mml:math id="M326" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> axis denote the specifics of a given experiment: F – full-physics model, E – emulator, 1 – DU and SS aerosols, and 2 – DU and SO aerosols. D denotes model discrepancy, followed by a number denoting mean error (in ppm).</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/18/673/2025/amt-18-673-2025-f16.png"/>

        </fig>

</sec>
</sec>
<sec id="Ch1.S6" sec-type="conclusions">
  <label>6</label><title>Conclusions</title>
      <p id="d2e7242">In this work, we have constructed and implemented a fast and accurate forward model emulator for the ReFRACtor implementation of the OCO-2 full-physics forward model. The emulator produces closed-form Jacobians and, as such, provides a convenient way of performing XCO<sub>2</sub> retrievals. We have demonstrated the accuracy of these retrievals and analyzed the effect of PCA dimension, aerosol types, and model discrepancy on the retrieval. All these tests indicate robustness and excellent reliability of our method and offer an encouraging proof of concept for future operational implementation with the latest ACOS algorithm and real-world OCO-2 data.</p>
      <p id="d2e7254">This work has significantly advanced kernel flow methodology <xref ref-type="bibr" rid="bib1.bibx46" id="paren.74"/> by including a cross-validation-based training strategy using a RMSE cost function and a new strategy for mini-batching. With this method, we have achieved a relative error of less than 1 %, which on its own is a significant improvement from the point of view of operator learning (e.g., see <xref ref-type="bibr" rid="bib1.bibx1" id="altparen.75"/>, for comparisons of GP and NN methods on various nonlinear problems). Our approach is computationally fast and, when a training data set is properly engineered, performs consistently with the span of training data. Compared with our ability to compute Jacobians in closed form, our approach holds potential to solve current and future data processing issues in atmospheric remote sensing stemming from computationally intensive forward models.</p>
      <p id="d2e7263">While Gaussian process methods offer an attractive means of including uncertainty propagation in emulation pipeline, our tests have shown that the predicted posterior standard deviation given by GPs was not adequate in providing reliable coverage of true labels after prediction. This is likely due to the kernel flow method's focus on optimizing the posterior mean prediction without assessing the prediction uncertainty. This could easily be remedied by including an uncertainty tuning penalty in the kernel flow loss function. Another disclaimer comes from evaluations of retrieval uncertainty in XCO<sub>2</sub>: our method did not agree with operational OE. This does not mean our estimates were better or worse, and further research is needed in calibrating retrieval uncertainties.</p>
      <p id="d2e7275">A logical next step would be to implement the GP emulator for an operational ACOS forward model instead of ReFRACtor, which requires closer collaboration with the OCO algorithm team. After demonstration on OCO-2, our approach is directly applicable for a myriad of other satellite missions. We note that future work will have to deal with training data design that was simplified in this work. Assessing different temporal and spatial variability in forward model parameters together with feasible distributions of state vectors will be key in this design effort. These efforts might benefit from including a cost–benefit analysis on training a global model usable everywhere versus, for example, retraining the emulator for sufficiently specified spatiotemporal data sets.</p>
</sec>

      
      </body>
    <back><app-group>

<app id="App1.Ch1.S1">
  <label>Appendix A</label><title>Closed-form Jacobians</title>
      <p id="d2e7289">To obtain a closed-form equation for the Jacobians used in the XCO<sub>2</sub> retrievals, we must explicitly compute the term <inline-formula><mml:math id="M330" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mtext>d</mml:mtext><mml:mrow><mml:mtext>d</mml:mtext><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> in Eq. (<xref ref-type="disp-formula" rid="Ch1.E3"/>). To accomplish this, we compute the partial derivative of the kernel function (Eq. <xref ref-type="disp-formula" rid="Ch1.E4"/>) with respect to the first input:</p>
      <p id="d2e7346"><disp-formula specific-use="gather" content-type="numbered"><mml:math id="M331" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E26"><mml:mtd><mml:mtext>A1</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mtable class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mi>k</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo mathsize="2.0em">[</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mfenced open="(" close=")"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>+</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mo>‖</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:msub><mml:mo>‖</mml:mo><mml:mi mathvariant="script">W</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mi>exp⁡</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mo>‖</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:msub><mml:mo>‖</mml:mo><mml:mi mathvariant="script">W</mml:mi></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:mi mathvariant="bold-italic">x</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo mathsize="2.0em">]</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E27"><mml:mtd><mml:mtext>A2</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtable rowspacing="0.2ex" class="split" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo mathsize="2.0em">[</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>exp⁡</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mo>‖</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:msub><mml:mo>‖</mml:mo><mml:mi mathvariant="script">W</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo mathsize="2.0em">]</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mo>+</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo mathsize="2.0em">[</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mfenced close=")" open="("><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mo>‖</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:msub><mml:mo>‖</mml:mo><mml:mi mathvariant="script">W</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mi>exp⁡</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mo>‖</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:msub><mml:mo>‖</mml:mo><mml:mi mathvariant="script">W</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo mathsize="2.0em">]</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E28"><mml:mtd><mml:mtext>A3</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>+</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mfenced open="[" close="]"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:mi mathvariant="bold-italic">x</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mfenced></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E29"><mml:mtd><mml:mtext>A4</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtable rowspacing="0.2ex" class="split" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo mathsize="2.0em">[</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>exp⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:mfenced><mml:mo mathsize="2.0em">]</mml:mo><mml:mo>+</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo mathsize="2.0em">[</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mfenced open="(" close=")"><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:mfenced></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mi>exp⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:mfenced><mml:mo mathsize="2.0em">]</mml:mo><mml:mo>+</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mfenced close="]" open="["><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:mi mathvariant="bold-italic">x</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mfenced></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E30"><mml:mtd><mml:mtext>A5</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtable class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo mathsize="2.0em">[</mml:mo><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mfenced close=")" open="("><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle></mml:mfenced><mml:mi>exp⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:mfenced><mml:mo mathsize="2.0em">]</mml:mo><mml:mo>+</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo mathsize="2.0em">[</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mfenced close=")" open="("><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle></mml:mfenced></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mi>exp⁡</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:mfenced><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mi mathvariant="bold-italic">d</mml:mi><mml:mi>exp⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:mfenced><mml:mo mathsize="2.0em">]</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E31"><mml:mtd><mml:mtext>A6</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>+</mml:mo><mml:mfenced open="[" close="]"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mfenced></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E32"><mml:mtd><mml:mtext>A7</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mfenced close="]" open="["><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mi mathvariant="bold-italic">d</mml:mi><mml:mi>exp⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:mfenced></mml:mrow></mml:mfenced><mml:mo>+</mml:mo><mml:mfenced open="[" close="]"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mfenced></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E33"><mml:mtd><mml:mtext>A8</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtable class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="script">W</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mfrac></mml:mstyle></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo mathsize="2.0em">[</mml:mo><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mi mathvariant="bold-italic">d</mml:mi><mml:mi>exp⁡</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold-italic">d</mml:mi></mml:mrow></mml:mfenced><mml:mo mathsize="2.0em">]</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mo>+</mml:mo><mml:mo mathsize="2.0em">[</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="script">W</mml:mi><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo mathsize="2.0em">]</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E34"><mml:mtd><mml:mtext>A9</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:msup><mml:mi mathvariant="script">W</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mfenced close="]" open="["><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mfenced close=")" open="("><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mi>exp⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:msqrt><mml:mn mathvariant="normal">3</mml:mn></mml:msqrt><mml:mi>l</mml:mi></mml:mfrac></mml:mstyle><mml:mo>‖</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:msub><mml:mo>‖</mml:mo><mml:mi mathvariant="script">W</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e8374">After computing <inline-formula><mml:math id="M332" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mo>∂</mml:mo><mml:mrow><mml:mo>∂</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mi>k</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, we get <inline-formula><mml:math id="M333" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mtext>d</mml:mtext><mml:mrow><mml:mtext>d</mml:mtext><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mi mathvariant="bold">Γ</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mi mathvariant="bold">X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> element by element, with <inline-formula><mml:math id="M334" display="inline"><mml:mi mathvariant="bold-italic">x</mml:mi></mml:math></inline-formula> being the new input and <inline-formula><mml:math id="M335" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> a training data point. The final Jacobian is then obtained by computing <inline-formula><mml:math id="M336" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mtext>d</mml:mtext><mml:mrow><mml:mtext>d</mml:mtext><mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:msup><mml:mi>z</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> via Eq. (<xref ref-type="disp-formula" rid="Ch1.E3"/>) and reversing transformations in Eqs. (<xref ref-type="disp-formula" rid="Ch1.E19"/>) and (<xref ref-type="disp-formula" rid="Ch1.E20"/>).</p>
</app>
  </app-group><notes notes-type="codedataavailability"><title>Code and data availability</title>

      <p id="d2e8492">Code and data are available on an OSF repository at <ext-link xlink:href="https://doi.org/10.17605/OSF.IO/U2T8A" ext-link-type="DOI">10.17605/OSF.IO/U2T8A</ext-link> <xref ref-type="bibr" rid="bib1.bibx34" id="paren.76"/>. The software requires ReFRACtor and ReFRACtorUQ GitHub repositories, which are freely available as well.</p>
  </notes><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d2e8505">OL: project administration, conceptualization, methodology, software, formal analysis, data curation, writing (original draft and review and editing), visualization. JS: conceptualization, methodology, software, writing (original draft and review and editing). JH: project administration, conceptualization, methodology, supervision, software, data curation, writing (review and editing). JM: methodology, software, data curation, writing (original draft). AB: conceptualization, methodology, writing (review and editing). HO: conceptualization, methodology, writing (review and editing).</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d2e8511">The contact author has declared that none of the authors has any competing interests.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d2e8518">Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.</p>
  </notes><ack><title>Acknowledgements</title><p id="d2e8524">The research described in this paper was performed at the Jet Propulsion Laboratory, California Institute of Technology, under contract with NASA. The authors thank Pulong Ma and Chris O'Dell for helpful guidance.</p></ack><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d2e8529">This paper was edited by Peer Nowack and reviewed by two anonymous referees.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><label>Batlle et al.(2023)</label><mixed-citation>Batlle, P., Darcy, M., Hosseini, B., and Owhadi, H.: Kernel Methods are  Competitive for Operator Learning, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.2304.13202" ext-link-type="DOI">10.48550/arXiv.2304.13202</ext-link>, 8 October 2023.</mixed-citation></ref>
      <ref id="bib1.bibx2"><label>Boesch et al.(2015)</label><mixed-citation>Boesch, H., Brown, L., Castano, R., Christi, M., Connor, B., Crisp, D.,  Eldering, A., Fisher, B., Frankenberg, C., Gunson, M., Granat, R., McDuffie,  J., Miller, C., Natraj, V., O'Brien, D., O'Dell, C., Osterman, G., Oyafuso,  F., Payne, V., Polonski, I., Smyth, M., Spurr, R., Thompson, D., and Toon,  G.: Orbiting Carbon Observatory-2 (OCO-2) Level 2 Full Physics Retrieval  Algorithm Theoretical Basis, Version 2.0, Rev 2, NASA Earth Data, <uri>https://doi.org/10.5067/8E4VLCK16O6Q</uri>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx3"><label>Braverman et al.(2021)</label><mixed-citation>Braverman, A., Hobbs, J., Teixeira, J., and Gunson, M.: Post hoc Uncertainty  Quantification for Remote Sensing Observing Systems, SIAM/ASA Journal on Uncertainty Quantification, 9, 1064–1093, <ext-link xlink:href="https://doi.org/10.1137/19M1304283" ext-link-type="DOI">10.1137/19M1304283</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx4"><label>Bréon et al.(2022)</label><mixed-citation>Bréon, F.-M., David, L., Chatelanaz, P., and Chevallier, F.: On the potential of a neural-network-based approach for estimating XCO<sub>2</sub> from OCO-2 measurements, Atmos. Meas. Tech., 15, 5219–5234, <ext-link xlink:href="https://doi.org/10.5194/amt-15-5219-2022" ext-link-type="DOI">10.5194/amt-15-5219-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx5"><label>Brynjarsdóttir et al.(2018)</label><mixed-citation>Brynjarsdóttir, J., Hobbs, J., Braverman, A., and Mandrake, L.: Optimal  Estimation Versus MCMC for CO<sub>2</sub> Retrievals, J. Agr. Biol. Envir. St., 23, 297–316, <ext-link xlink:href="https://doi.org/10.1007/s13253-018-0319-8" ext-link-type="DOI">10.1007/s13253-018-0319-8</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx6"><label>Byrne et al.(2023)</label><mixed-citation>Byrne, B., Baker, D. F., Basu, S., Bertolacci, M., Bowman, K. W., Carroll, D., Chatterjee, A., Chevallier, F., Ciais, P., Cressie, N., Crisp, D., Crowell, S., Deng, F., Deng, Z., Deutscher, N. M., Dubey, M. K., Feng, S., García, O. E., Griffith, D. W. T., Herkommer, B., Hu, L., Jacobson, A. R., Janardanan, R., Jeong, S., Johnson, M. S., Jones, D. B. A., Kivi, R., Liu, J., Liu, Z., Maksyutov, S., Miller, J. B., Miller, S. M., Morino, I., Notholt, J., Oda, T., O'Dell, C. W., Oh, Y.-S., Ohyama, H., Patra, P. K., Peiro, H., Petri, C., Philip, S., Pollard, D. F., Poulter, B., Remaud, M., Schuh, A., Sha, M. K., Shiomi, K., Strong, K., Sweeney, C., Té, Y., Tian, H., Velazco, V. A., Vrekoussis, M., Warneke, T., Worden, J. R., Wunch, D., Yao, Y., Yun, J., Zammit-Mangion, A., and Zeng, N.: National CO<sub>2</sub> budgets (2015–2020) inferred from atmospheric CO<sub>2</sub> observations in support of the global stocktake, Earth Syst. Sci. Data, 15, 963–1004, <ext-link xlink:href="https://doi.org/10.5194/essd-15-963-2023" ext-link-type="DOI">10.5194/essd-15-963-2023</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx7"><label>Chevallier et al.(2010)</label><mixed-citation>Chevallier, F., Ciais, P., Conway, T. J., Aalto, T., Anderson, B. E., Bousquet, P., Brunke, E. G., Ciattaglia, L., Esaki, Y., Frohlich, M., Gomez, A., Gomez-Pelaez, A. J., Haszpra, L., Krummel, P., Langenfelds, R. L.,  Leuenberger, M., Machida, T., Maignan, F., Matsueda, H., Morgu, J. A., Mukai,  H., Nakazawa, T., Peylin, P., Ramonet, M., Rivier, L., Sawa, Y., Schmidt, M.,  Steele, L. P., Vay, S. A., Vermeulen, A. T., Wofsy, S., and Worthy, D.: CO<sub>2</sub> surface fluxes at grid point scale estimated from a global 21 year  re-analysis of atmospheric measurements, J. Geophys. Res.-Atmos., 115, D21307, <ext-link xlink:href="https://doi.org/10.1029/2010JD013887" ext-link-type="DOI">10.1029/2010JD013887</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx8"><label>Connor et al.(2008)</label><mixed-citation>Connor, B. J., Boesch, H., Toon, G., Sen, B., Miller, C., and Crisp, D.:  Orbiting Carbon Observatory: Inverse Method and Prospective Error Analysis,  J. Geophys. Res., 113, D05305, <ext-link xlink:href="https://doi.org/10.1029/2006JD008336" ext-link-type="DOI">10.1029/2006JD008336</ext-link>, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx9"><label>Cressie(1993)</label><mixed-citation>Cressie, N.: Statistics for Spatial Data, John Wiley &amp; Sons, Inc., <ext-link xlink:href="https://doi.org/10.1002/9781119115151" ext-link-type="DOI">10.1002/9781119115151</ext-link>,  1993.</mixed-citation></ref>
      <ref id="bib1.bibx10"><label>Cressie(2018)</label><mixed-citation>Cressie, N.: Mission CO<sub>2</sub>ntrol: A Statistical Scientist's Role in Remote  Sensing of Atmospheric Carbon Dioxide, J. Am. Stat. Assoc., 113, 152–168, <ext-link xlink:href="https://doi.org/10.1080/01621459.2017.1419136" ext-link-type="DOI">10.1080/01621459.2017.1419136</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx11"><label>Crisp et al.(2004)</label><mixed-citation>Crisp, D., Atlas, R. M., Breon, F.-M., Brown, L. R., Burrows, J. P., Ciais, P., Connor, B. J., Doney, S. C., Fung, I. Y., Jacob, D. J., Miller, C. E.,  O'Brien, D., Pawson, S., Randerson, J. T., Rayner, P., Salawitch, R. J., Sander, S. P., Sen, B., Stephens, G. L., Tans, P. P., Toon, G. C., Wennberg,  P. O., Wofsy, S. C., Yung, Y. L., Kuang, Z., Chudasama, B., Sprague, G.,  Weiss, B., Pollock, R., Kenyon, D., and Schroll, S.: The Orbiting Carbon Observatory (OCO) mission, Adv. Space. Res., 34, 700–709, <ext-link xlink:href="https://doi.org/10.1016/j.asr.2003.08.062" ext-link-type="DOI">10.1016/j.asr.2003.08.062</ext-link>, 2004.</mixed-citation></ref>
      <ref id="bib1.bibx12"><label>Crisp et al.(2012)</label><mixed-citation>Crisp, D., Fisher, B. M., O'Dell, C., Frankenberg, C., Basilio, R., Bösch, H., Brown, L. R., Castano, R., Connor, B., Deutscher, N. M., Eldering, A., Griffith, D., Gunson, M., Kuze, A., Mandrake, L., McDuffie, J., Messerschmidt, J., Miller, C. E., Morino, I., Natraj, V., Notholt, J., O'Brien, D. M., Oyafuso, F., Polonsky, I., Robinson, J., Salawitch, R., Sherlock, V., Smyth, M., Suto, H., Taylor, T. E., Thompson, D. R., Wennberg, P. O., Wunch, D., and Yung, Y. L.: The ACOS CO<sub>2</sub> retrieval algorithm – Part II: Global XCO<sub>2</sub> data characterization, Atmos. Meas. Tech., 5, 687–707, <ext-link xlink:href="https://doi.org/10.5194/amt-5-687-2012" ext-link-type="DOI">10.5194/amt-5-687-2012</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx13"><label>Crisp et al.(2017)</label><mixed-citation>Crisp, D., Pollock, H. R., Rosenberg, R., Chapsky, L., Lee, R. A. M., Oyafuso, F. A., Frankenberg, C., O'Dell, C. W., Bruegge, C. J., Doran, G. B., Eldering, A., Fisher, B. M., Fu, D., Gunson, M. R., Mandrake, L., Osterman, G. B., Schwandner, F. M., Sun, K., Taylor, T. E., Wennberg, P. O., and Wunch, D.: The on-orbit performance of the Orbiting Carbon Observatory-2 (OCO-2) instrument and its radiometrically calibrated products, Atmos. Meas. Tech., 10, 59–81, <ext-link xlink:href="https://doi.org/10.5194/amt-10-59-2017" ext-link-type="DOI">10.5194/amt-10-59-2017</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx14"><label>Crowell et al.(2019)</label><mixed-citation>Crowell, S., Baker, D., Schuh, A., Basu, S., Jacobson, A. R., Chevallier, F., Liu, J., Deng, F., Feng, L., McKain, K., Chatterjee, A., Miller, J. B., Stephens, B. B., Eldering, A., Crisp, D., Schimel, D., Nassar, R., O'Dell, C. W., Oda, T., Sweeney, C., Palmer, P. I., and Jones, D. B. A.: The 2015–2016 carbon cycle as seen from OCO-2 and the global in situ network, Atmos. Chem. Phys., 19, 9797–9831, <ext-link xlink:href="https://doi.org/10.5194/acp-19-9797-2019" ext-link-type="DOI">10.5194/acp-19-9797-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx15"><label>Datta et al.(2016)</label><mixed-citation>Datta, A., Banerjee, S., Finley, A., and Gelfand, A.: Hierarchical  Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets, J. Am. Stat. Assoc., 111, 800–812, <ext-link xlink:href="https://doi.org/10.1080/01621459.2015.1044091" ext-link-type="DOI">10.1080/01621459.2015.1044091</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx16"><label>David et al.(2021)</label><mixed-citation>David, L., Bréon, F.-M., and Chevallier, F.: XCO<sub>2</sub> estimates from the OCO-2 measurements using a neural network approach, Atmos. Meas. Tech., 14, 117–132, <ext-link xlink:href="https://doi.org/10.5194/amt-14-117-2021" ext-link-type="DOI">10.5194/amt-14-117-2021</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx17"><label>Eldering et al.(2019)</label><mixed-citation>Eldering, A., Taylor, T. E., O'Dell, C. W., and Pavlick, R.: The OCO-3 mission: measurement objectives and expected performance based on 1 year of simulated data, Atmos. Meas. Tech., 12, 2341–2370, <ext-link xlink:href="https://doi.org/10.5194/amt-12-2341-2019" ext-link-type="DOI">10.5194/amt-12-2341-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx18"><label>Frey et al.(2019)</label><mixed-citation>Frey, M., Sha, M. K., Hase, F., Kiel, M., Blumenstock, T., Harig, R., Surawicz, G., Deutscher, N. M., Shiomi, K., Franklin, J. E., Bösch, H., Chen, J., Grutter, M., Ohyama, H., Sun, Y., Butz, A., Mengistu Tsidu, G., Ene, D., Wunch, D., Cao, Z., Garcia, O., Ramonet, M., Vogel, F., and Orphal, J.: Building the COllaborative Carbon Column Observing Network (COCCON): long-term stability and ensemble performance of the EM27/SUN Fourier transform spectrometer, Atmos. Meas. Tech., 12, 1513–1530, <ext-link xlink:href="https://doi.org/10.5194/amt-12-1513-2019" ext-link-type="DOI">10.5194/amt-12-1513-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx19"><label>Friedlingstein et al.(2014)</label><mixed-citation>Friedlingstein, P., Meinshausen, M., Arora, V., Jones, C., Anav, A., Liddicoat, S., and Knutti, R.: Uncertainties in CMIP5 Climate Projections due to Carbon Cycle Feedbacks, J. Climate, 27, 511–526, <ext-link xlink:href="https://doi.org/10.1175/JCLI-D-12-00579.1" ext-link-type="DOI">10.1175/JCLI-D-12-00579.1</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx20"><label>Friedlingstein et al.(2022)</label><mixed-citation>Friedlingstein, P., O'Sullivan, M., Jones, M. W., Andrew, R. M., Gregor, L., Hauck, J., Le Quéré, C., Luijkx, I. T., Olsen, A., Peters, G. P., Peters, W., Pongratz, J., Schwingshackl, C., Sitch, S., Canadell, J. G., Ciais, P., Jackson, R. B., Alin, S. R., Alkama, R., Arneth, A., Arora, V. K., Bates, N. R., Becker, M., Bellouin, N., Bittig, H. C., Bopp, L., Chevallier, F., Chini, L. P., Cronin, M., Evans, W., Falk, S., Feely, R. A., Gasser, T., Gehlen, M., Gkritzalis, T., Gloege, L., Grassi, G., Gruber, N., Gürses, Ö., Harris, I., Hefner, M., Houghton, R. A., Hurtt, G. C., Iida, Y., Ilyina, T., Jain, A. K., Jersild, A., Kadono, K., Kato, E., Kennedy, D., Klein Goldewijk, K., Knauer, J., Korsbakken, J. I., Landschützer, P., Lefèvre, N., Lindsay, K., Liu, J., Liu, Z., Marland, G., Mayot, N., McGrath, M. J., Metzl, N., Monacci, N. M., Munro, D. R., Nakaoka, S.-I., Niwa, Y., O'Brien, K., Ono, T., Palmer, P. I., Pan, N., Pierrot, D., Pocock, K., Poulter, B., Resplandy, L., Robertson, E., Rödenbeck, C., Rodriguez, C., Rosan, T. M., Schwinger, J., Séférian, R., Shutler, J. D., Skjelvan, I., Steinhoff, T., Sun, Q., Sutton, A. J., Sweeney, C., Takao, S., Tanhua, T., Tans, P. P., Tian, X., Tian, H., Tilbrook, B., Tsujino, H., Tubiello, F., van der Werf, G. R., Walker, A. P., Wanninkhof, R., Whitehead, C., Willstrand Wranne, A., Wright, R., Yuan, W., Yue, C., Yue, X., Zaehle, S., Zeng, J., and Zheng, B.: Global Carbon Budget 2022, Earth Syst. Sci. Data, 14, 4811–4900, <ext-link xlink:href="https://doi.org/10.5194/essd-14-4811-2022" ext-link-type="DOI">10.5194/essd-14-4811-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx21"><label>Gurney et al.(2002)</label><mixed-citation>Gurney, K., Law, R., Denning, A., Rayner, P., Baker, D., Bousquet, P.,  Bruhwiler, L., Chen, Y., Ciais, P., Fan, S., Fung, I., Gloor, M., Heimann,  M., Higuchi, K., John, J., Maki, T., Maksyutov, S., Masarie, K., Peylin, P.,  Prather, M., Pak, B., Randerson, J., Sarmiento, J., Taguchi, S., Takahashi,  T., and Yuen, C. A.: Towards robust regional estimates of CO<sub>2</sub> sources and  sinks using atmospheric transport models, Nature, 415, 626–630, <ext-link xlink:href="https://doi.org/10.1038/415626a" ext-link-type="DOI">10.1038/415626a</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bibx22"><label>Hobbs et al.(2017)</label><mixed-citation>Hobbs, J., Braverman, A., Cressie, N., Granat, R., and Gunson, M.:  Simulation-Based Uncertainty Quantification for Estimating Atmospheric CO<sub>2</sub>  from Satellite Data, SIAM/ASA Journal on Uncertainty Quantification, 5,  956–985, <ext-link xlink:href="https://doi.org/10.1137/16M1060765" ext-link-type="DOI">10.1137/16M1060765</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx23"><label>Hobbs et al.(2021)</label><mixed-citation>Hobbs, J., Katzfuss, M., Zilber, D., Brynjarsdóttir, J., Mondal, A., and  Berrocal, V.: Spatial Retrievals of Atmospheric Carbon Dioxide from Satellite  Observations, Remote Sensing, 13, 571, <ext-link xlink:href="https://doi.org/10.3390/rs13040571" ext-link-type="DOI">10.3390/rs13040571</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx24"><label>Imasu et al.(2023)</label><mixed-citation>Imasu, R., Matsunaga, T., Nakajima, M., Yoshida, Y., Shiomi, K., Morino, I.,  Saitoh, N., Niwa, Y., Someya, Y., Oishi, Y., Hashimoto, M., Noda, H.,  Hikosaka, K., Uchino, O., Maksyutov, S., Takagi, H., Ishida, H., Nakajima,  T. Y., Nakajima, T., and Shi, C.: Greenhouse gases Observing SATellite 2  (GOSAT-2): mission overview, Progress in Earth and Planetary Science, 10, 33, <ext-link xlink:href="https://doi.org/10.1186/s40645-023-00562-2" ext-link-type="DOI">10.1186/s40645-023-00562-2</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx25"><label>Innes(2019)</label><mixed-citation>Innes, M.: Don't Unroll Adjoint: Differentiating SSA-Form Programs, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.1810.07951" ext-link-type="DOI">10.48550/arXiv.1810.07951</ext-link>, 9 March 2019.</mixed-citation></ref>
      <ref id="bib1.bibx26"><label>IPCC(2023)</label><mixed-citation>IPCC: Summary for Policymakers, IPCC, 1–34,  <ext-link xlink:href="https://doi.org/10.1017/CBO9781107415324.004" ext-link-type="DOI">10.1017/CBO9781107415324.004</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx27"><label>Johnson(2020)</label><mixed-citation>Johnson, S. G.: The Sobol module for Julia, GitHub [code], <uri>https://github.com/JuliaMath/Sobol.jl</uri> (last access: 31 January 2025), 2020.</mixed-citation></ref>
      <ref id="bib1.bibx28"><label>Kaipio and Somersalo(2005)</label><mixed-citation>Kaipio, J. and Somersalo, E.: Statistical and Computational Inverse Problems,  Springer, <ext-link xlink:href="https://doi.org/10.1007/b138659" ext-link-type="DOI">10.1007/b138659</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx29"><label>Kalnay(2002)</label><mixed-citation>Kalnay, E.: Atmospheric Modeling, Data Assimilation and Predictability,  Cambridge University Press, <ext-link xlink:href="https://doi.org/10.1017/CBO9780511802270" ext-link-type="DOI">10.1017/CBO9780511802270</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bibx30"><label>Kasahara et al.(2020)</label><mixed-citation>Kasahara, M., Kachi, M., Inaoka, K., Fujii, H., Kubota, T., Shimada, R., and  Kojima, Y.: Overview and current status of GOSAT-GW mission and AMSR3  instrument, in: Sensors, Systems, and Next-Generation Satellites XXIV,  21–25 September 2020, edited by: Neeck, S. P., Hélière, A., and Kimura, T., International Society for Optics and Photonics, SPIE, 11530, p. 1153007, <ext-link xlink:href="https://doi.org/10.1117/12.2573914" ext-link-type="DOI">10.1117/12.2573914</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx31"><label>Kiel et al.(2019)</label><mixed-citation>Kiel, M., O'Dell, C. W., Fisher, B., Eldering, A., Nassar, R., MacDonald, C. G., and Wennberg, P. O.: How bias correction goes wrong: measurement of XCO<sub>2</sub> affected by erroneous surface pressure estimates, Atmos. Meas. Tech., 12, 2241–2259, <ext-link xlink:href="https://doi.org/10.5194/amt-12-2241-2019" ext-link-type="DOI">10.5194/amt-12-2241-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx32"><label>Kingma and Ba(2017)</label><mixed-citation>Kingma, D. P. and Ba, J.: Adam: A Method for Stochastic Optimization, arXiv [preprint], <uri>https://doi.org/10.48550/arXiv.1412.6980</uri>, 30 January 2017.</mixed-citation></ref>
      <ref id="bib1.bibx33"><label>Kuze et al.(2009)</label><mixed-citation>Kuze, A., Suto, H., Nakajima, M., and Hamazaki, T.: Thermal and near infrared  sensor for carbon observation Fourier-transform spectrometer on the  Greenhouse Gases Observing Satellite for greenhouse gases monitoring, Appl.  Optics, 48, 6716–6733, <ext-link xlink:href="https://doi.org/10.1364/AO.48.006716" ext-link-type="DOI">10.1364/AO.48.006716</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx34"><label>Lamminpää(2024)</label><mixed-citation>Lamminpää, O.: Forward Model Emulator for Atmospheric Radiative Transfer  Using Gaussian Processes And Cross Validation, OSF [code/data set], <ext-link xlink:href="https://doi.org/10.17605/OSF.IO/U2T8A" ext-link-type="DOI">10.17605/OSF.IO/U2T8A</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx35"><label>Lamminpää et al.(2019)</label><mixed-citation>Lamminpää, O., Hobbs, J., Brynjarsdóttir, J., Laine, M., Braverman, A.,  Lindqvist, H., and Tamminen, J.: Accelerated MCMC for Satellite-Based  Measurements of Atmospheric CO<sub>2</sub>, Remote Sensing, 11, 2061,  <ext-link xlink:href="https://doi.org/10.3390/rs11172061" ext-link-type="DOI">10.3390/rs11172061</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx36"><label>Li et al.(2024)</label><mixed-citation>Li, Z., Huang, D. Z., Liu, B., and Anandkumar, A.: Fourier Neural Operator with Learned Deformations for PDEs on General Geometries, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.2207.05209" ext-link-type="DOI">10.48550/arXiv.2207.05209</ext-link>, 2 May 2024.</mixed-citation></ref>
      <ref id="bib1.bibx37"><label>Liu et al.(2017)</label><mixed-citation>Liu, J., Bowman, K. W., Schimel, D. S., Parazoo, N. C., Jiang, Z., Lee, M.,  Bloom, A. A., Wunch, D., Frankenberg, C., Sun, Y., O'Dell, C. W., Gurney, K. R., Menemenlis, D., Gierach, M., Crisp, D., and Eldering, A.: Contrasting carbon cycle responses of the tropical continents to the  2015–2016 El Niño, Science, 358, eaam5690, <ext-link xlink:href="https://doi.org/10.1126/science.aam5690" ext-link-type="DOI">10.1126/science.aam5690</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx38"><label>Lu et al.(2021)</label><mixed-citation>Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E.: Learning  nonlinear operators via DeepONet based on the universal approximation theorem  of operators, Nature Machine Intelligence, 3, 218–229,  <ext-link xlink:href="https://doi.org/10.1038/s42256-021-00302-5" ext-link-type="DOI">10.1038/s42256-021-00302-5</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx39"><label>Ma et al.(2019)</label><mixed-citation>Ma, P., Mondal, A., Konomi, B. A., Hobbs, J., Song, J. J., and Kang, E. L.:  Computer Model Emulation with High-Dimensional Functional Output in  Large-Scale Observing System Uncertainty Experiments, Technometrics, 64,  65–79, <ext-link xlink:href="https://doi.org/10.1080/00401706.2021.1895890" ext-link-type="DOI">10.1080/00401706.2021.1895890</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx40"><label>McDuffie et al.(2020)</label><mixed-citation>McDuffie, J., Bowman, K., Hobbs, J., Natraj, V., Sarkissian, E., Mike, M. T., and Val, S.: Reusable Framework for Retrieval of Atmospheric Composition (ReFRACtor), Version 1.09, Zenodo [code], <uri>https://doi.org/10.5281/zenodo.4019567</uri>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx41"><label>Mishra and Molinaro(2021)</label><mixed-citation>Mishra, S. and Molinaro, R.: Physics informed neural networks for simulating  radiative transfer, J. Quant. Spectrosc. Ra., 270, 107705, <ext-link xlink:href="https://doi.org/10.1016/J.JQSRT.2021.107705" ext-link-type="DOI">10.1016/J.JQSRT.2021.107705</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx42"><label>Moore et al.(2018)</label><mixed-citation>Moore III, B., Crowell, S. M. R., Rayner, P. J., Kumer, J., O'Dell, C. W.,  O'Brien, D., Utembe, S., Polonsky, I., Schimel, D., and Lemen, J.: The  Potential of the Geostationary Carbon Cycle Observatory (GeoCarb) to Provide Multi-scale Constraints on the Carbon Cycle in the Americas, Front. Environ. Sci., 6, 109, <ext-link xlink:href="https://doi.org/10.3389/fenvs.2018.00109" ext-link-type="DOI">10.3389/fenvs.2018.00109</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx43"><label>Nguyen and Hobbs(2020)</label><mixed-citation>Nguyen, H. and Hobbs, J.: Intercomparison of Remote Sensing Retrievals: An  Examination of Prior-Induced Biases in Averaging Kernel Corrections, Remote  Sensing, 12, 3239, <ext-link xlink:href="https://doi.org/10.3390/rs12193239" ext-link-type="DOI">10.3390/rs12193239</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx44"><label>O'Dell et al.(2012)</label><mixed-citation>O'Dell, C. W., Connor, B., Bösch, H., O'Brien, D., Frankenberg, C., Castano, R., Christi, M., Eldering, D., Fisher, B., Gunson, M., McDuffie, J., Miller, C. E., Natraj, V., Oyafuso, F., Polonsky, I., Smyth, M., Taylor, T., Toon, G. C., Wennberg, P. O., and Wunch, D.: The ACOS CO<sub>2</sub> retrieval algorithm – Part 1: Description and validation against synthetic observations, Atmos. Meas. Tech., 5, 99–121, <ext-link xlink:href="https://doi.org/10.5194/amt-5-99-2012" ext-link-type="DOI">10.5194/amt-5-99-2012</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx45"><label>O'Dell et al.(2018)</label><mixed-citation>O'Dell, C. W., Eldering, A., Wennberg, P. O., Crisp, D., Gunson, M. R., Fisher, B., Frankenberg, C., Kiel, M., Lindqvist, H., Mandrake, L., Merrelli, A., Natraj, V., Nelson, R. R., Osterman, G. B., Payne, V. H., Taylor, T. E., Wunch, D., Drouin, B. J., Oyafuso, F., Chang, A., McDuffie, J., Smyth, M., Baker, D. F., Basu, S., Chevallier, F., Crowell, S. M. R., Feng, L., Palmer, P. I., Dubey, M., García, O. E., Griffith, D. W. T., Hase, F., Iraci, L. T., Kivi, R., Morino, I., Notholt, J., Ohyama, H., Petri, C., Roehl, C. M., Sha, M. K., Strong, K., Sussmann, R., Te, Y., Uchino, O., and Velazco, V. A.: Improved retrievals of carbon dioxide from Orbiting Carbon Observatory-2 with the version 8 ACOS algorithm, Atmos. Meas. Tech., 11, 6539–6576, <ext-link xlink:href="https://doi.org/10.5194/amt-11-6539-2018" ext-link-type="DOI">10.5194/amt-11-6539-2018</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx46"><label>Owhadi and Yoo(2019)</label><mixed-citation>Owhadi, H. and Yoo, G. R.: Kernel Flows: From learning kernels from data into  the abyss, J. Comput. Phys., 389, 22–47, <ext-link xlink:href="https://doi.org/10.1016/j.jcp.2019.03.040" ext-link-type="DOI">10.1016/j.jcp.2019.03.040</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx47"><label>Palmer et al.(2019)</label><mixed-citation>Palmer, P. I., Feng, L., Baker, D., Chevallier, F., Bösch, H., and Somkuti, P.: Net carbon emissions from African biosphere dominate pan-tropical atmospheric CO<sub>2</sub> signal, Nat. Commun., 10, 3344,  <ext-link xlink:href="https://doi.org/10.1038/s41467-019-11097-w" ext-link-type="DOI">10.1038/s41467-019-11097-w</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx48"><label>Patil et al.(2022)</label><mixed-citation>Patil, P., Kuusela, M., and Hobbs, J.: Objective Frequentist Uncertainty  Quantification for Atmospheric CO<sub>2</sub> Retrievals, SIAM/ASA Journal on  Uncertainty Quantification, 10, 827–859, <ext-link xlink:href="https://doi.org/10.1137/20M1356403" ext-link-type="DOI">10.1137/20M1356403</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx49"><label>Patra et al.(2007)</label><mixed-citation>Patra, P., Crisp, D., Kaiser, J., Wunch, D., Saeki, T., Ichii, K., Sekiya,  T., Wennberg, P., Feist, D., Pollard, D., Griffith, D., Velaxzco, V.,  Maziere, M., Sha, M., Roehl, C., Chatterjee, A., and Ishijima, K.: The  Orbiting Carbon Observatory (OCO-2) tracks 2–3 peta-gram increase in carbon  release to the atmosphere during the 2014–2016 El Niño, Sci. Rep., 7, 13567,  <ext-link xlink:href="https://doi.org/10.1038/s41598-017-13459-0" ext-link-type="DOI">10.1038/s41598-017-13459-0</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx50"><label>Peiro et al.(2022)</label><mixed-citation>Peiro, H., Crowell, S., Schuh, A., Baker, D. F., O'Dell, C., Jacobson, A. R., Chevallier, F., Liu, J., Eldering, A., Crisp, D., Deng, F., Weir, B., Basu, S., Johnson, M. S., Philip, S., and Baker, I.: Four years of global carbon cycle observed from the Orbiting Carbon Observatory 2 (OCO-2) version 9 and in situ data and comparison to OCO-2 version 7, Atmos. Chem. Phys., 22, 1097–1130, <ext-link xlink:href="https://doi.org/10.5194/acp-22-1097-2022" ext-link-type="DOI">10.5194/acp-22-1097-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx51"><label>Peters et al.(2007)</label><mixed-citation>Peters, W., Jacobson, A. R., Sweeney, C., Andrews, A. E., Conway, T. J.,  Masarie, K., Miller, J. B., Bruhwiler, L. M. P., Pétron, G., Hirsch, A. I., Worthy, D. E. J., van der Werf, G. R., Randerson, J. T., Wennberg, P. O., Krol, M. C., and Tans, P. P.: An Atmospheric Perspective on North American Carbon Dioxide Exchange: CarbonTracker, P. Natl. Acad. Sci. USA, 104, 18925–18930, <ext-link xlink:href="https://doi.org/10.1073/pnas.0708986104" ext-link-type="DOI">10.1073/pnas.0708986104</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx52"><label>Press et al.(1992)</label><mixed-citation> Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T.:  Numerical Recipes in FORTRAN 77: The Art of Scientific Computing, Cambridge  University Press, 2nd edn., ISBN 052143064X, 1992.</mixed-citation></ref>
      <ref id="bib1.bibx53"><label>Ran and Li(2019)</label><mixed-citation>Ran, Y. and Li, X.: TanSat: a new star in global carbon monitoring from China, Sci. Bull., 64, 284–285, <ext-link xlink:href="https://doi.org/10.1016/j.scib.2019.01.019" ext-link-type="DOI">10.1016/j.scib.2019.01.019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx54"><label>Rasmussen and Williams(2006)</label><mixed-citation>Rasmussen, C. and Williams, C.: Gaussian Processes for Machine Learning,  Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, USA,  <uri>http://gaussianprocess.org/gpml/</uri> (last access: 31 January 2025), 2006.</mixed-citation></ref>
      <ref id="bib1.bibx55"><label>Rodgers(2004)</label><mixed-citation> Rodgers, C. D.: Inverse Methods for Atmospheric Sounding: Theory and Practice, World Scientific Publishing Co. Pte. Ltd., Singapore 596224, Reprint edn., ISBN 981022740X, 2004.</mixed-citation></ref>
      <ref id="bib1.bibx56"><label>Rosenberg et al.(2017)</label><mixed-citation>Rosenberg, R., Maxwell, S., Johnson, B. C., Chapsky, L., Lee, R. A. M., and  Pollock, R.: Preflight Radiometric Calibration of Orbiting Carbon Observatory  2, IEEE T. Geosci. Remote, 55, 1994–2006, <ext-link xlink:href="https://doi.org/10.1109/TGRS.2016.2634023" ext-link-type="DOI">10.1109/TGRS.2016.2634023</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx57"><label>Schimel et al.(2015)</label><mixed-citation>Schimel, D., Pavlick, R., Fisher, J. B., Asner, G. P., Saatchi, S., Townsend,  P., Miller, C., Frankenberg, C., Hibbard, K., and Cox, P.: Observing  terrestrial ecosystems and the carbon cycle from space, Glob. Change Biol., 21, 1762–1776, <ext-link xlink:href="https://doi.org/10.1111/gcb.12822" ext-link-type="DOI">10.1111/gcb.12822</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx58"><label>Sierk et al.(2019)</label><mixed-citation>Sierk, B., Bézy, J.-L., Löscher, A., and Meijer, Y.: The European CO<sub>2</sub> Monitoring Mission: observing anthropogenic greenhouse gas emissions from  space, in: International Conference on Space Optics – ICSO 2018, Chania, Greece, 9–12 October 2018, edited by: Sodnik, Z., Karafolas, N., and Cugny, B., International Society for Optics and Photonics, SPIE, 11180, p. 111800M, <ext-link xlink:href="https://doi.org/10.1117/12.2535941" ext-link-type="DOI">10.1117/12.2535941</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx59"><label>Sobol(1967)</label><mixed-citation> Sobol, I.: Distribution of Points in a Cube and Approximate Evaluation of   Integrals, Zh. Vych. Mat. Mat. Fiz., 7, 784–802, 1967.</mixed-citation></ref>
      <ref id="bib1.bibx60"><label>Stein(1999)</label><mixed-citation>Stein, M. L.: Interpolation of spatial data: some theory for kriging, Springer Science &amp; Business Media, <ext-link xlink:href="https://doi.org/10.1007/978-1-4612-1494-6" ext-link-type="DOI">10.1007/978-1-4612-1494-6</ext-link>, 1999.</mixed-citation></ref>
      <ref id="bib1.bibx61"><label>Stewart(1998)</label><mixed-citation>Stewart, G. W.: Matrix algorithms, Volume I: Basic Decompositions, SIAM (Society for Industrial and Applied Mathematics), i–xix, <ext-link xlink:href="https://doi.org/10.1137/1.9781611971408.fm" ext-link-type="DOI">10.1137/1.9781611971408.fm</ext-link>, 1998.</mixed-citation></ref>
      <ref id="bib1.bibx62"><label>Tukiainen et al.(2016)</label><mixed-citation>Tukiainen, S., Railo, J., Laine, M., Hakkarainen, J., Kivi, R., Heikkinen, P., Chen, H., and Tamminen, J.: Retrieval of atmospheric CH<sub>4</sub> profiles from  Fourier transform infrared data using dimension reduction and MCMC, J. Geophys. Res.-Atmos., 121, 10312–10327, <ext-link xlink:href="https://doi.org/10.1002/2015JD024657" ext-link-type="DOI">10.1002/2015JD024657</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx63"><label>Vecchia(1988)</label><mixed-citation>Vecchia, A. V.: Estimation and Model Identification for Continuous Spatial  Processes, J. Roy. Stat. Soc. B Met., 50, 297–312, <uri>http://www.jstor.org/stable/2345768</uri> (last access: 31 January 2025), 1988.</mixed-citation></ref>
      <ref id="bib1.bibx64"><label>Veefkind et al.(2012)</label><mixed-citation>Veefkind, J., Aben, I., McMullan, K., Förster, H., de Vries, J., Otter, G.,  Claas, J., Eskes, H., de Haan, J., Kleipool, Q., van Weele, M., Hasekamp,  O., Hoogeveen, R., Landgraf, J., Snel, R., Tol, P., Ingmann, P., Voors, R.,  Kruizinga, B., Vink, R., Visser, H., and Levelt, P.: TROPOMI on the ESA  Sentinel-5 Precursor: A GMES mission for global observations of the  atmospheric composition for climate, air quality and ozone layer  applications, Remote Sens. Environ., 120, 70–83,  <ext-link xlink:href="https://doi.org/10.1016/j.rse.2011.09.027" ext-link-type="DOI">10.1016/j.rse.2011.09.027</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx65"><label>Wu et al.(2023)</label><mixed-citation>Wu, K., Yang, D., Liu, Y., Cai, Z., Zhou, M., Feng, L., and Palmer, P. I.:  Evaluating the Ability of the Pre-Launch TanSat-2 Satellite to Quantify Urban CO<sub>2</sub> Emissions, Remote Sensing, 15, 4904, <ext-link xlink:href="https://doi.org/10.3390/rs15204904" ext-link-type="DOI">10.3390/rs15204904</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx66"><label>Wunch et al.(2017)</label><mixed-citation>Wunch, D., Wennberg, P. O., Osterman, G., Fisher, B., Naylor, B., Roehl, C. M., O'Dell, C., Mandrake, L., Viatte, C., Kiel, M., Griffith, D. W. T., Deutscher, N. M., Velazco, V. A., Notholt, J., Warneke, T., Petri, C., De Maziere, M., Sha, M. K., Sussmann, R., Rettinger, M., Pollard, D., Robinson, J., Morino, I., Uchino, O., Hase, F., Blumenstock, T., Feist, D. G., Arnold, S. G., Strong, K., Mendonca, J., Kivi, R., Heikkinen, P., Iraci, L., Podolske, J., Hillyard, P. W., Kawakami, S., Dubey, M. K., Parker, H. A., Sepulveda, E., García, O. E., Te, Y., Jeseck, P., Gunson, M. R., Crisp, D., and Eldering, A.: Comparisons of the Orbiting Carbon Observatory-2 (OCO-2) XCO<sub>2</sub> measurements with TCCON, Atmos. Meas. Tech., 10, 2209–2238, <ext-link xlink:href="https://doi.org/10.5194/amt-10-2209-2017" ext-link-type="DOI">10.5194/amt-10-2209-2017</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx67"><label>Zhu et al.(2022)</label><mixed-citation>Zhu, X., Huang, L., Ibrahim, C., Lee, E. H., and Bindel, D.: Scalable Bayesian Transformed Gaussian Processes, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.2210.10973" ext-link-type="DOI">10.48550/arXiv.2210.10973</ext-link>, 20 October 2022.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>Forward model emulator for atmospheric radiative transfer using Gaussian processes and cross validation</article-title-html>
<abstract-html/>
<ref-html id="bib1.bib1"><label>Batlle et al.(2023)</label><mixed-citation>
      
Batlle, P., Darcy, M., Hosseini, B., and Owhadi, H.: Kernel Methods are  Competitive for Operator Learning, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.2304.13202" target="_blank">https://doi.org/10.48550/arXiv.2304.13202</a>, 8 October 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>Boesch et al.(2015)</label><mixed-citation>
      
Boesch, H., Brown, L., Castano, R., Christi, M., Connor, B., Crisp, D.,  Eldering, A., Fisher, B., Frankenberg, C., Gunson, M., Granat, R., McDuffie,  J., Miller, C., Natraj, V., O'Brien, D., O'Dell, C., Osterman, G., Oyafuso,  F., Payne, V., Polonski, I., Smyth, M., Spurr, R., Thompson, D., and Toon,  G.: Orbiting Carbon Observatory-2 (OCO-2) Level 2 Full Physics Retrieval  Algorithm Theoretical Basis, Version 2.0, Rev 2, NASA Earth Data, <a href="https://doi.org/10.5067/8E4VLCK16O6Q" target="_blank"/>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>Braverman et al.(2021)</label><mixed-citation>
      
Braverman, A., Hobbs, J., Teixeira, J., and Gunson, M.: Post hoc Uncertainty  Quantification for Remote Sensing Observing Systems, SIAM/ASA Journal on Uncertainty Quantification, 9, 1064–1093, <a href="https://doi.org/10.1137/19M1304283" target="_blank">https://doi.org/10.1137/19M1304283</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>Bréon et al.(2022)</label><mixed-citation>
      
Bréon, F.-M., David, L., Chatelanaz, P., and Chevallier, F.: On the potential of a neural-network-based approach for estimating XCO<sub>2</sub> from OCO-2 measurements, Atmos. Meas. Tech., 15, 5219–5234, <a href="https://doi.org/10.5194/amt-15-5219-2022" target="_blank">https://doi.org/10.5194/amt-15-5219-2022</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>Brynjarsdóttir et al.(2018)</label><mixed-citation>
      
Brynjarsdóttir, J., Hobbs, J., Braverman, A., and Mandrake, L.: Optimal  Estimation Versus MCMC for CO<sub>2</sub> Retrievals, J. Agr. Biol. Envir. St., 23, 297–316, <a href="https://doi.org/10.1007/s13253-018-0319-8" target="_blank">https://doi.org/10.1007/s13253-018-0319-8</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>Byrne et al.(2023)</label><mixed-citation>
      
Byrne, B., Baker, D. F., Basu, S., Bertolacci, M., Bowman, K. W., Carroll, D., Chatterjee, A., Chevallier, F., Ciais, P., Cressie, N., Crisp, D., Crowell, S., Deng, F., Deng, Z., Deutscher, N. M., Dubey, M. K., Feng, S., García, O. E., Griffith, D. W. T., Herkommer, B., Hu, L., Jacobson, A. R., Janardanan, R., Jeong, S., Johnson, M. S., Jones, D. B. A., Kivi, R., Liu, J., Liu, Z., Maksyutov, S., Miller, J. B., Miller, S. M., Morino, I., Notholt, J., Oda, T., O'Dell, C. W., Oh, Y.-S., Ohyama, H., Patra, P. K., Peiro, H., Petri, C., Philip, S., Pollard, D. F., Poulter, B., Remaud, M., Schuh, A., Sha, M. K., Shiomi, K., Strong, K., Sweeney, C., Té, Y., Tian, H., Velazco, V. A., Vrekoussis, M., Warneke, T., Worden, J. R., Wunch, D., Yao, Y., Yun, J., Zammit-Mangion, A., and Zeng, N.: National CO<sub>2</sub> budgets (2015–2020) inferred from atmospheric CO<sub>2</sub> observations in support of the global stocktake, Earth Syst. Sci. Data, 15, 963–1004, <a href="https://doi.org/10.5194/essd-15-963-2023" target="_blank">https://doi.org/10.5194/essd-15-963-2023</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>Chevallier et al.(2010)</label><mixed-citation>
      
Chevallier, F., Ciais, P., Conway, T. J., Aalto, T., Anderson, B. E., Bousquet, P., Brunke, E. G., Ciattaglia, L., Esaki, Y., Frohlich, M., Gomez, A., Gomez-Pelaez, A. J., Haszpra, L., Krummel, P., Langenfelds, R. L.,  Leuenberger, M., Machida, T., Maignan, F., Matsueda, H., Morgu, J. A., Mukai,  H., Nakazawa, T., Peylin, P., Ramonet, M., Rivier, L., Sawa, Y., Schmidt, M.,  Steele, L. P., Vay, S. A., Vermeulen, A. T., Wofsy, S., and Worthy, D.: CO<sub>2</sub> surface fluxes at grid point scale estimated from a global 21 year  re-analysis of atmospheric measurements, J. Geophys. Res.-Atmos., 115, D21307, <a href="https://doi.org/10.1029/2010JD013887" target="_blank">https://doi.org/10.1029/2010JD013887</a>, 2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>Connor et al.(2008)</label><mixed-citation>
      
Connor, B. J., Boesch, H., Toon, G., Sen, B., Miller, C., and Crisp, D.:  Orbiting Carbon Observatory: Inverse Method and Prospective Error Analysis,  J. Geophys. Res., 113, D05305, <a href="https://doi.org/10.1029/2006JD008336" target="_blank">https://doi.org/10.1029/2006JD008336</a>, 2008.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>Cressie(1993)</label><mixed-citation>
      
Cressie, N.: Statistics for Spatial Data, John Wiley &amp; Sons, Inc., <a href="https://doi.org/10.1002/9781119115151" target="_blank">https://doi.org/10.1002/9781119115151</a>,  1993.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>Cressie(2018)</label><mixed-citation>
      
Cressie, N.: Mission CO<sub>2</sub>ntrol: A Statistical Scientist's Role in Remote  Sensing of Atmospheric Carbon Dioxide, J. Am. Stat. Assoc., 113, 152–168, <a href="https://doi.org/10.1080/01621459.2017.1419136" target="_blank">https://doi.org/10.1080/01621459.2017.1419136</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>Crisp et al.(2004)</label><mixed-citation>
      
Crisp, D., Atlas, R. M., Breon, F.-M., Brown, L. R., Burrows, J. P., Ciais, P., Connor, B. J., Doney, S. C., Fung, I. Y., Jacob, D. J., Miller, C. E.,  O'Brien, D., Pawson, S., Randerson, J. T., Rayner, P., Salawitch, R. J., Sander, S. P., Sen, B., Stephens, G. L., Tans, P. P., Toon, G. C., Wennberg,  P. O., Wofsy, S. C., Yung, Y. L., Kuang, Z., Chudasama, B., Sprague, G.,  Weiss, B., Pollock, R., Kenyon, D., and Schroll, S.: The Orbiting Carbon Observatory (OCO) mission, Adv. Space. Res., 34, 700–709, <a href="https://doi.org/10.1016/j.asr.2003.08.062" target="_blank">https://doi.org/10.1016/j.asr.2003.08.062</a>, 2004.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>Crisp et al.(2012)</label><mixed-citation>
      
Crisp, D., Fisher, B. M., O'Dell, C., Frankenberg, C., Basilio, R., Bösch, H., Brown, L. R., Castano, R., Connor, B., Deutscher, N. M., Eldering, A., Griffith, D., Gunson, M., Kuze, A., Mandrake, L., McDuffie, J., Messerschmidt, J., Miller, C. E., Morino, I., Natraj, V., Notholt, J., O'Brien, D. M., Oyafuso, F., Polonsky, I., Robinson, J., Salawitch, R., Sherlock, V., Smyth, M., Suto, H., Taylor, T. E., Thompson, D. R., Wennberg, P. O., Wunch, D., and Yung, Y. L.: The ACOS CO<sub>2</sub> retrieval algorithm – Part II: Global XCO<sub>2</sub> data characterization, Atmos. Meas. Tech., 5, 687–707, <a href="https://doi.org/10.5194/amt-5-687-2012" target="_blank">https://doi.org/10.5194/amt-5-687-2012</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>Crisp et al.(2017)</label><mixed-citation>
      
Crisp, D., Pollock, H. R., Rosenberg, R., Chapsky, L., Lee, R. A. M., Oyafuso, F. A., Frankenberg, C., O'Dell, C. W., Bruegge, C. J., Doran, G. B., Eldering, A., Fisher, B. M., Fu, D., Gunson, M. R., Mandrake, L., Osterman, G. B., Schwandner, F. M., Sun, K., Taylor, T. E., Wennberg, P. O., and Wunch, D.: The on-orbit performance of the Orbiting Carbon Observatory-2 (OCO-2) instrument and its radiometrically calibrated products, Atmos. Meas. Tech., 10, 59–81, <a href="https://doi.org/10.5194/amt-10-59-2017" target="_blank">https://doi.org/10.5194/amt-10-59-2017</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>Crowell et al.(2019)</label><mixed-citation>
      
Crowell, S., Baker, D., Schuh, A., Basu, S., Jacobson, A. R., Chevallier, F., Liu, J., Deng, F., Feng, L., McKain, K., Chatterjee, A., Miller, J. B., Stephens, B. B., Eldering, A., Crisp, D., Schimel, D., Nassar, R., O'Dell, C. W., Oda, T., Sweeney, C., Palmer, P. I., and Jones, D. B. A.: The 2015–2016 carbon cycle as seen from OCO-2 and the global in situ network, Atmos. Chem. Phys., 19, 9797–9831, <a href="https://doi.org/10.5194/acp-19-9797-2019" target="_blank">https://doi.org/10.5194/acp-19-9797-2019</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>Datta et al.(2016)</label><mixed-citation>
      
Datta, A., Banerjee, S., Finley, A., and Gelfand, A.: Hierarchical  Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets, J. Am. Stat. Assoc., 111, 800–812, <a href="https://doi.org/10.1080/01621459.2015.1044091" target="_blank">https://doi.org/10.1080/01621459.2015.1044091</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>David et al.(2021)</label><mixed-citation>
      
David, L., Bréon, F.-M., and Chevallier, F.: XCO<sub>2</sub> estimates from the OCO-2 measurements using a neural network approach, Atmos. Meas. Tech., 14, 117–132, <a href="https://doi.org/10.5194/amt-14-117-2021" target="_blank">https://doi.org/10.5194/amt-14-117-2021</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>Eldering et al.(2019)</label><mixed-citation>
      
Eldering, A., Taylor, T. E., O'Dell, C. W., and Pavlick, R.: The OCO-3 mission: measurement objectives and expected performance based on 1 year of simulated data, Atmos. Meas. Tech., 12, 2341–2370, <a href="https://doi.org/10.5194/amt-12-2341-2019" target="_blank">https://doi.org/10.5194/amt-12-2341-2019</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>Frey et al.(2019)</label><mixed-citation>
      
Frey, M., Sha, M. K., Hase, F., Kiel, M., Blumenstock, T., Harig, R., Surawicz, G., Deutscher, N. M., Shiomi, K., Franklin, J. E., Bösch, H., Chen, J., Grutter, M., Ohyama, H., Sun, Y., Butz, A., Mengistu Tsidu, G., Ene, D., Wunch, D., Cao, Z., Garcia, O., Ramonet, M., Vogel, F., and Orphal, J.: Building the COllaborative Carbon Column Observing Network (COCCON): long-term stability and ensemble performance of the EM27/SUN Fourier transform spectrometer, Atmos. Meas. Tech., 12, 1513–1530, <a href="https://doi.org/10.5194/amt-12-1513-2019" target="_blank">https://doi.org/10.5194/amt-12-1513-2019</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>Friedlingstein et al.(2014)</label><mixed-citation>
      
Friedlingstein, P., Meinshausen, M., Arora, V., Jones, C., Anav, A., Liddicoat, S., and Knutti, R.: Uncertainties in CMIP5 Climate Projections due to Carbon Cycle Feedbacks, J. Climate, 27, 511–526, <a href="https://doi.org/10.1175/JCLI-D-12-00579.1" target="_blank">https://doi.org/10.1175/JCLI-D-12-00579.1</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>Friedlingstein et al.(2022)</label><mixed-citation>
      
Friedlingstein, P., O'Sullivan, M., Jones, M. W., Andrew, R. M., Gregor, L., Hauck, J., Le Quéré, C., Luijkx, I. T., Olsen, A., Peters, G. P., Peters, W., Pongratz, J., Schwingshackl, C., Sitch, S., Canadell, J. G., Ciais, P., Jackson, R. B., Alin, S. R., Alkama, R., Arneth, A., Arora, V. K., Bates, N. R., Becker, M., Bellouin, N., Bittig, H. C., Bopp, L., Chevallier, F., Chini, L. P., Cronin, M., Evans, W., Falk, S., Feely, R. A., Gasser, T., Gehlen, M., Gkritzalis, T., Gloege, L., Grassi, G., Gruber, N., Gürses, Ö., Harris, I., Hefner, M., Houghton, R. A., Hurtt, G. C., Iida, Y., Ilyina, T., Jain, A. K., Jersild, A., Kadono, K., Kato, E., Kennedy, D., Klein Goldewijk, K., Knauer, J., Korsbakken, J. I., Landschützer, P., Lefèvre, N., Lindsay, K., Liu, J., Liu, Z., Marland, G., Mayot, N., McGrath, M. J., Metzl, N., Monacci, N. M., Munro, D. R., Nakaoka, S.-I., Niwa, Y., O'Brien, K., Ono, T., Palmer, P. I., Pan, N., Pierrot, D., Pocock, K., Poulter, B., Resplandy, L., Robertson, E., Rödenbeck, C., Rodriguez, C., Rosan, T. M., Schwinger, J., Séférian, R., Shutler, J. D., Skjelvan, I., Steinhoff, T., Sun, Q., Sutton, A. J., Sweeney, C., Takao, S., Tanhua, T., Tans, P. P., Tian, X., Tian, H., Tilbrook, B., Tsujino, H., Tubiello, F., van der Werf, G. R., Walker, A. P., Wanninkhof, R., Whitehead, C., Willstrand Wranne, A., Wright, R., Yuan, W., Yue, C., Yue, X., Zaehle, S., Zeng, J., and Zheng, B.: Global Carbon Budget 2022, Earth Syst. Sci. Data, 14, 4811–4900, <a href="https://doi.org/10.5194/essd-14-4811-2022" target="_blank">https://doi.org/10.5194/essd-14-4811-2022</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>Gurney et al.(2002)</label><mixed-citation>
      
Gurney, K., Law, R., Denning, A., Rayner, P., Baker, D., Bousquet, P.,  Bruhwiler, L., Chen, Y., Ciais, P., Fan, S., Fung, I., Gloor, M., Heimann,  M., Higuchi, K., John, J., Maki, T., Maksyutov, S., Masarie, K., Peylin, P.,  Prather, M., Pak, B., Randerson, J., Sarmiento, J., Taguchi, S., Takahashi,  T., and Yuen, C. A.: Towards robust regional estimates of CO<sub>2</sub> sources and  sinks using atmospheric transport models, Nature, 415, 626–630, <a href="https://doi.org/10.1038/415626a" target="_blank">https://doi.org/10.1038/415626a</a>, 2002.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>Hobbs et al.(2017)</label><mixed-citation>
      
Hobbs, J., Braverman, A., Cressie, N., Granat, R., and Gunson, M.:  Simulation-Based Uncertainty Quantification for Estimating Atmospheric CO<sub>2</sub>  from Satellite Data, SIAM/ASA Journal on Uncertainty Quantification, 5,  956–985, <a href="https://doi.org/10.1137/16M1060765" target="_blank">https://doi.org/10.1137/16M1060765</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>Hobbs et al.(2021)</label><mixed-citation>
      
Hobbs, J., Katzfuss, M., Zilber, D., Brynjarsdóttir, J., Mondal, A., and  Berrocal, V.: Spatial Retrievals of Atmospheric Carbon Dioxide from Satellite  Observations, Remote Sensing, 13, 571, <a href="https://doi.org/10.3390/rs13040571" target="_blank">https://doi.org/10.3390/rs13040571</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>Imasu et al.(2023)</label><mixed-citation>
      
Imasu, R., Matsunaga, T., Nakajima, M., Yoshida, Y., Shiomi, K., Morino, I.,  Saitoh, N., Niwa, Y., Someya, Y., Oishi, Y., Hashimoto, M., Noda, H.,  Hikosaka, K., Uchino, O., Maksyutov, S., Takagi, H., Ishida, H., Nakajima,  T. Y., Nakajima, T., and Shi, C.: Greenhouse gases Observing SATellite 2  (GOSAT-2): mission overview, Progress in Earth and Planetary Science, 10, 33, <a href="https://doi.org/10.1186/s40645-023-00562-2" target="_blank">https://doi.org/10.1186/s40645-023-00562-2</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>Innes(2019)</label><mixed-citation>
      
Innes, M.: Don't Unroll Adjoint: Differentiating SSA-Form Programs, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.1810.07951" target="_blank">https://doi.org/10.48550/arXiv.1810.07951</a>, 9 March 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>IPCC(2023)</label><mixed-citation>
      
IPCC: Summary for Policymakers, IPCC, 1–34,  <a href="https://doi.org/10.1017/CBO9781107415324.004" target="_blank">https://doi.org/10.1017/CBO9781107415324.004</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>Johnson(2020)</label><mixed-citation>
      
Johnson, S. G.: The Sobol module for Julia, GitHub [code], <a href="https://github.com/JuliaMath/Sobol.jl" target="_blank"/> (last access: 31 January 2025), 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>Kaipio and Somersalo(2005)</label><mixed-citation>
      
Kaipio, J. and Somersalo, E.: Statistical and Computational Inverse Problems,  Springer, <a href="https://doi.org/10.1007/b138659" target="_blank">https://doi.org/10.1007/b138659</a>, 2005.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>Kalnay(2002)</label><mixed-citation>
      
Kalnay, E.: Atmospheric Modeling, Data Assimilation and Predictability,  Cambridge University Press, <a href="https://doi.org/10.1017/CBO9780511802270" target="_blank">https://doi.org/10.1017/CBO9780511802270</a>, 2002.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>Kasahara et al.(2020)</label><mixed-citation>
      
Kasahara, M., Kachi, M., Inaoka, K., Fujii, H., Kubota, T., Shimada, R., and  Kojima, Y.: Overview and current status of GOSAT-GW mission and AMSR3  instrument, in: Sensors, Systems, and Next-Generation Satellites XXIV,  21–25 September 2020, edited by: Neeck, S. P., Hélière, A., and Kimura, T., International Society for Optics and Photonics, SPIE, 11530, p. 1153007, <a href="https://doi.org/10.1117/12.2573914" target="_blank">https://doi.org/10.1117/12.2573914</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>Kiel et al.(2019)</label><mixed-citation>
      
Kiel, M., O'Dell, C. W., Fisher, B., Eldering, A., Nassar, R., MacDonald, C. G., and Wennberg, P. O.: How bias correction goes wrong: measurement of XCO<sub>2</sub> affected by erroneous surface pressure estimates, Atmos. Meas. Tech., 12, 2241–2259, <a href="https://doi.org/10.5194/amt-12-2241-2019" target="_blank">https://doi.org/10.5194/amt-12-2241-2019</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>Kingma and Ba(2017)</label><mixed-citation>
      
Kingma, D. P. and Ba, J.: Adam: A Method for Stochastic Optimization, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.1412.6980" target="_blank"/>, 30 January 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>Kuze et al.(2009)</label><mixed-citation>
      
Kuze, A., Suto, H., Nakajima, M., and Hamazaki, T.: Thermal and near infrared  sensor for carbon observation Fourier-transform spectrometer on the  Greenhouse Gases Observing Satellite for greenhouse gases monitoring, Appl.  Optics, 48, 6716–6733, <a href="https://doi.org/10.1364/AO.48.006716" target="_blank">https://doi.org/10.1364/AO.48.006716</a>, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>Lamminpää(2024)</label><mixed-citation>
      
Lamminpää, O.: Forward Model Emulator for Atmospheric Radiative Transfer  Using Gaussian Processes And Cross Validation, OSF [code/data set], <a href="https://doi.org/10.17605/OSF.IO/U2T8A" target="_blank">https://doi.org/10.17605/OSF.IO/U2T8A</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>Lamminpää et al.(2019)</label><mixed-citation>
      
Lamminpää, O., Hobbs, J., Brynjarsdóttir, J., Laine, M., Braverman, A.,  Lindqvist, H., and Tamminen, J.: Accelerated MCMC for Satellite-Based  Measurements of Atmospheric CO<sub>2</sub>, Remote Sensing, 11, 2061,  <a href="https://doi.org/10.3390/rs11172061" target="_blank">https://doi.org/10.3390/rs11172061</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>Li et al.(2024)</label><mixed-citation>
      
Li, Z., Huang, D. Z., Liu, B., and Anandkumar, A.: Fourier Neural Operator with Learned Deformations for PDEs on General Geometries, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.2207.05209" target="_blank">https://doi.org/10.48550/arXiv.2207.05209</a>, 2 May 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>Liu et al.(2017)</label><mixed-citation>
      
Liu, J., Bowman, K. W., Schimel, D. S., Parazoo, N. C., Jiang, Z., Lee, M.,  Bloom, A. A., Wunch, D., Frankenberg, C., Sun, Y., O'Dell, C. W., Gurney, K. R., Menemenlis, D., Gierach, M., Crisp, D., and Eldering, A.: Contrasting carbon cycle responses of the tropical continents to the  2015–2016 El Niño, Science, 358, eaam5690, <a href="https://doi.org/10.1126/science.aam5690" target="_blank">https://doi.org/10.1126/science.aam5690</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>Lu et al.(2021)</label><mixed-citation>
      
Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E.: Learning  nonlinear operators via DeepONet based on the universal approximation theorem  of operators, Nature Machine Intelligence, 3, 218–229,  <a href="https://doi.org/10.1038/s42256-021-00302-5" target="_blank">https://doi.org/10.1038/s42256-021-00302-5</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>Ma et al.(2019)</label><mixed-citation>
      
Ma, P., Mondal, A., Konomi, B. A., Hobbs, J., Song, J. J., and Kang, E. L.:  Computer Model Emulation with High-Dimensional Functional Output in  Large-Scale Observing System Uncertainty Experiments, Technometrics, 64,  65–79, <a href="https://doi.org/10.1080/00401706.2021.1895890" target="_blank">https://doi.org/10.1080/00401706.2021.1895890</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>McDuffie et al.(2020)</label><mixed-citation>
      
McDuffie, J., Bowman, K., Hobbs, J., Natraj, V., Sarkissian, E., Mike, M. T., and Val, S.: Reusable Framework for Retrieval of Atmospheric Composition (ReFRACtor), Version 1.09, Zenodo [code], <a href="https://doi.org/10.5281/zenodo.4019567" target="_blank"/>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>Mishra and Molinaro(2021)</label><mixed-citation>
      
Mishra, S. and Molinaro, R.: Physics informed neural networks for simulating  radiative transfer, J. Quant. Spectrosc. Ra., 270, 107705, <a href="https://doi.org/10.1016/J.JQSRT.2021.107705" target="_blank">https://doi.org/10.1016/J.JQSRT.2021.107705</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>Moore et al.(2018)</label><mixed-citation>
      
Moore III, B., Crowell, S. M. R., Rayner, P. J., Kumer, J., O'Dell, C. W.,  O'Brien, D., Utembe, S., Polonsky, I., Schimel, D., and Lemen, J.: The  Potential of the Geostationary Carbon Cycle Observatory (GeoCarb) to Provide Multi-scale Constraints on the Carbon Cycle in the Americas, Front. Environ. Sci., 6, 109, <a href="https://doi.org/10.3389/fenvs.2018.00109" target="_blank">https://doi.org/10.3389/fenvs.2018.00109</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>Nguyen and Hobbs(2020)</label><mixed-citation>
      
Nguyen, H. and Hobbs, J.: Intercomparison of Remote Sensing Retrievals: An  Examination of Prior-Induced Biases in Averaging Kernel Corrections, Remote  Sensing, 12, 3239, <a href="https://doi.org/10.3390/rs12193239" target="_blank">https://doi.org/10.3390/rs12193239</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>O'Dell et al.(2012)</label><mixed-citation>
      
O'Dell, C. W., Connor, B., Bösch, H., O'Brien, D., Frankenberg, C., Castano, R., Christi, M., Eldering, D., Fisher, B., Gunson, M., McDuffie, J., Miller, C. E., Natraj, V., Oyafuso, F., Polonsky, I., Smyth, M., Taylor, T., Toon, G. C., Wennberg, P. O., and Wunch, D.: The ACOS CO<sub>2</sub> retrieval algorithm – Part 1: Description and validation against synthetic observations, Atmos. Meas. Tech., 5, 99–121, <a href="https://doi.org/10.5194/amt-5-99-2012" target="_blank">https://doi.org/10.5194/amt-5-99-2012</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>O'Dell et al.(2018)</label><mixed-citation>
      
O'Dell, C. W., Eldering, A., Wennberg, P. O., Crisp, D., Gunson, M. R., Fisher, B., Frankenberg, C., Kiel, M., Lindqvist, H., Mandrake, L., Merrelli, A., Natraj, V., Nelson, R. R., Osterman, G. B., Payne, V. H., Taylor, T. E., Wunch, D., Drouin, B. J., Oyafuso, F., Chang, A., McDuffie, J., Smyth, M., Baker, D. F., Basu, S., Chevallier, F., Crowell, S. M. R., Feng, L., Palmer, P. I., Dubey, M., García, O. E., Griffith, D. W. T., Hase, F., Iraci, L. T., Kivi, R., Morino, I., Notholt, J., Ohyama, H., Petri, C., Roehl, C. M., Sha, M. K., Strong, K., Sussmann, R., Te, Y., Uchino, O., and Velazco, V. A.: Improved retrievals of carbon dioxide from Orbiting Carbon Observatory-2 with the version 8 ACOS algorithm, Atmos. Meas. Tech., 11, 6539–6576, <a href="https://doi.org/10.5194/amt-11-6539-2018" target="_blank">https://doi.org/10.5194/amt-11-6539-2018</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>Owhadi and Yoo(2019)</label><mixed-citation>
      
Owhadi, H. and Yoo, G. R.: Kernel Flows: From learning kernels from data into  the abyss, J. Comput. Phys., 389, 22–47, <a href="https://doi.org/10.1016/j.jcp.2019.03.040" target="_blank">https://doi.org/10.1016/j.jcp.2019.03.040</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>Palmer et al.(2019)</label><mixed-citation>
      
Palmer, P. I., Feng, L., Baker, D., Chevallier, F., Bösch, H., and Somkuti, P.: Net carbon emissions from African biosphere dominate pan-tropical atmospheric CO<sub>2</sub> signal, Nat. Commun., 10, 3344,  <a href="https://doi.org/10.1038/s41467-019-11097-w" target="_blank">https://doi.org/10.1038/s41467-019-11097-w</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>Patil et al.(2022)</label><mixed-citation>
      
Patil, P., Kuusela, M., and Hobbs, J.: Objective Frequentist Uncertainty  Quantification for Atmospheric CO<sub>2</sub> Retrievals, SIAM/ASA Journal on  Uncertainty Quantification, 10, 827–859, <a href="https://doi.org/10.1137/20M1356403" target="_blank">https://doi.org/10.1137/20M1356403</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>Patra et al.(2007)</label><mixed-citation>
      
Patra, P., Crisp, D., Kaiser, J., Wunch, D., Saeki, T., Ichii, K., Sekiya,  T., Wennberg, P., Feist, D., Pollard, D., Griffith, D., Velaxzco, V.,  Maziere, M., Sha, M., Roehl, C., Chatterjee, A., and Ishijima, K.: The  Orbiting Carbon Observatory (OCO-2) tracks 2–3 peta-gram increase in carbon  release to the atmosphere during the 2014–2016 El Niño, Sci. Rep., 7, 13567,  <a href="https://doi.org/10.1038/s41598-017-13459-0" target="_blank">https://doi.org/10.1038/s41598-017-13459-0</a>, 2007.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>Peiro et al.(2022)</label><mixed-citation>
      
Peiro, H., Crowell, S., Schuh, A., Baker, D. F., O'Dell, C., Jacobson, A. R., Chevallier, F., Liu, J., Eldering, A., Crisp, D., Deng, F., Weir, B., Basu, S., Johnson, M. S., Philip, S., and Baker, I.: Four years of global carbon cycle observed from the Orbiting Carbon Observatory 2 (OCO-2) version 9 and in situ data and comparison to OCO-2 version 7, Atmos. Chem. Phys., 22, 1097–1130, <a href="https://doi.org/10.5194/acp-22-1097-2022" target="_blank">https://doi.org/10.5194/acp-22-1097-2022</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>Peters et al.(2007)</label><mixed-citation>
      
Peters, W., Jacobson, A. R., Sweeney, C., Andrews, A. E., Conway, T. J.,  Masarie, K., Miller, J. B., Bruhwiler, L. M. P., Pétron, G., Hirsch, A. I., Worthy, D. E. J., van der Werf, G. R., Randerson, J. T., Wennberg, P. O., Krol, M. C., and Tans, P. P.: An Atmospheric Perspective on North American Carbon Dioxide Exchange: CarbonTracker, P. Natl. Acad. Sci. USA, 104, 18925–18930, <a href="https://doi.org/10.1073/pnas.0708986104" target="_blank">https://doi.org/10.1073/pnas.0708986104</a>, 2007.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>Press et al.(1992)</label><mixed-citation>
      
Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T.:  Numerical Recipes in FORTRAN 77: The Art of Scientific Computing, Cambridge  University Press, 2nd edn., ISBN&thinsp;052143064X, 1992.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>Ran and Li(2019)</label><mixed-citation>
      
Ran, Y. and Li, X.: TanSat: a new star in global carbon monitoring from China, Sci. Bull., 64, 284–285, <a href="https://doi.org/10.1016/j.scib.2019.01.019" target="_blank">https://doi.org/10.1016/j.scib.2019.01.019</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>Rasmussen and Williams(2006)</label><mixed-citation>
      
Rasmussen, C. and Williams, C.: Gaussian Processes for Machine Learning,  Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, USA,  <a href="http://gaussianprocess.org/gpml/" target="_blank"/> (last access: 31 January 2025), 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>Rodgers(2004)</label><mixed-citation>
      
Rodgers, C. D.: Inverse Methods for Atmospheric Sounding: Theory and Practice, World Scientific Publishing Co. Pte. Ltd., Singapore 596224, Reprint edn., ISBN&thinsp;981022740X, 2004.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>Rosenberg et al.(2017)</label><mixed-citation>
      
Rosenberg, R., Maxwell, S., Johnson, B. C., Chapsky, L., Lee, R. A. M., and  Pollock, R.: Preflight Radiometric Calibration of Orbiting Carbon Observatory  2, IEEE T. Geosci. Remote, 55, 1994–2006, <a href="https://doi.org/10.1109/TGRS.2016.2634023" target="_blank">https://doi.org/10.1109/TGRS.2016.2634023</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>Schimel et al.(2015)</label><mixed-citation>
      
Schimel, D., Pavlick, R., Fisher, J. B., Asner, G. P., Saatchi, S., Townsend,  P., Miller, C., Frankenberg, C., Hibbard, K., and Cox, P.: Observing  terrestrial ecosystems and the carbon cycle from space, Glob. Change Biol., 21, 1762–1776, <a href="https://doi.org/10.1111/gcb.12822" target="_blank">https://doi.org/10.1111/gcb.12822</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib58"><label>Sierk et al.(2019)</label><mixed-citation>
      
Sierk, B., Bézy, J.-L., Löscher, A., and Meijer, Y.: The European CO<sub>2</sub> Monitoring Mission: observing anthropogenic greenhouse gas emissions from  space, in: International Conference on Space Optics – ICSO 2018, Chania, Greece, 9–12 October 2018, edited by: Sodnik, Z., Karafolas, N., and Cugny, B., International Society for Optics and Photonics, SPIE, 11180, p. 111800M, <a href="https://doi.org/10.1117/12.2535941" target="_blank">https://doi.org/10.1117/12.2535941</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib59"><label>Sobol(1967)</label><mixed-citation>
      
Sobol, I.: Distribution of Points in a Cube and Approximate Evaluation of   Integrals, Zh. Vych. Mat. Mat. Fiz., 7, 784–802, 1967.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib60"><label>Stein(1999)</label><mixed-citation>
      
Stein, M. L.: Interpolation of spatial data: some theory for kriging, Springer Science &amp; Business Media, <a href="https://doi.org/10.1007/978-1-4612-1494-6" target="_blank">https://doi.org/10.1007/978-1-4612-1494-6</a>, 1999.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib61"><label>Stewart(1998)</label><mixed-citation>
      
Stewart, G. W.: Matrix algorithms, Volume I: Basic Decompositions, SIAM (Society for Industrial and Applied Mathematics), i–xix,
<a href="https://doi.org/10.1137/1.9781611971408.fm" target="_blank">https://doi.org/10.1137/1.9781611971408.fm</a>, 1998.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib62"><label>Tukiainen et al.(2016)</label><mixed-citation>
      
Tukiainen, S., Railo, J., Laine, M., Hakkarainen, J., Kivi, R., Heikkinen, P., Chen, H., and Tamminen, J.: Retrieval of atmospheric CH<sub>4</sub> profiles from  Fourier transform infrared data using dimension reduction and MCMC, J. Geophys. Res.-Atmos., 121, 10312–10327, <a href="https://doi.org/10.1002/2015JD024657" target="_blank">https://doi.org/10.1002/2015JD024657</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib63"><label>Vecchia(1988)</label><mixed-citation>
      
Vecchia, A. V.: Estimation and Model Identification for Continuous Spatial  Processes, J. Roy. Stat. Soc. B Met., 50, 297–312, <a href="http://www.jstor.org/stable/2345768" target="_blank"/> (last access: 31 January 2025), 1988.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib64"><label>Veefkind et al.(2012)</label><mixed-citation>
      
Veefkind, J., Aben, I., McMullan, K., Förster, H., de Vries, J., Otter, G.,  Claas, J., Eskes, H., de Haan, J., Kleipool, Q., van Weele, M., Hasekamp,  O., Hoogeveen, R., Landgraf, J., Snel, R., Tol, P., Ingmann, P., Voors, R.,  Kruizinga, B., Vink, R., Visser, H., and Levelt, P.: TROPOMI on the ESA  Sentinel-5 Precursor: A GMES mission for global observations of the  atmospheric composition for climate, air quality and ozone layer  applications, Remote Sens. Environ., 120, 70–83,  <a href="https://doi.org/10.1016/j.rse.2011.09.027" target="_blank">https://doi.org/10.1016/j.rse.2011.09.027</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib65"><label>Wu et al.(2023)</label><mixed-citation>
      
Wu, K., Yang, D., Liu, Y., Cai, Z., Zhou, M., Feng, L., and Palmer, P. I.:  Evaluating the Ability of the Pre-Launch TanSat-2 Satellite to Quantify Urban
CO<sub>2</sub> Emissions, Remote Sensing, 15, 4904, <a href="https://doi.org/10.3390/rs15204904" target="_blank">https://doi.org/10.3390/rs15204904</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib66"><label>Wunch et al.(2017)</label><mixed-citation>
      
Wunch, D., Wennberg, P. O., Osterman, G., Fisher, B., Naylor, B., Roehl, C. M., O'Dell, C., Mandrake, L., Viatte, C., Kiel, M., Griffith, D. W. T., Deutscher, N. M., Velazco, V. A., Notholt, J., Warneke, T., Petri, C., De Maziere, M., Sha, M. K., Sussmann, R., Rettinger, M., Pollard, D., Robinson, J., Morino, I., Uchino, O., Hase, F., Blumenstock, T., Feist, D. G., Arnold, S. G., Strong, K., Mendonca, J., Kivi, R., Heikkinen, P., Iraci, L., Podolske, J., Hillyard, P. W., Kawakami, S., Dubey, M. K., Parker, H. A., Sepulveda, E., García, O. E., Te, Y., Jeseck, P., Gunson, M. R., Crisp, D., and Eldering, A.: Comparisons of the Orbiting Carbon Observatory-2 (OCO-2) XCO<sub>2</sub> measurements with TCCON, Atmos. Meas. Tech., 10, 2209–2238, <a href="https://doi.org/10.5194/amt-10-2209-2017" target="_blank">https://doi.org/10.5194/amt-10-2209-2017</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib67"><label>Zhu et al.(2022)</label><mixed-citation>
      
Zhu, X., Huang, L., Ibrahim, C., Lee, E. H., and Bindel, D.: Scalable Bayesian Transformed Gaussian Processes, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.2210.10973" target="_blank">https://doi.org/10.48550/arXiv.2210.10973</a>, 20 October 2022.

    </mixed-citation></ref-html>--></article>
