<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0" article-type="research-article">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">AMT</journal-id><journal-title-group>
    <journal-title>Atmospheric Measurement Techniques</journal-title>
    <abbrev-journal-title abbrev-type="publisher">AMT</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Atmos. Meas. Tech.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1867-8548</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/amt-19-3511-2026</article-id><title-group><article-title>A guide to optimised spatiotemporal data co-location by mutual information maximisation</article-title><alt-title>Optimised co-location with mutual information</alt-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes" rid="aff1 aff2">
          <name><surname>Martin</surname><given-names>Andrew Steven</given-names></name>
          <email>eeasm@leeds.ac.uk</email>
        <ext-link>https://orcid.org/0000-0003-2464-8808</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1 aff2">
          <name><surname>Guy</surname><given-names>Heather</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-3525-0766</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff3 aff4">
          <name><surname>Gallagher</surname><given-names>Michael Ray</given-names></name>
          
        </contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1 aff2">
          <name><surname>Neely III</surname><given-names>Ryan Reynolds</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-4560-4812</ext-link></contrib>
        <aff id="aff1"><label>1</label><institution>School of Earth and Environment, University of Leeds, Leeds, UK</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>National Centre for Atmospheric Science, Leeds, UK</institution>
        </aff>
        <aff id="aff3"><label>3</label><institution>Cooperative Institute for Research in Environmental Sciences, University of Colorado, Boulder, Colorado, USA</institution>
        </aff>
        <aff id="aff4"><label>4</label><institution>NOAA Physical Sciences Laboratory, Boulder, Colorado, USA</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Andrew Steven Martin (eeasm@leeds.ac.uk)</corresp></author-notes><pub-date><day>27</day><month>May</month><year>2026</year></pub-date>
      
      <volume>19</volume>
      <issue>10</issue>
      <fpage>3511</fpage><lpage>3537</lpage>
      <history>
        <date date-type="received"><day>5</day><month>December</month><year>2025</year></date>
           <date date-type="rev-request"><day>17</day><month>December</month><year>2025</year></date>
           <date date-type="rev-recd"><day>12</day><month>May</month><year>2026</year></date>
           <date date-type="accepted"><day>13</day><month>May</month><year>2026</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2026 Andrew Steven Martin et al.</copyright-statement>
        <copyright-year>2026</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026.html">This article is available from https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026.html</self-uri><self-uri xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026.pdf">The full text article is available as a PDF file from https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026.pdf</self-uri>
      <abstract><title>Abstract</title>

      <p id="d2e129">The matching of data described on different coordinate systems between multiple data sources – spatiotemporal co-location – is a necessary and crucial step in geospatial data synthesis and validation. The particular choice of co-location scheme, and the choice of parameters applied to it, decide what subsets of the original datasets are included in downstream analyses, affecting the quantitative outputs of comparison studies and multi-retrieval synthesised datasets. Previously, no generalised framework for deciding how best to co-locate data has existed. We outline a domain- and data-agnostic framework that generalises the process of selecting an optimised co-location parametrisation for a given co-location scheme, by maximising the mutual information encoded between the data included in the subsequent analyses. We demonstrate the framework by applying it to a comparison of vertical cloud fraction profiles retrieved from the polar-orbiting ICESat-2 satellite's ATL09 data product, and surface-based observations at four Cloudnet observatories. We evaluate per-site optimised co-location parametrisations and find that using the optimised co-location parametrisations quantitatively improves the comparison between the datasets over naive choices of co-location parameters. This work has implications across almost all remote sensing data products – especially for satellite validations – and will facilitate deep learning methodologies by producing paired datasets with the maximal information about the structure between datasets available to be learned.</p>
  </abstract>
    
<funding-group>
<award-group id="gs1">
<funding-source>Natural Environment Research Council</funding-source>
<award-id>NE/T00939X/1</award-id>
</award-group>
</funding-group>
</article-meta>
  </front>
<body>
      

      
<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d2e143">Remote sensing data, obtained from Earth observation satellites and surface based observatories, provide invaluable data for furthering our understanding of Earth-system processes, for the validation and constraining of models, and for making observations in remote locations or locations with extreme conditions.</p>
      <p id="d2e146">Particularly for satellite data, rigorous validation and a formal uncertainty characterisation are essential for subsequent use of the data. In order to validate satellite data, we need to compare it against reference measurements <xref ref-type="bibr" rid="bib1.bibx24" id="paren.1"/>. Rarely are the reference measurements described on the same set of coordinates as the data to be validated. Inter-comparison of remote sensing retrievals and multi-sensor data synthesis are subject to similar challenges.</p>
      <p id="d2e152">Ideally when comparing data from two different sources, the observations are made simultaneously, and are sensitive to the same spatial volume. However, observations from different platforms will have different viewing geometries, such that the sensitivities across the same observed spatial volume differ between the data sources. This induces a difference between the measurements often referred to as the smoothing error <xref ref-type="bibr" rid="bib1.bibx48" id="paren.2"><named-content content-type="pre">e.g.</named-content></xref>. Furthermore, the observations from different sources can measure distinct physical volumes that are spatiotemporally displaced from each other. This induces a bias commonly referred to as the co-location mismatch <xref ref-type="bibr" rid="bib1.bibx60 bib1.bibx62" id="paren.3"><named-content content-type="pre">e.g.</named-content></xref>. <xref ref-type="bibr" rid="bib1.bibx60" id="text.4"><named-content content-type="post">Fig. 1</named-content></xref> and <xref ref-type="bibr" rid="bib1.bibx24" id="text.5"><named-content content-type="post">Fig. 2</named-content></xref> both provide good representations of the issues of co-locating measurements.</p>
      <p id="d2e175">In order to compare data recorded on different sets of coordinates, we need to perform spatiotemporal co-location. We define spatiotemporal co-location as the process of matching data between two or more data sources, described on different sets of coordinates, such that discrete co-location events can be defined. For a given co-location event, the data associated with it from the different data sources is considered sufficiently close in time and space to be directly comparable once the data have been homogenised <xref ref-type="bibr" rid="bib1.bibx24" id="paren.6"/>. Often, for a given implementation of spatiotemporal co-location – which we will refer to as a <italic>co-location scheme</italic> (defined in Sect. <xref ref-type="sec" rid="Ch1.S2.SS1"/>) – there will be parameters for the scheme that change the amount of data permitted by the subsetting operations of the co-location process. Once data have been co-located and co-location events have been identified, formal uncertainty characterisation can be performed, or other comparison metrics such as the bias, RMSE and correlation coefficients can be calculated <xref ref-type="bibr" rid="bib1.bibx61" id="paren.7"><named-content content-type="pre">e.g.</named-content></xref>.</p>
      <p id="d2e192">A good spatiotemporal co-location requires that there are sufficient  co-location events for the subsequent analysis to be viable. Conversely, the co-location cannot permit so much data that the subsequent analysis is contaminated with data being compared between two physically independent sets of observations. Finding a parametrisation for a co-location scheme that balances the need for sufficient data, whilst minimising the co-location mismatch between the compared data within a co-location event is the crux of the problem, and is as yet unsolved <xref ref-type="bibr" rid="bib1.bibx19" id="paren.8"/>.</p>
      <p id="d2e198">As an example, when comparing data between a satellite and a surface based observatory, a simple and often used co-location scheme is to generate co-location events when the satellite measurement footprint falls within some along-ground distance <inline-formula><mml:math id="M1" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> of the surface based observatory, and to subset the surface based data with a temporal window of duration <inline-formula><mml:math id="M2" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>, centred on the time of closest approach of the satellite to the observatory. By doing this, individual co-location events (often described as <italic>overpasses</italic>) are small segments of a single orbit from the satellite, and the data is often averaged along-track to obtain a single vertical profile or scalar value that can be compared against temporally averaged data from the surface-based observatory <xref ref-type="bibr" rid="bib1.bibx1 bib1.bibx2 bib1.bibx22 bib1.bibx25 bib1.bibx26 bib1.bibx28 bib1.bibx34 bib1.bibx40 bib1.bibx41 bib1.bibx43 bib1.bibx44 bib1.bibx47 bib1.bibx51" id="paren.9"><named-content content-type="pre">e.g.</named-content></xref>. Each of the studies using the aforementioned co-location scheme to match data between a satellite and surface-based observatory need to make a choice for the values of <inline-formula><mml:math id="M3" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M4" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>, the parameters affecting the spatiotemporal volumes within which data must have been recorded in order to be permitted in subsequent analyses.</p>
      <p id="d2e238">When deciding how data are co-located, care must be taken to ensure the co-location parametrisation is selected independently of the results of the subsequent analysis <xref ref-type="bibr" rid="bib1.bibx63" id="paren.10"/>. Some studies justify the choice of their co-location parametrisations qualitatively <xref ref-type="bibr" rid="bib1.bibx2 bib1.bibx4 bib1.bibx13 bib1.bibx25 bib1.bibx32 bib1.bibx43 bib1.bibx47 bib1.bibx50" id="paren.11"><named-content content-type="pre">e.g.</named-content></xref>.  Some studies test the effects of changing the co-location parameters empirically <xref ref-type="bibr" rid="bib1.bibx1 bib1.bibx12 bib1.bibx40 bib1.bibx44" id="paren.12"><named-content content-type="pre">e.g.</named-content></xref> however, these studies use the comparison metric being used in the subsequent analysis to justify or inform the choice of co-location parametrisation. The comparison metric is an unknown quantity (hence the need for the analysis in the first place), so by using the value of the comparison metric to inform the co-location parametrisation – which in turn affects the estimate of the comparison metric itself – a prior expectation is effectively applied to the comparison metric in the analysis.</p>
      <p id="d2e254">For example, the linear correlation coefficient between retrieved values is often used as a comparison and validation metric. Often, the co-location parametrisation that maximises the correlation coefficient is selected <xref ref-type="bibr" rid="bib1.bibx12 bib1.bibx40 bib1.bibx51" id="paren.13"><named-content content-type="pre">e.g.</named-content></xref>.  The computation of the correlation coefficient is an estimate, with bias and variance depending on the data permitted by the co-location. By selecting a co-location parametrisation that maximises the correlation coefficient, we are preferentially selecting results with larger correlation coefficients, which could arise from the variance in the estimation, rather than an actually better comparison between the measurements. This is akin to over-fitting models to our data. The outcome: results will look better than they potentially are. For example, the uncertainty budgets for retrievals may be underestimated and inferred biases could be too small in magnitude.</p>
      <p id="d2e262">As such, an independent metric for assessing the quality of data co-location is necessary. By treating measurements as samples drawn from probability distributions of underlying geophysical fields, and paired measurements within a co-location event as being drawn from a joint probability distribution, we propose the mutual information between data within co-location events as an appropriate metric to assess the quality of spatiotemporal co-locations. Mutual information balances the requirements for sufficiently sampling the available system states such that we can infer relationships between the data, whilst being sensitive to the inclusion of comparisons between physically independent samples.</p>
      <p id="d2e265">From this, we outline a generalised framework for evaluating optimised co-location parametrisations by maximising the mutual information between data to be compared. For any co-location scheme that can be parametrised with a finite number of parameters (Sect. <xref ref-type="sec" rid="Ch1.S2.SS1"/>), the mutual information (Sect. <xref ref-type="sec" rid="Ch1.S2.SS2"/>) is estimated between the data permitted by the co-location (Sect. <xref ref-type="sec" rid="Ch1.S2.SS3"/>), and the optimised co-location parametrisation is selected as the parametrisation that maximises the mutual information (Sect. <xref ref-type="sec" rid="Ch1.S2.SS4"/>). We demonstrate the framework by applying it to a comparison between the ICESat-2 ATL09 cloud layer product and macrophysical cloud products derived from four Cloudnet observatories (Sect. <xref ref-type="sec" rid="Ch1.S3"/>).</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>Framework</title>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Framework definitions</title>
      <p id="d2e293">The outcome of this framework is the co-location of data where the information shared between the two retrievals is maximised. As described in <xref ref-type="bibr" rid="bib1.bibx24" id="text.14"/>, before validation metrics between datasets can be calculated, three key steps must be performed: quality checks, to ensure the data being compared are realistic, self-consistent, and reasonably well characterised; spatiotemporal co-location, ensuring that the data being compared represent sufficiently similar measurements in time and space, and; homogenisation, whereby any further transformations to the data (unit conversions, temporal or spatial aggregations, etc.) are applied, allowing like-to-like comparisons to be made between the homogenised data. In this framing, both the quality checks and co-location processes act to subset the data, permitting observations that meet both the quality requirements and co-location criteria to be used in the homogenisation process. Throughout this work, we will refer to co-location schemes, criteria, parametrisations and events. <list list-type="order"><list-item>
      <p id="d2e301"><italic>Co-location scheme.</italic> The method by which data from two or more data sources are matched with each other. Some example co-location schemes are outlined in Fig. <xref ref-type="fig" rid="F1"/>.</p></list-item><list-item>
      <p id="d2e309"><italic>Co-location criteria.</italic> The logical statements that implement a given co-location scheme. The co-location criteria often take the form of inequalities, with data satisfying the inequalities being included in the analysis.</p></list-item><list-item>
      <p id="d2e315"><italic>Co-location parametrisation.</italic> A vector specifying the values of variable parameters used in the co-location criteria. These can be described by a general parametrisation vector, <inline-formula><mml:math id="M5" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M6" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M7" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi>M</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M8" display="inline"><mml:mi>M</mml:mi></mml:math></inline-formula> is the number of values required to fully describe the co-location scheme.</p></list-item><list-item>
      <p id="d2e368"><italic>Co-location event.</italic> A discrete unit of matched homogenised data between the co-located data sources that simultaneously satisfy all of the co-location criteria.</p></list-item></list></p>

      <fig id="F1"><label>Figure 1</label><caption><p id="d2e375">Example realisations of spatial co-location schemes between data of different spatial dimensionalities. <bold>(a)</bold> A point-line co-location, where the data falling within a distance <inline-formula><mml:math id="M9" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> of the point data source is utilised from the line source. <bold>(b)</bold> A point-area co-location, where pixels whose centres fall within a distance <inline-formula><mml:math id="M10" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> of the point data source are used. <bold>(c)</bold> A line-line co-location where data falling within a distance <inline-formula><mml:math id="M11" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> of the crossing point between the lines is used. <bold>(d)</bold> A line-area co-location, where a minimum path length, <inline-formula><mml:math id="M12" display="inline"><mml:mi>l</mml:mi></mml:math></inline-formula>, must be traced within each pixel in order for the pixel to be used in the analysis. In panels <bold>(b)</bold> and <bold>(d)</bold>, the pixels highlighted in bold (red) are those selected to remain in the homogenisation process. Each spatial co-location scheme will also be paired with a temporal co-location scheme. References for each co-location scheme are provided in Sect. <xref ref-type="sec" rid="Ch1.S2.SS1"/>.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f01.png"/>

        </fig>

      <p id="d2e433">The framework requires that the co-location criteria for a given co-location scheme can be described by a finite number of parameters, which is applicable for all realistic co-location schemes. The co-location scheme <italic>can</italic> be arbitrary, but it should be physically motivated to achieve better results. Figure <xref ref-type="fig" rid="F1"/> shows four possible co-location schemes between data of different dimensionalities. Panel (a) shows a scheme for co-locating satellite swath data and point-like surface based observations, as was described in the introduction <xref ref-type="bibr" rid="bib1.bibx1 bib1.bibx2 bib1.bibx4 bib1.bibx22 bib1.bibx23 bib1.bibx25 bib1.bibx32 bib1.bibx40 bib1.bibx43 bib1.bibx44 bib1.bibx47 bib1.bibx51" id="paren.15"><named-content content-type="pre">e.g.</named-content></xref>. Panel (b) instead shows a possible scheme for matching data between a 2-dimensional source (e.g. a grid of pixels) and a point source of data <xref ref-type="bibr" rid="bib1.bibx5 bib1.bibx10 bib1.bibx12 bib1.bibx13 bib1.bibx39 bib1.bibx46 bib1.bibx49" id="paren.16"><named-content content-type="pre">e.g.</named-content></xref>. Panel (c) shows a possible co-location scheme between two 1-dimensional data sources – possibly two satellite ground tracks <xref ref-type="bibr" rid="bib1.bibx44 bib1.bibx64" id="paren.17"><named-content content-type="pre">e.g.</named-content></xref>. Data are subset based on it falling within a circle of radius <inline-formula><mml:math id="M13" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>, common to both data sources, and centred on the location where the paths cross. There is a second criterion, that the time difference between the data sources being at the crossing point should be less than some upper bound <inline-formula><mml:math id="M14" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>, where <inline-formula><mml:math id="M15" display="inline"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the distances of data from source <inline-formula><mml:math id="M16" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> to the crossing point, and <inline-formula><mml:math id="M17" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the times associated with data from source <inline-formula><mml:math id="M18" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> being at the crossing point. Thus, the co-location scheme leads to the co-location criteria of <inline-formula><mml:math id="M19" display="inline"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>≤</mml:mo><mml:mi>R</mml:mi></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M20" display="inline"><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>|</mml:mo><mml:mo>≤</mml:mo><mml:mi mathvariant="italic">τ</mml:mi></mml:mrow></mml:math></inline-formula>. The co-location criteria can be described by the parametrisation <inline-formula><mml:math id="M21" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M22" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M23" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Co-location events consist of paired homogenised data between the two satellites where their orbital paths intersected within a time <inline-formula><mml:math id="M24" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>, and the data are spatially subset within the circle of radius <inline-formula><mml:math id="M25" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>.</p>
      <p id="d2e595">Although the above co-location schemes are basic and naive to the underlying physical processes that govern the spatiotemporal gradients of the measurands, our knowledge of the underlying processes can be encoded into the co-location scheme through the inclusion of additional co-location criteria and higher dimensional co-location parametrisations. For example, if co-locating atmospheric data between a satellite and ground-based station, the co-location scheme in Fig. <xref ref-type="fig" rid="F1"/>a could be augmented to encode our expectations about how local advection may affect the spatiotemporal co-location of the data. For example, more complex schemes could allow us to encode our expectations of how local advection would affect the spatial distribution of (in)dependent samples between data sources, through the implementation of logical criteria on ancillary wind data.</p>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>Mutual information</title>
      <p id="d2e608">In this framework, we treat the underlying physical state as being  independent between co-location events. Thus, we treat the underlying physical state of the system being measured as a random variable drawn from the distribution of all plausible system states. Measurements are affected by this randomness, as well as other confounding variability due to co-location mismatch and detector noise (for example). As such, pairs of measurements within a co-location event should be related by a joint probability distribution that accounts for the distribution of system states and the additional variability. If the measurements being made are not independent, the joint probability distribution will have some non-independent structure that can be used to inform us about the relationship between the measurements. Mutual information is a concept derived from information theory <xref ref-type="bibr" rid="bib1.bibx54" id="paren.18"><named-content content-type="pre">e.g.</named-content></xref> that we use as a quantitative metric to assess the quality of the relationship between sets of retrievals when co-locating the data with different co-location parametrisations. <xref ref-type="bibr" rid="bib1.bibx58" id="text.19"/> provides a good introduction to the concepts of information theory.</p>
      <p id="d2e619"><xref ref-type="bibr" rid="bib1.bibx35" id="text.20"/> describes the entropy of a variable <inline-formula><mml:math id="M26" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M27" display="inline"><mml:mrow><mml:mtext>H</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, as a measure of our ignorance about the variable <inline-formula><mml:math id="M28" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> prior to observation, and as the average quantity of information we stand to learn by measuring <inline-formula><mml:math id="M29" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula>.

            <disp-formula id="Ch1.E1" content-type="numbered"><label>1</label><mml:math id="M30" display="block"><mml:mrow><mml:mtext>H</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="double-struck">E</mml:mi><mml:mi>X</mml:mi></mml:msub><mml:mfenced close="]" open="["><mml:mrow><mml:mi>log⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

          where <inline-formula><mml:math id="M31" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the probability of the variable having a given value, and <inline-formula><mml:math id="M32" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="double-struck">E</mml:mi><mml:mi>X</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is an expectation value taken over all possible values of <inline-formula><mml:math id="M33" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula>. <xref ref-type="bibr" rid="bib1.bibx35" id="text.21"/> describes mutual information between two random variables <inline-formula><mml:math id="M34" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M35" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> as the expected reduction in our ignorance of the possible values of <inline-formula><mml:math id="M36" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula>, given our knowledge of the value of <inline-formula><mml:math id="M37" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>. Mutual information is expressed as

            <disp-formula id="Ch1.E2" content-type="numbered"><label>2</label><mml:math id="M38" display="block"><mml:mrow><mml:mtext>I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="double-struck">E</mml:mi><mml:mrow><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi></mml:mrow></mml:msub><mml:mfenced open="[" close="]"><mml:mrow><mml:mi>log⁡</mml:mi><mml:mfenced close=")" open="("><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

          where <inline-formula><mml:math id="M39" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the joint probability of the outcome of <inline-formula><mml:math id="M40" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M41" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> occurring simultaneously, <inline-formula><mml:math id="M42" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M43" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are the marginal distributions of <inline-formula><mml:math id="M44" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M45" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> respectively, and <inline-formula><mml:math id="M46" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="double-struck">E</mml:mi><mml:mrow><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> represents an expectation over all possible pairs of <inline-formula><mml:math id="M47" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M48" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>. Mutual information has units depending on the base <inline-formula><mml:math id="M49" display="inline"><mml:mi>b</mml:mi></mml:math></inline-formula> of the logarithm used in the information theoretic equations, with <inline-formula><mml:math id="M50" display="inline"><mml:mi>b</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M51" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M52" display="inline"><mml:mn mathvariant="normal">2</mml:mn></mml:math></inline-formula> giving units of bits, and <inline-formula><mml:math id="M53" display="inline"><mml:mi>b</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M54" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M55" display="inline"><mml:mi>e</mml:mi></mml:math></inline-formula> giving units of nats. Conversions between information theoretic units consists of linearly scaling the information theoretic values.</p>
      <p id="d2e990">In our framework, <inline-formula><mml:math id="M56" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M57" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are the retrievals or measurements of underlying physical quantities that we wish to compare or relate. In the limiting case that <inline-formula><mml:math id="M58" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M59" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are independent, we obtain <inline-formula><mml:math id="M60" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M61" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M62" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, and Eq. (<xref ref-type="disp-formula" rid="Ch1.E2"/>) yields a value of I <inline-formula><mml:math id="M63" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0. Otherwise, the mutual information encoded between the measurements will be positive, with larger values indicating greater reductions in our ignorance of the pair of measured values given access to one of the values.</p>
      <p id="d2e1078">When co-locating data between data sources in order to compare the data, we want to include data that best characterises the relationship between the data sources. The best possible relationship between our data is a one-to-one mapping between values <inline-formula><mml:math id="M64" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M65" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>. In this case, knowledge of one variable fully determines the value of the other. This case is equivalent to minimising our ignorance of the joint value of the two retrievals, and is equivalent to maximising the mutual information between the retrievals.</p>
      <p id="d2e1096">This is shown in Fig. <xref ref-type="fig" rid="F2"/>. Two variables, <inline-formula><mml:math id="M66" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M67" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, each with a fixed marginal distribution (Fig. <xref ref-type="fig" rid="F2"/>a–c), take on two distinct joint probability distributions (Fig. <xref ref-type="fig" rid="F2"/>d–e). Figure <xref ref-type="fig" rid="F2"/>d shows <inline-formula><mml:math id="M68" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M69" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> as being independent variables. The probability density is distributed throughout the space, and the probability of <inline-formula><mml:math id="M70" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> given any value of <inline-formula><mml:math id="M71" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> simply follows the marginal distribution <inline-formula><mml:math id="M72" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. In this case, learning the value of <inline-formula><mml:math id="M73" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> yields no new information about the possible value of <inline-formula><mml:math id="M74" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula>, and the mutual information is estimated to be near zero. The value of <inline-formula><mml:math id="M75" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M76" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M77" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>0.002 nats can be negative as a result of it being empirically estimated (see Sect. <xref ref-type="sec" rid="Ch1.S2.SS3"/>). In Fig. <xref ref-type="fig" rid="F2"/>e, <inline-formula><mml:math id="M78" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M79" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> have a strong non-linear dependency. If we know the value of <inline-formula><mml:math id="M80" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, our ignorance about the possible values of <inline-formula><mml:math id="M81" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> decreases substantially, as <inline-formula><mml:math id="M82" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> almost certainly falls on the manifold mapping <inline-formula><mml:math id="M83" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> to <inline-formula><mml:math id="M84" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula>. There is obvious structure in the joint probability distribution that can be learned, and as a result, the mutual information estimate <inline-formula><mml:math id="M85" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M86" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 1.965 nats is higher when <inline-formula><mml:math id="M87" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M88" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are dependent compared to being independent.</p>

      <fig id="F2" specific-use="star"><label>Figure 2</label><caption><p id="d2e1299">Two synthetically generated variables <inline-formula><mml:math id="M89" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M90" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> with fixed marginal distributions <bold>(a, b, c)</bold> form two different joint probability distributions <bold>(d, e)</bold>. <inline-formula><mml:math id="M91" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M92" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are independent in panel <bold>(d)</bold>, but have a strong non-linear dependence in panel <bold>(e)</bold>. When there is dependence between the variables, the probability density is spread across fewer possible states. Mutual information, I<sub>KSG</sub> (given below <bold>d</bold> and <bold>e</bold>), captures this structure, increasing as the individual variables encode more information about the joint distribution.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f02.png"/>

        </fig>

      <p id="d2e1364">It is of note that the mutual information between the data <inline-formula><mml:math id="M94" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M95" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> is invariant under reversible transformations <inline-formula><mml:math id="M96" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M97" display="inline"><mml:mo>→</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M98" display="inline"><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M99" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M100" display="inline"><mml:mo>→</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M101" display="inline"><mml:mrow><mml:msup><mml:mi>Y</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>. Conversely, it can be shown through the data processing inequality <xref ref-type="bibr" rid="bib1.bibx6 bib1.bibx42" id="paren.22"><named-content content-type="pre">e.g.</named-content></xref> that if <inline-formula><mml:math id="M102" display="inline"><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> is computed independently of <inline-formula><mml:math id="M103" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, or <inline-formula><mml:math id="M104" display="inline"><mml:mrow><mml:msup><mml:mi>Y</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> is computed independently of <inline-formula><mml:math id="M105" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula>, that the post-processing steps can only act to decrease the mutual information between <inline-formula><mml:math id="M106" display="inline"><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M107" display="inline"><mml:mrow><mml:msup><mml:mi>Y</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>. That is,

            <disp-formula id="Ch1.E3" content-type="numbered"><label>3</label><mml:math id="M108" display="block"><mml:mrow><mml:mtext>I</mml:mtext><mml:mfenced open="(" close=")"><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>;</mml:mo><mml:msup><mml:mi>Y</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:mfenced><mml:mo>≤</mml:mo><mml:mtext>I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

          with equality holding only if the transformations <inline-formula><mml:math id="M109" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M110" display="inline"><mml:mo>→</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M111" display="inline"><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M112" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M113" display="inline"><mml:mo>→</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M114" display="inline"><mml:mrow><mml:msup><mml:mi>Y</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> are both reversible. Thus, for computing mutual information between homogenised geophysical variables, the number of irreversible post-processing corrections to the data that would be applied to a typical analysis should be minimised. For example, range corrections to lidar backscatter data are reversible, as they consist of multiplication by a fixed factor for each data point. Thus, these type of range corrections can be applied, but have no impact on the mutual information. A noise reduction process like a rolling-window average however is irreversible, so would act to reduce the mutual information between the two data sources.</p>
</sec>
<sec id="Ch1.S2.SS3">
  <label>2.3</label><title>Mutual information estimation</title>
      <p id="d2e1594">The definition for mutual information given in Eq. (<xref ref-type="disp-formula" rid="Ch1.E2"/>) requires full knowledge of the joint probability distribution between the variables in question. We have incomplete knowledge of the marginal and joint probability distributions from which our measurements are sampled. Thus, we need to estimate mutual information, and the estimator must be able to handle a finite number of continuous valued samples as input. Problems in the Earth sciences may require comparison of multiple variables simultaneously, or of vector quantities. Thus, mutual information estimators that also handle higher dimensional samples from probability distributions are preferable.</p>
      <p id="d2e1599">One method to estimate the mutual information is to discretise the measurements and produce a histogram approximating the underlying probability distributions <xref ref-type="bibr" rid="bib1.bibx3" id="paren.23"><named-content content-type="pre">e.g.</named-content></xref>, as is demonstrated in Fig. <xref ref-type="fig" rid="F2"/>. This method requires that a sufficient number of joint measurements are made in order to well characterise the joint probability distribution. As mutual information is a measure of the structure of the joint probability distribution, the choice of histogram bins into which the data are discretised is very important. The bins need not be uniform in size, and adapting the bins to the data can decrease the bias and variance of the estimator <xref ref-type="bibr" rid="bib1.bibx9" id="paren.24"/>.</p>
      <p id="d2e1612">Another commonly employed method is estimating the mutual information from nearest neighbour distances between samples in the sample space <xref ref-type="bibr" rid="bib1.bibx18 bib1.bibx15" id="paren.25"><named-content content-type="pre">e.g.</named-content></xref>. Regions of the sample space with high probability density will likely be sampled more frequently than regions with lower probability density, resulting in samples drawn with low separations between them, indicating structure in the distribution. <xref ref-type="bibr" rid="bib1.bibx15" id="text.26"/> describes a method that extends the mutual information estimators described in <xref ref-type="bibr" rid="bib1.bibx18" id="text.27"/>, <inline-formula><mml:math id="M115" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, from 1-dimensional to multidimensional samples from both <inline-formula><mml:math id="M116" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M117" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, and a way to characterise the bias and variance of the estimator. The extension of a nearest neighbours method to multidimensional samples allows fast and (relatively) computationally efficient calculation of mutual information compared to discretisation methods.</p>
</sec>
<sec id="Ch1.S2.SS4">
  <label>2.4</label><title>Parametrising co-location criteria by maximising mutual information</title>
      <p id="d2e1663">A good comparison between data requires that comparisons are made between retrievals from co-location events that well sample the physically possible underlying system states. This amounts to obtaining enough co-location events such that the marginal distributions of all retrievals are well sampled.</p>
      <p id="d2e1666">The reliable estimation of the mutual information requires a certain number of co-location events to be permitted by a co-location parametrisation <inline-formula><mml:math id="M118" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>. Some <inline-formula><mml:math id="M119" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> will permit too little data for the mutual information estimators to learn structure in the joint probability distribution, resulting in an underestimation of the mutual information. At some point, parametrisations <inline-formula><mml:math id="M120" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> will permit sufficient data for the mutual information estimator to accurately estimate <inline-formula><mml:math id="M121" display="inline"><mml:mrow><mml:mtext>I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> (assuming no contamination by independent data), when the individual measurement marginal distributions are well sampled. Thus, having parametrisations permit more data leads to increases in <inline-formula><mml:math id="M122" display="inline"><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, until the estimated <inline-formula><mml:math id="M123" display="inline"><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> <inline-formula><mml:math id="M124" display="inline"><mml:mo>≈</mml:mo></mml:math></inline-formula> I, and additional co-location events provide no new information about the joint distribution of the retrievals being compared.</p>
      <p id="d2e1736">At some point, parametrisations will produce co-location events matching data within a sufficiently large spatiotemporal volume, such that the data contributing to the co-location event from different sources originate from physically independent observations. This will contaminate the joint probability distribution being assessed with independent samples. Appendix <xref ref-type="sec" rid="App1.Ch1.S1"/> demonstrates, using a toy model, that contaminating the comparison with independent data necessarily reduces the upper bound of the mutual information encoded between retrievals. Thus, for <inline-formula><mml:math id="M125" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> permitting data co-location within large enough spatiotemporal volumes, increasing the spatiotemporal volume should act to decrease the mutual information.</p>
      <p id="d2e1748">Figure <xref ref-type="fig" rid="F3"/> implements the toy model described in Appendix <xref ref-type="sec" rid="App1.Ch1.S1.SS1"/> to demonstrate the effects of a co-location parametrisation permitting too little and too much data. Figure <xref ref-type="fig" rid="F3"/>a shows a data limited regime, in which there are deficient samples for learning the structure of the relationship between the variables reliably. The mutual information between the variables can be improved by further sampling, as is shown in Fig. <xref ref-type="fig" rid="F3"/>b. In this case, there is no contamination and the significantly denser sampling results in a value of <inline-formula><mml:math id="M126" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M127" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 1.984 <inline-formula><mml:math id="M128" display="inline"><mml:mo>±</mml:mo></mml:math></inline-formula> 0.015 nats. Figure <xref ref-type="fig" rid="F3"/>c–d show the effects of contamination with independent data, and how it rapidly reduces the estimated <inline-formula><mml:math id="M129" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> values to near zero when the signal to noise ratio is highest in panel (d).</p>

      <fig id="F3" specific-use="star"><label>Figure 3</label><caption><p id="d2e1807">A demonstration of how a data-limited <bold>(a)</bold> and independent data contaminated <bold>(c–d)</bold> regime reduces the estimated mutual information encoded between two variables <inline-formula><mml:math id="M130" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M131" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, when compared to a case when <inline-formula><mml:math id="M132" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M133" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are well sampled without contamination from independent data <bold>(b)</bold>. Panel <bold>(b)</bold> represents the best scenario for a co-location. In all cases, <inline-formula><mml:math id="M134" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M135" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are synthetic and generated from the same marginal distributions as in Fig. <xref ref-type="fig" rid="F2"/>.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f03.png"/>

        </fig>

      <p id="d2e1873">Thus, there are two main factors influencing <inline-formula><mml:math id="M136" display="inline"><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>: a data limited regime in which the inclusion of more data acts to increase <inline-formula><mml:math id="M137" display="inline"><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> and; a contaminated data regime in which the inclusion of more data increases the proportion of independent samples being compared, which acts to reduce <inline-formula><mml:math id="M138" display="inline"><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>. We postulate that a parametrisation <inline-formula><mml:math id="M139" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> exists for which the mutual information is maximised, where the competing effects of including more comparable samples and more independent samples are balanced. At <inline-formula><mml:math id="M140" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>, the relationship between the retrievals, encoded in their joint distribution, is best characterised, and this should be the co-location parametrisation used in any subsequent analysis.</p>
      <p id="d2e1926">Thus, in order to optimise the co-location parametrisation <inline-formula><mml:math id="M141" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, the steps are: <list list-type="order"><list-item>
      <p id="d2e1941">Identify a set <inline-formula><mml:math id="M142" display="inline"><mml:mrow><mml:mi mathvariant="script">P</mml:mi><mml:mo>=</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>, describing a range of plausible parametrisations.</p></list-item><list-item>
      <p id="d2e1961">For every <inline-formula><mml:math id="M143" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>∈</mml:mo><mml:mi mathvariant="script">P</mml:mi></mml:mrow></mml:math></inline-formula>, perform quality checks and spatiotemporal co-location subsetting according to <inline-formula><mml:math id="M144" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> on the data to be compared.</p></list-item><list-item>
      <p id="d2e1984">Apply the chosen mutual information estimator to the co-located homogenised data, to obtain <inline-formula><mml:math id="M145" display="inline"><mml:mrow><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p></list-item><list-item>
      <p id="d2e2005">Identify the optimised co-location parametrisation <inline-formula><mml:math id="M146" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> as the parametrisation maximising <inline-formula><mml:math id="M147" display="inline"><mml:mrow><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. That is<disp-formula id="Ch1.E4" content-type="numbered"><label>4</label><mml:math id="M148" display="block"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:munder><mml:mtext>arg max</mml:mtext><mml:mrow><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>∈</mml:mo><mml:mi mathvariant="script">P</mml:mi></mml:mrow></mml:munder><mml:mspace width="0.33em" linebreak="nobreak"/><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p></list-item></list></p>
      <p id="d2e2071">In the following section, we will demonstrate the application and usefulness of this framework by co-locating satellite and surface based retrievals of vertical cloud fraction, and showing that the comparison is optimised by choosing the co-location parametrisation <inline-formula><mml:math id="M149" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> over other values.</p>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Application example: validating the ICESat-2 ATL09 cloud layer product using Cloudnet observations</title>
<sec id="Ch1.S3.SS1">
  <label>3.1</label><title>ICESat-2 ATL09 cloud layer product</title>
      <p id="d2e2100">ICESat-2 is a polar orbiting satellite, launched by NASA in 2018, and is the only satellite currently in orbit with the capability to make vertically resolved observations polewards of 83° north and south <xref ref-type="bibr" rid="bib1.bibx27" id="paren.28"/>. The satellite has a single instrument payload, the Advanced Topographic Laser Altimeter System (ATLAS) – a photon counting lidar predominantly designed for altimetry  <xref ref-type="bibr" rid="bib1.bibx36" id="paren.29"/>. To aid analyses of the altimetry data, atmospheric backscatter products are produced to facilitate quality checks on the altimetry data products. The ATL04 normalised relative backscatter profiles product is a level 2 product derived from the photon point-cloud data provided by ATL02 <xref ref-type="bibr" rid="bib1.bibx33" id="paren.30"/>. ATL04 consists of photon returns that are vertically aggregated and summed over 400 consecutive laser pulses, to produce a data product with 30 m vertical resolution and 280 m along-track resolution. Photon counts are reported in a vertical range from 250 m below an on-board digital elevation model (DEM) to 13.75 km above the DEM. The ATLAS lidar transmits six laser beams, split into three pairs of a strong and weak beam, with the strong beams transmitting four times more power than the weak beams. The ATL04, and subsequently the ATL09 product, use the measurements from the three strong  beams, producing three sets of vertically resolved observations.</p>
      <p id="d2e2112">The ATL09 calibrated backscatter and atmospheric layers data product <xref ref-type="bibr" rid="bib1.bibx38 bib1.bibx37" id="paren.31"/> derives from the ATL04 data product, with the aim of characterising the state of the atmosphere through which ICESat-2 performs its altimetry measurements. Due to the challenges associated with absolute calibration of the backscatter profiles, the high noise rate, and the folding of signals into a 15 km window <xref ref-type="bibr" rid="bib1.bibx38" id="paren.32"/>, a bespoke cloud detection algorithm was developed, the density dimension algorithm <xref ref-type="bibr" rid="bib1.bibx14" id="paren.33"><named-content content-type="pre">DDA,</named-content></xref>.</p>
      <p id="d2e2126">Although ATL09 calibrated backscatter profiles have been compared against profiles from a cloud physics lidar, and CALIPSO <xref ref-type="bibr" rid="bib1.bibx38" id="paren.34"/>, and the DDA has been demonstrated for cloud and blowing snow detection over Antarctica <xref ref-type="bibr" rid="bib1.bibx14" id="paren.35"/>, no validation of the produced cloud layer product has been made against surface based cloud observations. This study will demonstrate the framework outlined in Sect. <xref ref-type="sec" rid="Ch1.S2"/> whilst providing an initial comparison of the ICESat-2 ATL09 cloud layer retrieval against surface based retrievals.</p>
      <p id="d2e2137">Quality checks for the ATL09 data are described in Appendix <xref ref-type="sec" rid="App1.Ch1.S2.SS1"/>. The homogenisation process transforms the atmospheric layer boundaries reported – pairs of cloud top and cloud base heights – into a categorised feature mask distinguishing between clear sky, cloud, and attenuated regions where the presence of atmospheric scatterers cannot be determined. Vertical profiles of the feature mask are subset according to the co-location criteria (outlined in Sect. <xref ref-type="sec" rid="Ch1.S3.SS3"/>). The feature mask is then horizontally averaged across all the remaining profiles to produce vertical cloud fraction (VCF) profiles. The VCF profiles are then homogenised by being vertically interpolated onto a set of <inline-formula><mml:math id="M150" display="inline"><mml:mn mathvariant="normal">50</mml:mn></mml:math></inline-formula> height coordinates with a vertical spacing of 240 m.</p>
</sec>
<sec id="Ch1.S3.SS2">
  <label>3.2</label><title>Cloudnet</title>
      <p id="d2e2159">The Cloudnet retrieval <xref ref-type="bibr" rid="bib1.bibx16" id="paren.36"/> produces products that categorise the atmospheric profile above a given observatory by optimally combining available retrievals from multiple data sources. Cloudnet synthesises data from ground-based radar, lidar, microwave radiometer, and weather forecast models to produce fields of macrophysical and microphysical quantities such as temperature, cloud occurrence and ice water content. There are 28 main Cloudnet sites, with numerous campaigns and ARM sites also contributing data. For this study, we use data from four observatories: Ny-Ålesund, Hyytiala, Jülich and Munich <xref ref-type="bibr" rid="bib1.bibx11" id="paren.37"/>. The location of each site is outlined in Table <xref ref-type="table" rid="T1"/>.</p>

<table-wrap id="T1" specific-use="star"><label>Table 1</label><caption><p id="d2e2173">The locations of the Cloudnet sites used in the analysis, and important results of the mutual information calculation between the ATL09 and Cloudnet VCF profiles at each site. <inline-formula><mml:math id="M151" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>orbits</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> represents the normalised across-track density of ICESat-2 orbits at the latitude of the Cloudnet site. <inline-formula><mml:math id="M152" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> <inline-formula><mml:math id="M153" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M154" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi mathvariant="italic">τ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the optimised parametrisation at which the maximum mutual information, <inline-formula><mml:math id="M155" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, is found.     <inline-formula><mml:math id="M156" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>events</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the number of co-location events from which data is included with a parametrisation <inline-formula><mml:math id="M157" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>.    <inline-formula><mml:math id="M158" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>profiles</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the number of pairwise profile comparisons made between ATL09 and Cloudnet VCF profiles across all co-location events for a given parametrisation <inline-formula><mml:math id="M159" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="9">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="center"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="center"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:colspec colnum="7" colname="col7" align="center"/>
     <oasis:colspec colnum="8" colname="col8" align="center"/>
     <oasis:colspec colnum="9" colname="col9" align="center"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">site</oasis:entry>
         <oasis:entry colname="col2">latitude</oasis:entry>
         <oasis:entry colname="col3">longitude</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M160" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>orbits</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M161" display="inline"><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M162" display="inline"><mml:mover accent="true"><mml:mi mathvariant="italic">τ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7"><inline-formula><mml:math id="M163" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>events</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col8"><inline-formula><mml:math id="M164" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>profiles</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col9"><inline-formula><mml:math id="M165" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">(° N)</oasis:entry>
         <oasis:entry colname="col3">(° E)</oasis:entry>
         <oasis:entry colname="col4"/>
         <oasis:entry colname="col5">(km)</oasis:entry>
         <oasis:entry colname="col6">(h)</oasis:entry>
         <oasis:entry colname="col7"/>
         <oasis:entry colname="col8"/>
         <oasis:entry colname="col9">(nats)</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Ny-Ålensund</oasis:entry>
         <oasis:entry colname="col2">78.9</oasis:entry>
         <oasis:entry colname="col3">11.9</oasis:entry>
         <oasis:entry colname="col4">5.29</oasis:entry>
         <oasis:entry colname="col5">60.0</oasis:entry>
         <oasis:entry colname="col6">6.0</oasis:entry>
         <oasis:entry colname="col7">932</oasis:entry>
         <oasis:entry colname="col8">6.06 <inline-formula><mml:math id="M166" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 10<sup>8</sup></oasis:entry>
         <oasis:entry colname="col9">0.607 <inline-formula><mml:math id="M168" display="inline"><mml:mo>±</mml:mo></mml:math></inline-formula> 0.020</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Hyytiala</oasis:entry>
         <oasis:entry colname="col2">61.8</oasis:entry>
         <oasis:entry colname="col3">24.3</oasis:entry>
         <oasis:entry colname="col4">2.12</oasis:entry>
         <oasis:entry colname="col5">140.3</oasis:entry>
         <oasis:entry colname="col6">5.0</oasis:entry>
         <oasis:entry colname="col7">881</oasis:entry>
         <oasis:entry colname="col8">1.01 <inline-formula><mml:math id="M169" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 10<sup>9</sup></oasis:entry>
         <oasis:entry colname="col9">0.511 <inline-formula><mml:math id="M171" display="inline"><mml:mo>±</mml:mo></mml:math></inline-formula> 0.016</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Jülich</oasis:entry>
         <oasis:entry colname="col2">50.9</oasis:entry>
         <oasis:entry colname="col3">6.4</oasis:entry>
         <oasis:entry colname="col4">1.59</oasis:entry>
         <oasis:entry colname="col5">196.9</oasis:entry>
         <oasis:entry colname="col6">10.0</oasis:entry>
         <oasis:entry colname="col7">956</oasis:entry>
         <oasis:entry colname="col8">3.27 <inline-formula><mml:math id="M172" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 10<sup>9</sup></oasis:entry>
         <oasis:entry colname="col9">0.533 <inline-formula><mml:math id="M174" display="inline"><mml:mo>±</mml:mo></mml:math></inline-formula> 0.020</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Munich</oasis:entry>
         <oasis:entry colname="col2">48.1</oasis:entry>
         <oasis:entry colname="col3">11.6</oasis:entry>
         <oasis:entry colname="col4">1.50</oasis:entry>
         <oasis:entry colname="col5">140.3</oasis:entry>
         <oasis:entry colname="col6">4.0</oasis:entry>
         <oasis:entry colname="col7">634</oasis:entry>
         <oasis:entry colname="col8">6.60 <inline-formula><mml:math id="M175" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 10<sup>8</sup></oasis:entry>
         <oasis:entry colname="col9">0.484 <inline-formula><mml:math id="M177" display="inline"><mml:mo>±</mml:mo></mml:math></inline-formula> 0.018</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e2681">To produce the homogenised VCF profiles used in our analyses, we start with the Cloudnet <italic>categorize</italic> product, which holds the calibrated synthesised data <xref ref-type="bibr" rid="bib1.bibx11" id="paren.38"><named-content content-type="pre">accessed through the Cloudnet FMI website,</named-content></xref>. Following the definition of cloud mask in the code presented in <xref ref-type="bibr" rid="bib1.bibx59" id="text.39"/>, we extract the cloud mask as the feature mask used in the analysis. The full quality check process is described in Appendix <xref ref-type="sec" rid="App1.Ch1.S2.SS2"/>. The vertical profiles of the feature mask are then subset according to the temporal co-location criteria described in Sect. <xref ref-type="sec" rid="Ch1.S3.SS3"/>. Like the ATL09 homogenisation process, the Cloudnet-derived feature masks are horizontally averaged to produce profiles of vertical cloud fraction, and vertically interpolated onto a set of 50 height coordinates with a vertical spacing of 240 m.</p>
</sec>
<sec id="Ch1.S3.SS3">
  <label>3.3</label><title>Co-location scheme</title>
      <p id="d2e2707">Data from Cloudnet and ATL09 are co-located using the co-location scheme shown in Fig. <xref ref-type="fig" rid="F1"/>a. Projected onto the Earth's surface, the spatial co-location scheme treats each Cloudnet site as a 0-dimensional point-like source. The ATL09 data then constitutes three distinct 1-dimensional  line-like sources – one for each ATLAS strong beam. For each vertical profile in each of the three beams from the ATL09 data, here indexed with subscript <inline-formula><mml:math id="M178" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>, the great-circle distance between the profile's footprint on the ground, and the location of the Cloudnet site, <inline-formula><mml:math id="M179" display="inline"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, is calculated. The criteria for accepting the ATL09 data is

            <disp-formula id="Ch1.E5" content-type="numbered"><label>5</label><mml:math id="M180" display="block"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>≤</mml:mo><mml:mi>R</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

          such that all profiles falling within a distance <inline-formula><mml:math id="M181" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> across the Earth's surface of the Cloudnet site are kept in the analysis.</p>
      <p id="d2e2754">The temporal co-location scheme first requires finding the time of closest approach between ICESat-2 and the Cloudnet site, <inline-formula><mml:math id="M182" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>. This is simply the time associated with an ATL09 profile <inline-formula><mml:math id="M183" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula> that minimises <inline-formula><mml:math id="M184" display="inline"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. That is,

            <disp-formula id="Ch1.E6" content-type="numbered"><label>6</label><mml:math id="M185" display="block"><mml:mrow><mml:msup><mml:mi>j</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:munder><mml:mtext>arg min</mml:mtext><mml:mi>j</mml:mi></mml:munder><mml:mo>(</mml:mo><mml:msub><mml:mi>r</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="1em"/><mml:msub><mml:mi>t</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:msup><mml:mi>j</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e2834">To subset the Cloudnet data, for each Cloudnet profile with index <inline-formula><mml:math id="M186" display="inline"><mml:mi>l</mml:mi></mml:math></inline-formula>, the criteria applied is

            <disp-formula id="Ch1.E7" content-type="numbered"><label>7</label><mml:math id="M187" display="block"><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>|</mml:mo><mml:mo>≤</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi mathvariant="italic">τ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

          where <inline-formula><mml:math id="M188" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the time associated with the given Cloudnet profile <inline-formula><mml:math id="M189" display="inline"><mml:mi>l</mml:mi></mml:math></inline-formula>. This subsets the Cloudnet data based on the profile being recorded within a temporal window of duration <inline-formula><mml:math id="M190" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>, centred on the time of closest approach. Thus, the co-location of the ATL09 and Cloudnet data can be parametrised as <inline-formula><mml:math id="M191" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M192" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M193" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e2932">Figure <xref ref-type="fig" rid="F4"/> shows a demonstration of the co-location of ATL09 and Cloudnet data at Ny-Ålesund. In the ATL09 granule, ICESat-2 ground tracks passed within 18 km of the Cloudnet observatory at Ny-Ålesund. Figure <xref ref-type="fig" rid="F4"/>a shows the locations of the ATLAS strong-beam footprints in relation to the Cloudnet site as ICESat-2 travelled from north to south.</p>

      <fig id="F4" specific-use="star"><label>Figure 4</label><caption><p id="d2e2942">The co-location of ATL09 and Cloudnet data at Ny-Ålesund, with ATL09 data from the granule with reference ground track 115 on cycle 12 (dated 1 July 2021), using the co-location parametrisation <inline-formula><mml:math id="M194" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M195" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (125 km, 6 h). <bold>(a)</bold> The Cloudnet observatory (star), and the three ATLAS strong beams (lines). A circle of radius <inline-formula><mml:math id="M196" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M197" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 125 km is drawn around the Cloudnet site. <bold>(b)</bold> The feature mask generated from the ATL09 data associated with strong beam <inline-formula><mml:math id="M198" display="inline"><mml:mn mathvariant="normal">1</mml:mn></mml:math></inline-formula>, showing where ATL09 retrieves clouds and is attenuated. <bold>(c)</bold> The distance between the ATLAS ground track and Cloudnet observatory, showing the co-location criteria subsetting of the cloudmask. The unhatched region contains the vertical profiles contributing to the co-location. Panels <bold>(d)</bold> and <bold>(e)</bold> are the same as panels <bold>(b)</bold> and <bold>(c)</bold>, but for the Cloudnet data contributing to the co-location event. <bold>(f)</bold> Vertical cloud fraction profiles for the ATL09 and Cloudnet feature masks, as well as the vertical cloud and attenuation profile for the ATL09 feature mask, subset by the co-location parametrisation.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f04.png"/>

        </fig>

      <p id="d2e3012">Figure <xref ref-type="fig" rid="F4"/>b shows the ATL09 cloudmask from the granule. The observed clouds are optically thick enough to attenuate the ATLAS lidar beam throughout the co-location event, so lower level cloud layers will be missed. Layers are detected across a range of heights, with cloud tops varying from 2 up to 8 km. Figure <xref ref-type="fig" rid="F4"/>c plots the spatial co-location criteria from Eq. (<xref ref-type="disp-formula" rid="Ch1.E5"/>) as the satellite travels near Ny-Ålesund. The distance from the lidar beam to the ground forms a hyperbolic curve with a minimum separation of 18.4 km. This results in the volume of ATL09 data per overpass being asymptotically linear in <inline-formula><mml:math id="M199" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>. The horizontal dashed line represents the spatial co-location parametrisation of <inline-formula><mml:math id="M200" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M201" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 125 km, and the vertical dashed lines seen in Fig. <xref ref-type="fig" rid="F4"/>b–c are the boundaries of the subset data based on the spatial co-location criteria, with hatching showing the data rejected by the co-location scheme.</p>
      <p id="d2e3045">Similarly, Fig. <xref ref-type="fig" rid="F4"/>d–e show the feature mask and temporal co-location criteria applied to the Cloudnet data during the overpass. Early profiles in the mask show low cloud layers between 1 and 2 km, growing into a thicker cloud layer ranging from near the surface up to 8 km high, with the top height varying throughout the co-location event. The co-location criteria plotted forms a piecewise linear function, the result of this being that the volume of Cloudnet data used in the analysis is linear in <inline-formula><mml:math id="M202" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>.</p>
      <p id="d2e3057">Figure <xref ref-type="fig" rid="F4"/>f shows the output of the homogenisation process on the ATL09 and Cloudnet data. The feature masks, subset by the co-location criteria, are horizontally averaged over all included vertical profiles, producing VCF profiles. A profile of the cloud and attenuation fraction is also given for the ATL09 data. Above 5 km in height, both VCF profiles visually correlate with each other, indicating that the co-location between the ATL09 and Cloudnet data may be viable. However, below 5 km, the ATL09 VCF values are significantly lower than the Cloudnet VCF values, due to attenuation of the ATLAS lidar beam in the higher cloud layers, resulting in lower cloud layers being unobserved by ICESat-2. The ATL09 vertical cloud and attenuation fraction profile correlates with the Cloudnet VCF down to an altitude of roughly 1.5 km.</p>
</sec>
<sec id="Ch1.S3.SS4">
  <label>3.4</label><title>Mutual information estimation</title>
      <p id="d2e3070">With the homogenised ATL09 and Cloudnet data both being VCF profiles described on 50 height levels each, the joint probability distribution for pairs of VCF profiles is 100 dimensional. As such, we require a mutual information estimator that accepts multi-dimensional inputs.</p>
      <p id="d2e3073">The KSG mutual information estimator <xref ref-type="bibr" rid="bib1.bibx18" id="paren.40"/> and its adaptations to multidimensional data <xref ref-type="bibr" rid="bib1.bibx15" id="paren.41"/>, <inline-formula><mml:math id="M203" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, will be used in this work. As inputs, we provide the matched sets of ATL09 and Cloudnet VCF profiles. The KSG estimator provides mutual information estimates in units of nats. This is a result of the estimator being derived using natural logarithms instead of logarithms with base 2. The conversion from nats to the more widely used unit bits is a scaling factor of <inline-formula><mml:math id="M204" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>ln⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> bits nat<sup>−1</sup>.</p>
      <p id="d2e3128">We use the estimator parameter <inline-formula><mml:math id="M206" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M207" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 10, as testing on our data showed that it balances the decrease in estimator variance as <inline-formula><mml:math id="M208" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> increases against the increase in bias and computational cost as <inline-formula><mml:math id="M209" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> increases. The variance of the KSG estimator, <inline-formula><mml:math id="M210" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mtext>KSG</mml:mtext><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup></mml:mrow></mml:math></inline-formula>, can be estimated as following the form <xref ref-type="bibr" rid="bib1.bibx15" id="paren.42"><named-content content-type="post">Appendix 1</named-content></xref>

            <disp-formula id="Ch1.E8" content-type="numbered"><label>8</label><mml:math id="M211" display="block"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mtext>KSG</mml:mtext><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi>B</mml:mi><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

          where <inline-formula><mml:math id="M212" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> is the number of samples used in estimating <inline-formula><mml:math id="M213" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M214" display="inline"><mml:mi>B</mml:mi></mml:math></inline-formula> is a constant parameter to be evaluated. In the maximum likelihood estimation of <inline-formula><mml:math id="M215" display="inline"><mml:mi>B</mml:mi></mml:math></inline-formula>, we produce <inline-formula><mml:math id="M216" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M217" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 10 non-overlapping partitions of the original data to compute <inline-formula><mml:math id="M218" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mrow><mml:mtext>KSG</mml:mtext><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup></mml:mrow></mml:math></inline-formula>, and perform this <inline-formula><mml:math id="M219" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mtext>repeats</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M220" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 20 times to evaluate <inline-formula><mml:math id="M221" display="inline"><mml:mi>B</mml:mi></mml:math></inline-formula>.</p>
      <p id="d2e3304">As well as identifying the optimised parametrisation <inline-formula><mml:math id="M222" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, we also identify regions of the parameter space where <inline-formula><mml:math id="M223" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are consistent with the value of <inline-formula><mml:math id="M224" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> (the maximum estimated mutual information), giving a region with finite extent from which an optimised parametrisation could feasibly be selected. To do this, we perform an unequal variances (Welch's) <inline-formula><mml:math id="M225" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> test for each <inline-formula><mml:math id="M226" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M227" display="inline"><mml:mo>≠</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M228" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> to test the null hypothesis that the mean estimated mutual information at <inline-formula><mml:math id="M229" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> is equal to the mean estimated mutual information at <inline-formula><mml:math id="M230" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> – that is that <inline-formula><mml:math id="M231" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M232" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M233" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Parametrisations <inline-formula><mml:math id="M234" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> for which the null hypothesis cannot be rejected with a significance of 0.05 are considered candidate optimised parametrisations. Conversely, parametrisations for which the null hypothesis is rejected are not considered as candidates for the optimised parametrisation.</p>
      <p id="d2e3468">In our analysis, we will use the parametrisation <inline-formula><mml:math id="M235" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> that maximises the mutual information between the ATL09 and Cloudnet VCF profiles, but other strategies could be employed to select which candidate optimised co-location parametrisation will be used (e.g. maximising the data volume permitted by the co-location). If two parametrisations yield negligibly different mutual information values, it is up to the researcher to decide which parametrisation they should use, considering any trade-offs between the use of additional data and any increased computational costs the additional data may incur.</p>
</sec>
<sec id="Ch1.S3.SS5">
  <label>3.5</label><title>Validation metrics and methodology</title>
      <p id="d2e3490">Once we have evaluated the optimised co-location parametrisations for each Cloudnet observatory, we perform a basic comparison of the co-located ATL09 and Cloudnet VCF profiles to demonstrate the impact of using a co-location parametrisation with maximised mutual information instead of other choices of parametrisation.</p>
      <p id="d2e3493">We compute confusion matrices classifying VCF values into three categories: containing no cloud (nc), when <inline-formula><mml:math id="M236" display="inline"><mml:mrow><mml:mtext>VCF</mml:mtext><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>; being partially cloudy (pc), when 0 <inline-formula><mml:math id="M237" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> VCF <inline-formula><mml:math id="M238" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 1; and being totally cloudy (tc), when VCF <inline-formula><mml:math id="M239" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 1. We make the distinction between nc, pc and tc cases, as VCF values are defined on the closed interval <inline-formula><mml:math id="M240" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>, but the probability distribution has degeneracies at 0 and 1, when scenes with no or total cloud cover happen with finite probability. This results in the probability distribution of VCF values having Dirac-delta like contributions at 0 and 1, but being otherwise continuous on the open interval <inline-formula><mml:math id="M241" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e3562">Having computed confusion matrices, we then compute copula densities between pairs of VCF values across all co-location events and heights within VCF profiles. A copula is the multidimensional extension of the cumulative distribution function for multiple random variables. Random variables are transformed by the probability integral transform – that is, values <inline-formula><mml:math id="M242" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> for a random variable <inline-formula><mml:math id="M243" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> with cumulative distribution function <inline-formula><mml:math id="M244" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>X</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are transformed into the variable <inline-formula><mml:math id="M245" display="inline"><mml:mi>U</mml:mi></mml:math></inline-formula> with values <inline-formula><mml:math id="M246" display="inline"><mml:mi>u</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M247" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M248" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>X</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, such that <inline-formula><mml:math id="M249" display="inline"><mml:mi>U</mml:mi></mml:math></inline-formula> is uniformly distributed on the interval <inline-formula><mml:math id="M250" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. In the bivariate case, with variables <inline-formula><mml:math id="M251" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M252" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, transformed into the variables <inline-formula><mml:math id="M253" display="inline"><mml:mi>U</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M254" display="inline"><mml:mi>V</mml:mi></mml:math></inline-formula> respectively, the copula is computed as

            <disp-formula id="Ch1.E9" content-type="numbered"><label>9</label><mml:math id="M255" display="block"><mml:mrow><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="double-struck">P</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mi>U</mml:mi><mml:mo>≤</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>V</mml:mi><mml:mo>≤</mml:mo><mml:mi>v</mml:mi></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

          which represents the probability that both uniformly distributed variables <inline-formula><mml:math id="M256" display="inline"><mml:mi>U</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M257" display="inline"><mml:mi>V</mml:mi></mml:math></inline-formula> are less than their respective coordinates at the same time. Because the marginal distributions of all variables contributing to a copula are uniform, the structure of the copula captures the dependency structure between the variables, independent of the marginal distributions of the original random variables.</p>
      <p id="d2e3742">In the same way that a probability density function can be obtained by differentiating a cumulative distribution function, so too can a copula density function be obtained by repeated differentiation of the copula. In the bivariate case,

            <disp-formula id="Ch1.E10" content-type="numbered"><label>10</label><mml:math id="M258" display="block"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msup><mml:mo>∂</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:mi>u</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>∂</mml:mo><mml:mi>v</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e3797">For independent variables, the copula is given as <inline-formula><mml:math id="M259" display="inline"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mtext>independent</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M260" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M261" display="inline"><mml:mrow><mml:mi>u</mml:mi><mml:mi>v</mml:mi></mml:mrow></mml:math></inline-formula>. From this, we derive the independent copula density as <inline-formula><mml:math id="M262" display="inline"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mtext>independent</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M263" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 1 uniformly. Thus, we can interpret copula densities greater than <inline-formula><mml:math id="M264" display="inline"><mml:mn mathvariant="normal">1</mml:mn></mml:math></inline-formula> as giving pairs <inline-formula><mml:math id="M265" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> (and by extension <inline-formula><mml:math id="M266" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>) that are sampled more frequently than if the variables <inline-formula><mml:math id="M267" display="inline"><mml:mi>U</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M268" display="inline"><mml:mi>V</mml:mi></mml:math></inline-formula> were independent. Conversely, copula densities less than <inline-formula><mml:math id="M269" display="inline"><mml:mn mathvariant="normal">1</mml:mn></mml:math></inline-formula> indicate regions of <inline-formula><mml:math id="M270" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> that are sampled less frequently than if the variables were independent. <xref ref-type="bibr" rid="bib1.bibx53" id="text.43"/> provide a good introduction to methods and interpretation concerning copulae.</p>
      <p id="d2e3947">Copula densities with values further from <inline-formula><mml:math id="M271" display="inline"><mml:mn mathvariant="normal">1</mml:mn></mml:math></inline-formula> indicate that the underlying distribution is dissimilar to the independent joint distribution, which is the desired quality of the co-location. We define the root mean squared difference (RMSD) as a metric of the difference between a copula density and the independent copula:

            <disp-formula id="Ch1.E11" content-type="numbered"><label>11</label><mml:math id="M272" display="block"><mml:mrow><mml:mtext>RMSD</mml:mtext><mml:mo>=</mml:mo><mml:mo mathsize="2.0em">(</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:munder><mml:mo movablelimits="false">∫</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:msup><mml:mo>]</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:munder><mml:mi mathvariant="normal">d</mml:mi><mml:mi>u</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="normal">d</mml:mi><mml:mi>v</mml:mi><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msup><mml:mo mathsize="2.0em">)</mml:mo><mml:mstyle scriptlevel="+1"><mml:mfrac><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e4028">Larger RMSD values indicate that a given copula density differs more from the independent copula. If the ATL09 and Cloudnet VCF measurements are entirely independent, <inline-formula><mml:math id="M273" display="inline"><mml:mi>c</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M274" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 1 uniformly and the RMSD is zero. If there is dependency between the VCF distributions, then certain pairs of <inline-formula><mml:math id="M275" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> values will be sampled more frequently than if the VCF distributions were independent (<inline-formula><mml:math id="M276" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M277" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 1) and, to conserve probability, some pairs <inline-formula><mml:math id="M278" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> will be sampled less frequently than if the distributions were independent (<inline-formula><mml:math id="M279" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M280" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 1). The more dependent the VCF distributions are, the larger the area of <inline-formula><mml:math id="M281" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> pairs for which <inline-formula><mml:math id="M282" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo><mml:mo>≠</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> will be, and the larger the magnitudes of <inline-formula><mml:math id="M283" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> will be in these areas. Thus, stronger dependency between the distributions will result in larger RMSD values. The RMSD attains a maximum value of <inline-formula><mml:math id="M284" display="inline"><mml:mn mathvariant="normal">1</mml:mn></mml:math></inline-formula> when <inline-formula><mml:math id="M285" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> describes a one-to-one mapping (e.g. <inline-formula><mml:math id="M286" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M287" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M288" display="inline"><mml:mrow><mml:mi mathvariant="italic">δ</mml:mi><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>-</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>).</p>
      <p id="d2e4257">For this analysis, with a given parametrisation <inline-formula><mml:math id="M289" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>, we identify all co-location events <inline-formula><mml:math id="M290" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>, and compute <inline-formula><mml:math id="M291" display="inline"><mml:mrow><mml:msub><mml:mtext>VCF</mml:mtext><mml:mrow><mml:mtext>ATL09</mml:mtext><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>z</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and VCF<inline-formula><mml:math id="M292" display="inline"><mml:mrow><mml:msub><mml:mi/><mml:mrow><mml:mtext>Cloudnet</mml:mtext><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>z</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. We keep pairs of VCF values for each height <inline-formula><mml:math id="M293" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula>, and each co-location event if both VCF values are categorised as pc. We do this so that a valid copula density function can be defined without the degeneracies induced by considering cases with no or total cloud cover.</p>
<sec id="Ch1.S3.SS5.SSS1">
  <label>3.5.1</label><title>Vertical bias distributions</title>
      <p id="d2e4331">We compute the vertical distribution of bias between the ATL09 and Cloudnet VCF profiles as a function of height, <inline-formula><mml:math id="M294" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>bias</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mo>,</mml:mo><mml:mi>z</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M295" display="inline"><mml:mi mathvariant="italic">ν</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M296" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> VCF<sub>ATL09</sub> <inline-formula><mml:math id="M298" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula> VCF<sub>Cloudnet</sub> is the bias, bounded between <inline-formula><mml:math id="M300" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula>1 and 1. The expected bias and variance of the bias as a function of height are calculated as

                  <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M301" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="Ch1.E12"><mml:mtd><mml:mtext>12</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="double-struck">E</mml:mi><mml:mo>[</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mo>|</mml:mo><mml:mi>z</mml:mi><mml:mo>]</mml:mo><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mn mathvariant="normal">1</mml:mn></mml:munderover><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="italic">ν</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>bias</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mo>,</mml:mo><mml:mi>z</mml:mi><mml:mo>)</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="Ch1.E13"><mml:mtd><mml:mtext>13</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtext>Var</mml:mtext><mml:mo>[</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mo>|</mml:mo><mml:mi>z</mml:mi><mml:mo>]</mml:mo><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mn mathvariant="normal">1</mml:mn></mml:munderover><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="italic">ν</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>bias</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mo>,</mml:mo><mml:mi>z</mml:mi><mml:mo>)</mml:mo><mml:msup><mml:mi mathvariant="italic">ν</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>-</mml:mo><mml:mi mathvariant="double-struck">E</mml:mi><mml:mo>[</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mo>|</mml:mo><mml:mi>z</mml:mi><mml:msup><mml:mo>]</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
</sec>
</sec>
<sec id="Ch1.S3.SS6">
  <label>3.6</label><title>Results</title>
<sec id="Ch1.S3.SS6.SSS1">
  <label>3.6.1</label><title>Case study: Jülich</title>
      <p id="d2e4557">We will start by demonstrating the results of the mutual information computation at a singular site, Jülich, before showing the results across all four example sites.</p>
      <p id="d2e4560">Figure <xref ref-type="fig" rid="F5"/>a shows the number of co-location events used in the study as a function of the co-location parametrisation. The Cloudnet data at Jülich forms a near-complete record, meaning co-location events are solely dependent on the availability of sufficiently close ICESat-2 orbital tracks. Thus, we expect no gradient in <inline-formula><mml:math id="M302" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>events</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> as a function of <inline-formula><mml:math id="M303" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>, which we see.</p>

      <fig id="F5" specific-use="star"><label>Figure 5</label><caption><p id="d2e4585">The number of co-location events <bold>(a)</bold>, pairwise vertical profile comparisons <bold>(b)</bold>, and the mutual information <bold>(c)</bold> computed between ATL09 data and Cloudnet data from the observatory at Jülich, as a function of co-locations parametrisation <inline-formula><mml:math id="M304" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M305" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M306" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The maximum mutual information (indicated by crossing dashed lines) occurs at <inline-formula><mml:math id="M307" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> <inline-formula><mml:math id="M308" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (196.9 km, 10 h). Hatching denotes regions of parameter space where <inline-formula><mml:math id="M309" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is not significantly different from <inline-formula><mml:math id="M310" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p></caption>
            <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f05.png"/>

          </fig>

      <p id="d2e4695">We should expect the number of co-location events to be approximately a linear function of <inline-formula><mml:math id="M311" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>. At a given latitude, orbital ground tracks can be split into two sets, ascending and descending. Within each set, all orbital tracks are approximately parallel at a given latitude, so the number of events included in the analysis is the number of orbital tracks intersecting a line centred at Jülich, of length <inline-formula><mml:math id="M312" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mi>R</mml:mi></mml:mrow></mml:math></inline-formula>, perpendicular to the orbital tracks. Given the rotational symmetry of repeating orbital tracks, the across-track density of orbits, <inline-formula><mml:math id="M313" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>orbits</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, can be approximated as constant at given latitude, and an equation approximating <inline-formula><mml:math id="M314" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>orbits</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> is given in Appendix <xref ref-type="sec" rid="App1.Ch1.S3"/>. The outcome is that <inline-formula><mml:math id="M315" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>events</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M316" display="inline"><mml:mo>∝</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M317" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>. Given the logarithmic scaling of the colour map, and the plot coordinates, the smooth colour gradient seen in Fig. <xref ref-type="fig" rid="F5"/>a indicates that <inline-formula><mml:math id="M318" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>events</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> is a polynomial function of <inline-formula><mml:math id="M319" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>, consistent with the above arguments.</p>
      <p id="d2e4795">Figure <xref ref-type="fig" rid="F5"/>b shows the number of pairwise vertical profile comparisons, <inline-formula><mml:math id="M320" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>profiles</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, computed as

              <disp-formula id="Ch1.E14" content-type="numbered"><label>14</label><mml:math id="M321" display="block"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>profiles</mml:mtext></mml:msub><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mi>i</mml:mi></mml:munder><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mtext>ATL09</mml:mtext><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mtext>Cloudnet</mml:mtext><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            where <inline-formula><mml:math id="M322" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> represents a co-location event, and <inline-formula><mml:math id="M323" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is the number of vertical profiles from data source <inline-formula><mml:math id="M324" display="inline"><mml:mi>s</mml:mi></mml:math></inline-formula>, in co-location event <inline-formula><mml:math id="M325" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>, included in the analysis after the application of the co-location criteria. This can be thought of as the volume of data being compared – if the homogenisation process was not to aggregate the data but instead compare individual  observations in a pairwise fashion, this is the number of paired VCF profiles that would be available in the analysis.</p>
      <p id="d2e4890">As with <inline-formula><mml:math id="M326" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>events</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, the smooth colour gradient in Fig. <xref ref-type="fig" rid="F5"/>b indicates <inline-formula><mml:math id="M327" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>profiles</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> is approximately polynomial as a function of <inline-formula><mml:math id="M328" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M329" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>. We expect <inline-formula><mml:math id="M330" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>profiles</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> to be proportional to <inline-formula><mml:math id="M331" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>, as for each co-location event, the number of included vertical profiles linearly scales with the duration of the time window of length <inline-formula><mml:math id="M332" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>. We expect <inline-formula><mml:math id="M333" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>profiles</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> to be proportional to <inline-formula><mml:math id="M334" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, one power coming from a proportionality to <inline-formula><mml:math id="M335" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>events</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, and the other deriving from the fact that for each given co-location event, the number of included vertical profiles scales as a function of <inline-formula><mml:math id="M336" display="inline"><mml:msqrt><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>-</mml:mo><mml:msubsup><mml:mi>r</mml:mi><mml:mtext>min</mml:mtext><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup></mml:mrow></mml:msqrt></mml:math></inline-formula>, which is approximately linear in <inline-formula><mml:math id="M337" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> in the limit of <inline-formula><mml:math id="M338" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M339" display="inline"><mml:mo>≫</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M340" display="inline"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mtext>min</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>. The results are consistent with <inline-formula><mml:math id="M341" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>profiles</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M342" display="inline"><mml:mo>∝</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M343" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mi mathvariant="italic">τ</mml:mi></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e5086">Figure <xref ref-type="fig" rid="F5"/>c shows the mutual information calculated according to Sect. <xref ref-type="sec" rid="Ch1.S3.SS4"/> across all tested parametrisations. For each parametrisation <inline-formula><mml:math id="M344" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M345" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M346" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, a joint distribution of VCFs between the ATL09 and Cloudnet data could be plotted, akin to Fig. <xref ref-type="fig" rid="F3"/> (albeit in 100 dimensions), from which a value of the mutual information between the ATL09 and Cloudnet VCFs is computed. Mutual information values range from a minimum of 0.177 <inline-formula><mml:math id="M347" display="inline"><mml:mo>±</mml:mo></mml:math></inline-formula> 0.010 nats at <inline-formula><mml:math id="M348" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M349" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (500 km, 300 s), to a maximum of 0.533 <inline-formula><mml:math id="M350" display="inline"><mml:mo>±</mml:mo></mml:math></inline-formula> 0.020 nats with <inline-formula><mml:math id="M351" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> <inline-formula><mml:math id="M352" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (196.9 km, 10 h). The mutual information surface has a ridge of higher values where the global maximum is found. As <inline-formula><mml:math id="M353" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> moves away from <inline-formula><mml:math id="M354" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, the mutual information appears to decrease roughly monotonically. In the region of lower <inline-formula><mml:math id="M355" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M356" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula> values, this can be explained as the mutual information estimator being data limited. The KSG estimator acts on the pairs of VCF profiles as though they are drawn from a 100-dimensional joint probability distribution. In order to learn structure in this joint distribution and compute larger mutual information values, a sufficient number of co-location events must contribute to the analysis, so that the 100-dimensional distribution can be sampled densely enough to infer the structure.</p>
      <p id="d2e5203">For larger values of <inline-formula><mml:math id="M357" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M358" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>, we expect the rate of errors as a results of co-location mismatch to increase. This will contaminate the VCF comparisons with uncorrelated and independent profile comparisons. As is shown in Appendix <xref ref-type="sec" rid="App1.Ch1.S1"/>, the inclusion of independent data to the comparison necessarily decreases the possible upper bound of the mutual information. The <inline-formula><mml:math id="M359" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> surface seen in Fig. <xref ref-type="fig" rid="F5"/>c is consistent with these expectations.</p>
      <p id="d2e5244">There is a region of <inline-formula><mml:math id="M360" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> near <inline-formula><mml:math id="M361" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> for which <inline-formula><mml:math id="M362" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are not significantly different from <inline-formula><mml:math id="M363" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> with a significance of <inline-formula><mml:math id="M364" display="inline"><mml:mn mathvariant="normal">0.05</mml:mn></mml:math></inline-formula>, indicated by the hatching on Fig. <xref ref-type="fig" rid="F5"/>c. This region represents other possible choices of an optimal parametrisation that provides similarly informative comparisons between the ATL09 and Cloudnet VCF retrievals at Jülich. The region predominantly exists for values of <inline-formula><mml:math id="M365" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M366" display="inline"><mml:mo>≥</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M367" display="inline"><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>, extending as far as <inline-formula><mml:math id="M368" display="inline"><mml:mo>∼</mml:mo></mml:math></inline-formula> 240 km. The possible values of <inline-formula><mml:math id="M369" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula> range from <inline-formula><mml:math id="M370" display="inline"><mml:mo>∼</mml:mo></mml:math></inline-formula> 8 h (less than <inline-formula><mml:math id="M371" display="inline"><mml:mover accent="true"><mml:mi mathvariant="italic">τ</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>) to <inline-formula><mml:math id="M372" display="inline"><mml:mo>∼</mml:mo></mml:math></inline-formula> 18 h (greater than <inline-formula><mml:math id="M373" display="inline"><mml:mover accent="true"><mml:mi mathvariant="italic">τ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>). As seen in Fig. <xref ref-type="fig" rid="F5"/>a–b, the hatched regions represent similar or larger input data volumes when compared to <inline-formula><mml:math id="M374" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, indicating that the KSG estimator could be not data-limited prior to incorporating sufficient independent data to contaminate the results, such that <inline-formula><mml:math id="M375" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> has attained its upper bound given the data distributions. There is also a smaller region of possibly optimised parametrisations found near <inline-formula><mml:math id="M376" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M377" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (150 km, 6 h), with lower associated data volumes.</p>
</sec>
<sec id="Ch1.S3.SS6.SSS2">
  <label>3.6.2</label><title>Mutual information at four Cloudnet observatories</title>
      <p id="d2e5439">We will start by considering the <inline-formula><mml:math id="M378" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> surfaces for Munich, Jülich and Hyytiala, as seen in Fig. <xref ref-type="fig" rid="F6"/>a–c. Qualitatively, all three sites show a similar structure of a surface with a ridge of higher mutual information values that contains a single global maximum. These similarities in structure can be explained using the same arguments as in Sect. <xref ref-type="sec" rid="Ch1.S3.SS6.SSS1"/>, with regions in the parameter space where the KSG estimator is data limited, and regions where the input data to the estimator is contaminated with independent samples. Despite structural similarities, the optimised parametrisations, <inline-formula><mml:math id="M379" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>, at each site differ, as do the magnitudes of <inline-formula><mml:math id="M380" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Values for <inline-formula><mml:math id="M381" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, <inline-formula><mml:math id="M382" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, and other quantities at each Cloudnet observatory are given in Table <xref ref-type="table" rid="T1"/>.</p>

      <fig id="F6" specific-use="star"><label>Figure 6</label><caption><p id="d2e5537">The mutual information computed between ATL09 and Cloudnet VCF profiles as a function of co-location parametrisation <inline-formula><mml:math id="M383" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M384" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M385" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">τ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for the Cloudnet observatories at Munich <bold>(a)</bold>, Jülich <bold>(b)</bold>, Hyytiala <bold>(c)</bold> and Ny-Ålesund <bold>(d)</bold>. The location of the maximum mutual information value, <inline-formula><mml:math id="M386" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, is indicated by the crossing dashed lines, and hatching indicates regions in parameter space where <inline-formula><mml:math id="M387" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> does not differ significantly from <inline-formula><mml:math id="M388" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Panel <bold>(b)</bold> is the same as Fig. <xref ref-type="fig" rid="F5"/>c.</p></caption>
            <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f06.png"/>

          </fig>

      <p id="d2e5648">Table <xref ref-type="table" rid="T1"/> shows that the optimised radius <inline-formula><mml:math id="M389" display="inline"><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> can vary on a per-site basis. For example, <inline-formula><mml:math id="M390" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>Jülich</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> is larger than both <inline-formula><mml:math id="M391" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>Hyytiala</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M392" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>Munich</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>. One possible explanation for this is the relatively flat orography around the Jülich Cloudnet observatory, when compared (for example) to the proximity of the Munich observatory to the Alps. The mountainous orography of the Alps could result in smaller spatial scales over which local cloud formation is correlated, giving rise to smaller spatial informativity scales <inline-formula><mml:math id="M393" display="inline"><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> than at other locations like Jülich. The values of <inline-formula><mml:math id="M394" display="inline"><mml:mover accent="true"><mml:mi mathvariant="italic">τ</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> at Munich, Jülich and Hyytiala are similar orders of magnitude, ranging from 4 to 10 h. These values are consistent with the temporal scale of cloud evolution found in other studies <xref ref-type="bibr" rid="bib1.bibx55 bib1.bibx56" id="paren.44"/>.</p>
      <p id="d2e5730">At Ny-Ålesund, the <inline-formula><mml:math id="M395" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> surface (see Fig. <xref ref-type="fig" rid="F6"/>d) shares some qualities with the mutual information surfaces seen at the other Cloudnet observatories – the surface has a single ridge of larger values, with <inline-formula><mml:math id="M396" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> decreasing as <inline-formula><mml:math id="M397" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> moves away from <inline-formula><mml:math id="M398" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, and has a value of <inline-formula><mml:math id="M399" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi mathvariant="italic">τ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>Ny-Ålesund</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M400" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 6 h, which is in the same order of magnitude as the values of <inline-formula><mml:math id="M401" display="inline"><mml:mover accent="true"><mml:mi mathvariant="italic">τ</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> at the other Cloudnet sites. The mutual information values increase sharply as <inline-formula><mml:math id="M402" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> increases above <inline-formula><mml:math id="M403" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M404" display="inline"><mml:mo>∼</mml:mo></mml:math></inline-formula> 50 km and as <inline-formula><mml:math id="M405" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula> increases above <inline-formula><mml:math id="M406" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M407" display="inline"><mml:mo>∼</mml:mo></mml:math></inline-formula> 5 h. This sharp increase in the mutual information values results in the maximum <inline-formula><mml:math id="M408" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M409" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.607 <inline-formula><mml:math id="M410" display="inline"><mml:mo>±</mml:mo></mml:math></inline-formula> 0.020 nats occurring at <inline-formula><mml:math id="M411" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> <inline-formula><mml:math id="M412" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (60 km, 6 h). The value of <inline-formula><mml:math id="M413" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>Ny-Ålesund</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> is lower than those found at the other three Cloudnet sites. This could partially be explained by the orography around Ny-Ålesund. The island of Spitsbergen, on which Ny-Ålesund is sat, is mountainous with peaks as high as 1700 m. The proximity of the Cloudnet observatory at Ny-Ålesund to the mountainous orography could lead to a reduced spatial autocorrelation scale between the clouds observed at the Cloudnet site and those to the East, that may be observed by ICESat-2. Thus, physically uncorrelated VCF comparisons would contaminate the mutual information calculation at smaller values of <inline-formula><mml:math id="M414" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> than at other sites, reducing the value of <inline-formula><mml:math id="M415" display="inline"><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>.</p>
      <p id="d2e5953">Another more subtle effect impacts the value of <inline-formula><mml:math id="M416" display="inline"><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> at each site. For a given parametrisation <inline-formula><mml:math id="M417" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>, the KSG mutual information estimator accepts <inline-formula><mml:math id="M418" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>events</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> pairs of VCF profiles in order to estimate <inline-formula><mml:math id="M419" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. <inline-formula><mml:math id="M420" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>events</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is of a similar order of magnitude at Ny-Ålesund when compared to Hyytiala and Jülich, even with a substantially smaller value of <inline-formula><mml:math id="M421" display="inline"><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>. This is due to the local across-track density of orbits being an increasing function of latitude. As well as being linearly proportional to <inline-formula><mml:math id="M422" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M423" display="inline"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mtext>events</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> has a functional dependency on latitude, which is derived in Appendix <xref ref-type="sec" rid="App1.Ch1.S3"/>. The normalised across-track orbital density, <inline-formula><mml:math id="M424" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>orbits</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, is given for each site in Table <xref ref-type="table" rid="T1"/>. The higher local density of orbits at Ny-Ålesund compared to the other sites allows for more data to be used in the estimation of <inline-formula><mml:math id="M425" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> at Ny-Ålesund than at Cloudnet observatories at lower latitudes. This could result in denser sampling of the <inline-formula><mml:math id="M426" display="inline"><mml:mn mathvariant="normal">100</mml:mn></mml:math></inline-formula>-dimensional joint probability distribution at lower values of <inline-formula><mml:math id="M427" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> for more poleward locations. Thus, the mutual information estimator, being able to infer structure in the joint probability distribution at smaller values of <inline-formula><mml:math id="M428" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>, switches from a data limited regime to being sensitive to the inclusion of independent VCF samples. This could result in the estimated value <inline-formula><mml:math id="M429" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> being reduced from the maximum  attainable value at lower values of <inline-formula><mml:math id="M430" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> at Ny-Ålesund than at other sites, as the structure has already been inferred by the KSG estimator, and the effect of contamination by independent data outweighs the inclusion of more VCF comparisons that may be related.</p>
      <p id="d2e6143">The hatched regions in Fig. <xref ref-type="fig" rid="F6"/> are unique across all four Cloudnet observatories, but can be split into two sets: Jülich and Hyytiala, being generally surrounded by flatter orography and having larger plausible extents in parameter space from which optimised co-location parametrisations can be selected, and; Munich and Ny-Ålesund, having much closer proximity to mountainous orography, and having substantially smaller regions of parameter space from which a plausibly optimised parametrisation can be selected – in the case of Ny-Ålesund, only a single tested parametrisation, specifically <inline-formula><mml:math id="M431" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>Ny-Ålesund</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, can significantly be considered optimised. This split between the two sets suggests not only are the optimised parametrisations <inline-formula><mml:math id="M432" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> different between locations, but that the co-locations at each site are uniquely sensitive to the choice of <inline-formula><mml:math id="M433" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>, with the sites located closest to mountainous orography being the most sensitive to the choice of co-location parametrisation.</p>
      <p id="d2e6179">By quantifying the mutual information encoded between our data, we learn where and when we should be selecting data around each Cloudnet observatory, and find that the spatial and temporal scales for data subsetting are different at each location. In identifying <inline-formula><mml:math id="M434" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, we are able to analyse the maximum volume of data while minimising the contamination of the results through the inclusion of independent data. We have demonstrated that the value of <inline-formula><mml:math id="M435" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> is influenced by local factors, such as mountainous orography near the surface-based observatories, and non-local factors such as the satellite sampling strategy <xref ref-type="bibr" rid="bib1.bibx52" id="paren.45"/>. The non-trivial shape of the <inline-formula><mml:math id="M436" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> surfaces computed for each Cloudnet observatory show that optimising the parametrisation requires a full exploration of the parametrisation space, and that optimising each individual parameter independently will not adequately identify the true maximum in the estimated mutual information. Moreover, we have shown that a one-size-fits-all approach to selecting the co-location parametrisation is unsuitable. Using <inline-formula><mml:math id="M437" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>Ny-Ålesund</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> at the other Cloudnet observatories would reduce the number of permitted co-location events, reducing the data volume available for the subsequent analyses. Instead, if we chose <inline-formula><mml:math id="M438" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>Munich</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> for all Cloudnet observatories, the co-location at Ny-Ålesund would be degraded by the inclusion of independent samples, but the co-location at Jülich would conversely be degraded by a reduction in the available co-location events.</p>
</sec>
<sec id="Ch1.S3.SS6.SSS3">
  <label>3.6.3</label><title>Dependency between ATL09 and Cloudnet VCFs for different co-location parametrisations</title>
      <p id="d2e6262">To demonstrate the importance of the choice of co-location parametrisation <inline-formula><mml:math id="M439" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> on the validation of satellite data, we compute confusion matrices and copulae between all pairs of VCF values in the ATL09-Cloudnet VCF profile pairs for the parametrisations given in Table <xref ref-type="table" rid="T2"/>. Two canonical choices of co-location parametrisation exist. Firstly, one could choose to only accept co-locations that minimise the spatiotemporal displacement between the data sources, at the expense of reducing the available data volume for subsequent analysis. This is represented by <inline-formula><mml:math id="M440" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">00</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M441" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (50 km, 30 min). The second canonical choice is to use all of the available data, in the hopes of having enough good data comparisons that the inclusion of independent data is not impactful. This is represented by <inline-formula><mml:math id="M442" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M443" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (500 km, 2 d).</p>

<table-wrap id="T2" specific-use="star"><label>Table 2</label><caption><p id="d2e6314">Results from the computation of copulae comparing VCF values between ATL09 and Cloudnet data for all the tested parametrisations. The accuracy of the agreement between the VCF retrievals for the categories nc, pc and tc (see Fig. <xref ref-type="fig" rid="F7"/>) is given as ACC. <inline-formula><mml:math id="M444" display="inline"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mo>min⁡</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> is the minimum copula density for the given parametrisation, <inline-formula><mml:math id="M445" display="inline"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mo>max⁡</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> is the maximum achieved copula density, and <inline-formula><mml:math id="M446" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the discretised tail dependence of the copula. RMSD is the root mean squared difference of the copula density from the independence copula density. Values in bold indicate the best parametrisation for the given metric (the notion of best being defined in the text).</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="8">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="center"/>
     <oasis:colspec colnum="5" colname="col5" align="center"/>
     <oasis:colspec colnum="6" colname="col6" align="center"/>
     <oasis:colspec colnum="7" colname="col7" align="center"/>
     <oasis:colspec colnum="8" colname="col8" align="center"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">parametrisation</oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M447" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> (km)</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M448" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula> (h)</oasis:entry>
         <oasis:entry colname="col4">ACC</oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M449" display="inline"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mo>min⁡</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M450" display="inline"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mo>max⁡</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col7"><inline-formula><mml:math id="M451" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col8">RMSD</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M452" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">00</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">50.0</oasis:entry>
         <oasis:entry colname="col3">0.5</oasis:entry>
         <oasis:entry colname="col4">0.751</oasis:entry>
         <oasis:entry colname="col5">0.44</oasis:entry>
         <oasis:entry colname="col6">1.65</oasis:entry>
         <oasis:entry colname="col7">1.05</oasis:entry>
         <oasis:entry colname="col8">0.21</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M453" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">500.0</oasis:entry>
         <oasis:entry colname="col3">48</oasis:entry>
         <oasis:entry colname="col4">0.734</oasis:entry>
         <oasis:entry colname="col5">0.49</oasis:entry>
         <oasis:entry colname="col6">2.09</oasis:entry>
         <oasis:entry colname="col7">0.91</oasis:entry>
         <oasis:entry colname="col8">0.22</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M454" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">100.0</oasis:entry>
         <oasis:entry colname="col3">3</oasis:entry>
         <oasis:entry colname="col4"><bold>0.762</bold></oasis:entry>
         <oasis:entry colname="col5">0.50</oasis:entry>
         <oasis:entry colname="col6">1.86</oasis:entry>
         <oasis:entry colname="col7">1.21</oasis:entry>
         <oasis:entry colname="col8">0.24</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M455" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula></oasis:entry>
         <oasis:entry namest="col2" nameend="col3" align="center">(see Table <xref ref-type="table" rid="T1"/>) </oasis:entry>
         <oasis:entry colname="col4">0.759</oasis:entry>
         <oasis:entry colname="col5"><bold>0.41</bold></oasis:entry>
         <oasis:entry colname="col6"><bold>2.27</bold></oasis:entry>
         <oasis:entry colname="col7"><bold>1.70</bold></oasis:entry>
         <oasis:entry colname="col8"><bold>0.27</bold></oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d2e6612">Many previous bodies of work use the same co-location scheme as outlined in Sect. <xref ref-type="sec" rid="Ch1.S3.SS3"/>. Many of these studies use values of <inline-formula><mml:math id="M456" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> that are low integer multiples of 50 km, and similarly values of <inline-formula><mml:math id="M457" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula> that are low integer multiples of 30 min <xref ref-type="bibr" rid="bib1.bibx47 bib1.bibx21 bib1.bibx44 bib1.bibx45 bib1.bibx51 bib1.bibx2 bib1.bibx64 bib1.bibx40 bib1.bibx34 bib1.bibx43 bib1.bibx39 bib1.bibx22 bib1.bibx25 bib1.bibx41 bib1.bibx26" id="paren.46"><named-content content-type="pre">e.g.</named-content></xref>. As such, we use <inline-formula><mml:math id="M458" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M459" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (100 km, 3 h) to represent a typical choice of co-location parametrisation from the literature.</p>
      <p id="d2e6656">The parametrisation <inline-formula><mml:math id="M460" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> represents the collection of all the co-location events across the four Cloudnet observatories, using each site-specific optimised co-location parametrisation (see Table <xref ref-type="table" rid="T1"/>).</p>
      <p id="d2e6671">Confusion matrices for the retrieval of no cloud (nc), partial cloud (pc) and total cloud (tc) VCF values between the ATL09 and Cloudnet data are given in Fig. <xref ref-type="fig" rid="F7"/>a–d. In all tested parametrisations, the cells corresponding to (nc, nc) and (pc, pc) are the two most probable states. The accuracy, being the sum of the confusion matrix diagonal elements where both retrievals agree, is given as ACC in Table <xref ref-type="table" rid="T2"/>. Across the tested parametrisations, the accuracy ranges between 0.73 and 0.76, with <inline-formula><mml:math id="M461" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> having the highest accuracy of 0.762. The contaminated-data regime given by <inline-formula><mml:math id="M462" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> has the lowest proportion of data falling in the (tc, tc) classification, and the highest proportion of data falling into the (pc, pc) classification. This is a result of the large integration scales for the larger <inline-formula><mml:math id="M463" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M464" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula> values, decreasing the probability that all or none of the vertical profiles contain any cloud at a given height. <inline-formula><mml:math id="M465" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M466" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> produce similar confusion matrices, although <inline-formula><mml:math id="M467" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, typically having larger <inline-formula><mml:math id="M468" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M469" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula> values than <inline-formula><mml:math id="M470" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, has less degeneracy in the VCF values and as a result has a higher proportion of (pc, pc) samples than <inline-formula><mml:math id="M471" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>.</p>

      <fig id="F7" specific-use="star"><label>Figure 7</label><caption><p id="d2e6785">Confusion matrices for the detection of no cloud (nc), partial cloud (pc) and total cloud (tc) across all VCF values at all sites for the co-locations representing: a data limited co-location, <inline-formula><mml:math id="M472" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">00</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <bold>(a)</bold>; an independent data contaminated co-location, <inline-formula><mml:math id="M473" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <bold>(b)</bold>; a co-location typical of those in the literature, <inline-formula><mml:math id="M474" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <bold>(c)</bold> and; the per-site optimised co-location, <inline-formula><mml:math id="M475" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> <bold>(d)</bold>. The values of the co-locatio  parametrisation vectors are given in Table <xref ref-type="table" rid="T2"/>. <bold>(e–h)</bold> Cumulative distribution functions for ATL09 and Cloudnet VCF values across all sites for associated co-location parametrisations, conditional that the VCF values are strictly between 0 and 1 to remove degeneracies in the copula densities. <bold>(i–l)</bold> Copula density plots for the associated co-location parametrisations. The contour indicates copula densities of <inline-formula><mml:math id="M476" display="inline"><mml:mn mathvariant="normal">1</mml:mn></mml:math></inline-formula>, with higher densities indicating regions in <inline-formula><mml:math id="M477" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>,</mml:mo><mml:mi>v</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> space that are sampled more frequently than if the variables <inline-formula><mml:math id="M478" display="inline"><mml:mi>U</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M479" display="inline"><mml:mi>V</mml:mi></mml:math></inline-formula> are independent.</p></caption>
            <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f07.png"/>

          </fig>

      <p id="d2e6896">Figure <xref ref-type="fig" rid="F7"/>e–h show the cumulative distribution functions transforming VCF values to their uniformly distributed copula coordinates. These are defined only for the data falling in the (pc, pc) classification so as to avoid degeneracies at <inline-formula><mml:math id="M480" display="inline"><mml:mn mathvariant="normal">0</mml:mn></mml:math></inline-formula> and <inline-formula><mml:math id="M481" display="inline"><mml:mn mathvariant="normal">1</mml:mn></mml:math></inline-formula>. The shapes of the curves indicate that in all cases, the density of ATL09 VCF samples decreases as a function of the ATL09 VCF value, concentrating the majority of samples at lower values. For <inline-formula><mml:math id="M482" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">00</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M483" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, the Cloudnet distribution functions are slightly inflected, as a result of having more VCF samples close in value to <inline-formula><mml:math id="M484" display="inline"><mml:mn mathvariant="normal">1</mml:mn></mml:math></inline-formula> than the parametrisations <inline-formula><mml:math id="M485" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M486" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>. This is likely due to the smaller <inline-formula><mml:math id="M487" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula> values for <inline-formula><mml:math id="M488" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">00</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M489" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> making the Cloudnet VCF values behave more like a binary variable than at larger <inline-formula><mml:math id="M490" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula> values.</p>
      <p id="d2e7003">Figure <xref ref-type="fig" rid="F7"/>i–l show the copula densities for the tested   parametrisations. <inline-formula><mml:math id="M491" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">00</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, being data-limited, produces a noisy copula density surface indicating that the relationship between the ATL09 and Cloudnet VCF distributions is not close to being one-to-one. The other parametrisations all have comparably smooth surfaces, with a well defined ridge of higher densities around the line <inline-formula><mml:math id="M492" display="inline"><mml:mi>u</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M493" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M494" display="inline"><mml:mi>v</mml:mi></mml:math></inline-formula>. This shows that for <inline-formula><mml:math id="M495" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M496" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M497" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>, ATL09 and Cloudnet VCF values typically trend together – albeit in a non-linear fashion  due to the non-equal cumulative distribution functions shown in Fig. <xref ref-type="fig" rid="F7"/>e–h.</p>
      <p id="d2e7076">The copula density associated with <inline-formula><mml:math id="M498" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> is the smoothest, due to being generated from the highest number of contributing samples. Despite this, the upper-right corner shows that <inline-formula><mml:math id="M499" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M500" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.91. The value of <inline-formula><mml:math id="M501" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> being less than <inline-formula><mml:math id="M502" display="inline"><mml:mn mathvariant="normal">1</mml:mn></mml:math></inline-formula> indicates that the two retrievals of VCF values disagree on when the most extreme VCF values occur, sampling this case less frequently than if the Cloudnet and ATL09 retrievals were independent. This only happens for <inline-formula><mml:math id="M503" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>. This is an undesirable characteristic in the comparison of the retrievals, and hints at the contamination of the comparison by VCF profiles that are independent due to the large spatio-temporal domain within which the co-locations happen. Considering the value of <inline-formula><mml:math id="M504" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for different <inline-formula><mml:math id="M505" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M506" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> produces the copula with the highest density in the upper-right tail.</p>
      <p id="d2e7187">We can also show that the copula density associated with <inline-formula><mml:math id="M507" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> yields the smallest minimum copula density value, <inline-formula><mml:math id="M508" display="inline"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mo>min⁡</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M509" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.41, and the highest maximum copula density value, <inline-formula><mml:math id="M510" display="inline"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mo>max⁡</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M511" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 2.27, across the tested parametrisations. The RMSD values indicate that the <inline-formula><mml:math id="M512" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M513" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> co-location parametrisations are better than utilising all available data, or limiting the co-locations in order to naively reduce the rate of co-location mismatch. We see that <inline-formula><mml:math id="M514" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> yields the copula with the highest RMSD value of the tested parametrisations, with RMSD <inline-formula><mml:math id="M515" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.27. This indicates that the dependency structure of the relationship between the VCF distributions from Cloudnet and ATL09 data is stronger with <inline-formula><mml:math id="M516" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> than with <inline-formula><mml:math id="M517" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, and the other tested parametrisations.</p>
</sec>
</sec>
<sec id="Ch1.S3.SS7">
  <label>3.7</label><title>Vertical bias profiles</title>
      <p id="d2e7305">Figure <xref ref-type="fig" rid="F8"/>a–d shows the bias distributions between the ATL09 and Cloudnet VCF profiles as a function of height. One common feature across all parametrisations is that the expected bias is negative for heights <inline-formula><mml:math id="M518" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 8 km, and is positive for higher altitudes (<inline-formula><mml:math id="M519" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M520" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 10 km). This indicates that the ATLAS lidar is observing more cloud presence than Cloudnet at higher altitudes (and visa-versa at lower altitudes). This could be explained by the viewing geometries (i.e. ICESat-2 viewing clouds from above and Cloudnet viewing clouds from below) and the effects of signal attenuation on the retrievals, and is consistent with comparisons of other vertically resolved satellite retrievals of cloud presence against surface observations <xref ref-type="bibr" rid="bib1.bibx32" id="paren.47"><named-content content-type="pre">e.g.</named-content></xref>.</p>

      <fig id="F8" specific-use="star"><label>Figure 8</label><caption><p id="d2e7338">Bias distributions between ATL09 and Cloudnet VCF profiles as a function of height for the parametrisations <inline-formula><mml:math id="M521" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">00</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <bold>(a)</bold>, <inline-formula><mml:math id="M522" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <bold>(b)</bold>, <inline-formula><mml:math id="M523" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <bold>(c)</bold> and <inline-formula><mml:math id="M524" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> <bold>(d)</bold>. Expected bias profiles are given as coloured dashed lines. <bold>(e)</bold> The expected bias profiles for the different parametrisations plotted together, using the same line styles and markers as in their individual panels. <bold>(f)</bold> The variance in the bias distributions as a function of height.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f08.png"/>

        </fig>

      <p id="d2e7409">Figure <xref ref-type="fig" rid="F8"/>e shows the expected bias profiles for the shown parametrisations as a function of height. In all cases, the bias is negative for <inline-formula><mml:math id="M525" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M526" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 8 km indicating that ICESat-2 is less sensitive to clouds at these altitudes than Cloudnet is. Similarly in all cases, for <inline-formula><mml:math id="M527" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M528" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 10 km, the biases are positive showing that ATL09 is reporting more cloud at higher altitudes than Cloudnet is. This is consistent with results from other comparisons of vertically resolved satellite retrievals of cloud presence against surface based observations <xref ref-type="bibr" rid="bib1.bibx32" id="paren.48"/>. Although the expected profiles are all qualitatively similar, the height at which the bias changes sign is different between the parametrisations. <inline-formula><mml:math id="M529" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> has the lowest change from negative to positive bias at a height of 7.9 km. Above this, both <inline-formula><mml:math id="M530" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M531" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> have bias transition heights around 8.5 km (their transitions occur within one histogram height bin of each other). <inline-formula><mml:math id="M532" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">00</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> has the highest bias transition height of 9.4 km.</p>
      <p id="d2e7490">Figure <xref ref-type="fig" rid="F8"/>f shows the variance of the bias distributions for the different parametrisations as a function of height. <inline-formula><mml:math id="M533" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> has a consistently lower variance across all heights when compared to <inline-formula><mml:math id="M534" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">00</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M535" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mtext>lit.</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>. <inline-formula><mml:math id="M536" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> around <inline-formula><mml:math id="M537" display="inline"><mml:mi>z</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M538" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0 has a similar variance to the other parametrisations, but above this height, the variance reduces to a nearly constant value around 0.2 for heights above 2 km.</p>
      <p id="d2e7553">Simply from observing the expected bias profiles and the variance profiles, one may deduce that the selection of <inline-formula><mml:math id="M539" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> gives the best comparison between the data, as the magnitude of the expected bias and variance are the lowest across all parametrisations. However, in using the expected bias and the variance to determine the choice of <inline-formula><mml:math id="M540" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>, we have necessarily biased our results to be closer to <italic>ideal</italic> values. As was shown in Sect. <xref ref-type="sec" rid="Ch1.S3.SS6.SSS3"/>, choosing <inline-formula><mml:math id="M541" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> over other parametrisations includes more comparisons of independent data in the  analysis when compared to the choice of <inline-formula><mml:math id="M542" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>. If we were to use <inline-formula><mml:math id="M543" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mn mathvariant="normal">11</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, corrective factors for the bias could be too small in magnitude, and the uncertainty budget of the VCF profiles might be  underestimated. Thus, we conclude that the metrics computed between the data to be compared should not be used to assess the quality of the co-location, and that the parametrisation should instead be evaluated by maximising the mutual information between the data to be compared.</p>
      <p id="d2e7612">By incorrectly choosing <inline-formula><mml:math id="M544" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>, we subscribe to two possible outcomes: a degradation in the quality of the results of our comparisons between the data, obtaining quantitatively different results due to the difference in the input data, or; quantitatively similar results to those found when using <inline-formula><mml:math id="M545" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>, that may arise as a result of competing erroneous effects due to the inclusion of independent data, or the rejection of dependent data.</p>
</sec>
</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Discussion</title>
      <p id="d2e7643">This paper presents a unified framework for determining an optimised  parametrisation that should be used when spatiotemporally co-locating  geospatial data, before comparative analyses or data synthesis can be  performed. We utilise mutual information as a domain- and data-agnostic metric quantifying the quality of a data co-location, independent of the metrics typically used in subsequent analyses. Selecting the co-location parametrisation by optimising the comparison metrics of the analysis risks biasing the validation results to the highest attainable values given the data. As such, the comparison metric cannot be used to assess the quality of the co-location of the data used to compute the comparison metric. By definition, we do not know the value of the comparison metric, and by  parametrising data co-location to maximise or optimise the comparison metric,  we can be thought of as applying a prior distribution to the validation  metric of what value we would like the result to have. Our framework allows you to assess your data independently from the co-location, reducing the  effects of sample bias induced by a bad co-location.</p>
      <p id="d2e7646">We have demonstrated for a novel data comparison how the framework can be utilised, defining a co-location scheme to apply to ICESat-2 ATL09 derived VCFs and Cloudnet derived VCFs. Using a grid search, we were able to identify optimised co-location parametrisations <inline-formula><mml:math id="M546" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> per Cloudnet observatory, and with a basic comparison between the VCF values, show that  this parametrisation produces better comparisons of the data than a typically  used parametrisation, as well as naive choices that maximise or minimise the used data volume. Still, there are some important parts of this framework that need addressing.</p>
<sec id="Ch1.S4.SS1">
  <label>4.1</label><title>The choice of mutual information estimator</title>
      <p id="d2e7666">In this study, we utilised the adaptation of the KSG estimator <xref ref-type="bibr" rid="bib1.bibx18" id="paren.49"/> proposed by <xref ref-type="bibr" rid="bib1.bibx15" id="text.50"/>. We chose this implementation of a mutual information estimator as it allows for the mutual information to be computed between distributions of arbitrary dimension, and the development of variance estimation for the mutual information estimator allows us to determine regions in parameter space within which <inline-formula><mml:math id="M547" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> may lie, as opposed to identifying a single value for <inline-formula><mml:math id="M548" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>.</p>
      <p id="d2e7695">By accepting data of arbitrary dimension, the KSG estimator is widely applicable within the Earth sciences community. Care is needed to properly implement and interpret the outputs of the estimator, but this is the case for all mutual information estimators.</p>
      <p id="d2e7698">We believe the KSG estimator is at present a suitable choice of estimator for many problems, but other estimators may be developed, or be shown to be more robust for certain data. In these cases, it is up to the researcher's judgement to decide which estimator is most appropriate for their analysis.</p>
</sec>
<sec id="Ch1.S4.SS2">
  <label>4.2</label><title>Physical interpretation of  <inline-formula><mml:math id="M549" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="true">^</mml:mo></mml:mover></mml:math></inline-formula></title>
      <p id="d2e7719">Ascribing physical meaning to the values of the parameters in <inline-formula><mml:math id="M550" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> may be tempting, as they define a spatio-temporal region where data falling within the region maximises the mutual information between the Cloudnet and ATL09 VCF retrievals. The values of <inline-formula><mml:math id="M551" display="inline"><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> and <inline-formula><mml:math id="M552" display="inline"><mml:mover accent="true"><mml:mi mathvariant="italic">τ</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> will be intimately linked to the spatial and temporal scales of cloud evolution at each given Cloudnet site. However, due to the high degree of non-linearity in the mutual information estimation, relating <inline-formula><mml:math id="M553" display="inline"><mml:mover accent="true"><mml:mi>R</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> and <inline-formula><mml:math id="M554" display="inline"><mml:mover accent="true"><mml:mi mathvariant="italic">τ</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> to well-defined concepts such as autocorrelation scales is theoretically  challenging. This work does not concern itself with elucidating the physical meaning of the values associated with <inline-formula><mml:math id="M555" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, but further work could allow empirical relationships between the components of <inline-formula><mml:math id="M556" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> and other well-defined quantities to be identified, opening up new methods for the evaluation of these different quantities.</p>
      <p id="d2e7793">As was shown in Sect. <xref ref-type="sec" rid="Ch1.S3.SS6.SSS2"/>, the extent of the regions from which optimised co-location parametrisations can be selected is site dependent, and depends on local factors such as orography, as well as factors relating to the sampling strategy at each site. As well as inferring physical meaning from the parameters in <inline-formula><mml:math id="M557" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula>, work could be done to model how the plausible region of optimised co-location parametrisations depends on the local environment, which would allow for planning of sampling strategies in advance of satellite missions to capitalise on maximising mutual information at different locations where reference data is recorded.</p>
</sec>
<sec id="Ch1.S4.SS3">
  <label>4.3</label><title>Choice of co-location scheme</title>
      <p id="d2e7817">We demonstrated the use of a simple co-location scheme, only considering (spatially) the separation between the ATL09 data and the Cloudnet observatory. Even with this simplified treatment of the spatial distribution of clouds, we were able to show an improvement of the comparison metrics calculated between the ATL09 and Cloudnet data, simply by choosing to use the co-location parametrisation <inline-formula><mml:math id="M558" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> over other co-location parametrisations.</p>
      <p id="d2e7830">However, being simplified, the co-location scheme described in Sect. <xref ref-type="sec" rid="Ch1.S3.SS3"/> still allows comparisons between independent VCF profiles. The scheme could be augmented with additional co-location criteria and parameters, in order to encode more a priori knowledge that constrains the data comparisons being permitted. As an example, <xref ref-type="bibr" rid="bib1.bibx25" id="text.51"/> compare CALIPSO cloud layer boundaries against those identified by a ceilometer at the Eastern North Atlantic (ENA) ARM observatory, located on the Azores. As well as subsetting data according to the co-location scheme described in Sect. <xref ref-type="sec" rid="Ch1.S3.SS3"/>, using <inline-formula><mml:math id="M559" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M560" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> (150 km, 1 h), co-location events are further subset based on the prevailing wind direction at the time of closest approach, in order to reduce the contamination of the analysis by comparing orographically disturbed cloud layers. The approach introduces two angular windows, one used if CALIPSO passes to the east of the ENA observatory, and one used if CALIPSO passes to the west. Each angular window is defined by two extreme angles, within which if the wind is blowing from within the angles subtended by the window, the co-location event is excluded from the analysis. In <xref ref-type="bibr" rid="bib1.bibx25" id="text.52"/>, the angles are chosen as cardinal directions, and the windows as a result subtend 90° each. In this framework, the angles defining the edges of these windows could each become a free parameter, resulting in the 6-dimensional parameter space given by <inline-formula><mml:math id="M561" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M562" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M563" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">τ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mtext>east,min</mml:mtext></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mtext>east,max</mml:mtext></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mtext>west,min</mml:mtext></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mtext>west,max</mml:mtext></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The more complicated parametrisation space and co-location criteria may allow for higher mutual information between the datasets to be achieved if there was a systematic shortcoming with the simpler co-location scheme that allows independent samples to be permitted in the analysis regardless of the choice of parametrisation.</p>
      <p id="d2e7917">In our demonstration, the 2-dimensional parameter space was explored by a grid search method to compute mutual information values across a range of parametrisations. Higher dimensional parameter spaces come at the cost of increasing computational overhead, and grid search methods scale exponentially with the number of dimensions. Thus, identifying optimised parametrisations in higher dimensional parameter spaces may require the use of methods such as stochastic gradient descent (in this case, minimising negative mutual information) to efficiently explore the possible parametrisations and identify suitable choices of <inline-formula><mml:math id="M564" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>.</p>
</sec>
<sec id="Ch1.S4.SS4">
  <label>4.4</label><title>Applicability of the framework</title>
      <p id="d2e7938">This framework is widely applicable to problems where the use of multiple data products on non-homogenised coordinates is required. In this study, the framework is demonstrated on the use of a comparison of cloud presence data products. Mutual information can (with the Holmes estimator) be computed between data of arbitrary dimension. Thus, validations of scalar quantity retrievals (e.g. aerosol optical depth) and vector quantity retrievals (e.g. VCFs) are both possible.</p>
      <p id="d2e7941">Vector quantity retrievals are equivalent to the joint retrieval of two geophysical fields between distinct data sources. For example, the joint retrieval of temperature and pressure could be considered a vector quantity and can thus be compared between data sources.</p>
      <p id="d2e7944">By adapting the co-location scheme, data can be matched between (for example) two different satellite platforms (see Fig. <xref ref-type="fig" rid="F1"/>c). This is important for facilitating analyses that characterise the differences between the  retrievals of the same quantities by different satellites, better  characterising uncertainties induced in long-term records of geophysical quantities.</p>
      <p id="d2e7949">The focus of the study need not be comparisons of satellite retrievals. The data could compare data from other mobile platforms, such as planes and ships. Outputs from generalised circulation models could be compared against surface-based or satellite observations, with the co-location schemes matching data from the model grid to the real data.</p>
      <p id="d2e7953">The framework of maximising mutual information would be useful in the  synthesis of multi-sensor retrievals over large spatial extents. With given time and length scales over which mutual information between data is high, data from multiple sources could be optimally combined to increase spatial or temporal coverage of satellite data, or to better characterise the  uncertainties of satellite data via comparison with sufficiently nearby surface-based observations.</p>
      <p id="d2e7956">The co-location of data can also be extended to include more than two sources of data. Triple co-location <xref ref-type="bibr" rid="bib1.bibx31" id="paren.53"><named-content content-type="pre">e.g.</named-content></xref> is a method that already utilises three unique data sources to characterise the uncertainty and bias of retrievals with respect to the unknown true value. As long as a co-location scheme and homogenisation process can be defined that incorporates criteria combining more than two data sources, the framework can be utilised to identify the optimised choice of parametrisation for maximising the mutual information contained between the used data.</p>
      <p id="d2e7964">Maximising the mutual information between data is also essential for producing high quality labelled pairs of samples that can be used as the supervised training data for deep learning models. In order to produce accurate mappings from one data product to another, a deep machine learning approach not only requires high quality data, but a sufficient volume of samples to be trained on in order to learn the mapping. Generating paired data by maximising the mutual information produces data of high quality with limited contamination from independent data, whilst also ensuring that enough data is present for the structure in the joint distribution to be identified. The joint distribution structure is the probabilistic mapping from one data source to the other that is to be learned.</p>
</sec>
</sec>
<sec id="Ch1.S5" sec-type="conclusions">
  <label>5</label><title>Summary</title>
      <p id="d2e7976">To summarise this work, we have proposed a data- and domain-agnostic framework that allows for the parameters determining the co-location of arbitrary data to be objectively optimised using the mutual information between the co-located data as an independent metric to assess the quality of the co-location. This is in opposition to using the validation or comparison metrics of subsequent analyses on the co-located data to assess the quality of the co-location, as is often done.</p>
      <p id="d2e7979">Correctly identifying the co-location parametrisation is crucial, as it determines the data available for all subsequent analysis and comparisons between the data. Parametrisations are multi-dimensional, and the effects of changing the parametrisation along individual dimensions are often non-separable. Thus sub-optimally selecting individual components of the parametrisation will degrade the subsequent analyses: either through the comparison of independent data or; by reducing the number of permitted dependent samples. Random or naive choices of the co-location parametrisation will almost certainly be sub-optimal, and the estimated mutual information surfaces are non-trivially dependent on the choice of parametrisation. We have shown that a one-size-fits-all approach to choosing the co-location parametrisation will likely be inappropriate when comparing data from different locations due to myriad local effects impacting the spatiotemporal variability of the geophysical fields being measured, and that using the optimised co-location parametrisations we define yields better relationships between data to be compared than naive choices of co-location parametrisations.</p>
      <p id="d2e7982">We demonstrate the application of this framework by comparing ICESat-2 ATL09 vertically resolved cloud retrievals against Cloudnet retrievals from four observatories. We computed mutual information surfaces as a function of the co-location parametrisation <inline-formula><mml:math id="M565" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>, and were able to identify site-specific optimised parametrisations <inline-formula><mml:math id="M566" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula> at each observatory. A basic comparison of ATL09 and Cloudnet VCF profiles for different  parametrisations showed that, using  our definition of optimised data  co-location, comparisons between the data were improved over naive choices of co-location parametrisations, as well as a parametrisation typical of those used in the literature.</p>
      <p id="d2e8002">All that the framework requires in order to be implemented is: a co-location scheme with well defined criteria implementing the scheme, described by variable parametrisations <inline-formula><mml:math id="M567" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula>; a mutual information estimator that is appropriate for the data being compared and; a method for sampling different choices of <inline-formula><mml:math id="M568" display="inline"><mml:mi mathvariant="bold-italic">p</mml:mi></mml:math></inline-formula> in a way that the parametrisation that maximises the mutual information, <inline-formula><mml:math id="M569" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold-italic">p</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover></mml:math></inline-formula>, will be identified (grid-search or an implementation of gradient ascent). The framework is adaptable and widely applicable, with applications in satellite validations, satellite  inter-comparisons, model validation, multi-sensor data synthesis and the production of labelled training data for deep learning methods.</p>
</sec>

      
      </body>
    <back><app-group>

<app id="App1.Ch1.S1">
  <label>Appendix A</label><title>Mutual information bounds for spatially inhomogeneous joint distributions</title>
      <p id="d2e8040">One of the concepts underpinning the framework described in Sect. <xref ref-type="sec" rid="Ch1.S2"/> is that the relationship between two retrievals depends on the spatial and temporal displacement between where the retrievals are made, and that a region within which an optimum comparison can be made exists. Geophysical fields are spatially inhomogeneous, so we can assume that the joint probability distribution relating two retrievals of geophysical fields is also spatially inhomogeneous.</p>
      <p id="d2e8045">Let us assume that for all displacements between the retrievals of two variables <inline-formula><mml:math id="M570" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M571" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M572" display="inline"><mml:mi mathvariant="bold-italic">r</mml:mi></mml:math></inline-formula>, that the joint probability distribution relating <inline-formula><mml:math id="M573" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M574" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> can be described as <inline-formula><mml:math id="M575" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. This distribution will have some mutual information associated with it.</p>
      <p id="d2e8108">We will use the notation I<inline-formula><mml:math id="M576" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> to refer to the mutual information encoded between <inline-formula><mml:math id="M577" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M578" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, and I<inline-formula><mml:math id="M579" display="inline"><mml:mrow><mml:mfenced open="[" close="]"><mml:mrow><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula> to refer to the mutual information encoded by the probability distribution <inline-formula><mml:math id="M580" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Thus, assuming that the joint probability <inline-formula><mml:math id="M581" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is fixed, we can write

          <disp-formula id="App1.Ch1.S1.E15" content-type="numbered"><label>A1</label><mml:math id="M582" display="block"><mml:mrow><mml:mtext>I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mtext>I</mml:mtext><mml:mfenced close="]" open="["><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>=</mml:mo><mml:mtext>i</mml:mtext><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e8262">If we co-locate <inline-formula><mml:math id="M583" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M584" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> by sampling with a density <inline-formula><mml:math id="M585" display="inline"><mml:mrow><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> across a domain <inline-formula><mml:math id="M586" display="inline"><mml:mi mathvariant="script">S</mml:mi></mml:math></inline-formula>, the joint probability distribution relating <inline-formula><mml:math id="M587" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M588" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> will be

          <disp-formula id="App1.Ch1.S1.E16" content-type="numbered"><label>A2</label><mml:math id="M589" display="block"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="script">S</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msub><mml:mo>∫</mml:mo><mml:mi mathvariant="script">S</mml:mi></mml:msub><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mo>∫</mml:mo><mml:mi mathvariant="script">S</mml:mi></mml:msub><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

        which represents a sample density weighted volume average of the probability distributions associated with all locations within the co-location domain <inline-formula><mml:math id="M590" display="inline"><mml:mi mathvariant="script">S</mml:mi></mml:math></inline-formula>.</p>
      <p id="d2e8411">Letting the integral of <inline-formula><mml:math id="M591" display="inline"><mml:mrow><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> over <inline-formula><mml:math id="M592" display="inline"><mml:mi mathvariant="script">S</mml:mi></mml:math></inline-formula> be <inline-formula><mml:math id="M593" display="inline"><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mi mathvariant="script">S</mml:mi></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:math></inline-formula>, the mutual information associated with the co-location within the domain <inline-formula><mml:math id="M594" display="inline"><mml:mi mathvariant="script">S</mml:mi></mml:math></inline-formula> can be expressed as

              <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M595" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E17"><mml:mtd><mml:mtext>A3</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtext>I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="script">S</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mtext>I</mml:mtext><mml:mfenced open="[" close="]"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="script">S</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E18"><mml:mtd><mml:mtext>A4</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:mtext>I</mml:mtext><mml:mfenced close="]" open="["><mml:mrow><mml:mspace linebreak="nobreak" width="0.125em"/><mml:munder><mml:mo movablelimits="false">∫</mml:mo><mml:mi mathvariant="script">S</mml:mi></mml:munder><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mi mathvariant="script">S</mml:mi></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e8581">Mutual information can be expressed as the Kullback–Leibler (KL) divergence between a joint probability distribution and the product of its marginal distributions. It can also be shown that the KL divergence is convex in pairs of both of its arguments <xref ref-type="bibr" rid="bib1.bibx57" id="paren.54"><named-content content-type="pre">e.g.</named-content><named-content content-type="post">proof 148</named-content></xref>. By extension, mutual information is convex in its arguments. This can be expressed through Jensen's inequality <xref ref-type="bibr" rid="bib1.bibx17" id="paren.55"/>:

          <disp-formula id="App1.Ch1.S1.E19" content-type="numbered"><label>A5</label><mml:math id="M596" display="block"><mml:mrow><mml:mtext>I</mml:mtext><mml:mfenced close="]" open="["><mml:mrow><mml:mi mathvariant="italic">κ</mml:mi><mml:mi>p</mml:mi><mml:mo>+</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>)</mml:mo><mml:mi>q</mml:mi></mml:mrow></mml:mfenced><mml:mo>≤</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:mtext>I</mml:mtext><mml:mo>[</mml:mo><mml:mi>p</mml:mi><mml:mo>]</mml:mo><mml:mo>+</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>)</mml:mo><mml:mtext>I</mml:mtext><mml:mo>[</mml:mo><mml:mi>q</mml:mi><mml:mo>]</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

        where <inline-formula><mml:math id="M597" display="inline"><mml:mi mathvariant="italic">κ</mml:mi></mml:math></inline-formula> is a constant between 0 and 1 describing a mixture between probability distributions <inline-formula><mml:math id="M598" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M599" display="inline"><mml:mi>q</mml:mi></mml:math></inline-formula>. The mutual information for the linear combination of probability distributions is less than or equal to the same linear combination of the mutual informations of the individual distributions. The inequality can be extended to a normalised weighted sum of multiple distributions. Thus, using Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E18"/>), we can express an inequality for I<inline-formula><mml:math id="M600" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="script">S</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> as

              <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M601" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E20"><mml:mtd><mml:mtext>A6</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtext>I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="script">S</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mtext>I</mml:mtext><mml:mfenced open="[" close="]"><mml:mrow><mml:munder><mml:mo movablelimits="false">∫</mml:mo><mml:mi mathvariant="script">S</mml:mi></mml:munder><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mi mathvariant="script">S</mml:mi></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E21"><mml:mtd><mml:mtext>A7</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>≤</mml:mo><mml:munder><mml:mo movablelimits="false">∫</mml:mo><mml:mi mathvariant="script">S</mml:mi></mml:munder><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mi mathvariant="script">S</mml:mi></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mtext>I</mml:mtext><mml:mo>[</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo><mml:mo>]</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E22"><mml:mtd><mml:mtext>A8</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>≤</mml:mo><mml:munder><mml:mo movablelimits="false">∫</mml:mo><mml:mi mathvariant="script">S</mml:mi></mml:munder><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mi mathvariant="script">S</mml:mi></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mtext>I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∫</mml:mo><mml:mi mathvariant="script">S</mml:mi></mml:munder><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo><mml:mtext>i</mml:mtext><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mi mathvariant="script">S</mml:mi></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e8963">Thus, if the mutual information between <inline-formula><mml:math id="M602" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M603" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> can be described by a function <inline-formula><mml:math id="M604" display="inline"><mml:mrow><mml:mtext>i</mml:mtext><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for samples recorded with a given displacement <inline-formula><mml:math id="M605" display="inline"><mml:mi mathvariant="bold-italic">r</mml:mi></mml:math></inline-formula>, the total mutual information when co-locating data within a domain <inline-formula><mml:math id="M606" display="inline"><mml:mi mathvariant="script">S</mml:mi></mml:math></inline-formula> is bounded by the volume- and sample-density- weighted sum of all contributions.</p>
      <p id="d2e9008">For geophysical fields, we expect that their spatiotemporal autocorrelation (and by some extension, the mutual information) to be a decreasing function of the spatiotemporal displacements being considered. Thus, we can model <inline-formula><mml:math id="M607" display="inline"><mml:mrow><mml:mtext>i</mml:mtext><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> as a decreasing function of <inline-formula><mml:math id="M608" display="inline"><mml:mrow><mml:mo>|</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:math></inline-formula>. As such, it can be shown that the upper bound on the mutual information when considering all samples taken within a <inline-formula><mml:math id="M609" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula>-spherical domain <inline-formula><mml:math id="M610" display="inline"><mml:mrow><mml:mi mathvariant="script">S</mml:mi><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> of radius <inline-formula><mml:math id="M611" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> around a fixed location is also a decreasing function of <inline-formula><mml:math id="M612" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>. As stated in Sect. <xref ref-type="sec" rid="Ch1.S2.SS4"/>, this is the effect of data contamination by independent samples acting to decrease the mutual information encoded between the measurements <inline-formula><mml:math id="M613" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M614" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>. The above analysis assumes that the computation of the mutual information is perfectly informed and not in fact data limited. As such, this is still consistent with our expectation that the maximum mutual information that will be evaluated for data co-located within a finite number of co-location events will not be found for <inline-formula><mml:math id="M615" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M616" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0, as the mutual information estimation will be data limited and thus the increase in <inline-formula><mml:math id="M617" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> will initially drive the value of <inline-formula><mml:math id="M618" display="inline"><mml:mrow><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="script">S</mml:mi><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> upwards towards the limiting value, until sufficient samples are available and the effects of contamination take over.</p>
<sec id="App1.Ch1.S1.SS1">
  <label>A1</label><title>A radially isotropic two-population example</title>
      <p id="d2e9152">Imagine a plane, with an observatory located at the origin, measuring variable <inline-formula><mml:math id="M619" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> with marginal distribution <inline-formula><mml:math id="M620" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. There is also a mobile platform making point-like measurements of variable <inline-formula><mml:math id="M621" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, with marginal distribution <inline-formula><mml:math id="M622" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The measurements of variable <inline-formula><mml:math id="M623" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are made at distances <inline-formula><mml:math id="M624" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> from the origin, such that the sampling density is spatially uniform. This is represented as <inline-formula><mml:math id="M625" display="inline"><mml:mrow><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M626" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 1 uniformly, and as such <inline-formula><mml:math id="M627" display="inline"><mml:mi mathvariant="italic">λ</mml:mi></mml:math></inline-formula> will be ignored in the following derivation.</p>
      <p id="d2e9240"><inline-formula><mml:math id="M628" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> is a spatially inhomogeneous variable, but isotropic (with respect to the origin), such that <inline-formula><mml:math id="M629" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M630" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are related by a spatially varying joint probability distribution. For <inline-formula><mml:math id="M631" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M632" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M633" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M634" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M635" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are dependent and have

                <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M636" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E23"><mml:mtd><mml:mtext>A9</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>r</mml:mi><mml:mo>&lt;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mi>p</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E24"><mml:mtd><mml:mtext>A10</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtext mathvariant="normal">I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>r</mml:mi><mml:mo>&lt;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mtext>I</mml:mtext><mml:mo>[</mml:mo><mml:msup><mml:mi>p</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>]</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mtext>I</mml:mtext><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

          being related by the non-zero mutual information I<sup>*</sup>. Considering samples radially further away than <inline-formula><mml:math id="M638" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M639" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M640" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are independent, such that

                <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M641" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E25"><mml:mtd><mml:mtext>A11</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>r</mml:mi><mml:mo>≥</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E26"><mml:mtd><mml:mtext>A12</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtext mathvariant="normal">I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>r</mml:mi><mml:mo>≥</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mtext>I</mml:mtext><mml:mo>[</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>]</mml:mo><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e9587">The data co-location scheme for matching samples between <inline-formula><mml:math id="M642" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M643" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> is to consider all matches for which <inline-formula><mml:math id="M644" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M645" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M646" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>. That is, the domain <inline-formula><mml:math id="M647" display="inline"><mml:mi mathvariant="script">S</mml:mi></mml:math></inline-formula> is a disk of radius <inline-formula><mml:math id="M648" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> centred on the origin. As such, the joint probability distribution relating <inline-formula><mml:math id="M649" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M650" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> as a function of <inline-formula><mml:math id="M651" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> is

            <disp-formula id="App1.Ch1.S1.E27" content-type="numbered"><label>A13</label><mml:math id="M652" display="block"><mml:mtable class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>r</mml:mi><mml:mo>&lt;</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mfenced open="{" close=""><mml:mtable class="array" columnalign="left left"><mml:mtr><mml:mtd><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mi>R</mml:mi><mml:mo>&lt;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:msup><mml:mfenced close=")" open="("><mml:mstyle displaystyle="false"><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow><mml:mi>R</mml:mi></mml:mfrac></mml:mstyle></mml:mstyle></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msup><mml:mi>p</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mfenced open="(" close=")"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:msup><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="false"><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow><mml:mi>R</mml:mi></mml:mfrac></mml:mstyle></mml:mstyle></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:mfenced><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mi>R</mml:mi><mml:mo>≥</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mfenced></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e9820">We can rewrite Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E27"/>) in terms of a mixing fraction <inline-formula><mml:math id="M653" display="inline"><mml:mi mathvariant="italic">κ</mml:mi></mml:math></inline-formula>,

                <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M654" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E28"><mml:mtd><mml:mtext>A14</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>r</mml:mi><mml:mo>&lt;</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:msup><mml:mi>p</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E29"><mml:mtd><mml:mtext>A15</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mfenced close="" open="{"><mml:mtable class="array" columnalign="left left"><mml:mtr><mml:mtd><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mi>R</mml:mi><mml:mo>&lt;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:msup><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow><mml:mi>R</mml:mi></mml:mfrac></mml:mstyle></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mi>R</mml:mi><mml:mo>≥</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mfenced></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E30"><mml:mtd><mml:mtext>A16</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mn mathvariant="normal">0</mml:mn><mml:mo>&lt;</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo><mml:mo>≤</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e10017">Considering the convexity of the mutual information of mixture distributions, we can derive the bound on <inline-formula><mml:math id="M655" display="inline"><mml:mrow><mml:mtext>I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>r</mml:mi><mml:mo>&lt;</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>:

                <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M656" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E31"><mml:mtd><mml:mtext>A17</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mtext mathvariant="normal">I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>r</mml:mi><mml:mo>&lt;</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mtext>I</mml:mtext><mml:mfenced close="]" open="["><mml:mrow><mml:mi mathvariant="italic">κ</mml:mi><mml:msup><mml:mi>p</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E32"><mml:mtd><mml:mtext>A18</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>≤</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:mtext>I</mml:mtext><mml:mfenced close="]" open="["><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>+</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>)</mml:mo><mml:mtext>I</mml:mtext><mml:mo>[</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo><mml:mo>]</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E33"><mml:mtd><mml:mtext>A19</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>≤</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:mtext>I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>r</mml:mi><mml:mo>&lt;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E34"><mml:mtd><mml:mtext>A20</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>≤</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:msup><mml:mtext>I</mml:mtext><mml:mo>*</mml:mo></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E35"><mml:mtd><mml:mtext>A21</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="normal">∴</mml:mi><mml:mtext>I</mml:mtext><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>r</mml:mi><mml:mo>&lt;</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo><mml:mfenced close="" open="{"><mml:mtable class="array" columnalign="left left"><mml:mtr><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mtext>I</mml:mtext><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mi>R</mml:mi><mml:mo>&lt;</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:mo>≤</mml:mo><mml:msup><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow><mml:mi>R</mml:mi></mml:mfrac></mml:mstyle></mml:mfenced><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msup><mml:mtext>I</mml:mtext><mml:mo>*</mml:mo></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mi>R</mml:mi><mml:mo>≥</mml:mo><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mfenced></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e10368">Thus, for sufficiently large <inline-formula><mml:math id="M657" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M658" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M659" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>, we necessarily expect a reduction in the mutual information between the co-located <inline-formula><mml:math id="M660" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M661" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> as the probability distribution becomes a mixture distribution, being contaminated with independent samples. Although this toy model is very simplified, it demonstrates explicitly how the inclusion of independent data reduces the upper bound on the mutual information encoded between <inline-formula><mml:math id="M662" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M663" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, and thought experiments can easily develop the model by including additional radial intervals within which the encoded mutual information within the interval is constant but between I<sup>*</sup> and <inline-formula><mml:math id="M665" display="inline"><mml:mn mathvariant="normal">0</mml:mn></mml:math></inline-formula>.</p>
      <p id="d2e10441">To demonstrate the mutual information bounds, we implemented the previously described sampling and co-location scheme. <inline-formula><mml:math id="M666" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M667" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are two Gaussian distributed variables with unit variance. For <inline-formula><mml:math id="M668" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M669" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M670" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M671" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M672" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> have a bivariate Gaussian joint probability distribution, with correlation <inline-formula><mml:math id="M673" display="inline"><mml:mi mathvariant="italic">ρ</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M674" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M675" display="inline"><mml:msqrt><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:msup><mml:mi>e</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:msqrt></mml:math></inline-formula>, chosen such that I<sup>*</sup> <inline-formula><mml:math id="M677" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 1 nat. For <inline-formula><mml:math id="M678" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M679" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M680" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>, samples of <inline-formula><mml:math id="M681" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M682" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> are independently drawn from univariate Gaussian distributions to ensure an independent joint probability distribution between <inline-formula><mml:math id="M683" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M684" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>.</p>
      <p id="d2e10602">Figure <xref ref-type="fig" rid="FA1"/> shows the estimated <inline-formula><mml:math id="M685" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> as a function of <inline-formula><mml:math id="M686" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M687" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> is the number of samples of <inline-formula><mml:math id="M688" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M689" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> available to the KSG estimator. As <inline-formula><mml:math id="M690" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M691" display="inline"><mml:mo>→</mml:mo></mml:math></inline-formula> 0, the estimate <inline-formula><mml:math id="M692" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M693" display="inline"><mml:mo>→</mml:mo></mml:math></inline-formula> I should approach the real mutual information value. Figure <xref ref-type="fig" rid="FA1"/> shows that for <inline-formula><mml:math id="M694" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M695" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M696" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, the mutual information estimates for both the dependent and independent data have converged to the correct values of 1 and 0 nats respectively.</p>

      <fig id="FA1"><label>Figure A1</label><caption><p id="d2e10735">Mutual information estimates <inline-formula><mml:math id="M697" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> between two Gaussian distributed variables as a function of the number of samples provided to the KSG estimator.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f09.png"/>

        </fig>

      <p id="d2e10759">Figure <xref ref-type="fig" rid="FA2"/> demonstrates a mixture distribution between the dependent and independent Gaussian joint distributions, as outlined in Eqs. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E28"/>)–(<xref ref-type="disp-formula" rid="App1.Ch1.S1.E30"/>). <inline-formula><mml:math id="M698" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M699" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> consist of <inline-formula><mml:math id="M700" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M701" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M702" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">5</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> joint samples, with <inline-formula><mml:math id="M703" display="inline"><mml:mrow><mml:mi mathvariant="italic">κ</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:math></inline-formula> samples being drawn from the dependent joint probability distribution, and <inline-formula><mml:math id="M704" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>)</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:math></inline-formula> samples being drawn independently. At the extremes where <inline-formula><mml:math id="M705" display="inline"><mml:mi mathvariant="italic">κ</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M706" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0 or 1, we recover the results seen in Fig. <xref ref-type="fig" rid="FA1"/>, that the mutual information estimates agree with the actual mutual information values. However, for 0 <inline-formula><mml:math id="M707" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M708" display="inline"><mml:mi mathvariant="italic">κ</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M709" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 1, the mutual information estimate is consistently lower than the theoretical upper bound provided by Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E34"/>).</p>

      <fig id="FA2"><label>Figure A2</label><caption><p id="d2e10878">The mutual information between two variables <inline-formula><mml:math id="M710" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M711" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>, which are sampled from a mixture distribution between a dependent and independent Gaussian distribution, as a function of mixing ratio <inline-formula><mml:math id="M712" display="inline"><mml:mi mathvariant="italic">κ</mml:mi></mml:math></inline-formula>. The theoretical upper bound given in Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E34"/>) is also plotted.</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f10.png"/>

        </fig>

      <p id="d2e10910">Figure <xref ref-type="fig" rid="FA3"/> extends the implementation used to create Fig. <xref ref-type="fig" rid="FA2"/>, by extending the definition of <inline-formula><mml:math id="M713" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> to depends on <inline-formula><mml:math id="M714" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula>, such that <inline-formula><mml:math id="M715" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M716" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M717" display="inline"><mml:mrow><mml:mi mathvariant="italic">π</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mi mathvariant="italic">λ</mml:mi></mml:mrow></mml:math></inline-formula>. It also uses the full definition for <inline-formula><mml:math id="M718" display="inline"><mml:mrow><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>(</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> given in Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E29"/>). The estimated profiles of <inline-formula><mml:math id="M719" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>|</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>r</mml:mi><mml:mo>&lt;</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are plotted for multiple sampling densities, <inline-formula><mml:math id="M720" display="inline"><mml:mi mathvariant="italic">λ</mml:mi></mml:math></inline-formula>, defined by values <inline-formula><mml:math id="M721" display="inline"><mml:mrow><mml:msup><mml:mi>N</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> such that

            <disp-formula id="App1.Ch1.S1.E36" content-type="numbered"><label>A22</label><mml:math id="M722" display="block"><mml:mrow><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi>N</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msup><mml:mi>N</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow><mml:mrow><mml:mi mathvariant="italic">π</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mrow><mml:mo>*</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>

      <fig id="FA3"><label>Figure A3</label><caption><p id="d2e11072">Mutual information estimates <inline-formula><mml:math id="M723" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:mi>Y</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>|</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>r</mml:mi><mml:mo>&lt;</mml:mo><mml:mi>R</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> plotted as functions of <inline-formula><mml:math id="M724" display="inline"><mml:mi>R</mml:mi></mml:math></inline-formula> for different sampling densities <inline-formula><mml:math id="M725" display="inline"><mml:mi mathvariant="italic">λ</mml:mi></mml:math></inline-formula>, associated with different values <inline-formula><mml:math id="M726" display="inline"><mml:mrow><mml:msup><mml:mi>N</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>. The theoretical upper bound of the true mutual information between <inline-formula><mml:math id="M727" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M728" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula> is also plotted, according to Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E35"/>).</p></caption>
          <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f11.png"/>

        </fig>

      <p id="d2e11158">The mutual information profiles all follow a generic pattern. For <inline-formula><mml:math id="M729" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M730" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M731" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>, increasing <inline-formula><mml:math id="M732" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> results in an increase in the number of dependent samples <inline-formula><mml:math id="M733" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> available for the KSG estimator to infer the relationship between <inline-formula><mml:math id="M734" display="inline"><mml:mi>X</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M735" display="inline"><mml:mi>Y</mml:mi></mml:math></inline-formula>. This results in <inline-formula><mml:math id="M736" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> rising from 0 nats when <inline-formula><mml:math id="M737" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M738" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0 towards a value of I<sup>*</sup>. For <inline-formula><mml:math id="M740" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M741" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M742" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula>, the only additional samples provided to the KSG estimator are entirely independent, driving the estimates <inline-formula><mml:math id="M743" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mtext>I</mml:mtext><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> towards 0 nats. This is consistent with our expectations (see Sect. <xref ref-type="sec" rid="Ch1.S2.SS4"/>).</p>
      <p id="d2e11295">We also see, as in Fig. <xref ref-type="fig" rid="FA2"/>, that the mutual information estimates are almost always lower than the theoretical upper bound. The estimates for <inline-formula><mml:math id="M744" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> <inline-formula><mml:math id="M745" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M746" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> with <inline-formula><mml:math id="M747" display="inline"><mml:mrow><mml:msup><mml:mi>N</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M748" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M749" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M750" display="inline"><mml:mrow><mml:msup><mml:mi>N</mml:mi><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M751" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M752" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">3</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> do vary above <inline-formula><mml:math id="M753" display="inline"><mml:mrow><mml:msup><mml:mtext>I</mml:mtext><mml:mo>*</mml:mo></mml:msup></mml:mrow></mml:math></inline-formula> by more than <inline-formula><mml:math id="M754" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mtext>KSG</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, but this can likely be attributed to the variance of the estimator with a low number of samples.</p>
</sec>
</app>

<app id="App1.Ch1.S2">
  <label>Appendix B</label><title>Quality checks and homogenisation</title>
<sec id="App1.Ch1.S2.SS1">
  <label>B1</label><title>ATL09</title>
      <p id="d2e11426">The ICESat-2 ATL09 data <xref ref-type="bibr" rid="bib1.bibx37" id="paren.56"/> are downloaded through the NASA Harmony API (<uri>https://harmony.earthdata.nasa.gov/</uri>, last access: 22 May 2026), and are subset spatially based on circular polygons of radius 510 km centred on each Cloudnet observatory. Code to facilitate the downloads through the NASA Harmony API is given in <xref ref-type="bibr" rid="bib1.bibx29" id="text.57"/>.</p>
      <p id="d2e11438">Each individual ATL09 file can be associated with a unique co-location event. A given ATL09 file is opened and the <italic>high_rate</italic> group is loaded for each profile (one per ATLAS strong beam). The <italic>high_rate</italic> groups are each described by a temporal coordinate, with height and layer index coordinates associated with specific variables within the group.</p>
      <p id="d2e11447">For each temporal coordinate within a group, we create an empty boolean vector with the size of the vertical dimension. Elements of the vector are populated with true for elements corresponding to heights at which the ATL09 product reports cloud, as given in the <italic>layer_bot</italic> and <italic>layer_top</italic> variables. This forms a vertical cloud presence profile. The inclusion of a layer is rejected if the corresponding <italic>layer_attr</italic> value is not 1 (cloud). If a layer within the vertical profile has an associated value <italic>layer_conf_dens</italic> <inline-formula><mml:math id="M755" display="inline"><mml:mo>&lt;</mml:mo></mml:math></inline-formula> 0.4, then the entire vertical profile is rejected from the analysis.</p>
      <p id="d2e11469">All of the successfully generated vertical cloud presence profiles are concatenated. The profiles' <italic>latitude</italic> and <italic>longitude</italic> profiles are then used to compute the separation between the ATLAS footprint and the Cloudnet observatory, <inline-formula><mml:math id="M756" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula>. The co-location criteria outlined in Sect. <xref ref-type="sec" rid="Ch1.S3.SS3"/> is applied.</p>
      <p id="d2e11488">If fewer than 17 vertical cloud presence profiles remain after the co-location subsetting (17 <inline-formula><mml:math id="M757" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> 240 m <inline-formula><mml:math id="M758" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 4.08 km along-track distance), the co-location event is rejected for containing insufficient valid data. Otherwise, the VCF profile for the co-location event is computed as the average vertical cloud presence profile at each height, considering all of the profiles permitted after the quality checks and co-location subsetting.</p>
</sec>
<sec id="App1.Ch1.S2.SS2">
  <label>B2</label><title>Cloudnet</title>
      <p id="d2e11513">The Cloudnet <italic>categorize</italic> data product is downloaded from the Cloudnet FMI website (<uri>https://cloudnet.fmi.fi</uri>, last access: 22 May 2026), subset temporally between 1 October 2018 and 1 January 2025 to match the availability of ICESat-2 data <xref ref-type="bibr" rid="bib1.bibx11" id="paren.58"/>. Because the Cloudnet data is near-continuous compared to the ATL09 data at a given Cloudnet observatory, the ATL09 data is first processed to identify viable co-location events.</p>
      <p id="d2e11525">For a given successfully co-located ATL09 file, the time of closest approach is identified as the time <inline-formula><mml:math id="M759" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> for which the computed separation  between the ATLAS footprints and the Cloudnet observatory, <inline-formula><mml:math id="M760" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula>, is minimised:

            <disp-formula id="App1.Ch1.S2.E37" content-type="numbered"><label>B1</label><mml:math id="M761" display="block"><mml:mrow><mml:msup><mml:mi>j</mml:mi><mml:mo>′</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:munder><mml:mtext>arg min</mml:mtext><mml:mi>j</mml:mi></mml:munder><mml:mo>(</mml:mo><mml:msub><mml:mi>r</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="1em"/><mml:msub><mml:mi>t</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:msup><mml:mi>j</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

          where the subscript <inline-formula><mml:math id="M762" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula> indexes the ATL09 vertical profiles loaded when co-locating the ATL09 data.</p>
      <p id="d2e11602">The interval

            <disp-formula id="App1.Ch1.S2.E38" content-type="numbered"><label>B2</label><mml:math id="M763" display="block"><mml:mrow><mml:mi mathvariant="script">T</mml:mi><mml:mo>=</mml:mo><mml:mfenced open="[" close="]"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi mathvariant="italic">τ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi>t</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi mathvariant="italic">τ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle></mml:mrow></mml:mfenced></mml:mrow></mml:math></disp-formula>

          is defined, and all Cloudnet files with data falling within the temporal interval <inline-formula><mml:math id="M764" display="inline"><mml:mi mathvariant="script">T</mml:mi></mml:math></inline-formula> are loaded and concatenated. The loaded data is then subset based on the interval <inline-formula><mml:math id="M765" display="inline"><mml:mi mathvariant="script">T</mml:mi></mml:math></inline-formula>. Co-location and quality subsetting can be considered as set intersection operations, which are commutative, so the data can be temporally subset prior to other operations being performed. This saves computational effort.</p>
      <p id="d2e11660">The <italic>category_bits</italic> variable is unpacked, and the cloudmask from the Cloudnet data is identified <xref ref-type="bibr" rid="bib1.bibx59" id="paren.59"><named-content content-type="pre">according to the code from</named-content><named-content content-type="post">defined in <italic>cloudnetpy.products.classification._find_cloud_mask</italic></named-content></xref> as

            <disp-formula id="App1.Ch1.S2.E39" content-type="numbered"><label>B3</label><mml:math id="M766" display="block"><mml:mrow><mml:mtext>cloud</mml:mtext><mml:mo>=</mml:mo><mml:mtext>droplet</mml:mtext><mml:mo>∪</mml:mo><mml:mfenced open="(" close=")"><mml:mrow><mml:mtext>falling</mml:mtext><mml:mo>∩</mml:mo><mml:mtext>freezing</mml:mtext></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

          where droplet, falling and freezing are three of the unpacked boolean fields. The cloud variable is thus a boolean field with the <italic>time</italic> and <italic>height</italic> dimensions associated with the Cloudnet dataset. VCF profiles are computed as the temporal average of the cloud field across all profiles permitted by the temporal co-location.</p>
</sec>
</app>

<app id="App1.Ch1.S3">
  <label>Appendix C</label><title>Across-track orbital density</title>
      <p id="d2e11714">In this appendix, we will derive a formula to determine the across-track density of satellite orbits, with the following approximations: <list list-type="order"><list-item>
      <p id="d2e11719">The Earth is a sphere.</p></list-item><list-item>
      <p id="d2e11723">Subsequent orbits of the same satellite on the tangent plane at a given location on the Earth's surface are parallel.</p></list-item><list-item>
      <p id="d2e11727">Orbits form great circle paths over the surface of the Earth.</p></list-item></list></p>
      <p id="d2e11730">These approximations are incorrect, but in most circumstances will lead to reasonable results. Firstly, the Earth is in fact not a sphere, but instead is better approximated as an oblate spheroid. However, the semi-major and semi-minor axes for the Earth differ by less than 0.4 %, making the spherical assumption reasonable for back-of-the-envelope calculations.</p>
      <p id="d2e11733">Secondly, assuming subsequent orbits of a satellite on the tangent plane are parallel is a broken assumption. All of the orbits at a given latitude will have the same inclination relative to the vector locally pointing north. On the tangent plane, it is assumed that the direction pointing north is equal throughout the plane, however it will in reality have a longitudinal dependence. Sufficiently far from the poles, this assumption will be accurate to first order for displacements along the tangent plane, distances which increase closer to the equator.</p>
      <p id="d2e11736">Finally, treating orbits as great circles is approximate, as the Earth rotates under the satellite as it orbits. This acts to make orbital tracks along the ground have a stronger westwards component than they otherwise would, effectively tilting the orbital track relative to a co-rotating great circle of the same inclination. For polar orbiting satellites with orbital inclinations greater than 90°, this acts to make the orbital track locally appear to have a shallower angle relative to the equator compared to the great circle with the same inclination.</p>
      <p id="d2e11740">We can define three points on the surface of the Earth: let <inline-formula><mml:math id="M767" display="inline"><mml:mrow><mml:msub><mml:mi>A</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> be the ascending node of the orbit, found on the equator; let <inline-formula><mml:math id="M768" display="inline"><mml:mrow><mml:msub><mml:mi>A</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> be the location on the Earth's surface at which we want to compute the local across-track orbital density and; let <inline-formula><mml:math id="M769" display="inline"><mml:mrow><mml:msub><mml:mi>A</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> be the point where the highest latitude is reached. If we let <inline-formula><mml:math id="M770" display="inline"><mml:mi mathvariant="italic">ϕ</mml:mi></mml:math></inline-formula> represent latitude (positive northwards), and <inline-formula><mml:math id="M771" display="inline"><mml:mi mathvariant="italic">λ</mml:mi></mml:math></inline-formula> be longitude, the locations can be expressed as

              <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M772" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S3.E40"><mml:mtd><mml:mtext>C1</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E41"><mml:mtd><mml:mtext>C2</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msub><mml:mi>A</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>:</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E42"><mml:mtd><mml:mtext>C3</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msub><mml:mi>A</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>:</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E43"><mml:mtd><mml:mtext>C4</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msub><mml:mi>A</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>:</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

        where <inline-formula><mml:math id="M773" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M774" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> are known constants which satisfy the inequalities <inline-formula><mml:math id="M775" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M776" display="inline"><mml:mo>≤</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M777" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M778" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M779" display="inline"><mml:mo>&gt;</mml:mo></mml:math></inline-formula> 0. Specifically, <inline-formula><mml:math id="M780" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> is equal to the reference angle of the orbital inclination of the satellite (the angle as it would be if limited to being between 0 and <inline-formula><mml:math id="M781" display="inline"><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mi mathvariant="italic">π</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle></mml:math></inline-formula>).</p>
      <p id="d2e11997">We will describe the bearing along which the great circle from point <inline-formula><mml:math id="M782" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> to point <inline-formula><mml:math id="M783" display="inline"><mml:mi>b</mml:mi></mml:math></inline-formula> lies when measured from point <inline-formula><mml:math id="M784" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> as <inline-formula><mml:math id="M785" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>b</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>. The angle <inline-formula><mml:math id="M786" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>b</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> can generally be expressed as

          <disp-formula id="App1.Ch1.S3.E44" content-type="numbered"><label>C5</label><mml:math id="M787" display="block"><mml:mrow><mml:mi>tan⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>b</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>b</mml:mi></mml:msub><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>b</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>a</mml:mi></mml:msub><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>b</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>a</mml:mi></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>b</mml:mi></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>b</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

        where <inline-formula><mml:math id="M788" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>b</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M789" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M790" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mi>b</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M791" display="inline"><mml:mo>-</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M792" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mi>a</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the difference in longitude between point <inline-formula><mml:math id="M793" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula> and point <inline-formula><mml:math id="M794" display="inline"><mml:mi>b</mml:mi></mml:math></inline-formula>.</p>
      <p id="d2e12197">At <inline-formula><mml:math id="M795" display="inline"><mml:mrow><mml:msub><mml:mi>A</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, the orbit has reached its highest latitude, <inline-formula><mml:math id="M796" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>. As such, the bearing of the orbit must necessarily be <inline-formula><mml:math id="M797" display="inline"><mml:mrow><mml:mo>±</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mi mathvariant="italic">π</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle></mml:mrow></mml:math></inline-formula> as the satellite transitions from heading northwards to southwards. Thus, <inline-formula><mml:math id="M798" display="inline"><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">20</mml:mn></mml:msub><mml:mo>|</mml:mo><mml:mo>=</mml:mo><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">21</mml:mn></mml:msub><mml:mo>|</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mi mathvariant="italic">π</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle></mml:mrow></mml:math></inline-formula>. Knowing that <inline-formula><mml:math id="M799" display="inline"><mml:mrow><mml:msub><mml:mo>lim⁡</mml:mo><mml:mrow><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>|</mml:mo><mml:mi mathvariant="italic">α</mml:mi><mml:mo>|</mml:mo><mml:mo>→</mml:mo><mml:mfrac><mml:mi mathvariant="italic">π</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mrow></mml:mfenced></mml:mrow></mml:msub><mml:mi>tan⁡</mml:mi><mml:mi mathvariant="italic">α</mml:mi><mml:mo>→</mml:mo><mml:mo>inf⁡</mml:mo></mml:mrow></mml:math></inline-formula>, we know that the denominator in Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S3.E44"/>) must tend towards zero. For the case of <inline-formula><mml:math id="M800" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">20</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>:

              <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M801" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S3.E45"><mml:mtd><mml:mtext>C6</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">20</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E46"><mml:mtd><mml:mtext>C7</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>⟹</mml:mo><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">20</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E47"><mml:mtd><mml:mtext>C8</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>⟹</mml:mo><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">20</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E48"><mml:mtd><mml:mtext>C9</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>⟹</mml:mo><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">20</mml:mn></mml:msub><mml:mo>|</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi mathvariant="italic">π</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

        This shows the longitudinal separation between <inline-formula><mml:math id="M802" display="inline"><mml:mrow><mml:msub><mml:mi>A</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M803" display="inline"><mml:mrow><mml:msub><mml:mi>A</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> must be <inline-formula><mml:math id="M804" display="inline"><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mi mathvariant="italic">π</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle></mml:math></inline-formula>, which is consistent with what we would expect given the symmetries of two great-circles drawn over a sphere, the equator and the orbital track, and the separation between these great-circles having two crossing points and two maxima, all evenly spaced along the great-circles.</p>
      <p id="d2e12481">The same logic can be applied for <inline-formula><mml:math id="M805" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">21</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, deriving an equation for <inline-formula><mml:math id="M806" display="inline"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">21</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>:

              <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M807" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S3.E49"><mml:mtd><mml:mtext>C10</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">21</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E50"><mml:mtd><mml:mtext>C11</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">21</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow><mml:mrow><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E51"><mml:mtd><mml:mtext>C12</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>tan⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow><mml:mrow><mml:mi>tan⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e12641">The cosine function is symmetric in its argument, meaning the Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S3.E51"/>) also provides an expression for <inline-formula><mml:math id="M808" display="inline"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M809" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M810" display="inline"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">21</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> because <inline-formula><mml:math id="M811" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M812" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> <inline-formula><mml:math id="M813" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">21</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>. Applying Eqs. (<xref ref-type="disp-formula" rid="App1.Ch1.S3.E44"/>) and (<xref ref-type="disp-formula" rid="App1.Ch1.S3.E51"/>) to finding <inline-formula><mml:math id="M814" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, we get

              <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M815" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S3.E52"><mml:mtd><mml:mtext>C13</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi>tan⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub></mml:mrow><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E53"><mml:mtd><mml:mtext>C14</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:msup><mml:mi>cos⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub></mml:mrow></mml:mfenced><mml:mfrac><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:msup></mml:mrow><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E54"><mml:mtd><mml:mtext>C15</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:mfenced close=")" open="("><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>tan⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:msup><mml:mi>tan⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msup><mml:mi>tan⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:mfenced><mml:mfrac><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:msup></mml:mrow><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>tan⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mfenced close=")" open="("><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>tan⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mi>tan⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E55"><mml:mtd><mml:mtext>C16</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:msup><mml:mi>tan⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msup><mml:mi>tan⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:mfenced><mml:mfrac><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:msup></mml:mrow><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mfenced close=")" open="("><mml:mrow><mml:msup><mml:mi>cos⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mi>sin⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msup><mml:mi>sin⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mi>cos⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E56"><mml:mtd><mml:mtext>C17</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi>tan⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mi>cos⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:msup><mml:mi>tan⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msup><mml:mi>tan⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:mfenced><mml:mfrac><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:msup></mml:mrow><mml:mrow><mml:msup><mml:mi>cos⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mi>sin⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msup><mml:mi>sin⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mi>cos⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

        which is importantly expressed purely as a function of the known latitudes <inline-formula><mml:math id="M816" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M817" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e13269">In order to convert this bearing into an across-track density of orbits, we need to evaluate the perpendicular separation between adjacent orbital tracks. For a satellite that revisits the same along-ground orbital track every <inline-formula><mml:math id="M818" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> orbits, the longitudinal displacement between adjacent tracks, <inline-formula><mml:math id="M819" display="inline"><mml:mrow><mml:mi mathvariant="italic">δ</mml:mi><mml:mi>x</mml:mi></mml:mrow></mml:math></inline-formula>, is given as

          <disp-formula id="App1.Ch1.S3.E57" content-type="numbered"><label>C18</label><mml:math id="M820" display="block"><mml:mrow><mml:mi mathvariant="italic">δ</mml:mi><mml:mi>x</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>R</mml:mi><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e13317">This can be related to the across-track separation between the adjacent orbits, <inline-formula><mml:math id="M821" display="inline"><mml:mrow><mml:mi mathvariant="italic">δ</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:math></inline-formula>, as

          <disp-formula id="App1.Ch1.S3.E58" content-type="numbered"><label>C19</label><mml:math id="M822" display="block"><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="italic">δ</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="italic">δ</mml:mi><mml:mi>x</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>=</mml:mo><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e13358">The across-track density of orbits, <inline-formula><mml:math id="M823" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>orbits</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> is inversely proportional to <inline-formula><mml:math id="M824" display="inline"><mml:mrow><mml:mi mathvariant="italic">δ</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:math></inline-formula>, such that

              <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M825" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S3.E59"><mml:mtd><mml:mtext>C20</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>orbits</mml:mtext></mml:msub><mml:mo>∝</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mi mathvariant="italic">δ</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E60"><mml:mtd><mml:mtext>C21</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>∝</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi>N</mml:mi><mml:mrow><mml:mi>R</mml:mi><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mn mathvariant="normal">12</mml:mn></mml:msub><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S3.E61"><mml:mtd><mml:mtext>C22</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>∝</mml:mo><mml:mfenced open="(" close=")"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:mfenced><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mi>cos⁡</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mi>arctan⁡</mml:mi><mml:mfenced close=")" open="("><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:mi>cos⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mi>cos⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:msup><mml:mi>tan⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msup><mml:mi>tan⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:mfenced><mml:mfrac><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:msup></mml:mrow><mml:mrow><mml:msup><mml:mi>cos⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mi>sin⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msup><mml:mi>sin⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:msup><mml:mi>cos⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced></mml:mrow></mml:mfenced></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
      <p id="d2e13579">Thus, the across-track density of orbits can be approximated given the reference angle of the satellite's orbital inclination, <inline-formula><mml:math id="M826" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, and the latitude of the location at which the density is to be calculated, <inline-formula><mml:math id="M827" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>. Multiplying Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S3.E61"/>) by a factor of <inline-formula><mml:math id="M828" display="inline"><mml:mrow><mml:mi>sin⁡</mml:mi><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> normalises <inline-formula><mml:math id="M829" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>orbits</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> such that <inline-formula><mml:math id="M830" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>orbits</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M831" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 1 when <inline-formula><mml:math id="M832" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <inline-formula><mml:math id="M833" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> 0.</p>
      <p id="d2e13667">Figure <xref ref-type="fig" rid="FC1"/> shows the across-track density of orbits calculated for the ISS, EarthCARE, the A-Train constellation of satellites and ICESat-2.</p>

      <fig id="FC1"><label>Figure C1</label><caption><p id="d2e13674">Normalised across-track density of orbits, <inline-formula><mml:math id="M834" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mtext>orbits</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, for the ISS, EarthCARE, A-Train satellite constellation, and ICESat-2. Horizontal dashed lines have fixed densities of 1, 2, 5 and 10. Vertical dashed lines correspond to the highest latitudes achieved by each satellite, and represent where Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S3.E61"/>) diverges.</p></caption>
        <graphic xlink:href="https://amt.copernicus.org/articles/19/3511/2026/amt-19-3511-2026-f12.png"/>

      </fig>

</app>
  </app-group><notes notes-type="codedataavailability"><title>Code and data availability</title>

      <p id="d2e13700">Software  implementing the framework and producing results is given by <xref ref-type="bibr" rid="bib1.bibx29" id="text.60"><named-content content-type="post"><ext-link xlink:href="https://doi.org/10.5281/zenodo.17830442" ext-link-type="DOI">10.5281/zenodo.17830442</ext-link></named-content></xref>. The Cloudnet data used in this study are generated by the Aerosol, Clouds and Trace Gases Research Infrastructure (ACTRIS) and are available from the ACTRIS Data Centre using the following link: <ext-link xlink:href="https://doi.org/10.60656/726097978e364d06" ext-link-type="DOI">10.60656/726097978e364d06</ext-link>. The specific dataset is given in <xref ref-type="bibr" rid="bib1.bibx11" id="text.61"><named-content content-type="post"><ext-link xlink:href="https://doi.org/10.60656/726097978e364d06" ext-link-type="DOI">10.60656/726097978e364d06</ext-link></named-content></xref>. ICESat-2 ATL09 data are downloaded from the NASA Harmony API (<uri>https://harmony.earthdata.nasa.gov/</uri>, last access: 22 May 2026) and utilise the ATL09 v6 data <xref ref-type="bibr" rid="bib1.bibx37" id="paren.62"><named-content content-type="post"><ext-link xlink:href="https://doi.org/10.5067/ATLAS/ATL09.006" ext-link-type="DOI">10.5067/ATLAS/ATL09.006</ext-link></named-content></xref>. Generated results can be accessed directly through  <xref ref-type="bibr" rid="bib1.bibx30" id="text.63"><named-content content-type="post"><ext-link xlink:href="https://doi.org/10.5281/zenodo.17817304" ext-link-type="DOI">10.5281/zenodo.17817304</ext-link></named-content></xref>.</p>
  </notes><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d2e13737">ASM conceptualised the study; ASM designed the framework; ASM, HG, MG designed the analysis methodology; ASM wrote the analysis software; ASM performed the analysis; All co-authors analysed and reviewed the results; ASM prepared the manuscript with feedback and contributions from all co-authors; HG, RN provided supervision.</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d2e13743">The contact author has declared that none of the authors has any competing interests.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d2e13749">Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.</p>
  </notes><ack><title>Acknowledgements</title><p id="d2e13755">This work was supported by the Natural Environment Research Council (NERC) Centre for Satellite Data in Environmental Science (SENSE) Centre for Doctoral Training (grant no. NE/T00939X/1). We acknowledge ACTRIS and Finnish Meteorological Institute for providing the Cloudnet data set which is available for download from <uri>https://cloudnet.fmi.fi</uri> (last access: 22 May 2026). The cloud radar data for Ny-Ålesund was provided by the University of Cologne, the ceilometer and microwave radiometer data by the Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research. We thank the staff of AWIPEV research base in Ny-Ålesund for technical support of the measurements. We gratefully acknowledge the funding by the Deutsche Forschungsgemeinschaft DFG (German Research Foundation) – project number 268020496 – TRR 172, within the “Transregional Collaborative Research Center 'ArctiC Amplification: Climate Relevant Atmospheric and SurfaCe Processes, and Feedback Mechanisms (AC)3”'. We acknowledge ECMWF for providing IFS model data, DWD for providing ICON model data, and NCEP (National Centers for Environmental Prediction) for providing access to GDAS1 data. The authors would like to thank the many teams contributing to maintaining ICESat-2 for their ongoing efforts in creating the atmospheric data products. This work used JASMIN, the UK's collaborative data analysis environment (<uri>https://www.jasmin.ac.uk</uri>, last access: 22 May 2026; <xref ref-type="bibr" rid="bib1.bibx20" id="altparen.64"/>). The Scientific colour maps acton, hawaii, imola, lipari, navia, roma and vik <xref ref-type="bibr" rid="bib1.bibx7" id="paren.65"/> are used in this study to prevent visual distortion of the data and exclusion of readers with colour-vision deficiencies <xref ref-type="bibr" rid="bib1.bibx8" id="paren.66"/>. ASM would like to thank Von P. Walden for contributions to the statistical methodology, and for his shared guidance and wisdom. ASM would like to acknowledge the rest of the ICECAPS team and their support over the past 2 years. ASM would like to thank WO, SH, ISSW and TM for making the long 2 weeks enjoyable, and KC for always being there when the going gets tough. The authors would like to thank the two reviewers of the manuscript and the handling editor, for their in depth reading and understanding of the manuscript, and for their insightful comments and suggestions which have improved the work.</p></ack><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d2e13775">This research has been supported by the Natural Environment Research Council (grant no. NE/T00939X/1).</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d2e13782">This paper was edited by Luca Lelli and reviewed by two anonymous referees.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><label>Alexander and Protat(2018)</label><mixed-citation>Alexander, S. P. and Protat, A.: Cloud Properties Observed From the Surface and by Satellite at the Northern Edge of the Southern Ocean, J. Geophys. Res.-Atmos., 123, 443–456, <ext-link xlink:href="https://doi.org/10.1002/2017JD026552" ext-link-type="DOI">10.1002/2017JD026552</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx2"><label>Baars et al.(2023)</label><mixed-citation>Baars, H., Walchester, J., Basharova, E., Gebauer, H., Radenz, M., Bühl, J., Barja, B., Wandinger, U., and Seifert, P.: Long-term validation of Aeolus L2B wind products at Punta Arenas, Chile, and Leipzig, Germany, Atmos. Meas. Tech., 16, 3809–3834, <ext-link xlink:href="https://doi.org/10.5194/amt-16-3809-2023" ext-link-type="DOI">10.5194/amt-16-3809-2023</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx3"><label>Beirlant et al.(1997)</label><mixed-citation> Beirlant, J., Dudewicz, E., Gyor, L., and Meulen, E.: Nonparametric entropy  estimation: An overview, International Journal of Mathematical and Statistical Sciences, 6, 17–39, 1997.</mixed-citation></ref>
      <ref id="bib1.bibx4"><label>Blanchard et al.(2014)</label><mixed-citation>Blanchard, Y., Pelon, J., Eloranta, E. W., Moran, K. P., Delanoë, J., and  Sèze, G.: A Synergistic Analysis of Cloud Cover and Vertical Distribution from A-Train and Ground-Based Sensors over the High Arctic Station Eureka from 2006 to 2010, J. Appl. Meteorol. Clim.,  53, 2553–2570,  <ext-link xlink:href="https://doi.org/10.1175/JAMC-D-14-0021.1" ext-link-type="DOI">10.1175/JAMC-D-14-0021.1</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx5"><label>Compernolle et al.(2021)</label><mixed-citation>Compernolle, S., Argyrouli, A., Lutz, R., Sneep, M., Lambert, J.-C., Fjæraa, A. M., Hubert, D., Keppens, A., Loyola, D., O'Connor, E., Romahn, F., Stammes, P., Verhoelst, T., and Wang, P.: Validation of the Sentinel-5 Precursor TROPOMI cloud data with Cloudnet, Aura OMI O<sub>2</sub>–O<sub>2</sub>, MODIS, and Suomi-NPP VIIRS, Atmos. Meas. Tech., 14, 2451–2476, <ext-link xlink:href="https://doi.org/10.5194/amt-14-2451-2021" ext-link-type="DOI">10.5194/amt-14-2451-2021</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx6"><label>Cover and Thomas(2006)</label><mixed-citation>Cover, T. M. and Thomas, J. A.: Elements of information theory,  Wiley-Interscience, Hoboken, N.J, 2nd edn., <ext-link xlink:href="https://doi.org/10.1002/047174882X" ext-link-type="DOI">10.1002/047174882X</ext-link>, ISBN 978-1-118-58577-1, ISBN 978-0-471-74881-6, ISBN 978-0-471-74882-3, ISBN 978-0-471-24195-9, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx7"><label>Crameri(2023)</label><mixed-citation>Crameri, F.: Scientific colour maps, Zenodo [code], <ext-link xlink:href="https://doi.org/10.5281/zenodo.1243862" ext-link-type="DOI">10.5281/zenodo.1243862</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx8"><label>Crameri et al.(2020)</label><mixed-citation>Crameri, F., Shephard, G. E., and Heron, P. J.: The misuse of colour in science communication, Nat. Commun., 11, 5444,  <ext-link xlink:href="https://doi.org/10.1038/s41467-020-19160-7" ext-link-type="DOI">10.1038/s41467-020-19160-7</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx9"><label>Darbellay and Vajda(1999)</label><mixed-citation>Darbellay, G. and Vajda, I.: Estimation of the information by an adaptive  partitioning of the observation space, IEEE T. Inform. Theory, 45, 1315–1321, <ext-link xlink:href="https://doi.org/10.1109/18.761290" ext-link-type="DOI">10.1109/18.761290</ext-link>, 1999.</mixed-citation></ref>
      <ref id="bib1.bibx10"><label>Deneke et al.(2009)</label><mixed-citation>Deneke, H. M., Knap, W. H., and Simmer, C.: Multiresolution analysis of the  temporal variance and correlation of transmittance and reflectance of an  atmospheric column, J. Geophys. Res.-Atmos., 114,  <ext-link xlink:href="https://doi.org/10.1029/2008JD011680" ext-link-type="DOI">10.1029/2008JD011680</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx11"><label>Ebell et al.(2025)</label><mixed-citation>Ebell, K., Geiß, A., Kneifel, S., Marke, T., Maturilli, M., Moisseev, D.,  O'Connor, E., Patra, S., Pfitzenmaier, L., Pospichal, B., Ritter, C.,  Schween, J., and Zinner, T.: Custom collection of categorize data from Hyytiälä, Jülich, Munich, and Ny-Ålesund between 1 Oct 2018 and 1 Jan 2025, ACTRIS Cloud remote sensing data centre unit (CLU) [data set], <ext-link xlink:href="https://doi.org/10.60656/726097978E364D06" ext-link-type="DOI">10.60656/726097978E364D06</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx12"><label>Eibedingil et al.(2021)</label><mixed-citation>Eibedingil, I. G., Gill, T. E., Van Pelt, R. S., and Tong, D. Q.: Comparison of Aerosol Optical Depth from MODIS Product Collection 6.1 and AERONET in the Western United States, Remote Sensing, 13, 2316, <ext-link xlink:href="https://doi.org/10.3390/rs13122316" ext-link-type="DOI">10.3390/rs13122316</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx13"><label>Fuchs et al.(2022)</label><mixed-citation>Fuchs, J., Andersen, H., Cermak, J., Pauli, E., and Roebeling, R.: High-resolution satellite-based cloud detection for the analysis of land surface effects on boundary layer clouds, Atmos. Meas. Tech., 15, 4257–4270, <ext-link xlink:href="https://doi.org/10.5194/amt-15-4257-2022" ext-link-type="DOI">10.5194/amt-15-4257-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx14"><label>Herzfeld et al.(2021)</label><mixed-citation>Herzfeld, U., Hayes, A., Palm, S., Hancock, D., Vaughan, M., and Barbieri, K.: Detection and Height Measurement of Tenuous Clouds and Blowing Snow in ICESat-2 ATLAS Data, Geophys. Res. Lett., 48, e2021GL093473, <ext-link xlink:href="https://doi.org/10.1029/2021GL093473" ext-link-type="DOI">10.1029/2021GL093473</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx15"><label>Holmes and Nemenman(2019)</label><mixed-citation>Holmes, C. M. and Nemenman, I.: Estimation of mutual information for  real-valued data with error bars and controlled bias, Phys. Rev. E, 100,  022404, <ext-link xlink:href="https://doi.org/10.1103/PhysRevE.100.022404" ext-link-type="DOI">10.1103/PhysRevE.100.022404</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx16"><label>Illingworth et al.(2007)</label><mixed-citation>Illingworth, A. J., Hogan, R. J., O'Connor, E. J., Bouniol, D., Brooks, M. E., Delanoé, J., Donovan, D. P., Eastment, J. D., Gaussiat, N., Goddard, J. W. F., Haeffelin, M., Baltink, H. K., Krasnov, O. A., Pelon, J., Piriou,  J.-M., Protat, A., Russchenberg, H. W. J., Seifert, A., Tompkins, A. M.,  van Zadelhoff, G.-J., Vinit, F., Willén, U., Wilson, D. R., and Wrench,  C. L.: Cloudnet: Continuous Evaluation of Cloud Profiles in Seven Operational Models Using Ground-Based Observations, B. Am. Meteorol. Soc., 88, 883–898,  <ext-link xlink:href="https://doi.org/10.1175/BAMS-88-6-883" ext-link-type="DOI">10.1175/BAMS-88-6-883</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx17"><label>Jensen(1906)</label><mixed-citation>Jensen, J. L. W. V.: Sur les fonctions convexes et les inégualités entre les  valeurs Moyennes, Acta Math.-Djursholm, 30, 175–193, <ext-link xlink:href="https://doi.org/10.1007/BF02418571" ext-link-type="DOI">10.1007/BF02418571</ext-link>, <ext-link xlink:href="https://doi.org/10.1007/bf02418571" ext-link-type="DOI">10.1007/bf02418571</ext-link>, 1906.</mixed-citation></ref>
      <ref id="bib1.bibx18"><label>Kraskov et al.(2004)</label><mixed-citation>Kraskov, A., Stögbauer, H., and Grassberger, P.: Estimating mutual  information, Phys. Rev. E, 69, 066138, <ext-link xlink:href="https://doi.org/10.1103/PhysRevE.69.066138" ext-link-type="DOI">10.1103/PhysRevE.69.066138</ext-link>, 2004.</mixed-citation></ref>
      <ref id="bib1.bibx19"><label>Langsdale et al.(2025)</label><mixed-citation>Langsdale, M., Verhoelst, T., Povey, A., Schutgens, N., Dowling, T., Lambert,  J.-C., Compernolle, S., and Kern, S.: The Challenges and Limitations of Validating Satellite-Derived Datasets Using Independent Measurements: Lessons Learned from Essential Climate Variables, Surv. Geophys., <ext-link xlink:href="https://doi.org/10.1007/s10712-025-09898-4" ext-link-type="DOI">10.1007/s10712-025-09898-4</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx20"><label>Lawrence et al.(2013)</label><mixed-citation>Lawrence, B. N., Bennett, V. L., Churchill, J., Juckes, M., Kershaw, P.,  Pascoe, S., Pepler, S., Pritchard, M., and Stephens, A.: Storing and  manipulating environmental big data with JASMIN, in: 201 IEEE International Conference on Big Data, IEEE, Silicon Valley, CA, USA, 68–75, <ext-link xlink:href="https://doi.org/10.1109/BigData.2013.6691556" ext-link-type="DOI">10.1109/BigData.2013.6691556</ext-link>, ISBN 978-1-4799-1293-3, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx21"><label>Lin et al.(2022)</label><mixed-citation>Lin, Y., Tian, P., Tang, C., Pang, S., and Zhang, L.: Combining CALIPSO and AERONET Data to Classify Aerosols Globally, IEEE T. Geoscience Remote, 60, 1–12, <ext-link xlink:href="https://doi.org/10.1109/TGRS.2021.3138085" ext-link-type="DOI">10.1109/TGRS.2021.3138085</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx22"><label>Liu et al.(2017)</label><mixed-citation>Liu, Y., Shupe, M. D., Wang, Z., and Mace, G.: Cloud vertical distribution from combined surface and space radar–lidar observations at two Arctic atmospheric observatories, Atmos. Chem. Phys., 17, 5973–5989, <ext-link xlink:href="https://doi.org/10.5194/acp-17-5973-2017" ext-link-type="DOI">10.5194/acp-17-5973-2017</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx23"><label>Liu et al.(2010)</label><mixed-citation>Liu, Z., Marchand, R., and Ackerman, T.: A comparison of observations in the  tropical western Pacific from ground-based and satellite  millimeter-wavelength cloud radars, J. Geophys. Res.-Atmos., 115, <ext-link xlink:href="https://doi.org/10.1029/2009JD013575" ext-link-type="DOI">10.1029/2009JD013575</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx24"><label>Loew et al.(2017)</label><mixed-citation>Loew, A., Bell, W., Brocca, L., Bulgin, C. E., Burdanowitz, J., Calbet, X.,  Donner, R. V., Ghent, D., Gruber, A., Kaminski, T., Kinzel, J., Klepp, C.,  Lambert, J.-C., Schaepman-Strub, G., Schröder, M., and Verhoelst, T.:  Validation practices for satellite-based Earth observation data across  communities, Rev. Geophys., 55, 779–817, <ext-link xlink:href="https://doi.org/10.1002/2017RG000562" ext-link-type="DOI">10.1002/2017RG000562</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx25"><label>Lu et al.(2021)</label><mixed-citation>Lu, X., Mao, F., Rosenfeld, D., Zhu, Y., Pan, Z., and Gong, W.: Satellite retrieval of cloud base height and geometric thickness of low-level cloud based on CALIPSO, Atmos. Chem. Phys., 21, 11979–12003, <ext-link xlink:href="https://doi.org/10.5194/acp-21-11979-2021" ext-link-type="DOI">10.5194/acp-21-11979-2021</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx26"><label>Mamouri et al.(2009)</label><mixed-citation>Mamouri, R. E., Amiridis, V., Papayannis, A., Giannakaki, E., Tsaknakis, G., and Balis, D. S.: Validation of CALIPSO space-borne-derived attenuated backscatter coefficient profiles using a ground-based lidar in Athens, Greece, Atmos. Meas. Tech., 2, 513–522, <ext-link xlink:href="https://doi.org/10.5194/amt-2-513-2009" ext-link-type="DOI">10.5194/amt-2-513-2009</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx27"><label>Markus et al.(2017)</label><mixed-citation>Markus, T., Neumann, T., Martino, A., Abdalati, W., Brunt, K., Csatho, B.,  Farrell, S., Fricker, H., Gardner, A., Harding, D., Jasinski, M., Kwok, R.,  Magruder, L., Lubin, D., Luthcke, S., Morison, J., Nelson, R.,  Neuenschwander, A., Palm, S., Popescu, S., Shum, C., Schutz, B. E., Smith,  B., Yang, Y., and Zwally, J.: The Ice, Cloud, and land Elevation Satellite-2 (ICESat-2): Science requirements, concept, and implementation, Remote Sens. Environ., 190, 260–273, <ext-link xlink:href="https://doi.org/10.1016/j.rse.2016.12.029" ext-link-type="DOI">10.1016/j.rse.2016.12.029</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx28"><label>Martin et al.(2021)</label><mixed-citation>Martin, A., Weissmann, M., Reitebuch, O., Rennie, M., Geiß, A., and Cress, A.: Validation of Aeolus winds using radiosonde observations and numerical weather prediction model equivalents, Atmos. Meas. Tech., 14, 2167–2183, <ext-link xlink:href="https://doi.org/10.5194/amt-14-2167-2021" ext-link-type="DOI">10.5194/amt-14-2167-2021</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx29"><label>Martin(2025a)</label><mixed-citation>Martin, A. S.: DAndrewA/a-guide-to-optimised-spatiotemporal-data-co-location-by-mutual-information-maximisation:  v1.0.1, Version v1.0.1, Zenodo [code], <ext-link xlink:href="https://doi.org/10.5281/zenodo.17830442" ext-link-type="DOI">10.5281/zenodo.17830442</ext-link>, 2025a.</mixed-citation></ref>
      <ref id="bib1.bibx30"><label>Martin(2025b)</label><mixed-citation>Martin, A. S.: Mutual information maximisation for spatiotemporal co-location: ICESat-2 ATL09 and Cloudnet categorize, Version v1, Zenodo [data set], <ext-link xlink:href="https://doi.org/10.5281/zenodo.17817304" ext-link-type="DOI">10.5281/zenodo.17817304</ext-link>, 2025b.</mixed-citation></ref>
      <ref id="bib1.bibx31"><label>McColl et al.(2014)</label><mixed-citation>McColl, K. A., Vogelzang, J., Konings, A. G., Entekhabi, D., Piles, M., and  Stoffelen, A.: Extended triple collocation: Estimating errors and  correlation coefficients with respect to an unknown target, Geophys. Res. Lett., 41, 6229–6236, <ext-link xlink:href="https://doi.org/10.1002/2014GL061322" ext-link-type="DOI">10.1002/2014GL061322</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx32"><label>McErlich et al.(2021)</label><mixed-citation>McErlich, C., McDonald, A., Schuddeboom, A., and Silber, I.: Comparing Satellite- and Ground-Based Observations of Cloud Occurrence Over High Southern Latitudes, J. Geophys. Res.-Atmos., 126, e2020JD033607, <ext-link xlink:href="https://doi.org/10.1029/2020JD033607" ext-link-type="DOI">10.1029/2020JD033607</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx33"><label>McGarry et al.(2021)</label><mixed-citation>McGarry, J. F., Carabajal, C. C., Saba, J. L., Reese, A. R., Holland, S. T.,  Palm, S. P., Swinski, J.-P. A., Golder, J. E., and Liiva, P. M.:  ICESat-2/ATLAS Onboard Flight Science Receiver Algorithms: Purpose, Process, and Performance, Earth and Space Science, 8, e2020EA001235, <ext-link xlink:href="https://doi.org/10.1029/2020EA001235" ext-link-type="DOI">10.1029/2020EA001235</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx34"><label>Mona et al.(2009)</label><mixed-citation>Mona, L., Pappalardo, G., Amodeo, A., D'Amico, G., Madonna, F., Boselli, A., Giunta, A., Russo, F., and Cuomo, V.: One year of CNR-IMAA multi-wavelength Raman lidar measurements in coincidence with CALIPSO overpasses: Level 1 products comparison, Atmos. Chem. Phys., 9, 7213–7228, <ext-link xlink:href="https://doi.org/10.5194/acp-9-7213-2009" ext-link-type="DOI">10.5194/acp-9-7213-2009</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx35"><label>Nearing et al.(2017)</label><mixed-citation>Nearing, G. S., Yatheendradas, S., Crow, W. T., Bosch, D. D., Cosh, M. H.,  Goodrich, D. C., Seyfried, M. S., and Starks, P. J.: Nonparametric triple  collocation, Water Resour. Res., 53, 5516–5530,  <ext-link xlink:href="https://doi.org/10.1002/2017WR020359" ext-link-type="DOI">10.1002/2017WR020359</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx36"><label>Neumann et al.(2019)</label><mixed-citation>Neumann, T. A., Martino, A. J., Markus, T., Bae, S., Bock, M. R., Brenner,  A. C., Brunt, K. M., Cavanaugh, J., Fernandes, S. T., Hancock, D. W.,  Harbeck, K., Lee, J., Kurtz, N. T., Luers, P. J., Luthcke, S. B., Magruder,  L., Pennington, T. A., Ramos-Izquierdo, L., Rebold, T., Skoog, J., and  Thomas, T. C.: The Ice, Cloud, and Land Elevation Satellite – 2  mission: A global geolocated photon product derived from the Advanced Topographic Laser Altimeter System, Remote Sens. Environ., 233, 111325, <ext-link xlink:href="https://doi.org/10.1016/j.rse.2019.111325" ext-link-type="DOI">10.1016/j.rse.2019.111325</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx37"><label>Palm et al.(2023)</label><mixed-citation>Palm, S., Yang, Y., Herzfeld, U., Hancock, D., Barbieri, K., Wimert, J., and  the ICESat-2 Science Team: ATLAS/icesat-2 L3A calibrated backscatter profiles and atmospheric layer characteristics, Version 6, NASA National Snow and Ice Data Center Distributed Active Archive Center (NSIDC) [data set], Boulder, Colorado USA, <ext-link xlink:href="https://doi.org/10.5067/ATLAS/ATL09.006" ext-link-type="DOI">10.5067/ATLAS/ATL09.006</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx38"><label>Palm et al.(2021)</label><mixed-citation>Palm, S. P., Yang, Y., Herzfeld, U., Hancock, D., Hayes, A., Selmer, P., Hart,  W., and Hlavka, D.: ICESat-2 Atmospheric Channel Description, Data Processing and First Results, Earth and Space Science, 8, e2020EA001470, <ext-link xlink:href="https://doi.org/10.1029/2020EA001470" ext-link-type="DOI">10.1029/2020EA001470</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx39"><label>Papagiannopoulos et al.(2016)</label><mixed-citation>Papagiannopoulos, N., Mona, L., Alados-Arboledas, L., Amiridis, V., Baars, H., Binietoglou, I., Bortoli, D., D'Amico, G., Giunta, A., Guerrero-Rascado, J. L., Schwarz, A., Pereira, S., Spinelli, N., Wandinger, U., Wang, X., and Pappalardo, G.: CALIPSO climatological products: evaluation and suggestions from EARLINET, Atmos. Chem. Phys., 16, 2341–2357, <ext-link xlink:href="https://doi.org/10.5194/acp-16-2341-2016" ext-link-type="DOI">10.5194/acp-16-2341-2016</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx40"><label>Pappalardo et al.(2010)</label><mixed-citation>Pappalardo, G., Wandinger, U., Mona, L., Hiebsch, A., Mattis, I., Amodeo, A.,  Ansmann, A., Seifert, P., Linné, H., Apituley, A., Alados Arboledas, L.,  Balis, D., Chaikovsky, A., D'Amico, G., De Tomasi, F., Freudenthaler, V.,  Giannakaki, E., Giunta, A., Grigorov, I., Iarlori, M., Madonna, F., Mamouri,  R.-E., Nasti, L., Papayannis, A., Pietruczuk, A., Pujadas, M., Rizi, V.,  Rocadenbosch, F., Russo, F., Schnell, F., Spinelli, N., Wang, X., and  Wiegner, M.: EARLINET correlative measurements for CALIPSO: First intercomparison results, J. Geophys. Res.-Atmos., 115,  <ext-link xlink:href="https://doi.org/10.1029/2009JD012147" ext-link-type="DOI">10.1029/2009JD012147</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx41"><label>Pauly et al.(2019)</label><mixed-citation>Pauly, R. M., Yorks, J. E., Hlavka, D. L., McGill, M. J., Amiridis, V., Palm, S. P., Rodier, S. D., Vaughan, M. A., Selmer, P. A., Kupchock, A. W., Baars, H., and Gialitaki, A.: Cloud-Aerosol Transport System (CATS) 1064 nm calibration and validation, Atmos. Meas. Tech., 12, 6241–6258, <ext-link xlink:href="https://doi.org/10.5194/amt-12-6241-2019" ext-link-type="DOI">10.5194/amt-12-6241-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx42"><label>Polyanskiy and Wu(2024)</label><mixed-citation>Polyanskiy, Y. and Wu, Y.: Information Theory: From Coding to Learning,  Cambridge University Press, 1st edn., <ext-link xlink:href="https://doi.org/10.1017/9781108966351" ext-link-type="DOI">10.1017/9781108966351</ext-link>, ISBN 978-1-108-96635-1, ISBN 978-1-108-83290-8, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx43"><label>Proestakis et al.(2019)</label><mixed-citation>Proestakis, E., Amiridis, V., Marinou, E., Binietoglou, I., Ansmann, A., Wandinger, U., Hofer, J., Yorks, J., Nowottnick, E., Makhmudov, A., Papayannis, A., Pietruczuk, A., Gialitaki, A., Apituley, A., Szkop, A., Muñoz Porcar, C., Bortoli, D., Dionisi, D., Althausen, D., Mamali, D., Balis, D., Nicolae, D., Tetoni, E., Liberti, G. L., Baars, H., Mattis, I., Stachlewska, I. S., Voudouri, K. A., Mona, L., Mylonaki, M., Perrone, M. R., Costa, M. J., Sicard, M., Papagiannopoulos, N., Siomos, N., Burlizzi, P., Pauly, R., Engelmann, R., Abdullaev, S., and Pappalardo, G.: EARLINET evaluation of the CATS Level 2 aerosol backscatter coefficient product, Atmos. Chem. Phys., 19, 11743–11764, <ext-link xlink:href="https://doi.org/10.5194/acp-19-11743-2019" ext-link-type="DOI">10.5194/acp-19-11743-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx44"><label>Protat et al.(2009)</label><mixed-citation>Protat, A., Bouniol, D., Delanoë, J., O'Connor, E., May, P. T.,  Plana-Fattori, A., Hasson, A., Görsdorf, U., and Heymsfield, A. J.: Assessment of Cloudsat Reflectivity Measurements and Ice Cloud Properties Using Ground-Based and Airborne Cloud Radar Observations. J. Atmos. Ocean. Tech., 26, 1717–1741, <ext-link xlink:href="https://doi.org/10.1175/2009JTECHA1246.1" ext-link-type="DOI">10.1175/2009JTECHA1246.1</ext-link>, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx45"><label>Protat et al.(2014a)</label><mixed-citation>Protat, A., Young, S. A., McFarlane, S. A., L'Ecuyer, T., Mace, G. G.,  Comstock, J. M., Long, C. N., Berry, E., and Delanoë, J.: Reconciling Ground-Based and Space-Based Estimates of the Frequency of Occurrence and Radiative Effect of Clouds around Darwin, Australia, J. Appl. Meteorol. Clim., 53, 456–478, <ext-link xlink:href="https://doi.org/10.1175/JAMC-D-13-072.1" ext-link-type="DOI">10.1175/JAMC-D-13-072.1</ext-link>, 2014a.</mixed-citation></ref>
      <ref id="bib1.bibx46"><label>Protat et al.(2014b)</label><mixed-citation>Protat, A., Young, S. A., Rikus, L., and Whimpey, M.: Evaluation of hydrometeor frequency of occurrence in a limited-area numerical weather prediction system using near real-time CloudSat–CALIPSO observations, Q. J. Roy. Meteor. Soc., 140, 2430–2443, <ext-link xlink:href="https://doi.org/10.1002/qj.2308" ext-link-type="DOI">10.1002/qj.2308</ext-link>, 2014b.</mixed-citation></ref>
      <ref id="bib1.bibx47"><label>Robinson et al.(2025)</label><mixed-citation>Robinson, J., Jaeglé, L., Palm, S. P., Shupe, M. D., Liston, G. E., and Frey,  M. M.: ICESat-2 Observations of Blowing Snow Over Arctic Sea Ice During the 2019–2020 MOSAiC Expedition, J. Geophys. Res.-Atmos., 130, e2025JD043919, <ext-link xlink:href="https://doi.org/10.1029/2025JD043919" ext-link-type="DOI">10.1029/2025JD043919</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx48"><label>Rodgers(2008)</label><mixed-citation> Rodgers, C. D.: Inverse methods for atmospheric sounding: theory and practice, in: Series on atmospheric, oceanic and planetary physics – Vol. 2, World Scientific, Singapore, reprinted edition, ISBN 978-981-02-2740-1, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx49"><label>Roebeling et al.(2008)</label><mixed-citation>Roebeling, R. A., Deneke, H. M., and Feijt, A. J.: Validation of Cloud Liquid Water Path Retrievals from SEVIRI Using One Year of CloudNET Observations, J. Appl. Meteorol. Clim., 47, 206–222, <ext-link xlink:href="https://doi.org/10.1175/2007JAMC1661.1" ext-link-type="DOI">10.1175/2007JAMC1661.1</ext-link>, 2008.</mixed-citation></ref>
      <ref id="bib1.bibx50"><label>Sayer et al.(2020)</label><mixed-citation>Sayer, A. M., Govaerts, Y., Kolmonen, P., Lipponen, A., Luffarelli, M., Mielonen, T., Patadia, F., Popp, T., Povey, A. C., Stebel, K., and Witek, M. L.: A review and framework for the evaluation of pixel-level uncertainty estimates in satellite aerosol remote sensing, Atmos. Meas. Tech., 13, 373–404, <ext-link xlink:href="https://doi.org/10.5194/amt-13-373-2020" ext-link-type="DOI">10.5194/amt-13-373-2020</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx51"><label>Schuster et al.(2012)</label><mixed-citation>Schuster, G. L., Vaughan, M., MacDonnell, D., Su, W., Winker, D., Dubovik, O., Lapyonok, T., and Trepte, C.: Comparison of CALIPSO aerosol optical depth retrievals to AERONET measurements, and a climatology for the lidar ratio of dust, Atmos. Chem. Phys., 12, 7431–7452, <ext-link xlink:href="https://doi.org/10.5194/acp-12-7431-2012" ext-link-type="DOI">10.5194/acp-12-7431-2012</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx52"><label>Schutgens et al.(2017)</label><mixed-citation>Schutgens, N., Tsyro, S., Gryspeerdt, E., Goto, D., Weigum, N., Schulz, M., and Stier, P.: On the spatio-temporal representativeness of observations, Atmos. Chem. Phys., 17, 9761–9780, <ext-link xlink:href="https://doi.org/10.5194/acp-17-9761-2017" ext-link-type="DOI">10.5194/acp-17-9761-2017</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx53"><label>Schölzel and Friederichs(2008)</label><mixed-citation>Schölzel, C. and Friederichs, P.: Multivariate non-normally distributed random variables in climate research – introduction to the copula approach, Nonlin. Processes Geophys., 15, 761–772, <ext-link xlink:href="https://doi.org/10.5194/npg-15-761-2008" ext-link-type="DOI">10.5194/npg-15-761-2008</ext-link>, 2008. </mixed-citation></ref>
      <ref id="bib1.bibx54"><label>Shannon(1948)</label><mixed-citation>Shannon, C. E.: A mathematical theory of communication, Bell Syst. Tech. J., 27, 379–423, <ext-link xlink:href="https://doi.org/10.1002/j.1538-7305.1948.tb01338.x" ext-link-type="DOI">10.1002/j.1538-7305.1948.tb01338.x</ext-link>, 1948.</mixed-citation></ref>
      <ref id="bib1.bibx55"><label>Shupe et al.(2011)Shupe, Walden, Eloranta, Uttal, Campbell, Starkweather, and Shiobara</label><mixed-citation>Shupe, M. D., Walden, V. P., Eloranta, E., Uttal, T., Campbell, J. R.,  Starkweather, S. M., and Shiobara, M.: Clouds at Arctic Atmospheric Observatories. Part I: Occurrence and Macrophysical Properties, J. Appl. Meteorol. Clim., 50, 626–644, <ext-link xlink:href="https://doi.org/10.1175/2010JAMC2467.1" ext-link-type="DOI">10.1175/2010JAMC2467.1</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx56"><label>Silber et al.(2018)</label><mixed-citation>Silber, I., Verlinde, J., Eloranta, E. W., and Cadeddu, M.: Antarctic Cloud Macrophysical, Thermodynamic Phase, and Atmospheric Inversion Coupling Properties at McMurdo Station: I. Principal Data Processing and Climatology, J. Geophys. Res.-Atmos., 123, 6099–6121, <ext-link xlink:href="https://doi.org/10.1029/2018JD028279" ext-link-type="DOI">10.1029/2018JD028279</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx57"><label>Soch et al.(2025)</label><mixed-citation>Soch, J., The Book of Statistical Proofs, Sarıtaş, K., Maja, Monticone, P.,  Faulkenberry, T. J., Martin, O. A., Kipnis, A., Balkus, S., lfkdlfdlk,  Allefeld, C., Atze, H., Knapp, A., McInerney, C. D., Lo4ding00, Ohan, V.,  amvosk, and maxgrozo: StatProofBook/StatProofBook.github.io: StatProofBook 2024, Zenodo [code], <ext-link xlink:href="https://doi.org/10.5281/zenodo.4305949" ext-link-type="DOI">10.5281/zenodo.4305949</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx58"><label>Stone(2022)</label><mixed-citation> Stone, J. V.: Information theory: a tutorial introduction, Sebtel Press,  Sheffield, United Kingdom, 2nd edn., ISBN 978-1-7396727-0-6, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx59"><label>Tukiainen et al.(2020)</label><mixed-citation>Tukiainen, S., O'Connor, E., and Korpinen, A.: CloudnetPy: A Python package for processing cloud remote sensing data, Journal of Open Source Software, 5, 2123, <ext-link xlink:href="https://doi.org/10.21105/joss.02123" ext-link-type="DOI">10.21105/joss.02123</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx60"><label>Verhoelst et al.(2015)</label><mixed-citation>Verhoelst, T., Granville, J., Hendrick, F., Köhler, U., Lerot, C., Pommereau, J.-P., Redondas, A., Van Roozendael, M., and Lambert, J.-C.: Metrology of ground-based satellite validation: co-location mismatch and smoothing issues of total ozone comparisons, Atmos. Meas. Tech., 8, 5039–5062, <ext-link xlink:href="https://doi.org/10.5194/amt-8-5039-2015" ext-link-type="DOI">10.5194/amt-8-5039-2015</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx61"><label>Verhoelst et al.(2026)</label><mixed-citation>Verhoelst, T., Povey, A. C., Gruber, A., Bulgin, C. E., Keppens, A.,  Compernolle, S., and Lambert, J.-C.: Confidently Uncertain: Validating Satellite ECV Measurement Uncertainty Estimates, Surv. Geophys., <ext-link xlink:href="https://doi.org/10.1007/s10712-026-09939-6" ext-link-type="DOI">10.1007/s10712-026-09939-6</ext-link>, 2026.</mixed-citation></ref>
      <ref id="bib1.bibx62"><label>Virtanen et al.(2018)</label><mixed-citation>Virtanen, T. H., Kolmonen, P., Sogacheva, L., Rodríguez, E., Saponaro, G., and de Leeuw, G.: Collocation mismatch uncertainties in satellite aerosol retrieval validation, Atmos. Meas. Tech., 11, 925–938, <ext-link xlink:href="https://doi.org/10.5194/amt-11-925-2018" ext-link-type="DOI">10.5194/amt-11-925-2018</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx63"><label>von Clarmann(2006)</label><mixed-citation>von Clarmann, T.: Validation of remotely sensed profiles of atmospheric state variables: strategies and terminology, Atmos. Chem. Phys., 6, 4311–4320, <ext-link xlink:href="https://doi.org/10.5194/acp-6-4311-2006" ext-link-type="DOI">10.5194/acp-6-4311-2006</ext-link>, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx64"><label>Wang et al.(2024)</label><mixed-citation>Wang, P., Donovan, D. P., van Zadelhoff, G.-J., de Kloe, J., Huber, D., and Reissig, K.: Evaluation of Aeolus feature mask and particle extinction coefficient profile products using CALIPSO data, Atmos. Meas. Tech., 17, 5935–5955, <ext-link xlink:href="https://doi.org/10.5194/amt-17-5935-2024" ext-link-type="DOI">10.5194/amt-17-5935-2024</ext-link>, 2024.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>A guide to optimised spatiotemporal data co-location by mutual information maximisation</article-title-html>
<abstract-html/>
<ref-html id="bib1.bib1"><label>Alexander and Protat(2018)</label><mixed-citation>
      
Alexander, S. P. and Protat, A.: Cloud Properties Observed From the Surface and by Satellite at the Northern Edge of the Southern Ocean, J. Geophys. Res.-Atmos., 123, 443–456, <a href="https://doi.org/10.1002/2017JD026552" target="_blank">https://doi.org/10.1002/2017JD026552</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>Baars et al.(2023)</label><mixed-citation>
      
Baars, H., Walchester, J., Basharova, E., Gebauer, H., Radenz, M., Bühl, J., Barja, B., Wandinger, U., and Seifert, P.: Long-term validation of Aeolus L2B wind products at Punta Arenas, Chile, and Leipzig, Germany, Atmos. Meas. Tech., 16, 3809–3834, <a href="https://doi.org/10.5194/amt-16-3809-2023" target="_blank">https://doi.org/10.5194/amt-16-3809-2023</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>Beirlant et al.(1997)</label><mixed-citation>
      
Beirlant, J., Dudewicz, E., Gyor, L., and Meulen, E.: Nonparametric entropy  estimation: An overview, International Journal of Mathematical and Statistical Sciences, 6, 17–39, 1997.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>Blanchard et al.(2014)</label><mixed-citation>
      
Blanchard, Y., Pelon, J., Eloranta, E. W., Moran, K. P., Delanoë, J., and  Sèze, G.: A Synergistic Analysis of Cloud Cover and Vertical Distribution from A-Train and Ground-Based Sensors over the High Arctic Station Eureka from 2006 to 2010, J. Appl. Meteorol. Clim.,  53, 2553–2570,  <a href="https://doi.org/10.1175/JAMC-D-14-0021.1" target="_blank">https://doi.org/10.1175/JAMC-D-14-0021.1</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>Compernolle et al.(2021)</label><mixed-citation>
      
Compernolle, S., Argyrouli, A., Lutz, R., Sneep, M., Lambert, J.-C., Fjæraa, A. M., Hubert, D., Keppens, A., Loyola, D., O'Connor, E., Romahn, F., Stammes, P., Verhoelst, T., and Wang, P.: Validation of the Sentinel-5 Precursor TROPOMI cloud data with Cloudnet, Aura OMI O<sub>2</sub>–O<sub>2</sub>, MODIS, and Suomi-NPP VIIRS, Atmos. Meas. Tech., 14, 2451–2476, <a href="https://doi.org/10.5194/amt-14-2451-2021" target="_blank">https://doi.org/10.5194/amt-14-2451-2021</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>Cover and Thomas(2006)</label><mixed-citation>
      
Cover, T. M. and Thomas, J. A.: Elements of information theory,  Wiley-Interscience, Hoboken, N.J, 2nd edn., <a href="https://doi.org/10.1002/047174882X" target="_blank">https://doi.org/10.1002/047174882X</a>, ISBN&thinsp;978-1-118-58577-1, ISBN&thinsp;978-0-471-74881-6, ISBN&thinsp;978-0-471-74882-3, ISBN&thinsp;978-0-471-24195-9, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>Crameri(2023)</label><mixed-citation>
      
Crameri, F.: Scientific colour maps, Zenodo [code], <a href="https://doi.org/10.5281/zenodo.1243862" target="_blank">https://doi.org/10.5281/zenodo.1243862</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>Crameri et al.(2020)</label><mixed-citation>
      
Crameri, F., Shephard, G. E., and Heron, P. J.: The misuse of colour in science communication, Nat. Commun., 11, 5444,  <a href="https://doi.org/10.1038/s41467-020-19160-7" target="_blank">https://doi.org/10.1038/s41467-020-19160-7</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>Darbellay and Vajda(1999)</label><mixed-citation>
      
Darbellay, G. and Vajda, I.: Estimation of the information by an adaptive  partitioning of the observation space, IEEE T. Inform. Theory, 45, 1315–1321, <a href="https://doi.org/10.1109/18.761290" target="_blank">https://doi.org/10.1109/18.761290</a>, 1999.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>Deneke et al.(2009)</label><mixed-citation>
      
Deneke, H. M., Knap, W. H., and Simmer, C.: Multiresolution analysis of the  temporal variance and correlation of transmittance and reflectance of an  atmospheric column, J. Geophys. Res.-Atmos., 114,  <a href="https://doi.org/10.1029/2008JD011680" target="_blank">https://doi.org/10.1029/2008JD011680</a>, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>Ebell et al.(2025)</label><mixed-citation>
      
Ebell, K., Geiß, A., Kneifel, S., Marke, T., Maturilli, M., Moisseev, D.,  O'Connor, E., Patra, S., Pfitzenmaier, L., Pospichal, B., Ritter, C.,  Schween, J., and Zinner, T.: Custom collection of categorize data from Hyytiälä, Jülich, Munich, and Ny-Ålesund between 1 Oct 2018 and 1 Jan 2025, ACTRIS Cloud remote sensing data centre unit (CLU) [data set], <a href="https://doi.org/10.60656/726097978E364D06" target="_blank">https://doi.org/10.60656/726097978E364D06</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>Eibedingil et al.(2021)</label><mixed-citation>
      
Eibedingil, I. G., Gill, T. E., Van Pelt, R. S., and Tong, D. Q.: Comparison of Aerosol Optical Depth from MODIS Product Collection 6.1 and AERONET in the Western United States, Remote Sensing, 13, 2316, <a href="https://doi.org/10.3390/rs13122316" target="_blank">https://doi.org/10.3390/rs13122316</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>Fuchs et al.(2022)</label><mixed-citation>
      
Fuchs, J., Andersen, H., Cermak, J., Pauli, E., and Roebeling, R.: High-resolution satellite-based cloud detection for the analysis of land surface effects on boundary layer clouds, Atmos. Meas. Tech., 15, 4257–4270, <a href="https://doi.org/10.5194/amt-15-4257-2022" target="_blank">https://doi.org/10.5194/amt-15-4257-2022</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>Herzfeld et al.(2021)</label><mixed-citation>
      
Herzfeld, U., Hayes, A., Palm, S., Hancock, D., Vaughan, M., and Barbieri, K.: Detection and Height Measurement of Tenuous Clouds and Blowing Snow in ICESat-2 ATLAS Data, Geophys. Res. Lett., 48, e2021GL093473, <a href="https://doi.org/10.1029/2021GL093473" target="_blank">https://doi.org/10.1029/2021GL093473</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>Holmes and Nemenman(2019)</label><mixed-citation>
      
Holmes, C. M. and Nemenman, I.: Estimation of mutual information for  real-valued data with error bars and controlled bias, Phys. Rev. E, 100,  022404, <a href="https://doi.org/10.1103/PhysRevE.100.022404" target="_blank">https://doi.org/10.1103/PhysRevE.100.022404</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>Illingworth et al.(2007)</label><mixed-citation>
      
Illingworth, A. J., Hogan, R. J., O'Connor, E. J., Bouniol, D., Brooks, M. E., Delanoé, J., Donovan, D. P., Eastment, J. D., Gaussiat, N., Goddard, J. W. F., Haeffelin, M., Baltink, H. K., Krasnov, O. A., Pelon, J., Piriou,  J.-M., Protat, A., Russchenberg, H. W. J., Seifert, A., Tompkins, A. M.,  van Zadelhoff, G.-J., Vinit, F., Willén, U., Wilson, D. R., and Wrench,  C. L.: Cloudnet: Continuous Evaluation of Cloud Profiles in Seven Operational Models Using Ground-Based Observations, B. Am. Meteorol. Soc., 88, 883–898,  <a href="https://doi.org/10.1175/BAMS-88-6-883" target="_blank">https://doi.org/10.1175/BAMS-88-6-883</a>, 2007.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>Jensen(1906)</label><mixed-citation>
      
Jensen, J. L. W. V.: Sur les fonctions convexes et les inégualités entre les  valeurs Moyennes, Acta Math.-Djursholm, 30, 175–193, <a href="https://doi.org/10.1007/BF02418571" target="_blank">https://doi.org/10.1007/BF02418571</a>, <a href="https://doi.org/10.1007/bf02418571" target="_blank">https://doi.org/10.1007/bf02418571</a>, 1906.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>Kraskov et al.(2004)</label><mixed-citation>
      
Kraskov, A., Stögbauer, H., and Grassberger, P.: Estimating mutual  information, Phys. Rev. E, 69, 066138, <a href="https://doi.org/10.1103/PhysRevE.69.066138" target="_blank">https://doi.org/10.1103/PhysRevE.69.066138</a>, 2004.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>Langsdale et al.(2025)</label><mixed-citation>
      
Langsdale, M., Verhoelst, T., Povey, A., Schutgens, N., Dowling, T., Lambert,  J.-C., Compernolle, S., and Kern, S.: The Challenges and Limitations of Validating Satellite-Derived Datasets Using Independent Measurements: Lessons Learned from Essential Climate Variables, Surv. Geophys., <a href="https://doi.org/10.1007/s10712-025-09898-4" target="_blank">https://doi.org/10.1007/s10712-025-09898-4</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>Lawrence et al.(2013)</label><mixed-citation>
      
Lawrence, B. N., Bennett, V. L., Churchill, J., Juckes, M., Kershaw, P.,  Pascoe, S., Pepler, S., Pritchard, M., and Stephens, A.: Storing and  manipulating environmental big data with JASMIN, in: 201 IEEE International Conference on Big Data, IEEE, Silicon Valley, CA, USA, 68–75, <a href="https://doi.org/10.1109/BigData.2013.6691556" target="_blank">https://doi.org/10.1109/BigData.2013.6691556</a>, ISBN 978-1-4799-1293-3, 2013.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>Lin et al.(2022)</label><mixed-citation>
      
Lin, Y., Tian, P., Tang, C., Pang, S., and Zhang, L.: Combining CALIPSO and AERONET Data to Classify Aerosols Globally, IEEE T. Geoscience Remote, 60, 1–12, <a href="https://doi.org/10.1109/TGRS.2021.3138085" target="_blank">https://doi.org/10.1109/TGRS.2021.3138085</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>Liu et al.(2017)</label><mixed-citation>
      
Liu, Y., Shupe, M. D., Wang, Z., and Mace, G.: Cloud vertical distribution from combined surface and space radar–lidar observations at two Arctic atmospheric observatories, Atmos. Chem. Phys., 17, 5973–5989, <a href="https://doi.org/10.5194/acp-17-5973-2017" target="_blank">https://doi.org/10.5194/acp-17-5973-2017</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>Liu et al.(2010)</label><mixed-citation>
      
Liu, Z., Marchand, R., and Ackerman, T.: A comparison of observations in the  tropical western Pacific from ground-based and satellite  millimeter-wavelength cloud radars, J. Geophys. Res.-Atmos., 115, <a href="https://doi.org/10.1029/2009JD013575" target="_blank">https://doi.org/10.1029/2009JD013575</a>, 2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>Loew et al.(2017)</label><mixed-citation>
      
Loew, A., Bell, W., Brocca, L., Bulgin, C. E., Burdanowitz, J., Calbet, X.,  Donner, R. V., Ghent, D., Gruber, A., Kaminski, T., Kinzel, J., Klepp, C.,  Lambert, J.-C., Schaepman-Strub, G., Schröder, M., and Verhoelst, T.:  Validation practices for satellite-based Earth observation data across  communities, Rev. Geophys., 55, 779–817, <a href="https://doi.org/10.1002/2017RG000562" target="_blank">https://doi.org/10.1002/2017RG000562</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>Lu et al.(2021)</label><mixed-citation>
      
Lu, X., Mao, F., Rosenfeld, D., Zhu, Y., Pan, Z., and Gong, W.: Satellite retrieval of cloud base height and geometric thickness of low-level cloud based on CALIPSO, Atmos. Chem. Phys., 21, 11979–12003, <a href="https://doi.org/10.5194/acp-21-11979-2021" target="_blank">https://doi.org/10.5194/acp-21-11979-2021</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>Mamouri et al.(2009)</label><mixed-citation>
      
Mamouri, R. E., Amiridis, V., Papayannis, A., Giannakaki, E., Tsaknakis, G., and Balis, D. S.: Validation of CALIPSO space-borne-derived attenuated backscatter coefficient profiles using a ground-based lidar in Athens, Greece, Atmos. Meas. Tech., 2, 513–522, <a href="https://doi.org/10.5194/amt-2-513-2009" target="_blank">https://doi.org/10.5194/amt-2-513-2009</a>, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>Markus et al.(2017)</label><mixed-citation>
      
Markus, T., Neumann, T., Martino, A., Abdalati, W., Brunt, K., Csatho, B.,  Farrell, S., Fricker, H., Gardner, A., Harding, D., Jasinski, M., Kwok, R.,  Magruder, L., Lubin, D., Luthcke, S., Morison, J., Nelson, R.,  Neuenschwander, A., Palm, S., Popescu, S., Shum, C., Schutz, B. E., Smith,  B., Yang, Y., and Zwally, J.: The Ice, Cloud, and land Elevation Satellite-2 (ICESat-2): Science requirements, concept, and implementation, Remote Sens. Environ., 190, 260–273, <a href="https://doi.org/10.1016/j.rse.2016.12.029" target="_blank">https://doi.org/10.1016/j.rse.2016.12.029</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>Martin et al.(2021)</label><mixed-citation>
      
Martin, A., Weissmann, M., Reitebuch, O., Rennie, M., Geiß, A., and Cress, A.: Validation of Aeolus winds using radiosonde observations and numerical weather prediction model equivalents, Atmos. Meas. Tech., 14, 2167–2183, <a href="https://doi.org/10.5194/amt-14-2167-2021" target="_blank">https://doi.org/10.5194/amt-14-2167-2021</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>Martin(2025a)</label><mixed-citation>
      
Martin, A. S.: DAndrewA/a-guide-to-optimised-spatiotemporal-data-co-location-by-mutual-information-maximisation:  v1.0.1, Version v1.0.1, Zenodo [code], <a href="https://doi.org/10.5281/zenodo.17830442" target="_blank">https://doi.org/10.5281/zenodo.17830442</a>, 2025a.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>Martin(2025b)</label><mixed-citation>
      
Martin, A. S.: Mutual information maximisation for spatiotemporal co-location: ICESat-2 ATL09 and Cloudnet categorize, Version v1, Zenodo [data set], <a href="https://doi.org/10.5281/zenodo.17817304" target="_blank">https://doi.org/10.5281/zenodo.17817304</a>, 2025b.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>McColl et al.(2014)</label><mixed-citation>
      
McColl, K. A., Vogelzang, J., Konings, A. G., Entekhabi, D., Piles, M., and  Stoffelen, A.: Extended triple collocation: Estimating errors and  correlation coefficients with respect to an unknown target, Geophys. Res. Lett., 41, 6229–6236, <a href="https://doi.org/10.1002/2014GL061322" target="_blank">https://doi.org/10.1002/2014GL061322</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>McErlich et al.(2021)</label><mixed-citation>
      
McErlich, C., McDonald, A., Schuddeboom, A., and Silber, I.: Comparing Satellite- and Ground-Based Observations of Cloud Occurrence Over High Southern Latitudes, J. Geophys. Res.-Atmos., 126, e2020JD033607, <a href="https://doi.org/10.1029/2020JD033607" target="_blank">https://doi.org/10.1029/2020JD033607</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>McGarry et al.(2021)</label><mixed-citation>
      
McGarry, J. F., Carabajal, C. C., Saba, J. L., Reese, A. R., Holland, S. T.,  Palm, S. P., Swinski, J.-P. A., Golder, J. E., and Liiva, P. M.:  ICESat-2/ATLAS Onboard Flight Science Receiver Algorithms: Purpose, Process, and Performance, Earth and Space Science, 8, e2020EA001235, <a href="https://doi.org/10.1029/2020EA001235" target="_blank">https://doi.org/10.1029/2020EA001235</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>Mona et al.(2009)</label><mixed-citation>
      
Mona, L., Pappalardo, G., Amodeo, A., D'Amico, G., Madonna, F., Boselli, A., Giunta, A., Russo, F., and Cuomo, V.: One year of CNR-IMAA multi-wavelength Raman lidar measurements in coincidence with CALIPSO overpasses: Level 1 products comparison, Atmos. Chem. Phys., 9, 7213–7228, <a href="https://doi.org/10.5194/acp-9-7213-2009" target="_blank">https://doi.org/10.5194/acp-9-7213-2009</a>, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>Nearing et al.(2017)</label><mixed-citation>
      
Nearing, G. S., Yatheendradas, S., Crow, W. T., Bosch, D. D., Cosh, M. H.,  Goodrich, D. C., Seyfried, M. S., and Starks, P. J.: Nonparametric triple  collocation, Water Resour. Res., 53, 5516–5530,  <a href="https://doi.org/10.1002/2017WR020359" target="_blank">https://doi.org/10.1002/2017WR020359</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>Neumann et al.(2019)</label><mixed-citation>
      
Neumann, T. A., Martino, A. J., Markus, T., Bae, S., Bock, M. R., Brenner,  A. C., Brunt, K. M., Cavanaugh, J., Fernandes, S. T., Hancock, D. W.,  Harbeck, K., Lee, J., Kurtz, N. T., Luers, P. J., Luthcke, S. B., Magruder,  L., Pennington, T. A., Ramos-Izquierdo, L., Rebold, T., Skoog, J., and  Thomas, T. C.: The Ice, Cloud, and Land Elevation Satellite – 2  mission: A global geolocated photon product derived from the Advanced Topographic Laser Altimeter System, Remote Sens. Environ., 233, 111325, <a href="https://doi.org/10.1016/j.rse.2019.111325" target="_blank">https://doi.org/10.1016/j.rse.2019.111325</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>Palm et al.(2023)</label><mixed-citation>
      
Palm, S., Yang, Y., Herzfeld, U., Hancock, D., Barbieri, K., Wimert, J., and  the ICESat-2 Science Team: ATLAS/icesat-2 L3A calibrated backscatter profiles and atmospheric layer characteristics, Version 6, NASA National Snow and Ice Data Center Distributed Active Archive Center (NSIDC) [data set], Boulder, Colorado USA, <a href="https://doi.org/10.5067/ATLAS/ATL09.006" target="_blank">https://doi.org/10.5067/ATLAS/ATL09.006</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>Palm et al.(2021)</label><mixed-citation>
      
Palm, S. P., Yang, Y., Herzfeld, U., Hancock, D., Hayes, A., Selmer, P., Hart,  W., and Hlavka, D.: ICESat-2 Atmospheric Channel Description, Data Processing and First Results, Earth and Space Science, 8, e2020EA001470, <a href="https://doi.org/10.1029/2020EA001470" target="_blank">https://doi.org/10.1029/2020EA001470</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>Papagiannopoulos et al.(2016)</label><mixed-citation>
      
Papagiannopoulos, N., Mona, L., Alados-Arboledas, L., Amiridis, V., Baars, H., Binietoglou, I., Bortoli, D., D'Amico, G., Giunta, A., Guerrero-Rascado, J. L., Schwarz, A., Pereira, S., Spinelli, N., Wandinger, U., Wang, X., and Pappalardo, G.: CALIPSO climatological products: evaluation and suggestions from EARLINET, Atmos. Chem. Phys., 16, 2341–2357, <a href="https://doi.org/10.5194/acp-16-2341-2016" target="_blank">https://doi.org/10.5194/acp-16-2341-2016</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>Pappalardo et al.(2010)</label><mixed-citation>
      
Pappalardo, G., Wandinger, U., Mona, L., Hiebsch, A., Mattis, I., Amodeo, A.,  Ansmann, A., Seifert, P., Linné, H., Apituley, A., Alados Arboledas, L.,  Balis, D., Chaikovsky, A., D'Amico, G., De Tomasi, F., Freudenthaler, V.,  Giannakaki, E., Giunta, A., Grigorov, I., Iarlori, M., Madonna, F., Mamouri,  R.-E., Nasti, L., Papayannis, A., Pietruczuk, A., Pujadas, M., Rizi, V.,  Rocadenbosch, F., Russo, F., Schnell, F., Spinelli, N., Wang, X., and  Wiegner, M.: EARLINET correlative measurements for CALIPSO: First intercomparison results, J. Geophys. Res.-Atmos., 115,  <a href="https://doi.org/10.1029/2009JD012147" target="_blank">https://doi.org/10.1029/2009JD012147</a>, 2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>Pauly et al.(2019)</label><mixed-citation>
      
Pauly, R. M., Yorks, J. E., Hlavka, D. L., McGill, M. J., Amiridis, V., Palm, S. P., Rodier, S. D., Vaughan, M. A., Selmer, P. A., Kupchock, A. W., Baars, H., and Gialitaki, A.: Cloud-Aerosol Transport System (CATS) 1064 nm calibration and validation, Atmos. Meas. Tech., 12, 6241–6258, <a href="https://doi.org/10.5194/amt-12-6241-2019" target="_blank">https://doi.org/10.5194/amt-12-6241-2019</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>Polyanskiy and Wu(2024)</label><mixed-citation>
      
Polyanskiy, Y. and Wu, Y.: Information Theory: From Coding to Learning,  Cambridge University Press, 1st edn., <a href="https://doi.org/10.1017/9781108966351" target="_blank">https://doi.org/10.1017/9781108966351</a>, ISBN&thinsp;978-1-108-96635-1, ISBN&thinsp;978-1-108-83290-8, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>Proestakis et al.(2019)</label><mixed-citation>
      
Proestakis, E., Amiridis, V., Marinou, E., Binietoglou, I., Ansmann, A., Wandinger, U., Hofer, J., Yorks, J., Nowottnick, E., Makhmudov, A., Papayannis, A., Pietruczuk, A., Gialitaki, A., Apituley, A., Szkop, A., Muñoz Porcar, C., Bortoli, D., Dionisi, D., Althausen, D., Mamali, D., Balis, D., Nicolae, D., Tetoni, E., Liberti, G. L., Baars, H., Mattis, I., Stachlewska, I. S., Voudouri, K. A., Mona, L., Mylonaki, M., Perrone, M. R., Costa, M. J., Sicard, M., Papagiannopoulos, N., Siomos, N., Burlizzi, P., Pauly, R., Engelmann, R., Abdullaev, S., and Pappalardo, G.: EARLINET evaluation of the CATS Level 2 aerosol backscatter coefficient product, Atmos. Chem. Phys., 19, 11743–11764, <a href="https://doi.org/10.5194/acp-19-11743-2019" target="_blank">https://doi.org/10.5194/acp-19-11743-2019</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>Protat et al.(2009)</label><mixed-citation>
      
Protat, A., Bouniol, D., Delanoë, J., O'Connor, E., May, P. T.,  Plana-Fattori, A., Hasson, A., Görsdorf, U., and Heymsfield, A. J.:
Assessment of Cloudsat Reflectivity Measurements and Ice Cloud Properties Using Ground-Based and Airborne Cloud Radar Observations. J. Atmos. Ocean. Tech., 26, 1717–1741, <a href="https://doi.org/10.1175/2009JTECHA1246.1" target="_blank">https://doi.org/10.1175/2009JTECHA1246.1</a>, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>Protat et al.(2014a)</label><mixed-citation>
      
Protat, A., Young, S. A., McFarlane, S. A., L'Ecuyer, T., Mace, G. G.,  Comstock, J. M., Long, C. N., Berry, E., and Delanoë, J.: Reconciling Ground-Based and Space-Based Estimates of the Frequency of Occurrence and Radiative Effect of Clouds around Darwin, Australia, J. Appl. Meteorol. Clim., 53, 456–478, <a href="https://doi.org/10.1175/JAMC-D-13-072.1" target="_blank">https://doi.org/10.1175/JAMC-D-13-072.1</a>, 2014a.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>Protat et al.(2014b)</label><mixed-citation>
      
Protat, A., Young, S. A., Rikus, L., and Whimpey, M.: Evaluation of hydrometeor frequency of occurrence in a limited-area numerical weather prediction system using near real-time CloudSat–CALIPSO observations, Q. J.
Roy. Meteor. Soc., 140, 2430–2443, <a href="https://doi.org/10.1002/qj.2308" target="_blank">https://doi.org/10.1002/qj.2308</a>, 2014b.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>Robinson et al.(2025)</label><mixed-citation>
      
Robinson, J., Jaeglé, L., Palm, S. P., Shupe, M. D., Liston, G. E., and Frey,  M. M.: ICESat-2 Observations of Blowing Snow Over Arctic Sea Ice During the 2019–2020 MOSAiC Expedition, J. Geophys. Res.-Atmos., 130, e2025JD043919, <a href="https://doi.org/10.1029/2025JD043919" target="_blank">https://doi.org/10.1029/2025JD043919</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>Rodgers(2008)</label><mixed-citation>
      
Rodgers, C. D.: Inverse methods for atmospheric sounding: theory and practice, in: Series on atmospheric, oceanic and planetary physics – Vol. 2, World Scientific, Singapore, reprinted edition, ISBN&thinsp;978-981-02-2740-1, 2008.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>Roebeling et al.(2008)</label><mixed-citation>
      
Roebeling, R. A., Deneke, H. M., and Feijt, A. J.: Validation of Cloud Liquid Water Path Retrievals from SEVIRI Using One Year of CloudNET Observations, J. Appl. Meteorol. Clim., 47, 206–222, <a href="https://doi.org/10.1175/2007JAMC1661.1" target="_blank">https://doi.org/10.1175/2007JAMC1661.1</a>, 2008.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>Sayer et al.(2020)</label><mixed-citation>
      
Sayer, A. M., Govaerts, Y., Kolmonen, P., Lipponen, A., Luffarelli, M., Mielonen, T., Patadia, F., Popp, T., Povey, A. C., Stebel, K., and Witek, M. L.: A review and framework for the evaluation of pixel-level uncertainty estimates in satellite aerosol remote sensing, Atmos. Meas. Tech., 13, 373–404, <a href="https://doi.org/10.5194/amt-13-373-2020" target="_blank">https://doi.org/10.5194/amt-13-373-2020</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>Schuster et al.(2012)</label><mixed-citation>
      
Schuster, G. L., Vaughan, M., MacDonnell, D., Su, W., Winker, D., Dubovik, O., Lapyonok, T., and Trepte, C.: Comparison of CALIPSO aerosol optical depth retrievals to AERONET measurements, and a climatology for the lidar ratio of dust, Atmos. Chem. Phys., 12, 7431–7452, <a href="https://doi.org/10.5194/acp-12-7431-2012" target="_blank">https://doi.org/10.5194/acp-12-7431-2012</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>Schutgens et al.(2017)</label><mixed-citation>
      
Schutgens, N., Tsyro, S., Gryspeerdt, E., Goto, D., Weigum, N., Schulz, M., and Stier, P.: On the spatio-temporal representativeness of observations, Atmos. Chem. Phys., 17, 9761–9780, <a href="https://doi.org/10.5194/acp-17-9761-2017" target="_blank">https://doi.org/10.5194/acp-17-9761-2017</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>Schölzel and Friederichs(2008)</label><mixed-citation>
      
Schölzel, C. and Friederichs, P.: Multivariate non-normally distributed random variables in climate research – introduction to the copula approach, Nonlin. Processes Geophys., 15, 761–772, <a href="https://doi.org/10.5194/npg-15-761-2008" target="_blank">https://doi.org/10.5194/npg-15-761-2008</a>, 2008.


    </mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>Shannon(1948)</label><mixed-citation>
      
Shannon, C. E.: A mathematical theory of communication, Bell Syst. Tech. J., 27, 379–423, <a href="https://doi.org/10.1002/j.1538-7305.1948.tb01338.x" target="_blank">https://doi.org/10.1002/j.1538-7305.1948.tb01338.x</a>, 1948.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>Shupe et al.(2011)Shupe, Walden, Eloranta, Uttal, Campbell,
Starkweather, and Shiobara</label><mixed-citation>
      
Shupe, M. D., Walden, V. P., Eloranta, E., Uttal, T., Campbell, J. R.,  Starkweather, S. M., and Shiobara, M.: Clouds at Arctic Atmospheric Observatories. Part I: Occurrence and Macrophysical Properties, J. Appl. Meteorol. Clim., 50, 626–644, <a href="https://doi.org/10.1175/2010JAMC2467.1" target="_blank">https://doi.org/10.1175/2010JAMC2467.1</a>, 2011.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>Silber et al.(2018)</label><mixed-citation>
      
Silber, I., Verlinde, J., Eloranta, E. W., and Cadeddu, M.: Antarctic Cloud Macrophysical, Thermodynamic Phase, and Atmospheric Inversion Coupling Properties at McMurdo Station: I. Principal Data Processing and Climatology, J. Geophys. Res.-Atmos., 123, 6099–6121, <a href="https://doi.org/10.1029/2018JD028279" target="_blank">https://doi.org/10.1029/2018JD028279</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>Soch et al.(2025)</label><mixed-citation>
      
Soch, J., The Book of Statistical Proofs, Sarıtaş, K., Maja, Monticone, P.,  Faulkenberry, T. J., Martin, O. A., Kipnis, A., Balkus, S., lfkdlfdlk,  Allefeld, C., Atze, H., Knapp, A., McInerney, C. D., Lo4ding00, Ohan, V.,  amvosk, and maxgrozo: StatProofBook/StatProofBook.github.io: StatProofBook 2024, Zenodo [code], <a href="https://doi.org/10.5281/zenodo.4305949" target="_blank">https://doi.org/10.5281/zenodo.4305949</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib58"><label>Stone(2022)</label><mixed-citation>
      
Stone, J. V.: Information theory: a tutorial introduction, Sebtel Press,  Sheffield, United Kingdom, 2nd edn., ISBN&thinsp;978-1-7396727-0-6, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib59"><label>Tukiainen et al.(2020)</label><mixed-citation>
      
Tukiainen, S., O'Connor, E., and Korpinen, A.: CloudnetPy: A Python package for processing cloud remote sensing data, Journal of Open Source Software, 5, 2123, <a href="https://doi.org/10.21105/joss.02123" target="_blank">https://doi.org/10.21105/joss.02123</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib60"><label>Verhoelst et al.(2015)</label><mixed-citation>
      
Verhoelst, T., Granville, J., Hendrick, F., Köhler, U., Lerot, C., Pommereau, J.-P., Redondas, A., Van Roozendael, M., and Lambert, J.-C.: Metrology of ground-based satellite validation: co-location mismatch and smoothing issues of total ozone comparisons, Atmos. Meas. Tech., 8, 5039–5062, <a href="https://doi.org/10.5194/amt-8-5039-2015" target="_blank">https://doi.org/10.5194/amt-8-5039-2015</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib61"><label>Verhoelst et al.(2026)</label><mixed-citation>
      
Verhoelst, T., Povey, A. C., Gruber, A., Bulgin, C. E., Keppens, A.,  Compernolle, S., and Lambert, J.-C.: Confidently Uncertain: Validating Satellite ECV Measurement Uncertainty Estimates, Surv. Geophys., <a href="https://doi.org/10.1007/s10712-026-09939-6" target="_blank">https://doi.org/10.1007/s10712-026-09939-6</a>, 2026.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib62"><label>Virtanen et al.(2018)</label><mixed-citation>
      
Virtanen, T. H., Kolmonen, P., Sogacheva, L., Rodríguez, E., Saponaro, G., and de Leeuw, G.: Collocation mismatch uncertainties in satellite aerosol retrieval validation, Atmos. Meas. Tech., 11, 925–938, <a href="https://doi.org/10.5194/amt-11-925-2018" target="_blank">https://doi.org/10.5194/amt-11-925-2018</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib63"><label>von Clarmann(2006)</label><mixed-citation>
      
von Clarmann, T.: Validation of remotely sensed profiles of atmospheric state variables: strategies and terminology, Atmos. Chem. Phys., 6, 4311–4320, <a href="https://doi.org/10.5194/acp-6-4311-2006" target="_blank">https://doi.org/10.5194/acp-6-4311-2006</a>, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib64"><label>Wang et al.(2024)</label><mixed-citation>
      
Wang, P., Donovan, D. P., van Zadelhoff, G.-J., de Kloe, J., Huber, D., and Reissig, K.: Evaluation of Aeolus feature mask and particle extinction coefficient profile products using CALIPSO data, Atmos. Meas. Tech., 17, 5935–5955, <a href="https://doi.org/10.5194/amt-17-5935-2024" target="_blank">https://doi.org/10.5194/amt-17-5935-2024</a>, 2024.

    </mixed-citation></ref-html>--></article>
