Ground-based validation of the Copernicus Sentinel-5p TROPOMI NO2 measurements with the NDACC ZSL-DOAS, MAX-DOAS and Pandonia global networks

. This paper reports on consolidated ground-based validation results of the atmospheric NO 2 data produced op-erationally since April 2018 by the TROPOspheric Monitoring Instrument (TROPOMI) on board of the ESA/EU Copernicus Sentinel-5 Precursor (S5P) satellite. Tropospheric, stratospheric, and total NO 2 column data from S5P are compared to correlative measurements collected from, respectively, 19 Multi-Axis Differential Optical Absorption Spectroscopy (MAX-DOAS), 26 Network for the Detection of Atmospheric Composition Change (NDACC) Zenith-Scattered-Light DOAS (ZSL-DOAS), and 25 Pandonia Global Network (PGN)/Pandora instruments distributed globally. The validation methodology gives special care to minimizing mismatch errors due to imperfect spatio-temporal co-location of the satellite and correlative data, e.g. by using tailored observation operators to account for differences in smoothing and in sampling of atmospheric struc-tures and variability and photochemical modelling to reduce diurnal cycle effects. Compared to the ground-based measurements, S5P data show, on average, (i) a negative bias for the tropospheric column data, of typically − 23 % to − 37 % in clean to slightly polluted conditions but reaching values as high as − 51 % over highly polluted areas; (ii) a slight negative median difference for the stratospheric column data, of − 0 . 2 Pmolec cm − 2 , i.e. approx. − 2 % in 15 % in winter; a bias ranging from zero to − 50 % for the total column data, found to depend on the amplitude of the total NO 2 column, with small to slightly positive bias values for columns below 6 Pmolec cm − 2 and negative values


Introduction
Nitrogen oxides, and in particular the NO x (NO and NO 2 ), are important trace gases both in the troposphere and the stratosphere. In the troposphere they are produced mainly by the combustion of fossil and other organic fuels and by the production and use of nitrogen fertilizers for agriculture. They can also have a natural origin, e.g. lightning, biological processes in soils, and biomass burning. The NO/NO 2 ratio varies with solar illumination primarily, from 0.2-0.5 during the day down to zero at night. NO x are converted to nitric acid and nitrates, which are removed by dry deposition and rain, resulting in a tropospheric lifetime of a few hours to days. Tropospheric NO x are pollutants as well as proxies for other pollutants resulting from the (high-temperature) combustion of organic fuels. They are precursors for tropospheric ozone and aerosols and contribute to acid rain and smog. Because of their adverse health effects, local to national regulations limiting boundary layer NO x concentrations are now in place in a long list of countries across the world. In the stratosphere, NO x are formed by the photolysis of tropospheric nitrous oxide (N 2 O) produced by biogenic and anthropogenic processes and going up through the troposphere and stratosphere. Stratospheric NO x control the abundance of ozone as a catalyst in ozone destruction processes but also by mitigating ozone losses caused by catalytic cycles involving anthropogenic halogens through the lock-up of these halogens in so-called long-lived reservoirs.
The global distribution, cycles, and trends of atmospheric NO 2 have been measured from space by a large number of instruments on low Earth orbit (LEO) satellites. Since the late 1970s, its stratospheric and sometimes mesospheric abundance have been measured by limb-viewing and solaroccultation instruments working in the UV-visible and infrared spectral ranges: SME, LIMS, SAGE(-II), HALOE, and POAM-2/POAM-3, etc. and, in the last decade, OSIRIS, GOMOS, MIPAS, SCIAMACHY, Scisat ACE, and SAGE-III. Follow-on missions combining limb and occultation measurements are in development, like ALTIUS planned for Atmos. Meas. Tech., 14, 481-510, 2021 https://doi.org/10.5194/amt-14-481-2021 the coming years. Pioneered in 1995 with ERS-2 GOME , which for the first time brought NO 2 column measurements into space by Differential Optical Absorption Spectroscopy (DOAS; Noxon et al., 1979;Platt and Perner, 1983), the global monitoring of tropospheric NO 2 has continued uninterruptedly with a suite of UV-visible DOAS instruments with improving sensitivity and horizontal resolution: Envisat SCIAMACHY (Bovensmann et al., 1999), EOS-Aura OMI (Levelt et al., 2018), and the series of MetOp-A/B/C GOME-2 (Valks et al., 2011;Liu et al., 2019b).
Owing to its cardinal role in air quality, tropospheric chemistry, and stratospheric ozone, and as a precursor of essential climate variables (ECVs), the monitoring of atmospheric NO 2 on a global scale has been given proper attention in the European Earth Observation programme Copernicus. The Copernicus Space Component (CSC) is developing a constellation of atmospheric composition Sentinel satellites with complementary NO 2 measurement capabilities, consisting of Sentinel-4 geostationary missions (with hourly monitoring over Europe) and Sentinel-5 LEO missions (with daily monitoring globally), to be launched from 2023 onwards. A NO 2 measurement channel is also planned for the Copernicus Carbon Dioxide Monitoring mission CO2M for better attribution of the atmospheric emissions. The first element in orbit of this LEO+GEO constellation, the TROPOspheric Monitoring Instrument (TROPOMI), was launched on board of ESA's Sentinel-5 Precursor (S5P) early-afternoon LEO satellite in October 2017. This hyperspectral imaging spectrometer measures the Earth's radiance, at 0.2-0.4 nm resolution in the visible absorption band of NO 2 , over ground pixels as small as 7.0 × 3.5 or 5.5 × 3.5 km (before and after the switch to smaller pixel size on 6 August 2019, respectively) and with an almost daily global coverage thanks to a swath width of 2600 km.
Pre-launch mission requirements for the Copernicus Sentinel NO 2 data are, for the tropospheric NO 2 column, a bias lower than 50 % and an uncertainty lower than 0.7 Pmolec cm −2 , and for the stratospheric NO 2 column, a bias lower than 10 % and an uncertainty lower than 0.5 Pmolec cm −2 (ESA, 2017a, b). Since the beginning of its nominal operation in April 2018, in-flight compliance of S5P TROPOMI with these mission requirements has been monitored routinely by means of comparisons to groundbased reference measurements in the Validation Data Analysis Facility (VDAF) of the S5P Mission Performance Centre (MPC) and by comparison with similar satellite data from OMI and GOME-2. The Copernicus S5P MPC routine operations validation service is complemented with ground-based validation studies carried out in the framework of ESA's S5P Validation Team (S5PVT) through research projects funded nationally like NIDFORVAL (see details in the Acknowledgements). Ground-based validation of satellite NO 2 data (e.g. Petritoli et al., 2003;Brinksma et al., 2008;Celarier et al., 2008;Ionov et al., 2008;Valks et al., 2011;Comper-Figure 1. Geographical distribution of the UV-visible DOAS spectrometers contributing the ground-based correlative measurements: 26 NDACC ZSL-DOAS instruments in green, 19 MAX-DOAS instruments in blue, and 25 PGN instruments in red. nolle et al., 2020b;Pinardi et al., 2020) relies classically on three types of UV-visible DOAS instruments, which, thanks to complementary measurement techniques, provide correlative observations sensitive to the three components of the S5P data product: Multi-Axis Differential Optical Absorption Spectroscopy (MAX-DOAS) measures the tropospheric column during the day, Zenith-Scattered-Light DOAS (ZSL-DOAS) the stratospheric column at dawn and dusk, and Pandora direct Sun instruments the total column during the day, respectively. Currently, these three types of instruments contribute to global monitoring networks. Figure 1 shows the geographical distribution of instruments contributing data to the reported S5P validation study.
In this paper, we report on the consolidated results of the S5P NO 2 ground-based validation activities for the first 2 years of nominal operation. The TROPOMI tropospheric, stratospheric, and total column data products under investigation, together with the corresponding ground-based reference data, are described in Sect. 2. This is followed by a brief assessment of the coherence between the data generated by the near-real-time (NRTI) and offline (OFFL) channels of the operational processors. For clarity, in separate sections we present results for the stratospheric (Sect. 4), tropospheric (Sect. 5), and total (Sect. 6) NO 2 columns. These three sections include a description of the preparation of the filtered, co-located, and harmonized data pairs to be compared and the comparison results. Robust, harmonized statistical estimators are derived from the comparisons consistently throughout the paper: the median difference as a proxy for the bias and half of the 68 % interpercentile (IP68/2) as a measure of the comparison spread (equivalent to a standard deviation for a normal distribution but much less sensitive to unavoidable outliers). Thereafter, in Sect. 7, these individual results are assembled and discussed all together, to de-rive conclusions on their mutual coherence, on the fitness for purpose of the S5P data, and on remaining challenges for the accurate validation of NO 2 observations from space.
2 Data description 2.1 S5P TROPOMI data The retrieval of NO 2 (sub)columns from TROPOMI Earth nadir radiance and solar irradiance spectra is a threestep process relying on DOAS and on a chemical transport model (CTM)-based stratosphere-troposphere separation. The TROPOMI NO 2 algorithm is an adaptation of the QA4ECV community retrieval approach  and of the DOMINO/TEMIS algorithm (Boersma et al., 2007, already applied successfully to heritage and current satellite data records (GOME, SCIAMACHY, OMI, GOME-2). In the first step, the integrated amount of NO 2 along the optical path, or slant column density (SCD), is derived using the classical DOAS approach (Platt and Perner, 1983). In the second step, the retrieved SCD is assimilated by the TM5-MP CTM to allocate a vertical profile of the NO 2 concentration, needed for the separation between stratospheric and tropospheric SCDs. This assimilation procedure favours observations over pristine, remote areas where the entire NO 2 SCD can be attributed to the stratospheric component. Assuming relatively slow changes in the stratospheric NO x field, the model transports information to areas with a more significant tropospheric component. In the third step, the three slant (sub)column densities are converted into vertical (sub)column densities using appropriate air mass factors (AMFs). The CTM can be run either in forecast mode, using 1 d forecast meteorological data from the European Centre for Medium-Range Weather Forecasts (ECMWF), or in a more delayed processing mode, using 0-12 h forecast meteorological data. The former is used for near-real-time (NRTI) processing of the TROPOMI measurements, the latter for the offline (OFFL) production. For full technical details, the reader is referred to the Product Readme File (PRF), Product User Manual (PUM), and Algorithm Theoretical Basis Document (ATBD), all available at http://www.tropomi.eu/ data-products/nitrogen-dioxide (last access: 5 January 2021). A detailed description and quality assessment of the derived slant column data have already been published by van Geffen et al. (2020), and a publication on satellite intercomparison of vertical column data is under preparation . The current paper addresses the independent ground-based validation of vertical subcolumn densities in the troposphere and stratosphere and of the vertical total column. The S5P dataset validated here covers the nominal operational phase (Phase E2) of the S5P mission, starting in April 2018 and up to February 2020. No data obtained during the commissioning phase of the satellite have been used. Table 1 provides an overview of the processor versions to which this corresponds. Besides very detailed quality flags, the S5P NO 2 data product includes a combined quality assurance value (qa_value) enabling end users to easily filter data for their own purpose. For tropospheric applications (when not using the averaging kernels), the guideline is to use only NO 2 data with a qa_value > 0.75. This removes very cloudy scenes (cloud radiance fraction > 0.5), snow-or ice-covered scenes, and problematic retrievals. For stratospheric applications, where clouds are less of an issue, a more relaxed threshold of qa_value > 0.5 is recommended. These data filtering recommendations have been applied here, where the stricter requirement of qa_value > 0.75 has been used for the total column validation as well. Again, further details on this can be found in the PRF, PUM, and ATBD.

NDACC zenith-sky DOAS data
Since the pioneering ages of NO 2 column measurements from space with ERS-2 GOME in the mid-1990s, groundbased UV-visible DOAS measurements at twilight have served as a reference for the validation of NO 2 total column data over unpolluted stations and of NO 2 stratospheric column data from all nadir UV-visible satellites to date (e.g. Lambert et al., 1997a, b;Petritoli et al., 2003;Celarier et al., 2008;Ionov et al., 2008;Gruzdev and Elokhov, 2010 Hendrick et al., 2011;Robles-Gonzalez et al., 2016). Here as well, S5P TROPOMI stratospheric NO 2 column data are compared to the correlative measurements acquired by ZSL-DOAS (Zenith-Scattered-Light Differential Optical Absorption Spectroscopy) UV-visible spectrometers (e.g. Solomon et al., 1987;Hendrick et al., 2011, and references therein). A key property of zenith-sky measurements at twilight is the geometrical enhancement of the optical path in the stratosphere (Solomon et al., 1987), which offers high sensitivity to stratospheric absorbers of visible radiation and lower sensitivity to clouds and tropospheric species (except in the case of strong pollution events during thunderstorms or thick haze; see, for example, Pfeilsticker et al., 1999). However, the geometrical enhancement also implies horizontal smoothing of the measured information over hundreds of kilometres, which requires appropriate co-location methods to avoid large discrepancies with the higher resolution measurements of TROPOMI, as discussed in Sect.  (Pommereau and Goutail, 1988). For the current work, ZSL-DOAS validation data have been obtained: (1) through the LATMOS_RT Facility (in near-real-time processing mode), (2) from the NDACC Data Host Facility (DHF), and (3) via private communication with the instrument operator. The geographical distribution of these instruments is shown in Fig. 1, and further details are provided in Sect. A1. Measurements are made during twilight, at sunrise, and at sunset, but only sunset measurements are used here for signal-to-noise reasons (larger NO 2 column) and as these happen closer in time to the early-afternoon overpass of S5P. NDACC intercomparison campaigns (Roscoe et al., 1999;Vandaele et al., 2005) conclude an uncertainty of about 4 %-7 % on the slant column density. After conversion of the slant column into a vertical column using a zenith-sky AMF, and for the latest version of the data processing, the uncertainty on the vertical column is estimated to be on the order of 10 %-14 % Bognar et al., 2019). Estimated uncertainties for all ground-based measurement types are summarized in Table 2. In Sect. 4.1, the photochemical adjustment required to correctly compare twilight with midday measurements is described.

MAX-DOAS data
Satellite tropospheric NO 2 column data are compared classically to correlative measurements acquired by Multi-Axis Differential Optical Absorption Spectroscopy (MAX-DOAS) instruments (Hönninger and Platt, 2002;Hönninger et al., 2004;. From sunrise to sunset, MAX-DOAS instruments measure the UV-visible radiance scattered in several directions and elevation angles, from which the tropospheric vertical column density (VCD) and/or the lowest part of the tropospheric NO 2 profile (usually up to 3 km altitude, and up to 10 km at best) can be retrieved through different techniques (see, for example, Clémer et al., 2010;Hendrick et al., 2014;Friedrich et al., 2019;Bösch et al., 2018;Irie et al., 2008Irie et al., , 2011Vlemmix et al., 2010;Wagner et al., 2011;Beirle et al., 2019), with between 1 and 3 degrees of freedom. Their horizontal spatial representativeness varies with the aerosol load and the spectral region of the retrieval, from a few kilometres to tens of kilometres (Irie et al., 2011;Wagner et al., 2011;Wang et al., 2014). Published total uncertainty estimates on the NO 2 tropospheric VCD are of the order of 7 %-17 % in polluted conditions, including both random (around 3 % to 10 %, depending on the instrument) and systematic (11 % to 14 %) contributions (Irie et al., 2011;Wagner et al., 2011;Hendrick et al., 2014;Kanaya et al., 2014). These ranges are more or less confirmed by the uncertainties reported in the data files, as visualized in Fig. A1. Nevertheless, differences in the reported uncertainties and in the actual measurement of the same scene between individual instruments are sometimes larger, and the main potential sources of these inhomogeneities are summarized below: -Different uncertainty reporting strategy. The reported systematic uncertainty may include only that from the NO 2 cross sections (approx. 3 %; UNAM, BIRA-IASB, MPIC, AUTH, IUPB), or it may include also a contribution from the VCD retrieval step (up to 14 % in JAM-STEC data and 20 % in KNMI data) and the aerosol retrieval (Chiba U; Irie et al., 2011).
-Different SCD retrieval. Recommended common DOAS settings are used by all groups in the present study, and when doing so, instrument intercomparison campaigns like CINDI-1 and CINDI-2 (Roscoe et al., 2010;Kreher et al., 2020) revealed relative biases between 3 % and 10 % in the differential slant column density (DSCD).
-Different methods to retrieve VCD from DSCD (see also  Table A2). Using either (1) vertical profile inversion using optimal estimation (BIRA-IASB, UNAM); (2) profile inversion using (an optimal estimation of) parameterized profile shapes (JAMSTEC and Chiba U); (3) direct retrieval via the calculation of a tropospheric AMF (QA4ECV datasets); or (4) direct retrieval using a geometrical approximation can lead to systematic differences in the 5 %-15 % range (Vlemmix et al., 2015;Frieß et al., 2019). Table 2. Estimated uncertainties for the different types of ground-based measurements used in this work. Ex ante refers to uncertainties provided with the data, based on a propagation of raw measurement uncertainties and on sensitivity analyses. Ex post refers to uncertainty estimates derived by comparison with other (independent) measurements, which inevitably also contain some representativeness uncertainties. More detail is provided in the dedicated subsections of Sect. 2. certainty in polluted conditions. Ongoing efforts to harmonize MAX-DOAS tropospheric NO 2 data processing, e.g. as part of the ESA FRM4DOAS project, should help minimize such network inhomogeneities in the near future. MAX-DOAS data have been used extensively for tropospheric NO 2 satellite validation, for instance for Aura OMI and MetOp GOME-2 (e.g. by Celarier et al., 2008;Irie et al., 2012;Lin et al., 2014;Kanaya et al., 2014;Wang et al., 2017;Drosoglou et al., 2018;Liu et al., 2019a;Compernolle et al., 2020b;Pinardi et al., 2020), as well as for the evaluation of modelling results (Vlemmix et al., 2015;Blechschmidt et al., 2020).

Instrument
Data are collected either through ESA's Atmospheric Validation Data Centre (EVDC; https://evdc.esa.int/, last access: 5 January 2021) or by direct delivery from the instrument principal investigators (e.g. within the S5PVT NIDFORVAL AO project). Currently, 19 MAX-DOAS stations have contributed correlative data in the TROPOMI measurement period from April 2018 to February 2020. Detailed information about the stations and instruments is provided in Sect. A2. A few contributing sites measure in several geometries (e.g. Xianghe measure in both MAX-DOAS and direct Sun mode; Bremen and Athens both report MAX-DOAS and zenith-sky measurements) or have multiple instruments (e.g. Cabauw and UNAM stations host both MAX-DOAS and Pandora instruments). This allows for detailed (sub)column consistency checks and in-depth analysis of the site peculiarities, beyond the scope of the present overview paper.

PGN/Pandora data
The Pandonia Global Network (PGN) delivers direct Sun total column and multi-axis tropospheric column observations of several trace gases, including NO 2 , from a network of ground-based standardized Pandora Sun photometers in an automated way. In this work, only direct Sun observations are used. These have a random error uncertainty of about 0.27 Pmolec cm −2 and a systematic error uncertainty of 2.7 Pmolec cm −2 (Herman et al., 2009). Studies at US and Korean sites during the DISCOVER-AQ campaign found a good agreement of Pandora instruments with aircraft in situ measurements (within 20 % on average; Choi et al., 2019), although larger differences are observed for individual sites .
For the current work, 25 sites have contributed Pandora data, collected either from the ESA Atmospheric Validation Data Centre (EVDC) (https://evdc.esa.int/, last access: 5 January 2021) or from the PGN data archive (https:// pandonia-global-network.org/, last access: 5 January 2021). Only data files from a recent quality upgrade (processor version 1.7, retrieval version nvs1, with file version 004 and 005; see https://www.pandonia-global-network.org/ home/documents/release-notes/, last access: 5 January 2021) were used, with 005 files (consolidated data) having precedence over 004 files (rapid delivery data). The most important change with the previous data release is a more stringent quality filtering. A total of 17 sites have provided measurement data newer than 3 months.
Except at low Sun elevation, the footprint of these direct Sun measurements is much smaller than a TROPOMI pixel. Therefore, as is the case with MAX-DOAS, a significant horizontal smoothing difference error can be expected in the TROPOMI-Pandora comparison, especially in the case of tropospheric NO 2 gradients and when tropospheric NO 2 is the largest contributor to the total column. Three Pandora instruments (Altzomoni, Izaña, Mauna Loa) are located near the summit of a volcanic peak and are therefore not sensitive to the lower lying tropospheric NO 2 . In this work, their observations are compared to the TROPOMI stratospheric NO 2 data (see Sect. 4).

NO 2 cross section data
A potential source of inconsistencies between the different data products lies in the NO 2 cross sections that are used. An overview of the different choices made is provided in Table 3. Most products use the cross sections published by Vandaele et al. (1998), but there are differences in the choice of temperature at which to take the cross sections. The ZSL-DOAS measurements are processed with cross sections at a fixed 220 or 227 K, i.e. typical stratospheric temperatures. MAX-DOAS data are processed either with cross sections at room temperature (298 K, representing a typical tropospheric temperature) or using an orthogonalized set of cross sections at 298 and 220 K when both tropospheric and stratospheric slant columns are retrieved. As the scientific focus of the PGN up until processor version 1.7 (used for this study) was on measuring polluted conditions, i.e. in the presence of moderate to large tropospheric columns, the cross sections used in the processor are scaled to a fixed effective temperature of 254.4 K, which corresponds to the situation of approximately equal column amounts in the troposphere and stratosphere. The S5P retrievals use cross sections at 220 K but with an explicit correction for the temperature dependence of the NO 2 cross sections in the AMF: space-time co-located daily ECMWF temperature profile forecasts are used to compute a height-dependent AMF correction factor. The temperature sensitivity parameterized in this correction is approximately 0.32 % K −1 (Zara et al., 2017). A posteriori temperature correction of the ground-based data is beyond the scope of this paper, so it must be kept in mind that this may contribute to differences between S5P and ground-based columns. Specifically, we could expect a small seasonal cycle in the stratospheric column comparisons of a few percent due to the seasonal variation in stratospheric temperature not being accounted for in the ZSL-DOAS data processing. PGN columns may either be overestimated by up to 10 % when the column is mostly stratospheric or underestimated by a similar order of magnitude when large tropospheric amounts are present. The MAX-DOAS data may be biased in either direction by a few percent when tropospheric and/or stratospheric temperatures differ strongly from the 298 and 220 K default temperatures.
3 Mutual coherence between TROPOMI NRTI and OFFL As described in Sect. 2.1, the main difference between the NRTI and OFFL data processors lies in the use of either 1 d or 0-12 h forecast ECMWF meteorological data as input, which impacts the TM5-MP vertical NO 2 profiles. The mutual consistency between the NRTI and OFFL data products is monitored routinely using data and tools provided by the S5P MPC Level-2 Quality Control Portal (http://mpc-l2. tropomi.eu, last access: 5 January 2021). Figure 2 shows that, looking at global means of the NO 2 total column, the NRTI and OFFL data look very much alike, with NRTI column values on average 0.79 % larger than those obtained in OFFL. Eight NRTI and six OFFL processor versions are used in this comparison (as identified in Table 1). The activation of the successive processor versions and the switch to the smaller ground pixel size (on 6 August 2019) are marked by the yellow vertical lines. As expected, both NRTI and OFFL channels show NO 2 maxima in the winter/summer seasons (December, June) and minima near the equinoxes. The scatter also exhibits a seasonal cycle, with the largest values observed in the Northern Hemisphere winter season.
To further assess similarities and differences between the NRTI and OFFL processing channels, NO 2 values along individual orbits are also compared directly. An illustration is given in Fig. 3 for S5P orbit no. 07407, a randomly selected orbit crossing western Europe on a relatively cloud-free day (19 March 2019). Data were filtered to include only those pixels with a qa_value larger than 0.5 and were gridded to 1 • × 1 • before calculating the differences. The three maps of Fig. 3 show the difference between NRTI and OFFL values for the total, stratospheric, and tropospheric NO 2 column, respectively, together with the corresponding Pearson correlation coefficient and root-meansquare deviation (RMSD). While the correlation coefficient is high (typically around 0.98), the maps do reveal regions where significant deviations occur, up to ±0.5 Pmolec cm −2 between the NRTI and OFFL stratospheric columns and up to ±2 Pmolec cm −2 for both the tropospheric columns and the total columns. North-east of Iceland, NRTI-OFFL differences in stratospheric and in tropospheric columns are of opposite sign, while total column differences are minimal, indicating a different stratosphere-troposphere separation after the slant column retrieval leading. West of Norway, total columns differ significantly between NRTI and OFFL, and these differences are allocated mostly to the tropospheric columns. These features are specific to this particular orbit and not systematic. A more detailed investigation targeted solely at regions and times of significant deviations between NRTI and OFFL would be needed to better reveal the full benefit of the OFFL analysis, but that is beyond the scope of the current paper. What needs to be underlined is that the ground-based validation studies on which the present consolidated results are based upon do not yield significantly different conclusions for the two processing modes. Therefore, all results reported in this paper may be considered as applicable to the two processing channels.

Co-location and harmonization
To reduce mismatch errors due to the significant difference in horizontal sensitivity between S5P and ZSL-DOAS measure-ments, individual TROPOMI NO 2 stratospheric column data (in ground pixels at high horizontal sampling) are averaged over the much larger footprint of the air mass to which the ground-based zenith-sky measurement is sensitive; see Lambert et al. (1997bLambert et al. ( , 2012, Verhoelst et al. (2015), and Compernolle et al. (2020b) for details. The length of this footprint if of the order of 300-600 km in the direction of the Sun, and the width is typically of the order of 50-100 km at mid-latitudes, depending on the duration of sunrise and sunset. Note that, as the TROPOMI stratospheric column is a TM5 output, its true resolution is actually much lower than the pixel size. To account for effects of the photochemical diurnal cycle of stratospheric NO 2 , the ZSL-DOAS measurements at sunset are adjusted to the early-afternoon S5P overpass time using a model-based correction factor. The latter is calculated with the PSCBOX 1D stacked-box photochemical model (Errera and Fonteyn, 2001;Hendrick et al., 2004), initiated by daily fields from the SLIMCAT chemical transport  data co-located with ground-based SAOZ sunset measurements performed by CNRS/LATMOS at the NDACC mid-latitude station of Observatoire de Haute-Provence (France). The latter were adjusted for the photochemical difference between the S5P and twilight solar local times, while S5P data were averaged over the ground-based twilight air mass. Solid lines represent 2-month running medians. Scatter plot (b) and histogram of the differences (c) with several statistical measures of the agreement between data. model (CTM). The amplitude of the adjustment factor is sensitive to the effective solar zenith angle (SZA) assigned to the ZSL-DOAS measurements. It is assumed here to be 89.5 • or, during polar day and close to polar night, the largest or smallest SZA reached, respectively. This photochemical correction factor is an average based on 10 years of the box-model simulations, and the range of values over these 10 years can be considered an uncertainty estimate. It varies between 1 % and 6 % at the sites considered here, the uncertainty being largest at high latitudes in local winter. This does however not contain any model uncertainty (in the sense of the accuracy of the model in representing the true photochemical variation during the day). Another way to estimate the uncertainty in the adjusted ZSL-DOAS data is by comparing the agreement between sunrise and sunset measurements when both are photochemically adjusted to the S5P overpass time. This does also contain co-location mismatch uncertainty due to transport of air occurring during the period between sun-rise and sunset and due to the different air masses that are probed (east or west of the instrument respectively). Moreover, it also contains that part of the measurement uncertainty that is not systematic on a daily (or longer) timescale. We find that sunrise and sunset measurements typically agree within 6 % (standard deviation of the differences). Overall, the 10 %-14 % total uncertainty estimate already presented in Sect. 2.2 thus seems realistic. Figure 4 illustrates the comparison between TROPOMI and ground-based ZSL-DOAS SAOZ NO 2 data at the NDACC station at Observatoire de Haute-Provence (OHP) in southern France. The time series reveal a small negative median difference for TROPOMI, which is found to be a common feature across the network, but little seasonal structure. The correlation coefficient is excellent, and the histogram of the differences has an almost Gaussian shape. Comparison results for the entire ZSL-DOAS network are presented in Fig. 5. This figure reveals occasionally larger differences in more difficult co-location conditions (e.g. enhanced variability at the border of the polar vortex) but no impact of the TROPOMI pixel size change on 6 August 2019. The latter result must be interpreted with care as, for these comparisons, multiple TROPOMI pixels are averaged over the ZSL-DOAS observation operator before comparison (see Sect. 4.1), and as such any change in the noise statistics of individual pixels will be hidden.

Comparison results
Statistical estimators of the bias (median difference) and scatter per station are presented in box-and-whisker plots in Fig. 6 and in tabular form in Sect. A1. Across the network, S5P NRTI and OFFL stratospheric NO 2 column data are generally lower than the ground-based values by approximately 0.2 Pmolec cm −2 , with a station-station scatter of this median difference of similar magnitude (0.3 Pmolec cm −2 ). These numbers are within the mission requirement of a maximum bias of 10 % (equivalent to 0.2-0.4 Pmolec cm −2 , depending on latitude and season) and within the combined systemic uncertainty of the reference data and their model-based photochemical adjustment. The IP68/2 dispersion of the difference between TROPOMI stratospheric column and correlative data around their median value rarely exceeds 0.3 Pmolec cm −2 at sites without tropospheric pollution. When combining random errors in the satellite and reference measurements with irreducible co-location mismatch effects, it can be concluded that the random uncertainty on the S5P stratospheric column measurements falls within the mission requirements of max. 0.5 Pmolec cm −2 uncertainty.
The potential dependence of the TROPOMI stratospheric column bias and uncertainty on several influence quantities has been evaluated. Figure 7 shows results for the solar zenith angle (SZA), the fractional cloud cover (CF), and the surface albedo of the TROPOMI measurement. This evaluation does not reveal any variation of the bias much larger than 0.4 Pmolec cm −2 over the range of these influence quantities.

PGN measurements at high-altitude stations
Three of the PGN direct Sun instruments (see Sect. 6) are located near the summit of a volcanic peak: Altzomoni (3985 m a.m.s.l.) in the State of Mexico, Izaña (2360 m a.m.s.l.) on Mount Teide on the island of Tenerife, and Mauna Loa (4169 m a.m.s.l.) on the island of Hawaii. At these high-altitude sites, the total column measured by the ground-based direct Sun instrument misses most of the tropospheric (potentially polluted) part and as such becomes representative of the TROPOMI stratospheric column. These sites have therefore been added to Fig. 6, illustrating that these comparisons based on direct Sun data yield similar conclusions as those based on zenith-sky data, that is, a minor negative median difference of the order of −0.2 Pmolec cm −2 . It must be noted that, as discussed in Sect. 2.5, the PGN data are processed using cross sections at a temperature of 254.4 K, representative of a total column made of equal amounts of NO 2 in the stratosphere and troposphere. This leads to columns which are about 10 % larger than if they had been processed with cross sections for 220 K. Future processing of the PGN data will address this, and it is expected that this will mostly remove the apparent negative bias for Figure 6. Box-and-whisker plots summarizing from pole to pole the bias and spread of the difference between S5P TROPOMI NRTI and NDACC ZSL-DOAS NO 2 stratospheric columns (SAOZ data in black, other ZSL-DOAS in blue, and PGN in red). The median difference is represented by a vertical solid line inside the box, which marks the 25 % and 75 % quantiles. The whiskers cover the 9 %-91 % range of the differences. The shaded area represents the mission requirement of 0.5 Pmolec cm −2 for the uncertainty. Values between brackets in the labels denote the latitude of the station. TROPOMI (but lead to a slight inconsistency with the ZSL-DOAS results).

Co-location and harmonization
TROPOMI data are filtered following the qa_value > 0.75 rule as recommended in the associated PRF (see Sect. 2.1). Then for each day, the pixel over the site is selected. Figure 7. Dependence of the difference between TROPOMI NRTI and ground-based ZSL-DOAS stratospheric NO 2 column data on the satellite solar zenith angle (SZA), satellite cloud fraction, and satellite surface albedo, including a median and IP68/2 spread per bin (bin widths of 10 • in SZA, 0.05 in CF, and 0.1 in surface albedo). Different colours represent different stations, to illustrate the (modest) impact of station-station network inhomogeneity on these analyses. MAX-DOAS data series are temporally interpolated at the TROPOMI overpass time (only if data within ±1h exist), and daily comparisons are performed. This short temporal window avoids the need for a photochemical cycle adjustment. Details on the comparison approach are described in Pinardi et al. (2020) for the validation of OMI and GOME-2 NO 2 column data and in Compernolle et al. (2020b) for the validation of the OMI QA4ECV NO 2 Climate Data Record.

Comparison results
An illustration of the daily comparisons between TROPOMI and ground-based MAX-DOAS measurements between May 2018 and the end of January 2020 is presented in Fig. 8  for the Uccle station (Brussels, B, with moderate pollution levels). The two datasets have a correlation coefficient of 0.75 and a regression slope and intercept of 0.47 and 1.0 Pmolec cm −2 respectively. The (median and mean) difference of about −2.3 to −3.1 Pmolec cm −2 corresponds to a median relative difference of about −30 %.
Results for the entire MAX-DOAS network are presented in Fig. 9. This figure reveals mostly (but not only) negative differences, with a fairly significant variability but no clear seasonal features. No impact of the TROPOMI ground pixel size change on 6 August 2019 is observed.
Box-and-whisker plots for the whole network are shown in Fig. 10, with corresponding numeric values listed in Sect. A2. Based on measurements from these 19 MAX-DOAS stations, three different regimes can be identified: i. Small tropospheric NO 2 column values (median values below 2 Pmolec cm −2 ), e.g. at the Fukue and Phimai stations, lead to small differences. Typically, these stations show a small median bias (< 0.5 Pmolec cm −2 ), but this can still correspond to up to a −27 % relative bias. The dispersion (IP68/2) of the difference is smaller than 1 Pmolec cm −2 .
iii. Extremely polluted sites report larger differences. This is the case, for example, at the Mexican UNAM sites (UNAM and Vallejo in/close to Mexico City and Cuautitlan in a more remote part of the State of Mexico), with median tropospheric columns larger than 15 Pmolec cm −2 . These stations experience larger differences (> 10 Pmolec cm −2 , i.e. from −37 % to −74 %). The dispersion (IP68/2) of the difference is also quite large, between 4 and ∼ 12 Pmolec cm −2 . Results at these sites need deeper analysis.
The overall bias (median of all station median differences) is −2.4 Pmolec cm −2 , i.e. −37 %. The median dispersion is 3.5 Pmolec cm −2 , while the site-site dispersion (IP68/2 over all site medians) is 2.8 Pmolec cm −2 . Note that these network-averaged numbers are close to the numbers found for the polluted (Athens to Gucheng) sites. These results are within the mission requirement of a maximum bias of 50 %, but they exceed the uncertainty requirement of at most 0.7 Pmolec cm −2 , which is only satisfied for the clean sites' ensemble. A discussion on the causes of these biases and sometimes large comparisons' spread is provided in Sect. 7.
Two key influence quantities for observations of tropospheric NO 2 are aerosol optical depth (AOD) and cloud (radiance) fraction (CRF). The dependence of the differences between MAX-DOAS and TROPOMI tropospheric columns on these two influence quantities is visualized in Fig. 11. AOD is only retrieved in the processing of a handful of MAX-DOAS instruments, the others using climatological information, hence the limited subset in stations in panel (a) of this figure. No clear dependence of the bias on either property is seen, though in view of the relatively large scatter in these tropospheric column comparisons, this does not preclude more subtle dependencies. The impact of aerosol peak height would also be interesting to assess, but this is impossi-ble to judge within the scope of the current paper as no such information is readily available.
6 Total column validation 6.1 Filtering, co-location, and harmonization As was done for the tropospheric column validation in Sect. 5, only S5P pixels with a qa_value of at least 0.75 are retained. The so-called summed product is used, i.e. the total column computed as the stratospheric plus the tropospheric column values. This summed column differs from the total column product. Only Pandonia measurements with the highest quality label (0 and 10) are used. The average column value within a 1 h time interval, centred on the S5P overpass time, is used. As the NO/NO 2 ratio varies only slowly around the afternoon solar local time of the TROPOMI overpass, this small temporal window ensures no model-based adjustment is required. A 30 min time interval was tested as well, but this did not change the results significantly. Moreover, only TROPOMI pixels containing the station were considered. Figure 10. Same as Fig. 6 but now for the difference between S5P TROPOMI OFFL and MAX-DOAS NO 2 tropospheric columns and ordered as a function of the median ground-based tropospheric column (largest median VCD values on top). The line represents the median difference. Box bounds represent the 25 and 75 percentiles, while whiskers indicate the 9 and 91 percentiles. The shaded area corresponds to the mission requirement of a maximum bias of 50 %.

Comparison results
An example of a time series of co-located TROPOMI and PGN total column measurements, and their difference, is shown in Fig. 12.
Results for the entire PGN network are presented in Fig. 13. This figure reveals that the difference, even in relative units, depends strongly on the total NO 2 column, with low (or slightly positive) biases at low columns and markedly negative biases at high columns. No impact is observed for the TROPOMI ground pixel size switch of 6 August 2019.
Statistical estimators of the comparison results across the network are visualized in Fig. 14 and presented in tabular form in Table A3. One can distinguish roughly two different regimes.
The median relative difference is mostly within (or bordering) the ±10 % range for the sites with lower NO 2 median total column values (Alice Springs to New Brunswick; Canberra is an exception), while it is negative and mostly outside this range, but still within ±50 %, for the sites with higher NO 2 median total column value (Buenos Aires to UNAM).
It is however more useful to make the distinction between sites with low NO 2 (Alice Springs to New Brunswick) and high NO 2 (Buenos Aires to UNAM). For the low NO 2 sites, the overall bias is 0.1 Pmolec cm −2 (2 %), the overall dispersion is 1.1 Pmolec cm −2 , and the site-site dispersion is 0.2 Pmolec cm −2 . For the high NO 2 sites, the overall bias is −3.6 Pmolec cm −2 (−32 %), the overall dispersion is 3.3 Pmolec cm −2 , and the site-site dispersion is 1.4 Pmolec cm −2 .

Discussion and conclusions
A cross-network summary of the median difference and dispersion for the three S5P NO 2 (sub)column data is attempted in Table 4. While the difference between the NRTI and OFFL NO 2 values can reach up to a few Pmolec cm −2 for individual TROPOMI pixels, the two processing channels do not lead to significantly different validation results, and Table 4 therefore makes no distinction between the two.
For the stratospheric column, the general picture is a slight negative median difference of TROPOMI with re- Figure 11. Dependence of the difference between TROPOMI OFFL and ground-based MAX-DOAS tropospheric NO 2 column data on (a) the MAX-DOAS-retrieved aerosol optical depth (AOD; only available for a subset of the instruments) and (b) the satellite cloud radiance fraction (CRF). Table 4. Cross-network summary of the validation results: bias (median) and dispersion (IP68/2) of the difference with respect to the ground-based correlative measurements (median value over the stations).

Bias Dispersion
Stratosphere −0.2 Pmolec cm −2 ; −9 % 0.3 Pmolec cm −2 Troposphere -low NO 2 −0.3 Pmolec cm −2 ; −23 % 0.7 Pmolec cm −2 -high NO 2 −2 Pmolec cm −2 ; −37 % 3.4 Pmolec cm −2 -extreme NO 2 −12 Pmolec cm −2 ; −51 % 7 Pmolec cm −2 Total column -low NO 2 0.1 Pmolec cm −2 ; 2 % 1 Pmolec cm −2 -high NO 2 −3.6 Pmolec cm −2 ; −30 % 3 Pmolec cm −2 spect to the NDACC ZSL-DOAS network, of the order of -0.2 Pmolec cm −2 on average, with some station-station inhomogeneities and with larger differences in the highly variable conditions of the denoxified polar stratosphere in local winter. This median difference remains within the S5P mission requirements and is similar to the conclusions derived for similar satellite data from other sounders (e.g. Compernolle et al., 2020b). In view of the sources of systematic uncertainties in the different components of the comparison (satellite data, reference data, photochemical cycle adjustment, irreducible mismatch errors), this result is entirely within expectations. While comparisons to mountaintop PGN instruments confirm these values, using cross sections at a more appropriate (lower) temperature in the PGN data processing would lead to somewhat smaller columns and therefore a less significant negative median difference than that observed with Atmos. Meas. Tech., 14, 481-510, 2021 https://doi.org/10.5194/amt-14-481-2021 Figure 12. Same as Figs. 4 and 8 but now for the S5P OFFL total NO 2 column data co-located with ground-based Pandora measurements obtained at the PGN mid-latitude station of Boulder, Colorado.
respect to the ZSL-DOAS instruments. This probably reflects the true accuracy of the ground-based data, which should thus be taken to be of the order of ±10 % at best. For the tropospheric and total columns, averaging results over the networks with the hope of obtaining a meaningful global estimate is of limited use as the results depend strongly on the amount of tropospheric NO 2 . Overall, mission requirements in terms of bias are mostly met, the only exception being the tropospheric columns at extremely polluted sites, which have a bias on the threshold of 50 %. Nevertheless, it is clear that large negative median differences are observed across all sites experiencing significant tropospheric pollution. The dispersion of the difference is well outside of the mission requirements formulated for the tropospheric column data. Nevertheless, these results are consistent with those obtained with completely different validation techniques, such as those explored by Lorente et al. (2019) over Paris (using ground-based and Eiffel Tower NO 2 concentrations and a climatology of observed column-surface ratios). Many factors play a role in this apparent disagreement between TROPOMI and the ground-based networks, that can neither be attributed solely to the S5P data, nor to pure area-averaging differences.
First, local horizontal and vertical variations of the NO 2 field can explain (part of) such discrepancies, as illustrated in Chen et al. (2009, Compernolle et al. (2020b), and Dimitropoulou et al. (2020). While the MAX-DOAS picks up small local enhancements, the much larger satellite pixel provides a smoothed perception of the field. In particular for sounders with footprints (much) larger than the emission sources, this generally leads to underestimation in urban conditions while having better agreement in remote locations Kanaya et al., 2014;Pinardi et al., 2020). Dimitropoulou et al. (2020) showed specific improvements of the S5P NO 2 comparison results in the case of the Uccle MAX-DOAS when making use of the multiple azimuthal scan mode and when improving the S5P selection criteria to pixels along the MAX-DOAS field-of-view direction and within the effective sensitivity length. Large inhomogeneities around MAX-DOAS sites were also shown by Wang et al. (2014) taking some of these inhomogeneities into account in validation of other sounders, results have been improved . Judd et al. (2019) also showed the smoothing of the NO 2 field when resampling GeoTASO high-resolution airborne measurements to different simulated satellite pixel sizes. Second, vertical sensitivity (and thus averaging kernels) and a priori vertical profiles are known to be different for MAX-DOAS and nadir UV-visible satellite retrievals Liu et al., 2019b;Compernolle et al., 2020b), with MAX-DOAS measurements sensitive to layers close to the surface and satellite retrievals sensitive mostly to the free troposphere. The effect of the a priori vertical profile on the comparison was estimated for TROPOMI by Dimitropoulou et al. (2020) for Uccle, showing an increase by about 55 % when recalculating the TROPOMI column with MAX-DOAS daily mean tropospheric profile. Similarly, Ialongo et al. (2020) and Zhao et al. (2020) show improvement of the agreement between TROPOMI and Pandora total column data for episodes of NO 2 enhancement, when replacing the coarse a priori NO 2 profiles with high-resolution profiles from a high-resolution regional air quality forecast model. Somewhat related to the vertical sensitivity is the treatment of aerosol optical depth and its vertical profile. Poor representation of the aerosol opacity has been shown (from simulations) to cause both underestimated NO 2 in satellite retrievals and overestimated NO 2 in MAX-DOAS measure-ments (Leitão et al., 2010;Ma et al., 2013;Jin et al., 2016). Satellite-ground discrepancies in previous validation studies have already been attributed to such aerosol issues Compernolle et al., 2020b). Moreover, explicit aerosol corrections in the S5P retrievals have already been shown to improve the agreement .
Third, the treatment of cloud properties can have a significant effect on the retrieval of the TROPOMI NO 2 tropospheric VCD.  discuss the comparison with OMI NO 2 tropospheric column retrievals and show that on average TROPOMI is lower than OMI by −10 % to −12 % over Europe, North America, and India and up to −22 % over China. This difference is mainly attributed to the different cloud data product used in the NO 2 retrieval: FRESCO-S derives the cloud top pressure from TROPOMI radiances in the near-infrared O 2 −A band, while for OMI the cloud top pressure is retrieved from the O 2 −O 2 band in the UV-visible. Preliminary validation results (Compernolle et al., 2020a, andHenk Eskes, private communication, 2020) indicate that FRESCO-S is biased high in pressure, especially at altitudes close to the surface. A new version of FRESCO-S with an adapted wavelength window has been implemented and seems to remove most of the 10 %-22 % bias with OMI in polluted regions.
Fourth, although this work, Compernolle et al. (2020b), and Pinardi et al. (2020) all show a generally good coherence of the validation results among the MAX-DOAS in- Figure 14. Same as Figs. 6 and 10 but now for the difference between S5P TROPOMI (RPRO+OFFL) and PGN NO 2 total columns. Stations are ordered by ground-based total NO 2 median value, like in Fig. 10. The median difference is represented by a vertical solid line inside the box, which marks the 25 % and 75 % quantiles. The whiskers cover the 9 %-91 % range of the differences. The three mountaintop PGN instruments used for the validation of the stratospheric columns are not included here but in Fig. 6. struments across the network and also among MAX-DOAS and Pandora instruments, network homogenization remains an important challenge to focus on to improve the accuracy of future satellite validations (see Sect. 5 for a description of contributors to network inhomogeneity). Intercomparison campaigns, such as the CINDI-1 and CINDI-2 (Piters et al., 2012;Kreher et al., 2020), in-depth intercomparison studies of the retrieval methods (Frieß et al., 2019;Tirpitz et al., 2020;Peters et al., 2019), and dedicated projects aiming at the harmonization of the processing and of the associated metadata (such as the FRM4DOAS project of ESA's Fiducial Reference Measurements programme) are an important way to achieve this.
Regarding the mutual consistency of MAX-DOAS-and PGN-based validation results, while it may appear that, at low column values, PGN-based comparisons indicate a smaller bias than the MAX-DOAS comparisons, one must not forget that PGN measures the total column: at stations with a lower total column value, the stratospheric contribution is relatively more important. The better agreement here is therefore consistent with the good agreement found for the TROPOMI stratospheric NO 2 column vs. ZSL-DOAS and also vs. PGN at pristine mountain sites (Sect. 4). For sites characterized by a higher total NO 2 column, the tropospheric contribution becomes more important, and some of the same effects that make satellite-MAX-DOAS comparisons difficult, such as the smoothing difference error, the lower sensitivity of the satellite close to the surface, and the approximate S5P a priori profile, come into play as well.
In conclusion, the first 2 years of Copernicus S5P TROPOMI NO 2 column data produced both with the NRTI and OFFL versions 01.0x.xx of the operational processors do meet mission requirements for the bias and, to some extent, with precaution for the uncertainty (dispersion). The different data products available publicly through the Copernicus system are mutually consistent, are in good geophysical and quantitative agreement with ground-based correlative data of documented quality, and can be used for a variety of applications, on the condition that the features and limitations exposed here are taken into proper consideration and that the S5P data are filtered and used according to the recommendations provided in the official Product Readme File (PRF) and associated documentation, also available publicly. Ground-based validation activities relying on the correlative measurements contributed by the NDACC ZSL-DOAS, MAX-DOAS, and PGN global monitoring networks have progressed significantly in recent years and have demonstrated their capacity but also their current limitations in an operational context such as the Copernicus programme. Room does exist for further improvement of both the satellite and ground-based datasets, as well as the intercomparison methodology and its associated error budget. Beyond the methodology advances published here and in aforementioned papers, special effort is needed to understand fully and ever reduce comparison mismatch errors, which so far make the accurate validation of S5P data uncertainty bars difficult. Several updates of the calibration of TROPOMI spectra and of the TROPOMI NO 2 data retrieval processors are already in development and in implementation. Upcoming data versions should be validated with the same system as used in the current paper, allowing the necessary independent assessment of the S5P data product evolution.
Appendix A: Ground networks A1 The NDACC ZSL-DOAS network Table A1. ZSL-DOAS hosting stations, ordered by latitude, that contribute to the stratospheric NO 2 column validation. Several measures of the agreement between TROPOMI and the ground-based data are also provided. The bias over all stations (median over all station median differences) is −0.23 Pmolec cm −2 , while the overall dispersion (median over all 1/2IP68) is 0.31 Pmolec cm −2 , and the inter-station dispersion (1/2IP68 over all station medians) is 0.30 Pmolec cm −2 .

Station
Lat    Drosoglou et al. (2017). Several measures of the agreement between TROPOMI and the ground-based data are also provided. Biases and comparison spreads vary strongly between stations, mainly as a function of the nature of the site (clean or polluted). When calculating these numbers for the three regimes (clean, polluted, extreme), the median biases are −0.3, −2, and −12 Pmolec cm −2 (−23 %, −37 %, and −51 %) respectively, with median dispersions of 0.7, 3.4, and 7 Pmolec cm −2 . Note that the median values for the high tropospheric columns (Athens to Xianghe) are almost the same as the statistics found for the whole network. The site-site bias dispersion is 0.2, 1.2, and 3.3 Pmolec cm −2 for each regime. Considering the low NO 2 stations (Alice Springs to New Brunswick) only, the bias is 0.1 Pmolec cm −2 (2 %), the overall dispersion is 1.1 Pmolec cm −2 , and the inter-station dispersion is 0.2 Pmolec cm −2 . For the high NO 2 stations (Buenos Aires to UNAM), the bias is −3.6 Pmolec cm −2 (−30 %), the overall dispersion is 3.3 Pmolec cm −2 and the inter-station dispersion is 1.4 Pmolec cm −2 . Note that the mountaintop stations are not used in the calculation of these overall statistics.  Figure A1. (a) Box-and-whisker plots summarizing the TROPOMI-MAX-DOAS tropospheric VCD difference, per station, ordered as a function of the median ground-based tropospheric column (largest median VCD values on top). Panels (b, c, d) present, respectively, the assumed aerosol optical depth (AOD; either retrieved from the MAX-DOAS measurement or taken from the climatology used in the NO 2 retrieval), the MAX-DOAS absolute uncertainties, and the relative uncertainties (total median uncertainty in grey bars, random part in black and systematic part in red).
Author contributions. TV, SC, and GP carried out the global validation analysis. JCL, KUE, and MVR contributed input and advise at all stages of the analysis. AMF (EVDC), JG (Multi-TASTE), and SN (MPC VDAF-AVS) preprocessed and/or post-processed the ground-based and satellite data. HJE, KFB, PFL, and JPV developed the TROPOMI NO 2 data processor. AR, MVR, and TW contributed expertise on satellite NO 2 data retrieval. AC, FH, KK, MT, APa, JPP, and MVR supervise network operation and contributed ground-based scientific expertise. AD, LSdM, and CZ supervise the Copernicus S5P mission, the S5P MPC, and the S5PVT. All other co-authors contributed ground-based data and expertise at groundbased stations. TV, SC, GP, and JCL wrote and edited the paper. All co-authors revised and commented on the paper.
Competing interests. The authors declare that they have no conflict of interest.
Special issue statement. This article is part of the special issue "TROPOMI on Sentinel-5 Precursor: first year in operation (AMT/ACP inter-journal SI)". It is not associated with a conference.
Acknowledgements. Part of the reported work was carried out in the framework of the Copernicus Sentinel-5 Precursor Mission Performance Centre (S5P MPC), contracted by the European Space Agency and supported by the Belgian Federal Science Policy Office (BELSPO), the Royal Belgian Institute for Space Aeronomy (BIRA-IASB), the Netherlands Space Office (NSO), and the German Aerospace Centre (DLR). Part of this work was carried out also in the framework of the S5P Validation Team (S5PVT) AO projects NIDFORVAL (ID no. 28607, PI Gaia Pinardi, BIRA-IASB) and CESAR (ID no. 28596, PI Arnoud Apituley, KNMI). The authors express special thanks to Ann Mari Fjaeraa, José Granville, Sander Niemeijer, and Olivier Rasson for post-processing of the network and satellite data and for their dedication to the S5P operational validation.
The LATMOS real-time processing facility is acknowledged for fast delivery of ZSL-DOAS SAOZ data. Fast delivery of MAX-DOAS data tailored to the S5P validation was organized through the S5PVT AO project NIDFORVAL. The authors are grateful to ESA/ESRIN for supporting the ESA Validation Data Centre (EVDC) established at NILU and for running the Fiducial Reference Measurements (FRM) programme and in particular the FRM4DOAS and Pandonia projects. The PGN is a bilateral project between NASA and ESA, and the NASA funding for the PGN is provided through the NASA Tropospheric Composition Program and Goddard Space Flight Center Pandora project.