Intercomparison of Sentinel-5P TROPOMI cloud products for tropospheric trace gas retrievals

. Clouds have a strong impact on satellite measurements of tropospheric trace gases in the ultraviolet, visible, and near-infrared spectral ranges from space. Therefore, trace gas retrievals rely on information on cloud fraction, cloud albedo, 15 and cloud height from cloud products. In this study, the cloud parameters from different cloud retrieval algorithms for Sentinel-5 Precursor (S5P) TROPOMI are compared: the OCRA a priori cloud fraction, the ROCINN CAL cloud fraction and cloud top and base height, the ROCINN CRB cloud fraction and cloud height, the FRESCO cloud fraction, the interpolated FRESCO cloud height from the TROPOMI NO 2 product, the cloud fraction from the NO 2 fitting window, the O2-O2 cloud fraction and cloud height, the MICRU cloud fraction, and the VIIRS cloud fraction. Two different versions of the TROPOMI cloud products 20 OCRA/ROCINN


Introduction
Monitoring the global distribution of atmospheric constituents over a long period is essential to assess changes in atmospheric composition and the resulting consequences, such as pollution and climate change. One efficient way to perform such measurements is by using absorption spectrometry from satellite platforms. The first nadir-viewing instrument to measure with high spectral resolution in the ultraviolet, visible, and near-infrared (UV-Vis-NIR) spectral range, GOME, aboard ERS-2 (Burrows et al., 1999, and references therein), was launched in April 1995 to detect multiple species including ozone (O 3 ), nitrogen dioxide (NO 2 ), bromine monoxide (BrO), sulfur dioxide (SO 2 ), and formaldehyde (HCHO). During 2002-2012, SCIAMACHY on board ENVISAT (e.g. Burrows et al., 1995;Bovensmann et al., 1999) enabled the observation of additional species such as carbon monoxide (CO), carbon dioxide (CO 2 ), and methane (CH 4 ), through the integrated short-wave infrared (SWIR) channels. The Ozone Monitoring Instrument (OMI) launched in 2004 on the Aura platform (Levelt et al., 2018) and the three GOME-2 instruments (Munro et al., 2016) launched in 2006 aboard Metop-A, in 2012 on Metop-B, and in 2018 on Metop-C, expand the atmospheric composition data set. The ESA Sentinel-5 Precursor (S5P) with the TROPOspheric Monitoring Instrument (TROPOMI) on board (Veefkind et al., 2012) was launched in October 2017 as a preparatory mission to bridge the gap between the existing satellites and the planned Sentinel-4 and Sentinel-5 missions. TROPOMI is a space-borne nadir-viewing hyperspectral imaging spectrometer covering the UV, Vis, NIR, and SWIR spectral regions. TROPOMI monitors atmospheric trace gases daily and globally with a high spatial resolution of 5.5 × 3.5 km 2 (7 × 3.5 km 2 before 6 August 2019) (Eskes et al., 2021).
Satellite measurements of atmospheric composition are affected by clouds as they shield the underlying atmosphere from the satellite's view, reducing the sensitivity to the lower atmosphere. At the same time, clouds increase the sensitivity for absorbers above or inside the cloud through their higher albedo and multiple scattering within the cloud. As a result, it is challenging to retrieve the amounts of trace gases when clouds are in the field of view of the instrument due to increased reflection of solar radiation and enhanced light paths (e.g. Martin et al., 2002;Richter and Burrows, 2002;Boersma et al., 2004). Consequently, cloud retrieval algorithms have been developed and implemented in trace gas processing to account for the cloud impact when observing atmospheric gases with spectrometers from space (e.g. Koelemeijer et al., 2001;Loyola, 2004;Kokhanovsky et al., 2006;Stammes et al., 2008;Lutz et al., 2016). In general, cloud retrieval algorithms use independent pixel approximation (IPA), which defines the scene as a linear combination of a cloudy and a clear sub-scene. In the cloud-free sub-scene, part of the solar light reaches the surface and is reflected back to the satellite. In the cloudy part of the scene, the solar light is scattered and reflected by the cloud, which affects the amount of absorption by atmospheric trace gases.
For TROPOMI, several cloud products have been developed that use different physical processes as approaches to retrieve cloud parameters such as cloud fraction, cloud height, and cloud optical thickness. The Optical Cloud Recognition Algorithm (OCRA) and the Retrieval Of Cloud Information using Neural Networks (ROCINN) are the operational cloud product for TROPOMI ; see Sect. 2.1.1). The colour or whiteness approach is implemented in OCRA to retrieve a radiometric cloud fraction. OCRA applies background maps in a normalized redgreen-blue (RGB) colour space, where an optically thick cloud is assumed to be white. In ROCINN the O 2 absorption band is used in the range 756-771 nm to provide cloud height information as well as cloud optical thickness and cloud albedo. The Fast Retrieval Scheme for Clouds from the Oxygen A-band (FRESCO) uses the O 2 A-band and the brightness approach for the NIR region see Sect. 2.1.2). The brightness approach, where a cloud-free background is defined as dark compared to bright clouds, is also implemented in the TROPOMI NO 2 product for the UV-Vis region (Van Geffen et al., 2021;see Sect. 2.1.2), in the Mainz Iterative Cloud Retrieval Utilities (MICRU) for the UV, Vis, and NIR region (Sihler et al., 2021;see Sect. 2.1.4), and in the Visible Infrared Imaging Radiometer Suite (VI-IRS) for the Vis, IR, and SWIR region (Siddans, 2016;see Sect. 2.1.5). O 2 -O 2 absorption in the spectral range between 460 and 490 nm is used for the TROPOMI O 2 -O 2 product (Acarreta et al., 2004;Veefkind et al., 2016;see Sect. 2.1.3). This approach was initially developed for OMI because the spectral range of the O 2 A-band is not covered by this instrument. In general, cloud retrieval algorithms use cloud models where clouds are assumed to be, for example, reflecting surfaces as in the OCRA/ROCINN CRB (Clouds-as-Reflecting-Boundaries) model, FRESCO, MICRU, and O 2 -O 2 or homogeneous layers as in the OCRA/ROCINN CAL (Clouds-As-Layers) model. The cloud products differ in how the cloud albedo is determined; either it is fitted to the scene as in OCRA/ROCINN or assumed to have a fixed value of 0.8 as in FRESCO, O 2 -O 2 , and MICRU. Sihler et al. (2021) present the MICRU algorithm in more detail, comparing MICRU and different versions of OCRA and FRESCO using GOME-2 data. They show that MICRU is able to accurately determine small cloud fractions over a wide spectral range with less dependence on sun glint. Compernolle et al. (2021) performed a comprehensive validation of the OCRA/ROCINN CAL and CRB models as well as the FRESCO cloud product using cloud data from the VIIRS instrument aboard the Suomi National Polar-orbiting Partnership (NPP) satellite platform, OMI and MODIS satellites, and ground-based data from CloudNet. They present a new method for comparing the OCRA/ROCINN CRB cloud fraction and the FRESCO cloud fraction, converting the former to a scaled cloud fraction with a cloud albedo of 0.8, which is assumed in the FRESCO and O 2 -O 2 products. This procedure is used in the comparisons in this paper (see Sect. 2.2). Compernolle et al. (2021) found a pronounced west-east bias in the version 1 OCRA/ROCINN cloud product and unrealistic cloud heights equal to the surface altitude at low cloud fractions in the version 1 FRESCO product.
In this study, the operational TROPOMI cloud product (consisting of OCRA a priori, ROCINN CRB, and ROCINN CAL), the FRESCO cloud product, the cloud fraction from the NO 2 fitting window, the O 2 -O 2 cloud product, the VI-IRS cloud fraction, and the cloud fraction from MICRU are compared with respect to different regions (Europe, Africa, and China) and 4 test days in different seasons, i.e. a summer day (30 June 2018), a winter day (5 January 2019), a spring day (4 April 2019), and a autumn day (20 September 2019). The OCRA/ROCINN CLOUD, FRESCO, NO 2 , and VIIRS products were updated from version 1 to version 2 in summer 2020, and both versions are included in the comparison. This study is limited to the 4 test days, as data from both versions are only available for specific days. The goal of this paper is to summarize and compare the existing cloud retrieval algorithms for TROPOMI, to discuss their differences, and to document the changes between the versions. The focus is on parameters needed for the application of the cloud products for trace gas retrievals, not the retrieval of cloud properties themselves.
The paper is organized in the following way: the different TROPOMI cloud products, their properties, and the data preparation are described in Sect. 2. In Sect. 3, the cloud products are statistically compared first with respect to the version change (Sect. 3.1) and second regarding the version 2 cloud fractions (Sect. 3.2.1) over snow-and ice-covered areas (Sect. 3.2.2) and with sun glint (Sect. 3.2.3), cloud heights (Sect. 3.2.4), and across-track dependencies (Sect. 3.2.5). Finally, conclusions are given in Sect. 4.

Methods
In this study, we compare different cloud products based on TROPOMI data of version 1 and version 2. This section presents first the input data and the different cloud retrieval algorithms (Sect. 2.1). In Sect. 2.2, the data preparation for the comparison of the different TROPOMI cloud products is described.

Cloud retrieval algorithms
For TROPOMI, different cloud retrieval algorithms have been developed to retrieve cloud parameters from the UV, Vis, and NIR spectral regions. As the cloud products use different approaches, the retrieved cloud fractions, cloud albedo, and cloud heights differ. An overview of the cloud products used in this paper is shown in Table 1.

CLOUD OCRA/ROCINN
The operational TROPOMI CLOUD product generated from the Universal Processor for UV/VIS Atmospheric Spectrometers (UPAS) was developed by the German Aerospace Centre (DLR) as a two-step algorithm. First, OCRA for TROPOMI, an algorithm for cloud detection by optical sensors, is applied to TROPOMI measurements in the UV-Vis spectral region to retrieve the cloud fraction a priori . Using the colour-space approach, the UV-Vis reflectances of the observed scene are translated to colours to obtain the radiometric cloud fraction. In UPAS 1.x, which was operational until July 2020, the clear-sky reflectance and the across-track dependency correction are based on OMI data with a spatial resolution of 0.2 • × 0.4 • . In UPAS 2.1.3, operational between July 2020 and July 2021, the clear-sky background map and the across-track dependency correction are based on 1 year of TROPOMI data with a spatial resolution of 0.2 • × 0.2 • , and since UPAS 2.2.1, those are based on 3 years of TROPOMI data. In addition, an adapted scaling is included to improve the range of very low and very high cloud fractions .
Second, the OCRA a priori cloud fraction and NIR TROPOMI measurements are taken as input to a machine learning algorithm, ROCINN, to retrieve the cloud-top height, the cloud optical thickness, and the cloud albedo from reflectivity measurements in and around the O 2 A-band between 758 and 771 nm. Two cloud models are implemented in ROCINN: the Clouds-As-Layers (CAL) model and the Clouds-as-Reflecting-Boundaries (CRB) model. ROCINN CAL treats clouds as homogeneous layers of scattering liquid water particles to retrieve cloud fraction, cloud top height, and cloud optical thickness. The cloud base height from ROCINN CAL is not a retrieved quantity; instead, the cloud is assumed to have a constant geometrical thickness of 1 km. In ROCINN CRB, clouds are Lambertianequivalent reflectors, with cloud fraction, cloud height, and cloud albedo as output. Cloud fractions that are smaller than 0.05 in OCRA a priori are set to zero in the ROCINN CAL and CRB cloud fractions, and the ROCINN retrieval is not triggered under these "clear-sky" conditions. ROCINN used a MEdium Resolution Imaging Spectrometer (MERIS)based surface albedo climatology in UPAS 1.x. Starting with UPAS 2.1.3, the surface albedo climatology is replaced by an actual surface albedo retrieval of geometry-dependent effective Lambertian-equivalent reflectivity (GE_LER) using the TROPOMI data, and the surface albedo map is dynamically updated every day with the global gapless geometrydependent LER (G3_LER)  if a scene is indicated as clear-sky. In addition, cloud phase flags and effective scene parameters such as effective scene height and effective scene albedo are added, and the co-registration between the UV-Vis band 3 (BD3) and the NIR band 6 (BD6) is improved. This study includes the UPAS 1.1.7 data as version 1 and the UPAS 2.1.3 data as version 2. It should be noted that a very simple initial approach with limited usability is used for the quality value in UPAS 1.x, while significant improvements in the determination of the quality values have been made in UPAS 2.1.3 compared to version 1.x. In both versions, no quality filtering is applied for the OCRA a priori cloud fraction.

FRESCO and TROPOMI NO 2 product
The FRESCO algorithm, developed by the Royal Netherlands Meteorological Institute (KNMI), models the effective cloud fraction and cloud pressure (height) using the O 2 Aband centred at 760 nm (Koelemeijer et al., 2001;. The cloud parameters are retrieved from top-ofatmosphere reflectances in three 1 nm wide wavelength windows at 758-759 nm (no absorption), 760-761 nm (strong absorption), and 765-766 nm (moderate absorption). Measurements from the NIR spectrum, where land is characterized by a high albedo in contrast to dark water, make FRESCO susceptible to uncertainties in the surface reflectance, such as distinct coastlines and a land-water contrast. FRESCO uses a Lambertian cloud model, where the cloud is assumed to be a Lambertian reflector with a fixed albedo of 0.8. The processor version of FRESCO defined as version 1 is 1.3.x, and version 2 refers to FRESCO processor version 2.1.0, which uses new look-up tables for the surface albedo and degradation-corrected irradiances. The FRESCO implementation in processor version 1.4, operational since December 2020, and later processor versions 2.x have adopted a different, wider wavelength window in the oxygen A-band (working title: FRESCO-wide). This generally leads to lower cloud pressures, correcting a high bias observed in versions 1.2 and 1.3, and to significant increases in NO 2 in better agreement with OMI NO 2 retrievals .
The cloud retrieval algorithm of the TROPOMI NO 2 product has been developed due to the misalignment between the TROPOMI ground pixel view of the Vis and NIR bands (Van Geffen et al., 2021). The effective cloud fraction from the TROPOMI NO 2 product is retrieved from the NO 2 fitting window in the UV-Vis spectral region at 440 nm. The cloud height of the TROPOMI NO 2 product is derived from the FRESCO cloud pressure of the TROPOMI FRESCO product, taking into account the difference in the footprint of the UV-Vis and NIR detectors. The FRESCO cloud height of the TROPOMI NO 2 product is very similar to that of the TROPOMI FRESCO product. The only difference is that the FRESCO cloud height is not corrected for the misalignment between the UV-Vis and NIR channels of TROPOMI. Consequently, only the cloud height from the TROPOMI NO 2 product is included in the comparisons in this paper, and the FRESCO cloud height is not dealt with explicitly. For the TROPOMI NO 2 product, the processor versions 1.2.2 and 1.3.x are used as version 1, and the processor version 2.1.0 is used as version 2.

O 2 -O 2
The O 2 -O 2 algorithm has been developed by KNMI and was initially developed for OMI because this instrument does not cover the spectral range of the O 2 A-band at 760 nm (Acarreta et al., 2004;Veefkind et al., 2016). The algorithm uses OMI and TROPOMI measurements from the O 2 -O 2 (O 4 ) absorption window at 477 nm to retrieve the effective cloud fraction and the cloud height using a similar cloud model as the one used in FRESCO. However, it is more sensitive to clouds at lower altitudes and to aerosols because it uses O 2 -O 2 collision-induced absorption. As in FRESCO, a fixed cloud albedo of 0.8 is assumed. The retrieved cloud height is expected to be the mid-level of the cloud rather than the cloud top height . Since TROPOMI processor version 2.2.0 , the O 2 -O 2 cloud product is included in the TROPOMI NO 2 retrieval files, but the NO 2 retrieval only currently uses the FRESCO cloud information. For the O 2 -O 2 cloud product, the processor version 2.2.0 is used in this study.

MICRU
The MICRU algorithm was designed by the Max Planck Institute for Chemistry (MPIC) to retrieve the effective cloud fraction at different spectral bands using UV, Vis, and NIR TROPOMI measurements (Sihler et al., 2021). MICRU is optimized for low cloud fractions smaller than 0.2. It uses a viewing-direction-dependent empirical background map of surface reflectivity and differentiates between land and ocean. MICRU only computes effective cloud fractions and no other cloud parameters.

VIIRS
The VIIRS instrument is aboard the Suomi National Polarorbiting Partnership (NPP) satellite platform launched in 2011. It flies in a so-called loose formation with S5p, with the difference in overpass time being less than 5 min. The S5P-NPP cloud product has been developed by the Rutherford Appleton Laboratory (RAL) to retrieve a four-level cloud mask with cloud probability for VIIRS pixels within an S5P scene (Siddans, 2021). The number of VIIRS cloud mask pixels within a given S5P scene is typically about 50-200, depending on cross-track position for S5P Vis-NIR bands and about double that in the SWIR. VIIRS Vis and infrared (IR) imagery and radiometric measurements are used as input to obtain a geometric cloud fraction, which is based on the cloud mask and is mainly independent of the cloud optical properties. The VIIRS cloud product is used for cloud screening by different TROPOMI products, for example, by the methane (CH 4 ) processor to identify cloud-free scenes for processing, the CLOUD processor for cloud screening in its daily surface albedo retrieval (GE_LER), and the aerosol layer height (ALH) processor as a cloud mask. It should be emphasized that the effective cloud fractions retrieved from the cloud products mentioned above strongly depend on the cloud optical thickness; i.e. for optically thick clouds, the effective cloud fraction may be close to the geometric one, but for optically thin clouds it might be much below. Therefore, the geometric VIIRS cloud fraction is expected to have the largest differences from the other cloud fractions. In this study, the cloud fraction calculated from the VIIRS Cloud Mask (VCM) included in the TROPOMI L2_NP_BD3 product file of processor version 1.0.2 is defined as version 1 (see Sect. 2.2 for details of the calculation). The cloud fraction from the VIIRS Enterprise Cloud Mask (ECM) is defined as version 2 and is directly taken from the TROPOMI CLOUD product file of processor version 2.1.3.

Data preparation
Throughout this study, data with a quality parameter (qa value) larger than or equal to 0.5 are used. The qa value ranges between 0 and 1 and is not a mathematical parameter but an artificial value used to decide whether the pixels are of good quality (1) or whether the measurement is affected by processing errors and warnings, in which case the qa value is reduced from 1. There are no common rules on how the qa value should be calculated and what input information should be included (e.g. sun glint, snow/ice, low cloud fractions, aerosol pollution, extreme viewing geometries, algorithm-specific retrieval diagnostics). Therefore, the qa values in the different products (ROCINN CRB, ROCINN CAL, FRESCO, and TROPOMI NO 2 product) are not directly comparable, and this non-standardized qa value calculation across all products has a large impact on the comparison of the cloud data in this study, as will be seen later.
Some parameters from the set of TROPOMI cloud products need to be further processed to make the results from the different cloud retrieval algorithms more comparable.
ROCINN cloud albedo scaling. As recommended by Compernolle et al. (2021), the ROCINN CRB cloud fraction (CF) is converted to a scaled cloud fraction (sCF) with a fixed cloud albedo (CA) at 0.8. This provides a better comparison to cloud products that assume a fixed cloud albedo (0.8), such as FRESCO or O 2 -O 2 . The CA is assigned a fill value when CF = 0; thus the scaling is done as follows: Pressure-to-height conversion. The parameter "cloud_pressure_crb" (CP in Pascal) from the TROPOMI NO 2 product needs to be converted to cloud height (CH in metres). This parameter is derived from the FRESCO cloud pressure, already considering the difference in the footprint of the NIR and UV-Vis detectors. The following conversion formula is used: where L = −0.0065 K m −1 is the constant tropospheric temperature lapse rate, p surface = 101325 Pa is the surface pressure, and g = 9.81 m s −2 is the gravitational acceleration constant. The surface temperature T 0 is assumed to be 300 K. R s = R MW air is the specific gas constant for air, with the universal gas constant R = 8.3144621 J mol −1 K −1 and the molar weight of air MW air = 0.0289644 kg mol −1 . The FRESCO and NO 2 cloud heights are the same quantity, spatially interpolated for different spectral bands. FRESCO is retrieved in the NIR region, and the FRESCO cloud height in the TROPOMI NO 2 product is interpolated to the TROPOMI pixel coordinates of the UV-Vis region. As explained in Sect. 2.1.2, only the NO 2 cloud height is considered in the comparisons in this study.
Other data operations. FRESCO and O 2 -O 2 cloud fractions reach values up to 1.5, which is physically impossible.

M. Latsch et al.: Intercomparison of Sentinel-5P TROPOMI cloud products
This is due to the assumption of a fixed albedo of 0.8. Thus, values larger than 1.0 are forced to 1.0 to ensure a consistent comparison and are expected to accumulate in the density histograms.
The dimension "ground_pixel" of the FRESCO parameters only has 448 pixels compared to the other products, which have 450 values. This is a consequence of the different spatial coverage (i.e. footprints) of the NIR and UV-Vis channels of TROPOMI. Therefore, the "ground_pixel" dimension of the FRESCO parameters is filled with two additional entries at the beginning of every scanline to ensure comparability. No attempt was made to map the FRESCO cloud fraction to the footprint of the UV-Vis channel. Therefore, some scatter is expected in the comparisons involving FRESCO cloud fraction data.
The MICRU cloud fraction derived at 440 nm is included in the comparisons because this wavelength corresponds to a point within the NO 2 fitting window (Sihler et al., 2021).
The ratio of the sum of pixels in the VIIRS class of interest ("vcm_confidently_cloudy") and the total number of all pixels, in each case for the nominal field of view, is calculated to determine the geometric cloud fraction of the VIIRS measurement (Siddans, 2021).
In this study, the results are presented in density histograms, where the colour bar indicates what percentage of the available values is accumulated in the cloud fraction or cloud height values in the dots. In addition, the cloud products are compared in terms of differences on maps. For that purpose, the TROPOMI measurement pixels of all orbits are rasterized to a 0.03 • × 0.03 • grid. Thus, the differences represent averaged values of the measurements. This is important at higher latitudes where orbits overlap and measurements from multiple orbits are available per grid pixel.

Results and discussion
Due to limited data availability of TROPOMI version 2 data, the comparisons of the different cloud products are restricted to 4 test days from different seasons: 30 June 2018 (summer day), 5 January 2019 (winter day), 4 April 2019 (spring day), and 20 September 2019 (autumn day). In addition, the cloud products are investigated for three defined regions, namely Europe, Africa, and China, whose spatial definitions can be seen in Fig. 1. The different regions were selected to investigate the performance of the cloud retrieval algorithms in different challenging situations: the Europe region includes snow and ice cover, the Africa region is suitable for sun glint due to the water surfaces around the continent, and the China region represents the most polluted region on Earth, which poses a challenge for the cloud products to accurately retrieve cloud heights.
In the first section (Sect. 3.1), the differences between the cloud products based on TROPOMI version 1 and version 2 data are presented, focusing on the changes due to the ver- sion updates. Section 3.2 presents the results of the intercomparison between the cloud fractions and the cloud heights of the different cloud products using TROPOMI version 2 data. First, the cloud fractions are compared in respect of their correlations (Sect. 3.2.1) and then are further analysed with respect to snow-and ice-covered scenes over the region of Europe (Sect. 3.2.2) and for scenes with sun glint in the Africa region (Sect. 3.2.3). In Sect. 3.2.4, the results of the comparisons between the cloud heights for the region of China are discussed, and across-track dependencies for both cloud fractions and cloud heights on a global basis are shown in Sect. 3.2.5.

Comparison between version 1 and version 2 cloud products
In July 2020, the operational TROPOMI cloud products were updated from version 1 to version 2. This paper discusses the comparison between version 1 and version 2 cloud fractions from OCRA/ROCINN, FRESCO, the NO 2 fitting window (cf_fit), and VIIRS for the region of Europe and the spring day (4 April 2019), which represents a day with snow-and ice-covered scenes (Figs. 2 and 3). In addition, the results of the cloud height comparison for the Africa region and the summer day (30 June 2018) are presented (Fig. 4). Additional density histograms of the cloud fractions and the cloud heights for the other test days for the regions of Europe and Africa are shown in Appendix A and in Sect. S1 in the Supplement. The ROCINN CAL and OCRA a priori cloud fractions have a similar distribution (Fig. 2b and c), both being systematically larger in version 2 than in version 1. The differences between the two versions are much smaller for ROCINN CRB (Fig. 2a), arguably because cloud albedo has also changed but in the opposite direction, and the cloud fraction is scaled with the cloud albedo here as described in Sect. 2.2. However, the version 2 cloud fractions have higher values than the version 1 cloud fractions for all three OCRA/ROCINN products, especially for the largest values, due to an instrument degradation correction introduced in version 2. ROCINN CRB is distributed closer to the 1 : 1 line, which might result from a change in the surface albedo in version 2. ROCINN CAL and OCRA a priori version 2 show many values of 1, while version 1 is smaller, which is probably mainly related to the adapted OCRA scaling in version 2; i.e. the cloud fractions slightly below 1 in version 1 are more likely to be fully cloudy in version 2. The differences in values of zero for all three OCRA/ROCINN products may be due to the change from the OMI-based OCRA clear-sky reflectance climatology in version 1 to the TROPOMI-based clear-sky reflectance in version 2. As mentioned in Sect. 2.1.1, ROCINN is not triggered for cloud fractions smaller than 0.05. This can be seen in Fig. 2b, where ROCINN CAL version 1 exhibits a gap for cloud fractions smaller than 0.05 because the OCRA a priori cloud fraction in ROCINN CAL is set to zero. In contrast, ROCINN CAL version 2 shows cloud fractions smaller than 0.05 due to a change in the co-registration procedure of the satellite pixels in version 2. The co-registration from BD3 (UV-Vis) to BD6 (NIR) for the OCRA a priori cloud fraction and vice versa from BD6 to BD3 for the ROCINN parameters is present in both product versions. However, while version 1 used a simplified integer pixel shift, in version 2, an improved scheme using the TROPOMI static mapping tables provided by KNMI was implemented. These mapping tables contain the actual overlap areas of neighbouring pixels and are therefore much more precise. Due to the co-registration, it may happen, for example, along cloud edges, that pixels with cloud fractions larger than 0.05 in one band may have cloud fractions smaller than 0.05 in the other band. Since a 0.05 cloud fraction is the threshold for the ROCINN retrieval, this might result in slightly different data yields in the two bands.
FRESCO shows virtually no change between the two versions for Europe (Fig. 2d). However, some scatter is observed for desert regions, such as Africa, for FRESCO cloud fractions smaller than 0.2 for the summer and winter days (Figs. A1 and A2), likely resulting from a surface albedo adjustment in version 2 introduced to avoid negative cloud fractions. The cloud fraction from the NO 2 fitting window changed for the largest values (Fig. 2e), with version 2 being smaller than version 1. This is due to adjustments in the cloud albedo to avoid cloud fractions larger than 1 and the use of degradation-corrected irradiances in version 2, resulting in a higher irradiance signal and thus a lower reflectance. In addition, differences between the TROPOMI NO 2 product of version 1 and version 2 occur particularly over snowand ice-covered scenes, where some additional lines below the 1 : 1 line in the density histogram are found (Fig. 2e), corresponding to positive differences larger than 0.3 on the map (Fig. 3a). The reason for these differences is a change of the snow and ice mask from the Near-real-time Ice and Snow Extent (NISE) product in version 1 (Fig. 3b) to a mask based on European Centre for Medium-Range Weather Forecasts (ECMWF) data in version 2. The ECMWF mask has a higher spatial and temporal resolution (Fig. 3c). The VI- Figure 3. Mapped differences between the cloud fractions from the TROPOMI NO 2 product of versions 1 and 2 for Europe and the spring day (4 April 2019) (a). Snow and ice mask from the TROPOMI NO 2 product of version 1 (b) and version 2 (c) for Europe and the spring day (4 April 2019). The version 1 map is from the NISE product, and the version 2 map is based on higher spatially and temporally resolved ECMWF data.
IRS cloud fraction changed from VIIRS VCM (version 1) to ECM (version 2), resulting in large differences for the smallest and largest cloud fractions (Fig. 2f).
Besides the cloud fractions, the cloud height from ROCINN CRB, the cloud top height from ROCINN CAL, and the FRESCO cloud height from the TROPOMI NO 2 product (ch_fresco*), which is remapped from the FRESCO NIR cloud pressure to the UV-Vis spatial footprints, were updated from version 1 to version 2. As the ROCINN CAL cloud base height is not a retrieved parameter but is only calculated with a constant geometric cloud thickness from the ROCINN CAL cloud top height, it is not included in this comparison.
In general, the ROCINN CRB cloud height (Fig. 4a) and the ROCINN CAL cloud top height (Fig. 4b) show smaller differences between the versions than the FRESCO cloud height (Fig. 4c). This is also evident in the correlation coefficient; while ROCINN CRB and CAL show a correlation between the versions of 0.94 or 0.93, respectively, the FRESCO cloud heights correlate less well with a correlation of 0.75. Furthermore, the latter exhibit large scatter at heights lower than 2000 m and an additional vertical line of points for cloud heights where version 1 yields values close to 1000 m. One reason for the differences in the TROPOMI NO 2 products might be that in version 1, the cloud height converged to the surface pressure for low cloud fractions, while in version 2, this is less frequently the case. However, these results seem to indicate that the cloud heights from the FRESCO cloud product converge in some cases with the ROCINN cloud heights in version 2, as the FRESCO cloud height in version 1 was found to be too low overall (Compernolle et al., 2021) and is partly larger in version 2, as shown by the vertical branch in Fig. 4c. This assumption is corroborated in Sect. 3.2.5.

Intercomparison between cloud products
As the different cloud products use different assumptions and algorithms, the results are not expected to always agree. For TROPOMI data users, it is therefore of value to understand the differences in the various cloud products and possible effects on trace gas retrievals.
The comparison of the different cloud fractions is always shown relative to the cloud fraction from the NO 2 fitting window (hereafter referred to as "cf_fit") and placed on the x axis in the density histograms. While this decision to use this cloud fraction as the reference is to some degree arbitrary, it is motivated by the fact that this is the cloud fraction currently used in the TROPOMI NO 2 product. As NO 2 is probably the most commonly used TROPOMI tropospheric trace gas product, it can be considered the baseline. Consequently, the interpolated FRESCO cloud height from the TROPOMI NO 2 product (hereafter referred to as "ch_fresco*") is used as a reference when comparing the different cloud heights to obtain consistent results. It should be understood that these are merely used as reference values and do not imply that they are the true values.

Cloud fraction
In this section, an intercomparison between the cloud fractions derived from the different version 2 cloud products is presented concerning statistics, such as correlations, to give an overview of their general behaviour. In the following sections, density histograms and difference maps of the cloud fraction comparison are evaluated for specific situations, such as over snow and ice cover (Sect. 3.2.2) and with sun glint (Sect. 3.2.3) (see also Sect. S2.3 for more plots of the version 2 cloud fractions).
The tabular intercomparison of the correlations of the cloud fractions for the regions of Europe and Africa is shown in Figs. 5 and 6 (see Sect. S2.1 for the statistics for China, and Sect. S4.1 for the version 1 data). As a first result, it can be said that the correlations between different products have improved in version 2 compared to version 1 (Sect. S4.1) for all regions and test days. The values for Europe (Fig. 5) and China (Sect. S2.1) behave very similarly, and therefore only the comparison for Europe is discussed in detail.
The summer and autumn days for Europe exhibit overall better correlations, larger than 0.95, than the winter and  spring days (Fig. 5). This is especially true for the winter day, where the OCRA a priori cloud fraction deviates more from cf_fit, FRESCO, MICRU, and VIIRS than for other days, with correlations around 0.69. One reason for the poorer correlation between the cloud fractions for the winter and spring days is the product-specific treatment of snow and ice cover over Europe, which is discussed in more detail in Sect. 3.2.2. The correlations with VIIRS are the worst, with the lowest values of 0.42 for the winter day and the largest values of 0.92 for the summer day, but they are still better than in version 1. It should be borne in mind that the VIIRS cloud fraction is a geometric cloud fraction, not an effective one like the cloud fractions from the other cloud retrieval algorithms; thus, these differences were expected.
For the region of Africa, high correlations of the cloud fractions of more than 0.9 are found for version 2 (Fig. 6) as well as version 1 (Sect. S4.1) for all days except for VI-IRS, which is more different from the other products with correlations around 0.7 because VIIRS is a geometric cloud fraction. However, the variation of the values for the different days is minimal for Africa, which represents a desert region, in contrast to Europe, where the days show much more variations for the different seasons due to various influences such as snow and ice.
3.2.2 Cloud fraction over snow-and ice-covered scenes (Europe) Snow and ice are a particular challenge for the cloud algorithms using UV-Vis-NIR (UVN) measurements, as the brightness of snow-and ice-covered surfaces is difficult to distinguish from optically thick clouds. As described in Sect. 3.1, the OCRA/ROCINN and FRESCO cloud products use snow and ice masks to determine the snow-and ice-covered scenes, which have a better spatial resolution in version 2 than in version 1 (see Fig. 3b and c). Below, the different cloud fractions of version 2 are compared for the region of Europe and the spring day (4 April 2019), where snow and ice cover is present (Figs. 7 and 8). First, the general behaviour of the different cloud fractions compared to cf_fit is described for the Europe region (see Sect. S2.3 for the other test days), and second, the differences between the cloud products that occur due to snow and ice cover are discussed.
In general, the ROCINN CRB cloud fraction is larger than cf_fit, with a difference of about 0.1 for the largest values (Fig. 7a). ROCINN CAL and OCRA a priori are clearly different from ROCINN CRB due to the scaling of the latter with the cloud albedo (see Sect. 2.2). They show larger cloud fractions than cf_fit by up to 50 % for the largest values and little scatter for cloud fractions smaller than 0.2 ( Fig. 7b and c). FRESCO finds mainly larger values than cf_fit with a constant offset of about 0.1, and the points for smaller values are distributed in a hook shape (Fig. 7d). O 2 -O 2 and MICRU fit cf_fit very well ( Fig. 7e and f); only at the largest values where cf_fit is up to 30 % smaller, and for MICRU, do values lower than 0.1 scatter a little more. The O 2 -O 2 cloud fraction and cf_fit are expected to have a relatively good agreement because the approaches are very similar. There are two subtle differences: first, for version 2, cf_fit is corrected for trace gas absorption in the NO 2 fitting window; i.e. cf_fit is derived from the expected reflectivity without trace gas absorption using the NO 2 fit information instead of the full reflectivity. This leads to slightly higher values for cf_fit than for O 2 -O 2 for small cloud fractions, where the correction for the NO 2 absorption is especially of importance, since the trace gases absorb some of the light. Secondly, the wavelengths are quite close, but the albedo map evaluated for different wavelengths is used to reflect the wavelength difference between the NO 2 and the O 2 -O 2 fitting window. This leads to some small changes at larger values. Another reason why cf_fit is on average smaller than O 2 -O 2 at the highest values is that the look-up table used by both cloud retrieval algorithms has a weak dependence on the cloud height, which is different in both cases (FRESCO and O 2 -O 2 cloud heights). These heights may differ substantially, depending on the location, which influences the total radiance level. This is a difference which may lead to subtle changes in cloud fraction because if the cloud goes up, the Rayleigh scattering decreases, the intensity predicted by the cloud look-up table decreases, and the computed cloud fraction increases. VIIRS shows mostly larger values than cf_fit and has many values of 1 (fully cloudy) when cf_fit is smaller (Fig. 7g), resulting from the strict definition of cloudy pixels and the fact that it is a geometric cloud fraction. Larger scatter at the lowest 10 %-20 % of the cloud fraction values is found for all products except O 2 -O 2 . As a cloud fraction threshold of 0.2 is often used for trace gas retrievals, such as NO 2 , this result is particularly relevant to the application of cloud products to trace gas retrievals.
Looking at the effect of snow and ice on the cloud products, specific differences between the cloud fractions are found. OCRA a priori, FRESCO, MICRU, and VIIRS compared to cf_fit show an accumulation of extreme values (Fig. 7c, d, f, g). These values of the clusters correspond to negative differences larger than −0.3 over snow-and icecovered regions such as Scandinavia and western Russia (Fig. 8c, d, f, g). Some positive values larger than +0.3 are found over Norway for OCRA a priori, FRESCO, and VI-IRS, but this is the exception. These differences occur because, unlike the other products, cf_fit detects clouds over Norway, even though the cloudy pixels of the TROPOMI NO 2 product overlap areas identified as covered by snow and ice in the snow and ice mask of the TROPOMI NO 2 product (Fig. 3c). This shows that the product's different snow and ice cover treatment can lead to large differences in the cloud fractions.
Contrary to OCRA a priori, FRESCO, MICRU, and VIIRS compared to cf_fit, CRB and CAL do not show an accumulation of extreme values in the density histograms ( Fig. 7a and  b). The reason is that in ROCINN CRB and CAL, snow-and ice-covered pixels are filtered out, mainly based on retrieval diagnostics, which significantly reduce the quality value for such challenging retrievals to at least 0.25. Consequently, pixels with snow and ice cover in CAL and CRB are not included in the comparison since only values with a quality value larger than or equal to 0.5 are used. These two products flag only 54 % (CRB) and 57 % (CAL) of all values as valid in the region of Europe, as shown by a large number of grey pixels (NaN (not a number) values) on the maps ( Fig. 8a  and b). In contrast, the other cloud products include about 93 % of the values because they do not flag snow and ice as strongly as ROCINN does (see Appendix B for all numbers of available values). For example, the MICRU algorithm treats snow-and ice-covered pixels the same as pixels free of snow and ice, resulting in overestimated cloud fractions   over areas with variable snow and ice cover. As a result, MI-CRU compared to cf_fit exhibits a second correlation line for cloud fraction values larger than 0.8 (Fig. 7f). O 2 -O 2 seems to be the only algorithm that treats snow and ice exactly like the TROPOMI NO 2 product, as it mostly shows excellent agreement with the cf_fit values, especially for smaller values (Fig. 7e). The large differences between VIIRS and cf_fit are not mainly due to snow and ice but the many VIIRS values of 1 (fully cloudy) resulting from the strict definition of cloudy pixels and the fact that the VIIRS cloud fraction is a geometric cloud fraction rather than an effective one like the cloud fractions of the other cloud products (Figs. 7g and 8g).

Cloud fraction with sun glint (Africa region)
The results of the test days for the region of Africa do not differ considerably, as mentioned in Sect. 3.2.1, and the distributions of the density histograms between the different cloud fractions compared to cf_fit are virtually the same for all 4 d. The density histograms and difference maps for the winter day (5 January 2019) are shown in Figs. 9 and 10, respectively, to present the differences that generally occur between the cloud products for the Africa region and to provide a closer look at sun glint effects (see also Sect. S2.3 for the other test days).
The distributions of ROCINN CRB, ROCINN CAL, OCRA a priori, and VIIRS for Africa exhibit overall larger values than cf_fit (Fig. 9a, b, c, g), as already seen for Europe (Fig. 7). For FRESCO, the scatter points have a hook-shaped distribution, and the FRESCO cloud fractions are up to 0.5 larger for values where cf_fit is zero (Fig. 9d). It should be mentioned that the surface albedo used by FRESCO in the NIR is not very appropriate for TROPOMI since it is derived from GOME-2, and especially over vegetation, there are large systematic uncertainties with strong viewing angle dependence. The O 2 -O 2 cloud fraction and cf_fit agree well for smaller values but diverge slightly for larger values where cf_fit is smaller (Fig. 9e). The latter is also true for the MI-CRU cloud fraction and cf_fit, and in addition, there is some scatter at cloud fraction values lower than 0.2 (Fig. 9f), which is due to the different treatment of sun glint over water surfaces in the cloud retrieval algorithms.
Sun glint affects satellite measurements when sunlight is reflected directly from the ocean surface to the sensor. In such cases, the otherwise dark ocean water is perceived as a bright surface, which can be misinterpreted as clouds due to high reflectivity signals. The magnitude of the effect as well as the area affected depends on the smoothness of the surface and thus on wind speed. MICRU explicitly treats sun glint, which leads to differences from all the other products, as they do not have a specific treatment for sun glint. This can be seen in the fact that only the difference map of MICRU and cf_fit exhibits stripes over the oceans, here part of the Atlantic Ocean, where the sun glint geometry is given (Fig. 10g). This corresponds with the cloud fraction map from the TROPOMI NO 2 product detecting apparent cloud veils over these areas (Fig. 10d) and with the scatter at the lowest 0.2 cloud fraction in the density histograms (Fig. 9f). The different treatment of sun glint effects is also found on the difference maps for Africa for the other test days (Sect. S2.3). In this regard, it can be concluded that the cloud fractions over water surfaces might be most accurate in the MICRU algorithm.

Cloud height
The cloud height (CH), together with the cloud fraction, is an important parameter when comparing the different  TROPOMI cloud products to investigate the cloud impact on trace gas retrievals. In the following, the results of the comparison between the remapped FRESCO CH from the TROPOMI NO 2 product (ch_fresco*) and the ROCINN CAL cloud top height (CTH), ROCINN CAL cloud base height (CBH), the ROCINN CRB CH, and the O 2 -O 2 CH for the region of China are presented. It should be noted that in the ROCINN CAL model, only the CTH is a retrieved parameter, while the CBH is assumed to have a fixed offset of 1 km from the CTH. ROCINN CAL CTH is expected to be higher than ch_fresco* because they are closer to the geometric cloud edges than the cloud centroid height from the FRESCO algorithm. ROCINN CRB CH should agree well with ch_fresco* as the two algorithms are very similar in their approach. O 2 -O 2 is the only algorithm that uses O 4 absorption in the Vis spectrum; thus, the height is expected to be the most different from the other products.
The tabular intercomparisons of all cloud products regarding the correlations are shown in Fig. 11 (see Sect. S2.2 for the statistics of Europe and Africa, and Sect. S4.2 for the version 1 data). A good agreement of correlations is found for the summer and autumn days between the OCRA/ROCINN products and ch_fresco* with values of about 0.8. However, for the winter and spring days, the correlations between CRB CH and ch_fresco* show smaller values of about 0.7 due to snow and ice cover in the region. The correlations of version 2 are mostly better than the correlations of version 1 (Sect. S4.2). The O 2 -O 2 CH does not correlate well with the cloud heights from the other cloud products, with the worst correlations of around 0.5 for the winter and spring days, as expected due to the different approach.
The cloud heights are evaluated explicitly for China and the autumn day (20 September 2019) to provide an overview of the differences between the cloud products (Figs. 12 and 13; see Sect. S2.4 for density histograms of the other test days). The density histograms of the ROCINN CAL CBH, CAL CTH, and CRB CH compared to ch_fresco* show overall less scatter in version 2 (Fig. 12a, b, c), especially at lower values, than in version 1 (not shown). However, some scatter remains for values smaller than 8 km. Against expectations, the distributions of ROCINN CRB CH and CAL CBH compared to ch_fresco* look very similar. ROCINN CRB CH and CAL CBH are generally lower than ch_fresco* for higher clouds but fit better for lower clouds, while ROCINN CAL CTH is, on average, slightly larger than ch_fresco* for the lowest clouds and fits well for the highest clouds, as expected. This is also reflected in predominantly positive and negative differences, respectively, on the difference maps ( Fig. 13a, b, c), with deviations up to ±5 km, while in version 1, the deviations were up to ±8 km (not shown). The O 2 -O 2 CH and ch_fresco* show larger scatter above and below the 1 : 1 line at cloud heights lower than 8 km (Fig. 12d). All days show a stripe at the largest values of O 2 -O 2 CH around 16 km when ch_fresco* has lower values, for example, between 4 and 8 km. The distributions of the differences between O 2 -O 2 CH and ch_fresco* on the map have their own characteristics when all days are considered, and no regularity in their occurrence is found (see Fig. 13d).
Finally, it should be noted that the ROCINN products CAL CBH and CTH, as well as CRB CH, compared to ch_fresco* have much fewer available values, only 43 % and 44 % of all values for the autumn day and the region of China, respectively, than the FRESCO and O 2 -O 2 products, with 94 % and 96 %, respectively (see Appendix B for more details on the number of available values). This limitation of values in ROCINN is not only due to snow and ice flagging because only a small area is covered with snow and ice on that day, but the cloud heights are only available for pixels with a cloud fraction value above a threshold of 0.05. This generally leads to a small number of available cloud height values for ROCINN CAL and CRB.
As mentioned in Sect. 3.2.2, a cloud fraction threshold of 0.2 is often used for tropospheric trace gas retrievals because for larger cloud fractions, the information content on the lower troposphere is small. Therefore, this cloud fraction criterion is applied in Fig. 14 which only includes those scenes from Fig. 12 that have a cloud fraction less than or equal to 0.2 (see Sect. S2.4 for density histograms of the other test days). Much more scatter occurs for the ROCINN products CAL CBH and CTH, as well as CRB CH compared to ch_fresco*, especially for cloud heights lower than 8 km (Fig. 14a, b, c). This is expected, as less information on cloud height is available at low cloud fractions. The distribution of the O 2 -O 2 CH and ch_fresco* scatters mostly at cloud heights smaller than 4 km (Fig. 14d). However, the stripes at the highest (about 16 km) and the lowest O 2 -O 2 CH values seen for cloud heights without a limitation (Fig. 13d) remain.
In addition to limiting cloud fractions to a threshold of 0.2, low cloud heights smaller than 2 km are particularly critical for tropospheric trace gas retrievals. In Fig. 15, these two restrictions are applied to the cloud heights for the autumn day and the China region, and some patterns can be seen (see Sect. S2.4 for the other test days). First, the ROCINN products compared to ch_fresco* scatter extremely at this lowest range of cloud heights with some clusters at different heights, for example, from 0 to 800 m for ROCINN CAL CBH and from 400 m to 1 km for ROCINN CRB CH, for which the regression line fits the perfect line well. While the clusters for these two cloud products are below the 1 : 1 line, the ROCINN CAL CTH shows even two clusters at different heights, one above and one below the 1 : 1 line from about 600 to 1800 m. The O 2 -O 2 CH and ch_fresco* largely agree for values larger than 1 km. However, for values lower than 1 km, ch_fresco* is mainly larger than the O 2 -O 2 CH, and a second branch and much scatter occur.
These results show that the cloud heights of the different cloud retrieval algorithms differ significantly when only scenes with low cloud fraction are considered. When only the lowest cloud height values are considered, which are critical for tropospheric trace gas retrievals, the scatter further in- creases. Differences between the cloud heights without these limitations appear to be acceptable.

Across-track dependencies
For many products of nadir-viewing UV-Vis instruments, across-track biases have been reported. Possible reasons are instrumental effects, radiative transfer effects such as the surface bidirectional reflectance distribution function (BRDF) or the angular dependency of the scattering phase functions, or observational effects, for example, from the threedimensional structure of clouds. For partially cloudy pixels, a systematic effect with higher apparent cloud fractions would be expected for large observation angles, which can be explained by the cloud holes appearing smaller for slant light paths. Moreover, for partially cloudy pixels, a small systematic effect with possibly lower apparent cloud heights could be expected, which may result from the fact that the sides of the clouds contribute more to the measured signal for slant viewing angles. In addition to systematic effects, relatively small data samples may also have across-track variations from the specific sampling of the scene used.
In the following, across-track dependencies of the different version 2 cloud products are shown in line diagrams in which the daily averaged cloud fractions and cloud heights are plotted against the across-track index of the orbits for the globe (see Sect. S3 for the plots of other regions and Sect. S4.3 for the plots of version 1). The values in Figs. 16 and 18 are flagged for snow and ice to ensure a consistent analysis for the different days. In addition, only pixels for which all products have valid values are included in daily means because the OCRA/ROCINN products compared to the TROPOMI NO 2 product have significantly fewer valid values than the other products compared to the TROPOMI NO 2 product (see Appendix B for detailed tables). This is especially true for the 22nd across-track index, where a dip occurs in all cloud fraction and cloud height diagrams (e.g. Figs. 16 to 19); its cause is discussed further below.
For the globe, about 60 %-70 % of all pixels for the ROCINN cloud products and around 80 %-90 % of all pixels for the other cloud products are included in the daily mean cloud fractions for the 4 d, which is a more constant number compared to those of the regions of Europe and China (see Appendix B). The line plots are very similar for the different days (Fig. 16). VIIRS generally shows the largest values compared to the other products (up to 20 %-40 % larger). However, all products exhibit slightly larger values for the largest across-track pixel indices, and the curves look U-shaped, as one would expect from geometrical consid- erations for broken cloud fields. The OCRA a priori and ROCINN CAL are virtually identical. The cf_fit, O 2 -O 2 , MICRU, and ROCINN CRB have a very similar run of the curves, consistent with the good agreement in the density histograms. FRESCO is always smaller than OCRA a priori and ROCINN CAL but larger than the other products, which overlap more in the centre of the orbit and diverge slightly at the edges.
It should be noted that the overall good consistency of the plots is only obtained by filtering the values (snow and ice flag and overlap of available values in all products). When using all valid values for each product independently, quite different curves are obtained for those products having many more valid values. As an example, the global diagrams are shown in Fig. 17. Without the filtering, the cloud fractions of ROCINN CAL and OCRA a priori are no longer identical, as already seen when comparing the cloud products with cf_fit in the density histograms. Overall, the OCRA/ROCINN, FRESCO, cf_fit, O 2 -O 2 , and MICRU curves have similar shapes but large offsets. In contrast, the offset between VIIRS and the other cloud products is not as large as when the data are filtered, with about 10 %-30 %, and the VIIRS curves are less U-shaped. In addition, the non-filtered plots indicate that the above-mentioned dip in the curves at the 22nd acrosstrack pixel index remains on the OCRA/ROCINN cloud products, while the other products show a smooth behaviour. This dip might be related to the changing binning scheme towards the swath edges, as the TROPOMI ground pixel size changes at this detector position. The weights from the coregistration mapping tables also show this dip (not shown); hence it could also be a consequence of the co-registration treatment in the OCRA/ROCINN algorithm.
However, there is no overall indication for a systematic across-track problem in the cloud fractions of any of the products. The behaviour of the OCRA/ROCINN curves in version 2 is in general less different from that of the other products than in version 1 (Sect. S4.3). Consequently, it can be stated that the comparability of the cloud fractions has improved after the version update. However, it should be noted that the differences between the cloud products depend strongly on how they are compared.
The across-track dependencies for the cloud heights in global terms look very similar, with slightly bent curves and minima at the centre of the index range for all days ( Fig. 18; see Sect. S3 for the plots of other regions, and Sect. S4.3 for version 1 plots). This result does not correspond to the three-dimensional geometrical consideration for clouds and the above-mentioned expectation that the cloud height might be systematically smaller for slant viewing angles, which would result in a maximum for the centre of the pixel indices. However, the number of included values is very small, about 14 %-26 % of all pixels. This is due to the fact that only pixels are used for which all retrievals result in a valid cloud height. The interpolated FRESCO CH from the TROPOMI NO 2 product (ch_fresco*) is mainly larger than the ROCINN CRB CH, the CAL CBH, and the O 2 -O 2 CH but smaller than the ROCINN CAL CTH, with differences of about 500 m. This is the major difference from version 1, where ch_fresco* is more consistent with or even smaller than ROCINN CRB and CAL CBH (Sect. S4.3). O 2 -O 2 differs more from ROCINN CRB for the spring and autumn days, especially in the ranges of the lowest and largest acrosstrack pixel indices. Both ch_fresco* and the O 2 -O 2 CH show some steps (small oscillations) in the lines that are probably linked to interpolation in look-up tables.
It should be noted that the across-track mean diagrams can vary quite strongly from day to day, especially when only sub-regions are considered rather than a global distribution, and that across-track plots might look much smoother when weekly or monthly means of data are considered. In addition, as shown for the cloud fractions, the cloud products behave differently when the filtering conditions are changed. When all valid values for each product are used independently, the cloud height curves show a different arrangement than with filtering (Fig. 19). While the ROCINN CAL CTH and CBH do not change significantly, the ROCINN CRB CH is slightly larger than the ROCINN CAL CBH and lies partly in between the two ROCINN CAL cloud heights. O 2 -O 2 shows no systematic features and seems to behave randomly; for example, for the summer day, it is mostly smaller than ROCINN CAL CBH, for the winter, it is more like ROCINN CRB, for the spring day, it is between ROCINN CRB and CAL CTH, and for the autumn day, it behaves more like ROCINN CAL CBH on the left side of the acrosstrack indices, and on the right side, it shows large values like ROCINN CAL CTH. Overall, the curves are skewed, the left side showing lower values than the right side. The most unexpected changes are found for ch_fresco*, which exhibits values up to about 1 km lower than with filtering and therefore has values up to 500 m smaller than ROCINN CAL CBH, especially in the middle of the indices. This indicates that the finding that cloud heights from ROCINN and FRESCO are closer in version 2 only holds if the same filtering of the values is applied to all cloud products.
In summary, some systematic across-track problems in the cloud heights are evident, such as the unexpected minima of    the cloud heights in the centre of the pixel indices that cannot be explained by observational effects. In addition, O 2 -O 2 shows larger variability than the other products and tends for larger values towards larger across-track indices in some seasons and scenarios. Overall, the agreement between the cloud products strongly depends on how the cloud fractions and cloud heights are filtered.

Summary and conclusions
Several different cloud products are included in the S5p operational lv2 data, and they differ in their definition and typical application. The OCRA a priori cloud fraction operates in the UV-Vis spectral region and is used as input for the ROCINN CRB and CAL models, which operate in the NIR spectral region. For global statistics, using the cloud fraction from OCRA or from the ROCINN products will not make a big difference. For individual measurements, particularly over  snow and ice cover, it is recommended to use the ROCINN CRB and CAL cloud fractions instead of the OCRA a priori cloud fraction. FRESCO provides an effective cloud fraction retrieved from top-of-atmosphere reflectances, assuming an optically thick Lambertian cloud with a fixed albedo of 0.8. This approach is useful for trace gas retrievals, for example, for ozone. FRESCO retrieved in the O 2 A-band becomes sensitive to systematic uncertainties of the surface albedo with strong viewing angle dependence, especially over forests. Due to the large difference between the O 2 A-band and the NO 2 retrieval window and the misalignment between the TROPOMI ground pixel view of the Vis and NIR bands, the cloud fraction from the TROPOMI NO 2 product has been developed. It is suitable for NO 2 trace gas retrievals because the cloud fraction is retrieved from the NO 2 fitting window in the UV-Vis spectral region at 440 nm. The O 2 -O 2 algorithm uses measurements from the O 2 -O 2 (O 4 ) absorption window at 477 nm and assumes a fixed cloud albedo of 0.8. Although a similar model to the one in FRESCO is used, the O 2 -O 2 cloud product is more sensitive to lower clouds and to aerosols due to the application of O 2 -O 2 collision-induced absorption. It also provides continuity with data from the OMI mission. The MICRU algorithm is optimized for low cloud fractions smaller than 0.2 and is preferred for pixels over water with sun glint due to the explicit treatment of sun glint. The VIIRS cloud fraction is a geometric cloud fraction retrieved from a four-level cloud mask with cloud probability. It does not depend on cloud optical thickness as strongly as an effective cloud fraction, and thus, it shows a good performance for selecting completely cloud-free scenes. Therefore, it is useable for cloud screening by TROPOMI products, for example, the methane processor, to identify cloud-free scenes for processing.
Cloud information is essential for quantitative retrievals of trace gas columns from UV-Vis satellite observations. This study reports on a systematic comparison of different cloud products for the TROPOMI instrument on board the Sentinel-5 Precursor satellite. In a first step, versions 1 and 2 of the TROPOMI cloud products ROCINN CRB, ROCINN CAL, OCRA a priori, FRESCO, the cloud fraction from the NO 2 fitting window, and VIIRS are compared. The cloud fractions from the OCRA/ROCINN cloud products show the largest differences between version 1 and version 2, the ROCINN CRB cloud fraction being less affected by the version change due to the scaling with the cloud albedo. The FRESCO product shows virtually no changes, and the cloud fraction from the NO 2 fitting window changed principally over snow-and ice-covered scenes due to an update of the snow and ice mask. The VIIRS cloud fraction shows large differences at the smallest and the largest values due to the version change from VIIRS VCM to VI-IRS ECM. Concerning the cloud heights, the largest changes are found in the remapped FRESCO cloud height from the TROPOMI NO 2 product, while the ROCINN CRB and CAL cloud heights show merely small differences. Overall, the cloud heights from the different cloud retrieval algorithms converged, which is a good result as the FRESCO cloud height in version 1 was too low overall (Compernolle et al., 2021).
The second part of this work compares the above-named TROPOMI version 2 cloud products, as well as the O 2 -O 2 product and the MICRU cloud fraction. Compared to version 1, general improvements include smaller scatter for small cloud fractions in version 2 and the fact that the acrosstrack bias in the ROCINN cloud fractions found by Compernolle et al. (2021) in version 1 is no longer present in version 2. In addition, better comparability of the cloud products is found in version 2, resulting from improvements in the various retrievals. For OCRA/ROCINN, the following changes were applied: an instrument degradation correction, a change in the surface albedo treatment (daily updated G3_LER retrieval from TROPOMI instead of fixed climatology), the adapted OCRA scaling, the change in the UV-Vis co-registration procedure in version 2, and the change from the OMI-based OCRA clear-sky reflectance climatology in version 1 to the TROPOMI-based clear-sky reflectance in version 2. While FRESCO for Europe effectively shows no changes between the two versions, the cloud fractions smaller than 0.2 for Africa differ slightly due to a surface albedo adjustment in version 2. Since the albedo is a major cause of differences between the different cloud products, in the upcoming version 2.4.0 of the FRESCO product, a directional-dependent LER derived from TROPOMI observations will be used, which is expected to improve systematic biases in the FRESCO cloud fraction. In the TROPOMI NO 2 product of version 2, the cloud albedo is adjusted when the cloud fraction exceeds 1, and a degradation-corrected irradiance is used. In addition, the FRESCO cloud heights converge less frequently to the surface pressure for low cloud fractions in version 2 than in version 1. Furthermore, a spatially and temporally better-resolved snow and ice mask based on ECMWF data is implemented in all version 2 products, as opposed to the mask from the NISE product used in version 1.
The different TROPOMI cloud products of version 2 still show some systematic differences, both in the density histograms of the cloud fractions and the comparisons of cloud heights. However, the variations between the different days are smaller than in version 1. A large part of the differences can probably be explained by the different assumptions made by the cloud products regarding the cloud model used (e.g. Lambertian cloud or scattering cloud) and the surface albedo used (e.g. daily retrieved from TROPOMI or based on a fixed climatology). Another important source for differences is the behaviour under difficult surface conditions such as snow and ice cover and how the values for these situations are flagged in the cloud retrieval algorithms. Only 1 d per season is used in this study, but the same pattern of differences is seen when examining more days; the results are not presented in this paper.
Summarizing the results for the region of Europe, the ROCINN CAL and the OCRA a priori cloud fractions are predominantly larger than cf_fit, while ROCINN CRB and FRESCO show a small offset of 5 %, and the cloud fractions from the O 2 -O 2 , MICRU, and TROPOMI NO 2 products have the overall best agreement with small differences only for the largest cloud fractions. Additional systematic issues are found for snow-and ice-covered scenes, for example, clusters of extreme values in the OCRA a priori, FRESCO, MICRU, and VIIRS plots, as well as a second correlation line for MICRU and FRESCO. In summary, the different treatment of snow and ice cover leads to large differences between the cloud products, as the cloud retrieval algorithms apply varyingly strict flagging of snow-and ice-covered pixels.
For Africa, the cloud fractions show essentially the same patterns as for Europe. Only the NIR-based FRESCO cloud fraction shows biases over vegetation due to a differently derived surface albedo. The MICRU cloud fraction shows more scatter at the lowest values due to the explicit treatment of sun glint in the MICRU algorithm. Therefore, the MICRU cloud fraction is arguably more accurate over water surfaces affected by sun glint than the other cloud products that do not specifically treat sun glint in their algorithms.
To conclude the results of the cloud heights, overall better agreement is found between the ROCINN CAL and CRB products and the interpolated FRESCO cloud height from the TROPOMI NO 2 product (ch_fresco*) in version 2 than in version 1, with correlations around 0.8. The ROCINN CAL cloud base height and the ROCINN CRB cloud height are systematically lower than ch_fresco*, in particular at large values. For cloud heights lower than 2 km and applying a cloud fraction threshold of 0.2 as often done in trace gas retrievals, the ROCINN products compared to ch_fresco* show overall large scatter and clusters of values at different heights. The number of valid values differs largely between the ROCINN products and the FRESCO CH, with the ROCINN products providing only half the number of values of the TROPOMI NO 2 product in some cases due to a different flagging procedure of the cloud height values. The O 2 -O 2 cloud height shows mainly smaller values than ch_fresco* and much scatter for low clouds, as well as two clusters of values at the largest and lowest cloud heights when ch_fresco* is between 4 km and 8 km. The larger differences between the O 2 -O 2 cloud height and the other cloud products were expected because the O 2 -O 2 cloud product is the only algorithm in this study that uses the O 4 absorption in the blue part of the spectrum.
The across-track dependency plots of the different cloud products show slightly U-shaped curves with minima in the centre of the across-track pixel indices for the cloud fractions and the cloud heights. As for the cloud fraction, these bent curves can be explained by geometrical considerations for partially cloudy pixels because higher apparent cloud fractions are expected for large viewing angles. For the cloud height, the expectation that the cloud height might be smaller for slant viewing angles due to the larger contribution of the cloud sides to the measured signal cannot be confirmed. In addition to these issues, the FRESCO cloud height from the TROPOMI NO 2 product and the O 2 -O 2 cloud height show some steps in the curves, probably related to the interpolation in the look-up tables. O 2 -O 2 exhibits large variability and tends to have larger values at larger across-track indices in some seasons. In addition, the cloud heights show differences between the eastern and western pixels. Overall, the comparability of the cloud fractions has improved after the version update. However, the differences between the cloud products vary depending on how the cloud fractions and cloud heights are compared. For example, the FRESCO cloud height from the TROPOMI NO 2 product is mainly larger than CRB, CAL cloud base height, and O 2 -O 2 using the data filtering (snow/ice flag and data consistency), but it is the smallest cloud height without the data filtering. This should be kept in mind when considering the positive result of this study that the FRESCO and OCRA/ROCINN cloud heights are closer after the version update.
Taken as a whole, the different TROPOMI cloud products in version 2 are highly correlated. The differences between the cloud products for Europe and China in terms of cloud fraction and cloud height are much larger than those for Africa when comparing the different seasons due to the surface conditions such as snow and ice cover. Differences are larger at small cloud fractions and for low clouds, situations relevant for tropospheric trace gas retrievals. When comparing version 2 and version 1 products, the consistency between the cloud products has significantly improved, which is an important message to TROPOMI data users applying the cloud products for trace gas retrievals.