Many applications of atmospheric composition and climate data involve the comparison or combination of vertically resolved atmospheric state variables. Calculating differences and combining data require harmonization of data representations in terms of physical quantities and vertical sampling at least. If one or both datasets result from a retrieval process, knowledge of prior information and averaging kernel matrices in principle allows retrieval differences to be accounted for as well. Spatiotemporal mismatch of the sensed air masses and its contribution to the data discrepancies can be estimated with chemistry transport modeling support. In this work an overview of harmonization or matching operations for atmospheric profile observations is provided. The effect of these manipulations on the information content of the original data and on the uncertainty budget of data comparisons is examined and discussed.

The quality assessment and validation of atmospheric state observations largely rely on making comparisons with (reference) measurements of the same observable. On the other hand, data merging or fusion schemes involve the combination of observations from different sources, weighted by functions that mix uncertainties, information content aspects, and spatiotemporal (4-D) representativeness. And chemical data assimilation involves the comparison and/or combination of observations with modeling outputs. However, quantitative comparisons and combinations of atmospheric soundings are only possible when the observables are represented on the same vertical grid, within the same vertical range, and in identical units. Moreover, observations by different instruments also differ in their sensitivity to and representativeness of spatiotemporal features of the atmospheric field (i.e., resolution or smoothing differences)

Carried out in the context of several satellite validation studies (for Sentinel-5P, the European Space Agency's Climate Change Initiative, and the Satellite Application Facility on Atmospheric Composition Monitoring) and considering the exploration of advanced data fusion methods

When taking the difference of two vertically resolved atmospheric state observations, e.g., a measurement under study

If at least one of the observations is the result of a retrieval process, some retrieval contributions to the difference errors can be made explicit as well. For example each retrieved profile

The Committee on Earth Observation Satellites (CEOS) defines validation as (1) “the process of assessing, by independent means, the quality of the data products”

One can perform a so-called

Secondly, one often directly quantitatively or qualitatively verifies whether a sample bias

Irrespective of the method used, a full assessment and quantification of all contributions to the difference error

This section provides an overview of profile matching manipulations. A distinction is made between representation matching (relating to the vertical grid, vertical quantities, and their units), vertical smoothing matching (cf. vertical resolution of the measurement), retrieval matching (cf. impact of prior information), and spatiotemporal co-location matching. Because of the focus on vertically resolved atmospheric state observations, horizontal and vertical sampling and smoothing issues are discussed separately.

The matching of the vertical representation of the study and reference profiles is an unavoidable operation to make difference calculations possible in the first place. The vertical representation includes the vertical sampling and coordinate (altitude, pressure, geopotential height, or other) and the atmospheric state quantity (volume mixing ratio, number density, partial column, or other). A representation conversion may introduce a bias and reduce the precision due to uncertainties in the ancillary data and data manipulations, which actually should be taken into account in the comparison's uncertainty budget (see Sect.

When changing between concentration-type quantities like number density and volume mixing ratio, a diagonal level-by-level unit conversion matrix

When going from a concentration-type representation on levels to one between levels (i.e., on layers, like partial columns), one can choose the integration boundaries either on the given levels or in between them with the exception of the outer edges, resulting in a rectangular or square conversion matrix

The number of levels (for point-like concentration values) or layers (for vertically integrated column values) and their vertical locations or boundaries have to be identical for two profiles to be quantitatively compared. One can opt for an explicit vertical range matching of the two profiles first, e.g., by vertical clipping of the one or by extension by use of a climatology of the other. The latter can be applied when later profile operations require knowledge of the atmospheric state over its full vertical range

Several regridding approaches are in use, although their application typically can depend on units and/or the vertical resolution discrepancy between the input and target grids.

Straightforward regridding by (linear or other) interpolation only works appropriately, i.e., with minimum information loss, when going from a coarser-resolution input grid to a finer-resolution target grid. Although the corresponding interpolation matrix

In order not to suggest a vertical resolution that is misleadingly much higher than the effective vertical resolution of (one of) the observations, atmospheric state profile comparisons are often made on the vertical grid of the product with the coarsest sampling. When consequently the input grid has a finer resolution than the target grid, one can easily invert the problem by constructing an interpolation matrix

In practice vertical sampling definitions might change in time, or one might not know beforehand whether the target grid is coarser than the input grid or vice versa, or both grids may be similar.

One might instead prefer the total vertical column amount to be conserved during the regridding operation. Such mass-conserved regridding is easily achieved for partial column quantities, whether going from finer to coarser resolution or vice versa. It is sufficient to construct an overlap matrix that contains the fractions of how much each target grid layer is covered by an input grid layer

Total mass-conserved regridding of concentration-type quantities defined on vertical levels or as vertical averages, as is often the case in model fields, is somewhat less straightforward. Before being able to apply the conversion matrix as defined in the previous expression, the point-like concentration values of the input profile must be converted to vertically integrated values, and after the subsequent mass-conserved regridding operation a conversion to the initial units is needed. A combination of Eq. (

The vertical correlation of atmospheric measurement or retrieval quantities results from the allocation to neighboring levels (layers) of concentrations (columns) that are in fact obtained from vertically overlapping probed air masses. Especially for profile retrievals that have more retrieval levels than independent degrees of freedom in the measurement, the vertical smoothing of the spectral measurement information by the retrieval can be large. As the algebraic inversion of a retrieved profile's vertical smoothing is typically an ill-posed problem, vertical smoothing matching is ideally achieved by imposing an estimator of the coarser height-dependent window smoothing function to each level (layer) of the atmospheric state profile with the smaller window smoothing.

The smoothing window estimator can take any custom-defined shape, but in practice typically a box, triangular, or Gaussian-like function is applied. The window function in any case has to be normalized to unity, while the function width determines the extent of the vertical smoothing effect. This extent is chosen in agreement with the estimated vertical resolution of the coarser-smoothed atmospheric observation, usually going from a few to several tens of kilometers

For retrieved atmospheric state profiles, the best and already discretized estimators of the vertical smoothing functions are provided by the averaging kernel matrix rows

Attempting to harmonize two atmospheric state products whereby at least one is the result of a retrieval process, one has to consider differences in measurement weights, prior profile shapes, and prior constraints between both products. These differences can be (partially) corrected for in two ways. Either one imposes the retrieval artifacts of one product on the other, or one eliminates the retrieval artifacts and associated uncertainties from the retrieved product(s) at the cost of vertical resolution. Both options are discussed in the following two subsections, respectively.

Profile harmonization flowchart, indicating the order of the matching operations outlined in the text. Rectangular boxes are optional, while hexagons are mandatory. The maximum likelihood (ML) representation has here been included as a prior matching operation with

As described in the previous sections, vertical sampling and effective resolution differences can be virtually eliminated by applying appropriate regridding and smoothing procedures, respectively. The underlying requirement however is that the vertical dimension within the measurement range is nearly continuously sampled or, phrased differently, that neither the study nor the reference profile is vertically highly under-sampled. This ensures that neither instrument is blind to significantly variable parts of the profile, as only then can interpolation errors be kept to a minimum. Alternatively, interpolation difference errors could be small if both instruments have the same under-sampling pattern, but this hardly occurs in practice.

Regridding window functions for the four vertical sampling matching operations discussed in Sect.

In the horizontal and temporal dimensions, the sufficient-sampling requirement is usually far from satisfied for vertically resolved atmospheric state observations, in particular for ground-based measurements. Except for some specific measurement campaigns, station-to-station distances are usually much larger than the horizontal representativeness of the measurements, and the typical sounding frequencies (e.g., weekly) are much coarser than the characteristic measurement duration (minutes to hours) and timescale of atmospheric variability

Matching operations overview table for vertically resolved atmospheric state observations (order of appearance in the text). The averaging kernel (AK) smoothing operation shows the exemplary case for

Impact of matching operations on the information that is contained in the fractional averaging kernel matrix, as expressed by the DFS =

It is beyond the scope of this work to provide a review of all potential co-location methods, which range from simple space and time constraints to more geophysical constraints (e.g., based on potential vorticity), and even Lagrangian trajectory calculations to match as much as possible the measured air masses

Despite these attempts to optimize the co-location criteria, some irreducible co-location mismatch usually still affects the comparisons, adding non-negligible random and systematic errors to the difference statistics, and thereby hampering the interpretation of the differences in terms of the quality of the measurements and their reported uncertainties. Several approaches to quantify these co-location difference errors exist; see

The use of model data however also introduces some model uncertainty in the comparison results, meaning that this procedure only makes sense when the model uncertainty is (expected to be) smaller than the (spread on the) co-location mismatch errors. Moreover, a residual co-location difference error is still present, caused by finer structures in the sampling and smoothing of the observations than those accounted for by the model. This residual error can be quantified by use of an additional reference dataset that has a finer resolution than the model

Impact of matching operations on the comparison uncertainty budget

An overview of the atmospheric state profile matching operations discussed in this work is listed in Table

While intended to merely remove uncertainty contributions from eventual atmospheric state profile difference statistics, the harmonization operations discussed in this work obviously also impact the remaining covariance (matrix) and the information that is contained within a retrieval's averaging kernel matrix. First of all, from the discussion on vertical smoothing matching one can observe that in fact all operations that include a multiplication with a non-diagonal conversion matrix also impose a vertical smoothing on the vertical profile and its covariance and averaging kernel matrices. Especially the vertical sampling matching operation combines information from several input grid levels into a single output grid level by definition. For linear and mass-conserved regridding operations, the associated vertical smoothing windows are approximately triangular and square, respectively, with an extent that is limited to adjacent grid points (see Fig.

Vertical quantity matching by use of a diagonal conversion matrix will not introduce a vertical smoothing effect, but affects the covariance matrix and the averaging kernel matrix nevertheless. This is a result of these matrices being typically provided in absolute and, thus, unit-dependent numbers. One can avoid this unit dependence by switching to fractional representations of the covariance and averaging kernel matrices instead. These are given by

The harmonization operations presented in this work are intended to enable the calculation of profile difference statistics and to eliminate uncertainty contributions from the total uncertainty budget as expressed by Eq. (

It is clear that the vertical representation harmonization operations actually do not remove uncertainty from the full budget but are required for difference calculations of atmospheric state vectors with equal units and lengths. These operations affect the product covariance and moreover introduce auxiliary representation conversion uncertainty

Two atmospheric state products with different vertical smoothing

For the (asymmetrical) averaging kernel smoothing operation, the expression in Eq. (

In terms of the uncertainty contributions that are removed from the full covariance of the difference, the AK smoothing operation is equivalent to the re-optimized prior matching (Eq.

In the context of data comparisons as performed in satellite validation and of data combinations through assimilation or fusion, this work discusses the most frequent methods for the harmonization of vertically resolved atmospheric state observations in a conceptually and terminologically aligned framework. The harmonization of two profiles' representations is mandatory for data comparisons and for proper quantitative

No research data have been used in this theoretical overview work. The plots in Fig. 2 have been created from demonstrative matrices whose construction is explained in the text and caption.

AK wrote the majority of the text. SC initiated Sect. 5 and Fig. 2. TV wrote Sect. 4.4. DH verified the algebra and text consistency. J-CL is coordinator of this research.

The authors declare that they have no conflict of interest.

This article is part of the special issue “Towards Unified Error Reporting (TUNER)”. It is not associated with a conference.

The authors would like to acknowledge Thomas von Clarmann, Simone Ceccherini, Nicola Zoppetti, and Viktoria Sofieva for helpful discussions.

Parts of the reported work were funded by the AURORA project supported by the Horizon 2020 EU Research and Innovation program (call: H2020-EO-2015; topic: EO-2-2015) under grant agreement no. 687428, by ESA via the CCI-ECV Ozone Phase 2 project, and jointly by the Belgian Federal Science Policy Office (BELSPO) and ESA via the ProDEx project TROVA (PEA 4000116692, supporting S5PVT AO ID 28587 CHEOPS-5p). This work builds on the versatile satellite validation system Multi-TASTE that was developed in several heritage projects and refined within the EU FP7 Project Quality Assurance for Essential Climate Variables (QA4ECV; grant no. 60740) and EU H2020 project Gap Analysis for Integrated Atmospheric ECV CLImate Monitoring (GAIA-CLIM; grant no. 640276).

This paper was edited by Doug Degenstein and reviewed by two anonymous referees.