Abstract

AMT

Atmospheric Measurement Techniques

AMT

Atmos. Meas. Tech.

1867-8548

Copernicus Publications

Göttingen, Germany

10.5194/amt-19-3999-2026

Bayesian denoising of satellite images using co-registered NO2 images

Koene

Erik Franciscus Maria

erik.koene@empa.ch

https://orcid.org/0000-0002-2778-4066

Kuhlmann

Gerrit

https://orcid.org/0000-0002-7021-4712

Brunner

Dominik

https://orcid.org/0000-0002-4007-6902

1Empa, Laboratory for Air Pollution / Environmental Technology, Dübendorf, Switzerland

Erik Franciscus Maria Koene (erik.koene@empa.ch)

17June2026

19 12 39994012 11September2025 13October2025 8May2026 11May2026

2026

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/

This article is available from https://amt.copernicus.org/articles/19/3999/2026/amt-19-3999-2026.html

The full text article is available as a PDF file from https://amt.copernicus.org/articles/19/3999/2026/amt-19-3999-2026.pdf

Abstract

Accurate emission tracking (e.g., locating and quantifying hot spots) using satellite images requires a good signal-to-noise ratio (SNR) of total column images. Achieving this SNR is challenging for satellite-based trace gas imagers, especially when enhancements are small relative to the background or small relative to retrieval uncertainty. Therefore, some satellites carry additional trace gas imagers with high SNR, such as NO2, which is co-emitted with the trace gas of interest. While NO2 is frequently used qualitatively for plume detection or plume fitting, its potential for quantitative noise reduction remains largely untapped. This paper presents two methods to enhance the SNR of total column images using co-registered NO2 images through minimum mean square error (MMSE) Bayesian denoising, which are a simple form of a Kalman filter or maximum a posteriori estimate. The first “joint MMSE” method relies on the presence of plumes in both the low- and co-registered high-SNR NO2 images. The second “self-similar MMSE” method utilizes image self-similarity and is based on an existing technique called BM3D. The methods are evaluated using a synthetic dataset (SMARTCARB) of atmospheric CO2 and NO2 concentrations, achieving over +40 dB improvement in peak SNR (i.e., an over 10^(40∕10) increase in SNR). Additionally, the methods are applied to TROPOMI SO2 and NO2 data over South Africa and used to compute a divergence image, demonstrating that an estimated 40 %–80 % noise reduction is possible. By enhancing the SNR of total column images, these techniques improve the detectability of subtle emission signals, which could benefit atmospheric monitoring applications.

Horizon 2020

958927

101082194

1Introduction

To quantify emissions and support climate policy, satellite-based monitoring system are developed that will detect and quantify emissions plumes from cities and large point sources (hereafter referred to as “hot spots”). To perform emission quantification for hot spots, a good signal-to-noise ratio (SNR) is essential; first to be able to detect the plumes, and second to be able to quantify the plume enhancements with good accuracy. Achieving this for satellite observations of CO2 is challenging, as enhancements (∼ 0–5 ppm) are minor compared to background levels (∼ 420 ppm) and retrieval uncertainties are high . Therefore, CO₂ monitoring satellites like the Global Observing Satellite for Greenhouse gases and Water cycle (GOSAT-GW), the Copernicus Anthropogenic Carbon Dioxide Monitoring (CO2M) mission and the Twin Anthropogenic Greenhouse Gas Observers (TANGO) mission will carry an additional NO₂ instrument. NO2 is useful because it is co-emitted with CO2 during high-temperature combustion while it can be measured with a much better SNR. NO2 thus helps delineating and thereby quantifying the low SNR CO2 plumes using emission quantification methods. So far, approaches in the literature have used the information contained in the NO2 observations mainly qualitatively. For example, they guided plume detection or constrained a Gaussian curve fitted to plume transects . One prominent emission quantification method, the divergence method , cannot effectively leverage the superior SNR of the co-registered NO2 data, as it does not depend on plume detection. As the divergence method is highly susceptible to noise in the data due to its derivative operations, proposed to apply a mean filter to prepare the noisy CO2M CO₂ images for the divergence method; however, such spatial smoothing risks blurring emission signals at the source.

In this paper, we explore two data-driven methods to enhance the SNR of trace gas images using the co-registered NO2 images (a process also referred to as “denoising”). Compared to alternatives like a model-driven denoising approach relying on meteorological, emission, and satellite noise priors, a data-driven method requires fewer assumptions, as the empirical cross-field covariance between co-registered images is learned directly from the data. An example of the proposed methods is given in Fig. , which illustrates the effectiveness in reducing noise in a SO2 image recorded by the TROPOspheric Monitoring Instrument TROPOMI;. The two methods are minimum mean square estimators (MMSEs). The first is based on the joint information in CO2 and NO2 pixels. The second is based on image self-similarity. Put simply, the MMSEs we present are operations that extract a denoised signal from two (or more) noisy inputs in a Bayesian optimal way, much like a Kalman filter. We define the estimators in the theory section, and suggest to chain them in series to provide the best results. Within the results section of the paper, we verify the method by applying it to synthetic CO2M data to denoise synthetic CO2 images. We then show a “real data” example using combined TROPOMI SO2 and NO2 data and use the denoised SO2 data to denoise the corresponding divergence map.

Figure 1

Example of the denoising procedure for a South African region, recorded by TROPOMI on 20 February 2021. Axis labels are omitted to emphasize the clarity of the denoised image rather than geographical location. Data with a quality factor below 0.75 is masked and appears as plain gray. By optimally combining the (a) “noisy” SO2 and (b) less “noisy” NO2 images, we create a (c) denoised SO2 image. We highlight some features of the denoised image. Square A shows that in areas without signal, noise is effectively removed from the image. Rectangle B indicates that plume signals present in the noisy data are retained, resulting in an enhanced signal-to-noise ratio. Arrow C illustrates an east-west (i.e., right-left) feature with low amplitude but sharp, high contrast edges, identifiable in the original SO2 image but absent in the NO2 image, confirming that significant “positive” and “negative” signals are preserved during denoising. Circle D shows a feature of high amplitude in the NO2 image which is absent in the SO2 image, indicating we do not add signal unduly. The noise level estimate is derived from .

2Methods

The two denoising methods presented in the following will be referred to as the “joint MMSE” approach and the “self-similar MMSE” approaches. The former is a novel innovation, whereas the latter is a pre-existing method from the field of computer vision, which we adapt for denoising co-registered images.

2.1Joint MMSE (jMMSE)

In this section, we explore a method that makes use of the joint information in two co-registered signals at the pixel level. The theory may alternatively be derived from a Bayesian inference point of view, as shown in Sect. S1 in the Supplement.

2.1.1Observation model

Satellite data of two co-registered pixels of, say, CO2 and NO2 follow a general model like 1CÕ2NÕ2=CO2NO2+nCO2nNO2, where the tildes indicate noisy observations; CO2 and NO2 denote the noise-free but unknown true values, and nCO2 and nNO2 indicate the noise on the measurements. We can rewrite this model into a coupled observational model by making it a function of the noise-free CO2 data, 2CÕ2NÕ2=1d(x,y)CO2+nCO2nNO2=M̃=Hc+n where M̃ contains the two noisy observed pixels, c=CO2 is the noise-free column, H=[1d(x,y)]T is the observation operator with a spatially varying function d(x,y) that transforms the CO2 pixel into an equivalent NO2 observation, and n contains the two noise components.

2.1.2The maximum a posteriori solution

Our aim is to estimate c (the unknown noise-free CO2 column) from M̃ (the noisy observations). This can be written as a maximum a posteriori problem with a Gaussian distributed prior with mean E[c], noise mean E[n]=0 and independent errors E[cn]=E[c]E[n]=0, yielding a minimum mean square error (MMSE) optimal estimate of the underlying CO2 field, which we will denote by c^, 3c^=arg mincEσc-2(c-E[c])2+(Hc-M̃)TCnn-1(Hc-M̃), where σc2=E[c2]-E[c]2 is the variance of the expected prior, and Cnn=E[nnT] is the noise covariance matrix. See Appendix A for details how such quantities may be computed in practice. The solution to this problem is well-known e.g.,Eq. 6.8, 4c^=HTCnn-1M̃+σc-2E[c]HTCnn-1H+σc-2. The solution in Eq. () is the maximum a posteriori solution, also known as the generalized least squares solution or the Bayesian linear estimator, which is mathematically also identical to a single prediction step in a Kalman filter framework without recursive time updates e.g.,Eqs. 6.13–6.14.

2.1.3Solution using only the available data

The solution in Eq. () is elegant but impractical, as it requires one to know H (i.e. the true NO2 : CO2 ratio d(x,y) for every pixel). However, in the following, we show that the solution can be rewritten into a form that depends merely on the data itself. For this, we take a closer look at the data covariance matrix, which can be estimated from the data itself (i.e., the sample covariance matrix): 5Cdd=E[M̃M̃T]-E[M̃]E[M̃]T=cov(CÕ2,CÕ2)cov(CÕ2,NÕ2)cov(CÕ2,NÕ2)cov(NÕ2,NÕ2). Given the model of Eq. (), we can also write it as (making judicious use of E[cn]=0), 6Cdd=E(Hc+n)(Hc+n)T-EHc+nEHc+nT,7=E[nnT]-EnEnT+HHTE[c2]-E[c]2,8=Cnn+HHTσc2. As detailed in Sect. S3.1, we can derive a matrix inversion identity from the Sherman–Morrison formula, A(A+BCDT)-1=I-BDTA-1/(DTA-1B+C-1), which yields the following relation, 9CnnCdd-1=I-HHTCnn-1HTCnn-1H+σc-2, for which we note that the right-hand side closely resembles Eq. (). By rearranging terms, pre-multiplying with a vector wT=[10] that satisfies wTH=1, post-multiplying the result with (M̃-E[M̃]) and adding the expected prior column E[c], we obtain 10wTI-CnnCdd-1M̃-E[M̃]+E[c]=HTCnn-1M̃+σc-2E[c]HTCnn-1H+σc-2, (the details of this step are given in Sect. S3.2).

The denoised column estimate c^ of the Bayesian optimal solution of Eq. () may thus be obtained entirely from the data itself, using the left-hand side of Eq. (). It relieves us of the need to know the forward model H that maps the noise-free CO2 field into NO2 columns. Simplifying the left-hand side of Eq. (), the details of which are given in Sect. S3.3, we obtain the optimal joint MMSE estimate, 11c^=CÕ2-wTCnnCdd-1(M̃-E[M̃]). The various covariance matrices and expected values need to be computed using small patches of size T×T for small values of T (e.g., 5) around a given pixel. See Appendix A for an example implementation in the Python programming language, and Sect. S2 for an explicit version of the jMMSE estimate without vector notation.

The ratio CnnCdd-1 in Eq. () is quite literally the inverse of the SNR. Thus, in regions of a high SNR (CnnCdd-1≈0) we simply keep the measurement as it is, c^=CÕ2. In regions without enhanced signals, we have Cnn=Cdd⟺CnnCdd-1=I and thus take the expected value c^=E[c], e.g., the local mean or local median. Conversely, noise will be optimally subtracted in the case of a lower SNR (CnnCdd-1>I) based on the correlations between the CO2 and NO2 measurements. Hence, the derived expression has all the properties that we would expect from a SNR perspective.

2.2Self-similar MMSE (BM3D)

An alternative method for denoising is called block matching and 3D filtering (BM3D). This method was introduced by in the field of computer vision. It is another MMSE method, but this time it makes use of the self-similarity of patches within single color images (with three channels) to denoise them. We adapt it for denoising joint satellite images using two channels by linearly combining min-max-normalized CO2 and NO2 data (i.e., fitting the data range of the two satellite images into the range 0 to 1) into the first channel with a factor 0.5 each, and placing the min-max-normalized NO₂ image into the second channel. After computing the denoised estimates for both channels, we subtract the denoised NO₂ channel (with a factor 0.5) from the first channel to extract the denoised CO2 signal. Note that the factor 0.5 is one of the two hyperparameters that can be modified when applying BM3D (the other being the prior observation error, σc, mentioned in the following).

BM3D is still considered to be a state-of-the-art image denoising algorithm e.g.,, and computes the following MMSE result for the noise-free CO2 field (compare with Eq. ): 12c^(x)=T-1Cnn-1(f)Cnn-1(f)+σc-2TCO2̃(f), which is also known as a “Wiener (deconvolution) filter”. Operators T and T-1 are 3D wavelet transforms and their inverses, which project the image pixel space (x) into frequency domains (f). Compared to Eq. (), BM3D works with a scalar quantity rather than a vector quantity for each frequency, and the observation operator H is simply replaced by 1. The factor Cnn-1(f) is given by Cnn-1(f)=T[CO2](f)2. Thus, Cnn-1 is the energy of the true (noisefree) CO2 signal in the wavelet transformed domain. High spectral energy implies low noise and vice versa. Of course, the noisefree signal is not available, so the Wiener filter of Eq. () is not actually computable. BM3D circumvents this problem by first obtaining an estimate of the noisefree CO2 data through an initial filtering step, which is used instead of the true noisefree signal in Eq. ().

BM3D manages to achieve good performance using the assumption of image self-similarity (i.e., small patches of similar-looking data repeat throughout an image). If one can find several of such similar-looking patches in the image, and takes their mean, then random noise should be attenuated this is called “non-local means”,. The first estimate in BM3D is obtained in a similar manner. More precisely, first, an 8×8 image patch is selected and N similar patches are found in the image. Second, an N×8×8 “3D block” is formed of these patches. Third, the 3D blocks are transformed into the wavelet domain using a 3D wavelet transform T and denoised using a hard thresholding step (i.e., frequency components with low energy are removed). Fourth, after an inverse wavelet transform T-1, the N denoised patches are moved back to their respective spots in the image. This process is repeated for each image patch. The fifth step is to repeat the entire process, except that the denoising now uses Wiener deconvolution of Eq. () with Cnn-1 defined by the first denoised estimate, yielding the MMSE of the final image. The method is sketched in Fig. .

Figure 2

A schematic explanation of BM3D. In stage 1, similar looking patches are collaboratively denoised to produce a first denoised estimated image. In stage 2, similar looking patches are selected from the first estimate, and corresponding patches from the original input, form two blocks. Using a Wiener filter, the original image patches are denoised, leading to the final denoised image. The steps are carried out for all possible patches in the image.

BM3D denoises color images by forming a composite channel that contains the summed red, green, and blue image data. This composite channel is used for patch selection (step 1, above). For the remaining channels, the same patches are used, but each channel is denoised individually. We propose the same to make the process work for CO2 and NO2 images: we normalize the CO2 and NO2 images, and then form one channel of (CO2+NO2)/2 and one channel of just NO2. Patch selection is carried out on the first channel (the mean of the normalized CO2 and NO2 images), but denoising of the patches is carried out on both channels individually. By subtracting the second channel from the first, we end up with a new CO2 image, which was helped by the higher signal-to-noise ratio of the NO2 image during patch selection and denoising. A reference implementation in Python can be used that is called “bm3d” on pypi by .

2.3Sequential denoising using the two presented methods

As the two methods (joint MMSE and BM3D) are sufficiently different in the structural features they use to denoise the data, it stands to reason that an application of both BM3D (to provide an initial cleaned up version of the data) followed by the joint MMSE (to further enhance the signal) will have the potential to further denoise the data. In this paper, we also test this sequential denoising method.

3Results 3.1Performance metrics

We score the performance of the methods where the truth is available using the two most common metrics in computer vision. The first is the peak signal-to-noise ratio (PSNR) in units of decibel, i.e. a higher value means a better performance, 13PSNR=10log⁡10(max⁡(c)-min⁡(c))21nxny∑ix∑jy(c^ix,jy-cix,jy)2, where c≡cix,jy is the true (noise-free) signal and c^≡c^ix,jy is the estimated signal, indexed over all 2D pixels ix and jy. The second metric is the structural similarity index measure SSIM;again, a higher value means a better performance, 14SSIM(x,y)=(2μxμy+c1)(2σxy+c2)(μx2+μy2+c1)(σx2+σy2+c2), where x and y are 7×7 tiles/patches from images c and c^ respectively, μx and μy are their sample averages, σx and σy are the sample standard deviations, σxy is their covariance, and c1=(0.01(max⁡(c)-min⁡(c)))2 and c2=(0.03(max⁡(c)-min⁡(c)))2. The PSNR is very sensitive to random noise, while the SSIM is very sensitive to image artifacts such as blurring. Consequently, we want the PSNR and the SSIM to improve simultaneously.

We can make a noise estimate using the algorithm described in , which compares a grid-aligned Laplacian estimate with a diagonal Laplacian estimate, to estimate the noise standard deviation for Gaussian (i.e., white or random) noise as 15σest=π216(W-2)(h-2)∑pixelsI(x,y)∗N where W and H, respectively, are the width and height of the trace gas image I, and where ∗ denotes a spatial convolution with the 2-D kernel 16N=1-21-24-21-21.

3.2Application to synthetic joint CO2 and NO2 images

In this section, we will apply the algorithms presented above to synthetic CO2M CO₂ and NO₂ satellite images from the SMARTCARB dataset . The SMARTCARB dataset is a synthetic quasi-Level 2 product at the CO2M spatial resolution (roughly 2×2 km²) and swath length (roughly 250 km) over primarily Germany and surrounding regions, for the year 2015. As it is essential that plume signals look “similar” for input to the MMSE methods, we will use column-averaged dry-air mole fractions of CO2 (XCO2) data (in ppmv) and tropospheric NO2 column densities (in molec. cm⁻²). The reason for using XCO2 rather than CO2 column densities is that the latter are strongly susceptible to surface topography variations. One could also use XNO2 images, but the topographic effect on NO2 images is typically negligible. Hence, we will essentially use the “standard” data products. When denoting the results with the joint MMSE, the parameter T refers to the window size used to compute expected values within the joint MMSE method. For example, T=5 means that we select a 5×5 region centered on a pixel, which for CO2M is a region of about 10×10 km.

We will refer to results from the joint MMSE as “jMMSE” and from the BM3D method as “BM3D”. Additionally, we show the results from applying a simple 5×5 pixel mean filter (denoted as “5×5 mean filter” or “5×5 filter”) to purely the CO2 data, which was proposed in as a simple but effective method to prepare the noisy SMARTCARB data for the divergence method. Figures – show an example of the denoising methods applied to synthetic CO2M CO2 and NO2 images. The examples use the “high noise” scenario of the SMARTCARB dataset with random errors of σVEG50=1 ppm for XCO2 (the VEG50 scenario uses vegetation albedos and solar zenith angle of 50°; ) and σ=2×1015 molec. cm⁻² for NO2. We select the simulation day 23 October 2015, and focus on the coal-fired power plants Prunéřov

50.42° N 13.26° E; simulated with emissions of 11.4 Mt CO2 yr⁻¹ and 11.3 kt NO2 yr⁻¹.

and Počerady

50.43° N 13.68° E; simulated with emissions of 9.3 Mt CO2 yr⁻¹ and 9.2 kt NO2 yr⁻¹

. These mid-sized power plants were selected as their emissions produce only weak plume enhancements compared to the CO2 measurement noise level. Figure shows that the simulated high noise on the CO2 signal largely obscures the signal of the power plants, while the high noise on the NO2 signal does not cause considerable changes with respect to the “true” simulated NO₂ field. We can see that the 5×5 pixel mean filter does not manage to recover much of the CO2 signal. Conversely, applying the joint MMSE to the two noisy input fields recovers much of the CO2 images for a window size T=9 – see Sect. S5 for images of other window sizes. The BM3D method (panel c) performs roughly equal to the joint MMSE method with T=9 (panel b). We obtain the highest objective score by sequentially applying the joint MMSE method with T=9 to the BM3D results (panel d), with a visibly good fit to the noisefree CO2 signal, as well as an eightfold improvement of the SSIM and an increase in PSNR by +45.1 dB when going from the noisy image to the denoised image. To put this into context, consider that for Gaussian white noise, averaging X images with noise variance σnoisy yields σdenoised2=σnoisy2/X. Their PSNR improvement in dB may be expressed as 10log⁡10σnoisy2/σdenoised2=10log⁡10X. Using this relation, we we obtain here that X=1045.1/10≈32000, i.e., the image denoised with BM3D and jMMSE for T=9 has the same noise characteristics as if we would have averaged 32 000 images with these Gaussian independent high noise characteristics. Thus, the joint information content in CO2 and NO2 images is very large.

Figure 3

An example of a synthetic CO2M satellite image on 23 October 2015 for a “high noise” scenario zooming on the emission plumes of the coal-fired power stations near Prunéřov and Počerady.

Figure 4

The noisy data from Fig. b denoised using the jMMSE methods. Visually, it is clear that the plumes originally obscured by noise become visible again. The PSNR and SSIM scores have increased (indicating improvement). The best denoising performance is obtained by the combination of BM3D and the jMMSE method for T=9, with a +45.1 dB improvement.

3.3Application to joint SO2 and NO2 TROPOMI images

The method is also evaluated using Level-2 trace gas products from Sentinel-5P/TROPOMI, which provides near-daily global coverage at a spatial resolution of approximately 7×3.5 km² at nadir. We use reprocessed (RPRO) SO2 data (processor version 02.04.01) and NO2 data (processor version 02.04.00) for the year 2021 . We perform the denoising using a qa_value > 0.35 to retain some more data for the denoising methods to work with, but after denoising, we analyse and plot the results using the standard recommended threshold of qa_value > 0.75. The SO2 product used here corresponds to the SO2 Differential Optical Absorption Spectroscopy (DOAS) RPRO product. The more recent operational RPRO product is obtained using the Covariance-Based Retrieval Algorithm (COBRA), which has significantly lower noise and a corrected bias in the retrieval. As a consequence, the denoising performance demonstrated here for SO2 should be interpreted in the context of this earlier processor version; results may differ for COBRA-based products due to their improved signal-to-noise ratio and potentially different error correlation structure. To better represent surface emissions, an air mass factor correction is applied to the SO2 images, dividing the total column by the average of the three lowest averaging kernel weights

Following equation SO2,new=(∑lxl′/∑lAlxl′)SO2,old with x′=[0,…,0,1,1,1] from the top of the atmosphere to the surface and A is the averaging kernel. The reasoning is that we look only at power plant emissions, so we expect all the SO2 mass to be present close to the surface.

. While a more detailed investigation of air mass correction factors would benefit accurate emission estimates, that is beyond the scope of this study.

Figure a–d shows a TROPOMI overpass image over South Africa centered at Johannesburg for 3 March 2021, along with the ratio of the reported column precision to column values. This latter aspect is the inverse SNR (or, equivalently, the noise-to-signal-ratio). The NO2 image has substantially larger regions of low inverse SNR values (indicative of a good SNR) than that found for SO2. Hence, the SNR for NO2 is much better than that for SO2. Figure e shows that the SO2 signal can be improved using the NO2 signal with the multichannel BM3D method. The noise reduction becomes greater when the combination of the BM3D and jMMSE method is applied (panel f). Note that we use the jMMSE using T=5, which effectively implies a window of 25×24.5 km² at nadir, which is slightly larger than what we used for the SMARTCARB test. Figure g–h illustrates the changes compared to the original TROPOMI image, indicating substantial noise reduction. What is notable is that values close to 0 in Fig. g–h are exactly those regions with good SO2 SNR as shown in Fig. d. Hence, denoising has primarily removed “bad” signal while preserving “good” signal. Furthermore, the removed noise as shown in g–h is essentially feature-less and consists of random speckles. Had we seen plume-like features in Fig. g–h we would know that we were subtracting signal and not just noise, but this is clearly not the case. Thus, subtracting the denoised image from the original image provides an easy way to check if signal was added or destroyed.

Figure 5

Single TROPOMI overpass image on 3 March 2021 of (a) NO2 and (c) SO2. All pixels with a qa_value larger than 0.75 are plotted. Panels (b) and (d) plot the inverse of an estimate of the SNR, where values close to zero are assumed to be of high SNR. Panels (e) and (f) show an application of the (multichannel) BM3D and additionally the jMMSE. Panels (g) and (h) show the difference between the estimates and the original SO2 image.

If we apply the noise estimation method from to our image, we find that the original SO2 image has σest,original=4.1×1024 molec. cm⁻² (which is 1.7 times the mean reported column precision for this overpass), while σest,BM3D=1.2×1024 molec. cm⁻² and σest,BM3D+jMMSET=5=5.9×1023 molec. cm⁻²; in other words, a relative improvement of about 70 % or 86 %, respectively.

When considering a full year of observations (only selecting observations with a qa_value larger than 0.75) we find that the original SO2 image has a mean noise estimate following the method of of σ‾est,original=4.2×1024 molec. cm⁻², the BM3D average estimated noise is σ‾est,BM3D=2.3×1024 molec. cm⁻², and σ‾est,BM3D+jMMSET=5=1.4×1024 molec. cm⁻² – that is, a 47 % and 66 % improvement in noise characteristics, respectively.

To further illustrate the advantage of this methodology, we present annual SO2 divergence maps in Fig. , i.e., computations of ∇×(ueffSO2) averaged over a full year, where ueff is the 2-D vector containing the effective horizontal wind. In this case, wind fields were computed by vertically averaging ERA-5 reanalysis fields using the GNFR-A emission profile as weights. (We present an identical analysis for the SMARTCARB dataset in Sect. S4 where we show that the denoising methods, with the exception of the 5×5 pixel mean filter, do not introduce any biases, supporting the validity of the denoising approach for emission quantification.) The divergence was computed on the TROPOMI overpass coordinate system, and then remapped to a common 0.03° grid. We consider the divergence map after applying BM3D to the overpasses in Fig. b, a 5×5 pixel mean filter to each TROPOMI overpass as suggested by in Fig. c, and the BM3D+jMMSE T=5 approach in Fig. d. The 5×5 pixel mean filter reduces noise but considerably smears the signal; in contrast, the other two methods better suppress noise while retaining source sharpness. Using the method of , we observe a 94 % noise reduction with the 5×5 pixel mean filter, while the BM3D approach reduces noise by 49 % and the BM3D+jMMSE approach by 68 %.

Figure 6

Annual divergence maps for a region in South Africa. (a) The original “noisy” TROPOMI SO2 divergence map, (b) the estimate from applying BM3D to the SO2 and NO2 field prior to taking the divergence, (c) the estimate from a 5×5 mean filter to the noisy SO2 prior to taking the divergence, (d) the estimate from the optimal SO2 image using the proposed BM3D+jMMSE T=5 method prior to taking the divergence. A number of known sources are encircled.

To demonstrate that the denoising minimally impacts the signal structure, we compare emission estimates from different divergence fields in Fig. (a more detailed examination of the estimates is provided in Sect. S5, where we also present a “difference” plot between the “noisy” divergence image and the denoised estimates). We focus on the ratio of the original “noisy” TROPOMI data estimates to those obtained with the denoising approaches. Emission estimates were obtained by spatial integration after masking data more than 0.15° from the point source location, corresponding to an integration radius of 16.7 km longitudinally and 15 km latitudinally. It is clear that the 5×5 pixel mean filter estimates significantly deviate from those based on noisy data. In contrast, estimates using the BM3D and BM3D+jMMSE T=5 approaches are closer to the noisy data estimates, although some deviation is present. Sources where these two methods significantly differ from the noisy data are weakly represented in the SO2 and/or NO2 dataset. This indicates that the noisy data may not accurately capture their presence and thus may not serve as reliable ground truth for comparison. In fact, the major deviations in Fig. , the Newcastle Steel Works and Camden Power Station, yield negative emissions when using the original noisy data. Therefore, the divergence method does not provide a reliable estimate of these source emissions in these cases. This may be due to a (combination of a) variety of reasons, e.g., there was not enough data to produce a robust average; there was a temporal bias in the sampling (the divergence method needs satellite images with plumes being blown in all directions to produce a “point” on a map; otherwise “dipoles” with seeming positive and negative sides appear – so if our overpass consistently samples cases where the wind is blowing in one direction, we will get artifacts); the steady state assumption was invalid; our chosen effective wind is not appropriate for the terrain and/or plume.

Figure 7

SO2 emission estimate ratios from integrating along a ∼ 15 km radius around point source locations across the divergence maps shown in Fig. . A “blue” color in the ratio columns corresponds to an underestimation compared to the “noisy” SO2 divergence map, while a “red” color in the ratio columns corresponds to an overestimation. Note, that the “noisy” SO2 divergence map is not necessarily a ground truth, and this plot is shown for illustrative purposes only.

4Discussion

The joint MMSE and BM3D approaches aim to denoise data while preserving the signal. The examples demonstrate the effectiveness of these methods, but they also have limitations.

Firstly, these methods assume independent and identically distributed (i.i.d.) noise with zero mean. Consequently, structural noise patterns, such as stripes, may not be denoised and could be misinterpreted as meaningful signals by BM3D due to the self-similarity of structural noise. While tuning BM3D for such cases is possible with prior knowledge of noise distribution in the 2D wavenumber domain , this knowledge is often absent for satellite data. Another violation of the i.i.d. assumption may show up in the form of correlated errors in both satellite products due to shared dependencies in the radiative transfer (such as albedo, cloud contamination or aerosol loading). These factors enter both retrievals, thereby introducing a common-mode error component. Mitigation strategies could include restricting the analysis to scenes with low sources of such errors, or extending the statistical model presented here to include such correlated errors or biases. Consequently, users should interpret the denoised image with caution in regions where strong scene-dependent retrieval artifacts are expected, as these features, shared between the low and high SNR images, may be preserved or even reinforced in the denoised output.

Secondly, the practical implementation of jMMSE requires spatial stationarity, meaning the ratio of NO2 : CO2 column densities (or similarly chosen trace gases) should be approximately constant within a window of T×T pixels. It is clear that there is no globally fixed NO2 : CO2 ratio, and although this is approximately true when we focus on a small region, NO_x chemistry inside plumes will change ratios inside plumes . Hence, the value for T must be kept as low as possible. The competing interest, of course, is that for robust statistics, T should be chosen as large as possible. This raises the question of how to appropriately choose T. This can be based on subtracting the original image from the denoised image. Whatever shows up as structureless features there is noise, while whatever shows up as structured noise is likely the removal of real signal (e.g., this might happen if the difference shows something that looks like a plume and is indeed coincident with a plume on the original data). If one can grow T, but at some value of T the noise rejection does not improve anymore, then one has found the optimal T for noise rejection. Conversely, if one can grow T but at some point one is starting to reject also signal, then one can say they have found the optimal T to retain the signal. This argument suggests that spatial stationarity is best satisfied over small regions and indicates that denoising will be more applicable in high-resolution rather than coarse-resolution satellite images. The authors believe that rather than prescribing a value of T in this paper, it should remain an open hyperparameter for end users to decide. However, in the Supplement we demonstrate two example workflows for selecting the parameters based on a simple grid search – either by minimizing an objective function that maximizes noise reduction while minimizing signal bias, or by performing a simple grid search optimization for the remaining hyperparameters. Once optimal parameters are found to work for a handful of images, they generally work well for a full dataset.

5Conclusions

We presented two minimum mean square error (MMSE) estimators that enhance the signal-to-noise ratio (SNR) of noisy CO2 or SO2 images using co-registered NO2 images from the upcoming GOSAT-GW and CO2M satellites, as well as from Sentinel-5P. These methods enhance the visibility of plumes that are hard to discern in the noisy images. The first method, joint MMSE, preserves plumes with good SNR (i.e., where the signal is strong or highly correlated with the NO2 field) while subtracting noise elsewhere. The second method, BM3D, leverages image self-similarity by denoising a linear combination of normalized CO2 or SO2 and NO2 images. The best outcomes result from combining both estimators, initially applying BM3D for denoising, followed by joint MMSE for further refinement of the CO2 or SO2 image.

We demonstrate the effectiveness of these techniques in two case studies. In synthetic data tests, the denoising process improves peak SNR by more than 40 dB. When applied to TROPOMI SO2 and NO2 images over South Africa, or their annual divergence maps, we observe that a 30 %–60 % reduction in noise levels is possible, while leaving plume structures intact.

The proposed denoising methods can enhance plume detection for single-overpass images and averaged satellite datasets. These techniques improve plume visibility and may assist in plume emission quantification methods, such as Gaussian plume inversion, cross-sectional flux methods, or the divergence method, by providing cleaner input data. Therefore, by systematically reducing noise in total column images, this approach strengthens satellite capabilities for monitoring atmospheric emissions with greater precision.

Appendix APython implementation

In order to implement Eq. (), we need to compute the quantities Cnn, Cdd and E[M] with sufficient accuracy. In this section, we give some possible ways to obtain these quantities.

The expected column E[M] can be defined as the median value of a T×T patch around any given pixel.

The data covariance matrix Cdd is defined by Eq. (). To estimate it from the data (the “sample covariance matrix”), we form two row vectors of size 1×T2 – for example, CO2̃ as a 1×T2 row vector containing all CO2̃ observations in a T×T patch around the pixel, and NO2̃ as a 1×T2 row-vector containing all NO2̃ observations around the pixel. Defining the sample deviation matrix as a 2×T2 array, M=[CO2̃TNO2̃T]T-E[M], we may compute the 2×2 data covariance matrix, A1Cddsample=1T2-1MMT.

As we want to use a small value for T, it makes sense to use covariance shrinkage operators, which makes the estimation more robust . A simple operation is to get the eigenvector decomposition Cddsample=UΛUT, and reconstruct it with modified eigenvalues as A2Cddest=UΛ′UT, where Λ′=(αΛ+(1-α)Diag(nCO2,nNO2)) contains the modified eigenvalues based on the expected noise characteristics of the data, and Diag(×,×) forms a diagonal matrix with the elements given. For α=0.5 and nCO2 and nNO2 as diagonal entries of Cddsample we, for example, mix the sample covariance matrix and a diagonalized covariance matrix.

The noise covariance matrix Cnn corresponds to the precision of the instrument. If the noise is uncorrelated and known as σCO22 and σNO22 for the two measurements, that simply corresponds to Cnninstrument = Diag(σCO22,σNO22). Alternatively, if this type of data is not available, we may estimate it as the median covariance of a single overpass (or image), the idea being that a typical image contains primarily “noise” and only a limited amount of “signal” (i.e., hot spot enhancements) such that the median covariance matrix of the data is representative for the noise.

We can give a straightforward example of an implementation of this algorithm in about 100 lines of Python code. Special care is taken of missing data through use of the numpy “mask” feature. Along with the robust estimator for Cdd, we also include the capacity to make sure Cdd has a stable inverse by clipping the conditioning number and by adding small values along its diagonal. The code given here is given as an example of how the quantities used in the theory could be computed in practice.

ijl',wCddICnn[...,0].squeeze(),D) est_gather = np.zeros((*CO2.shape, T**2)) * np.nan for i in range(T**2): y, x = np.mod(i,T)-T//2, i//T-T//2 xs, xe = max((T//2)+x,0), min(CO2.shape[0]-(T//2)+x, CO2.shape[0]) ys, ye = max((T//2)+y,0), min(CO2.shape[1]-(T//2)+y, CO2.shape[1]) est_gather[xs:xe, ys:ye, i] = wCddICnnEM[:,:,i] # --- Generate filter grid coordinates y, x = np.meshgrid(np.arange(-T//2+1, T//2+1), np.arange(-T//2+1, T//2+1)) weights_2dgauss = np.exp(-(x**2 + y**2) / (2 * 4**2)) # --- Compute final output pred = CO2 - nanaverage(est_gather,W=weights_2dgauss.flatten()) est = np.where( np.isfinite(CO2), pred, np.nan ) est = np.where( np.isnan(NO2) & np.isnan(est), CO2, est ) return est[T:-T,T:-T] ]]>

Code and data availability

A C++ implementation of BM3D may be obtained from https://github.com/gfacciol/bm3d (last access: 21 August 2025), although in this paper we use a Python implementation from https://pypi.org/project/bm3d/ (last access: 6 September 2024). The code which implements the joint MMSE is added as a supplement to this paper, along with example data (one example from the SMARTCARB dataset and one example from the TROPOMI dataset), which can be used to reproduce Figs. , –.

The supplement related to this article is available online at https://doi.org/10.5194/amt-19-3999-2026-supplement.

Author contributions

EK derived and implemented the jMMSE algorithm and BM3D method modification. EK, GK, and DB all contributed equally to the writing process.

Competing interests

At least one of the (co-)authors is a member of the editorial board of Atmospheric Measurement Techniques. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Acknowledgements

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement no. 958927 (CoCO2) and the European Union's Horizon Europe programme under grant agreement no. 101082194 (CORSO), with additional funding by the Swiss State Secretary for Education, Research and Innovation (SERI) no. 22.00422. We gratefully acknowledge CSCS for the computing resources (project s1302).

Financial support

This research has been supported by the EU Horizon 2020 (grant nos. 958927 and 101082194) and the Swiss State Secretary for Education, Research and Innovation (SERI) no. 22.00422.

Review statement

This paper was edited by Ilse Aben and reviewed by two anonymous referees.

References Beirle et al.(2019)

Beirle, S., Borger, C., Dörner, S., Li, A., Hu, Z., Liu, F., Wang, Y., and Wagner, T.: Pinpointing nitrogen oxide emissions from space, Sci. Adv., 5, eaax9800, 10.1126/sciadv.aax9800, 2019.

Beirle et al.(2023)Beirle, Borger, Jost, and Wagner

Beirle, S., Borger, C., Jost, A., and Wagner, T.: Improved catalog of NO_x point source emissions (version 2), Earth Syst. Sci. Data, 15, 3051–3073, 10.5194/essd-15-3051-2023, 2023.

Buades et al.(2005)Buades, Coll, and Morel

Buades, A., Coll, B., and Morel, J.-M.: A review of image denoising algorithms, with a new one, Multiscale Model. Simul., 4, 490–530, 10.1137/040616024, 2005.

Buades et al.(2011)Buades, Coll, and Morel

Buades, A., Coll, B., and Morel, J.-M.: Non-local means denoising, Image Process. Line, 1, 208–212, 10.5201/ipol.2011.bcm_nlm, 2011.

Buchwitz et al.(2013)Buchwitz, Reuter, Bovensmann, Pillai, Heymann, Schneising, Rozanov, Krings, Burrows, Boesch et al.

Buchwitz, M., Reuter, M., Bovensmann, H., Pillai, D., Heymann, J., Schneising, O., Rozanov, V., Krings, T., Burrows, J. P., Boesch, H., Gerbig, C., Meijer, Y., and Löscher, A.: Carbon Monitoring Satellite (CarbonSat): assessment of atmospheric CO₂ and CH₄ retrieval errors by error parameterization, Atmos. Meas. Tech., 6, 3477–3500, 10.5194/amt-6-3477-2013, 2013.

Copernicus Sentinel-5P(2020)

Copernicus Sentinel-5P (processed by ESA): TROPOMI Level 2 Sulfur Dioxide Total Column products, Version 02, European Space Agency [data set], 10.5270/S5P-74eidii, 2020.

Copernicus Sentinel-5P(2021)

Copernicus Sentinel-5P (processed by ESA): TROPOMI Level 2 Nitrogen Dioxide total column products, Version 02, European Space Agency [data set], 10.5270/S5P-9bnp8q8, 2021.

Dabov et al.(2007)

Dabov, K., Foi, A., Katkovnik, V., and Egiazarian, K.: Image denoising by sparse 3-D transform-domain collaborative filtering, IEEE Trans. Image Process., 16, 2080–2095, 10.1109/TIP.2007.901238, 2007.

Fichtner(2021)

Fichtner, A.: Lecture Notes on Inverse Theory, Tech. rep., ETH Zurich, Switzerland, 10.33774/coe-2021-qpq2j, 2021.

Hakkarainen et al.(2022)

Hakkarainen, J., Ialongo, I., Koene, E., Szeląg, M. E., Tamminen, J., Kuhlmann, G., and Brunner, D.: Analyzing local carbon dioxide and nitrogen oxide emissions from space using the divergence method: an application to the synthetic SMARTCARB dataset, Front. Remote Sens., 3, 878731, 10.3389/frsen.2022.878731, 2022.

Immerkaer(1996)

Immerkaer, J.: Fast noise variance estimation, Comput. Vis. Image Underst., 64, 300–302, 10.1006/cviu.1996.0060, 1996.

Koene et al.(2024)Koene, Brunner, and Kuhlmann

Koene, E. F. M., Brunner, D., and Kuhlmann, G.: On the theory of the divergence method for quantifying source emissions from satellite observations, J. Geophys. Res.-Atmos., 129, 10.1029/2023JD039904, 2024.

Krol et al.(2024)Krol, van Stratum, Anglou, and Boersma

Krol, M., van Stratum, B., Anglou, I., and Boersma, K. F.: Evaluating NO_x stack plume emissions using a high-resolution atmospheric chemistry model and satellite-derived NO₂ columns, Atmos. Chem. Phys., 24, 8243–8262, 10.5194/acp-24-8243-2024, 2024.

Kuhlmann et al.(2019)Kuhlmann, Broquet, Marshall, Clément, Löscher, Meijer, and Brunner

Kuhlmann, G., Broquet, G., Marshall, J., Clément, V., Löscher, A., Meijer, Y., and Brunner, D.: Detectability of CO₂ emission plumes of cities and power plants with the Copernicus Anthropogenic CO₂ Monitoring (CO2M) mission, Atmos. Meas. Tech., 12, 6695–6719, 10.5194/amt-12-6695-2019, 2019.

Kuhlmann et al.(2020a)Kuhlmann, Clément, Marshall, Fuhrer, Broquet, Schnadt-Poberaj, Löscher, Meijer, and Brunner

Kuhlmann, G., Clément, V., Marshall, J., Fuhrer, O., Broquet, G., Schnadt-Poberaj, C., Löscher, A., Meijer, Y., and Brunner, D.: SMARTCARB – Use of satellite measurements of auxiliary reactive trace gases for fossil fuel carbon dioxide emission estimation, Zenodo, 10.5281/zenodo.4034266, 2020a.

Kuhlmann et al.(2020b)Kuhlmann, Clément, Marshall, Fuhrer, Broquet, Schnadt-Poberaj, Löscher, Meijer, and Brunner

Kuhlmann, G., Clément, V., Marshall, J., Fuhrer, O., Broquet, G., Schnadt-Poberaj, C., Löscher, A., Meijer, Y., and Brunner, D.: Synthetic XCO₂, CO and NO₂ observations for the CO2M and Sentinel-5 satellites, Zenodo [data set], 10.5281/zenodo.4048228, 2020b.

Kuhlmann et al.(2021)Kuhlmann, Henne, Meijer, and Brunner

Kuhlmann, G., Henne, S., Meijer, Y., and Brunner, D.: Quantifying CO₂ emissions of power plants with CO₂ and NO₂ imaging satellites, Front. Remote Sens., 14, 10.3389/frsen.2021.689838, 2021.

Ledoit and Wolf(2022)

Ledoit, O. and Wolf, M.: The power of (non-)linear shrinking: A review and guide to covariance matrix estimation, J. Financ. Econom., 20, 187–218, 10.2139/ssrn.3384500, 2022.

Mäkinen et al.(2020)Mäkinen, Azzari, and Foi

Mäkinen, Y., Azzari, L., and Foi, A.: Collaborative filtering of correlated noise: Exact transform-domain variance for improved shrinkage and patch matching, IEEE Trans. Image Process., 29, 8339–8354, 10.1109/TIP.2020.3014721, 2020.

Meier et al.(2024)Meier, Koene, Krol, Brunner, Damm, and Kuhlmann

Meier, S., Koene, E. F. M., Krol, M., Brunner, D., Damm, A., and Kuhlmann, G.: A lightweight NO₂-to-NO_x conversion model for quantifying NO_x emissions of point sources from NO₂ satellite observations, Atmos. Chem. Phys., 24, 7667–7686, 10.5194/acp-24-7667-2024, 2024.

Miller et al.(2007)

Miller, C., Crisp, D., DeCola, P., Olsen, S., Randerson, J., Michalak, A., Alkhaled, A., Rayner, P., Jacob, D. J., Suntharalingam, P.,Jones, D. B. A., Denning, A. S., Nicholls, M. E., Doney, S. C., Pawson, S., Boesch, H., Connor, B. J., Fung, I. Y., O'Brien, D., Salawitch, R. J., Sander, S. P., Sen, B., Tans, P., Toon, G. C., Wennberg, P. O., Wofsy, S. C., Yung, Y. L., and Law, R. M.: Precision requirements for space-based XCO2 data, J. Geophys. Res.-Atmos., 112, 10.1029/2006JD007659, 2007.

Reuter et al.(2019)Reuter, Buchwitz, Schneising, Krautwurst, O'Dell, Richter, Bovensmann, and Burrows

Reuter, M., Buchwitz, M., Schneising, O., Krautwurst, S., O'Dell, C. W., Richter, A., Bovensmann, H., and Burrows, J. P.: Towards monitoring localized CO₂ emissions from space: co-located regional CO₂ and NO₂ enhancements observed by the OCO-2 and S5P satellites, Atmos. Chem. Phys., 19, 9371–9383, 10.5194/acp-19-9371-2019, 2019.

Veefkind et al.(2012)

Veefkind, J. P., Aben, I., McMullan, K., Förster, H., De Vries, J., Otter, G., Claas, J., Eskes, H., De Haan, J., Kleipool, Q., van Weele, M, Hasekamp, O., Hoogeveen, R., Landgraf, J., Snel, R., Tol, P., Ingmann, P., Voors, R., Kruizinga, B., Vink, R., Visser, H., and Levelt, P. F.: TROPOMI on the ESA Sentinel-5 Precursor: A GMES mission for global observations of the atmospheric composition for climate, air quality and ozone layer applications, Remote Sens. Environ., 120, 70–83, 10.1016/j.rse.2011.09.027, 2012.

Wang et al.(2004)

Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P.: Image quality assessment: from error visibility to structural similarity, IEEE Trans. Image Process., 13, 600–612, 10.1109/TIP.2003.819861, 2004.

Yahya et al.(2020)Yahya, Tan, Su, Hu, Wang, Liu, and Hadi

Yahya, A. A., Tan, J., Su, B., Hu, M., Wang, Y., Liu, K., and Hadi, A. N.: BM3D image denoising algorithm based on an adaptive filtering, Multimed. Tools Appl., 79, 20391–20427, 10.1007/s11042-020-08815-8, 2020.