Introduction
Since the launch of the ESA Global Ozone Monitoring Experiment (GOME) in
1995, satellite observations have tremendously advanced our understanding of
the processes governing the atmospheric composition, greenhouse gas
emissions, and air
quality . Global
distributions of atmospheric species that play critical roles in atmospheric
chemistry and air pollution, such as ozone e.g.,,
NO2 e.g.,, SO2 e.g.,,
formaldehyde HCHO; e.g.,, glyoxal CHOCHO;
e.g.,, and BrO e.g.,, have been
retrieved from the backscattered solar UV–visible spectra observed by
generations of polar-orbiting satellite sensors, including
GOME , SCIAMACHY ,
OMI , GOME-2 ,
OMPS , and TROPOMI . A
constellation of geostationary satellites will provide hourly measurements of
these species over North America, Europe, and Asia in the near
future . Observations of the backscattered
shortwave infrared solar spectra also enable the retrieval of CO2, CH4,
and/or CO from SCIAMACHY ,
GOSAT , OCO-2 , and
TROPOMI . Moreover, many atmospheric species have
strong spectroscopic signatures in the mid-infrared and can be retrieved from
the Earth's thermal emission spectra collected by satellite sensors such as
MOPITT , AIRS , TES ,
IASI , and CrIS . One species of
particular significance to tropospheric chemistry and air quality is
NH3 , which has been successfully
retrieved from TES ,
AIRS ,
IASI , and
CrIS .
The retrieval results from satellite sensors are usually total or partial
(e.g., tropospheric or planetary boundary layer, PBL) column density at
individual satellite pixels, i.e., the level 2 product. However, the pixel
geometry may vary significantly even for the same sensor (see Fig. 1 for
example), and data quality screening (by cloud coverage, solar zenith angle,
surface albedo, thermal contrast, etc.) often leaves only small and patchy
fractions of useful level 2 pixels for any given orbit. As such, the level 2
data over many orbits are often projected to a regular spatial grid to better
represent the spatiotemporal variations of the target species through a
gridding algorithm. These “level 3” products help to average out the
observational noise that can be significant for individual level 2 retrieval
and make satellite data more accessible for scientific studies and the
general public. These products may also lead to additional discoveries, such
as emission and lifetime
estimates ,
source identification , trend
analyses ,
assessment of environmental exposure for public
health , and satellite data
validation .
The operational level 3 products are typically provided at grid sizes of
0.25∘×0.25∘ or even 1∘×1∘, which
are too coarse for regional heterogeneous emission sources (e.g., urban
areas), especially for species with short lifetimes. These level 3 products
are provided at fixed temporal intervals (e.g., daily, monthly, and
annually). To customize the temporal and spatial sampling intervals, one
often needs to regrid the level 2 data.
Various gridding algorithms have been developed to generate level 3 maps at
a regional scale with much finer grids (0.05–0.01∘) than the
sizes of level 2 pixels, and this process is generally referred to as
“oversampling” . In this work, we present
an agile, physics-based oversampling approach that represents each level 2
satellite pixel as a sensitivity distribution on the Earth's surface (e.g.,
the spatial response function), instead of a point or a polygon as assumed in
previous methods. A generalized 2-D super Gaussian function is used to
characterize the spatial response functions of both imaging grating
spectrometers (e.g., OMI, OMPS, and TROPOMI) and scanning Fourier transform
spectrometers (FTSs; e.g., GOSAT, IASI and CrIS). Applications to multiple
existing satellite datasets are also highlighted.
Satellite observations
OMI
The OMI instrument aboard the Aura satellite launched in 2004 is a push-broom
UV–visible imaging grating spectrometer. It has a daytime equatorial crossing
at ∼13:42 LT (local time). During normal global observation mode, the
backscattered sunlight from the Earth is imaged by a telescope onto a
rectangular entrance slit perpendicular to the flight direction. The light
coming through the slit, which corresponds to an across-track angle of
115∘, or 2600 km on the ground, is dispersed by optical gratings and
mapped on two 2D CCD detectors. Each detector image is aggregated
across-track (along the length of the slit) into 60 spectra, corresponding to
60 across-track spatial pixels for the UV2 (307–383 nm) and visible
(349–504 nm) bands, as shown by Fig. . Although the spatial
response functions of OMI pixels are
nonuniform , the OMI pixels are widely
characterized as quadrilateral polygons defined by 75 % of the energy in the
along-track field of view (FOV) and the halfway points of the across-track
FOV the 75 FOV pixel edges from the OMPIXCOR product;.
These OMI pixel polygons are close to rectangles, ranging from 14km×26km
at nadir (or 13km×24km if assuming nonoverlapping pixels)
to 28km×160km at the swath edges. Alternatively, OMI pixels can be
represented as tiled polygons with no overlap between adjacent pixels. These
tiled pixels produce a seamless swath image but are less accurate,
especially in the along-track direction. OMI is a highly successful mission
with long data records, and most of the successor missions follow a similar
design . The oversampling technique demonstrated here can
be readily adopted for a range of OMI products and OMI's successor missions,
such as OMPS, TROPOMI, and TEMPO.
Across-track (xtrack in figure) ground pixel geometry for IASI, CrIS, and the UV2 and VIS (visible) bands of OMI.
IASI
The IASI instrument is an FTS with an across-track scanning range of 2200 km
(Fig. ). It has a daytime equatorial crossing time of
∼09:30 LT. The first IASI instrument (IASI-A) was launched aboard the
MetOp-A satellite in 2006, with the launch of IASI-B following in 2012 and IASI-C in 2018. IASI scans across the track with
30 mirror positions, or fields of regard (FORs), and each FOR is composed of a
2×2 array of pixels, or FOV. Each FOV projected on ground is a 12 km
diameter circular footprint at nadir and elongates to ellipses towards the
swath edges . To simplify the ground pixel calculation, we
represent each pixel as an ellipse with the major and minor axes and rotation
angle interpolated from a lookup table based on latitude and FOR and FOV number.
We use the most recent neural network (NN) IASI NH3 retrieval based on
calculation of a hyperspectral range index (HRI) and subsequent conversion to
NH3 columns via a neural
network . The IASI NH3 datasets
are publicly available for both IASI-A and IASI-B, with the version
2 presenting significant improvements over version
1 , including the negative values that are
crucial for observational error averaging near the detection limit.
CrIS
The CrIS instrument, which is aboard the Suomi NPP satellite and the series
of JPSS satellites, is a step-scan FTS with 2200 km across-track width
(Fig. ). It has a daytime equatorial crossing time of
∼13:30 LT. It has the same number of FORs as IASI, but each
FOR contains 9 FOVs (3×3 array), providing a better spatial coverage.
Each CrIS FOV is 14 km at nadir, slightly larger than IASI. Due to the
mounting angle of the scanning mirror, the FOR rotates differently at each
scanning angle. Similar to IASI, each CrIS pixel is represented as a rotated
ellipse.
The CrIS fast physical retrieval (CFPR) NH3 retrieval product is based on the
TES optimal estimation approach that minimizes the differences between
spectral radiances and a simulated fast forward line-by-line
model .
Existing gridding methods
This section reviews existing gridding methods that map level 2 pixels to
level 3 grids. Oversampling conventionally refers to the cases where level 3
grid is much finer than the level 2 pixel size.
Spatial interpolation
The spatial interpolation methods generate continuous data fields from
observations made at discrete locations. The main difference between
interpolation and the point- and polygon-based oversampling approaches
discussed in Sect. and is that
the values at grid cells that are not covered by satellite observations can
be estimated. Therefore, the spatial interpolation methods are more commonly
used for satellite datasets with significant spatial gaps or requiring
additional smoothing. Common spatial interpolation methods include nearest
neighbors, piecewise 2-D linear interpolation, spline interpolation, and
various kriging methods. The moving window block kriging method has been
proposed to generate global level 3 products for satellite observations of
long-lived species, such as CH4 and
CO2 . A comprehensive review of
available spatial interpolation methods for environmental variables is
provided by . There are relatively few
applications of spatial interpolation methods to regional
fine-grid oversampling, where each target grid cell usually receives a large number of
overlapping satellite observations.
proposed an interpolative gridding algorithm that reconstructs the trace gas
distribution by a continuous parabolic spline surface, defined on the lattice
of tiled satellite pixels. This approach produces smooth regional level 3
maps for the OMI NO2 products with specifically tuned smoothing
parameters but has not been tested in non-tiled observations with
significant numbers of missing values (e.g., IASI and CrIS).
Satellite observations as points
The simple “drop-in-the-box” gridding method can be classified into this
category, as each satellite observation is assumed to be a point on the
surface. The value for each target grid cell is the average of all screened
satellite observations with the center of the FOV falling inside the grid cell
boundaries. A conventional oversampling approach has been developed based on
the drop-in-the-box method; instead of only averaging “in the box”, it
includes satellite observations within a certain radius (much larger than the
grid size) from the center of each grid cell. This averaging radius is chosen
to balance the smoothing and noise but is also somewhat arbitrary. For
example, used a radius of 8 km to oversample the OMI
NO2 tropospheric columns and a larger radius of 24 km to oversample the
OMI SO2 total columns near the Canadian oil sands region;
used 12 km to oversample the OMI SO2 total
columns over the US; and used 24 km to oversample the
HCHO total columns near Houston, TX. This oversampling approach is referred
to as “point oversampling” hereafter, as the pixel geometry is not
considered. The pixel-specific observational errors are also not taken into
account.
Figure reconstructs a point oversampling process for an
arbitrary target grid point (red star) located near Denver, CO. OMI NO2
data over the year 2005 are used in this demonstration.
Pixels with a cloud fraction ≥30 % or a solar zenith angle ≥75∘ are screened out. Only across-track positions with relatively
small pixel areas (6–55 out of 1–60) are included, a common practice to
oversample OMI data. Adding pixels at the swath edges would induce more
“false negative” cases, as shown below. The screened satellite pixel
centers that fall within a 12 km radius (dashed circle) are plotted as black
points and red triangles. The red triangles are “false positive”
observations because the corresponding pixel quadrilaterals, provided by the
OMPIXCOR product, do not cover the target grid point. The pixel geometry of
an extreme false positive case is illustrated by the pixel quadrilateral,
featuring the largest separation between its boundary and the target grid
point. Likewise, the false negative observations are plotted as purple
squares, whose pixel centers fall outside the averaging circle (and hence not
averaged), but these pixels cover the target grid point. An extreme case of
the false negatives is also illustrated. For this example, there are 243
pixels within the 12 km radius, of which 54 are false positives (22 %). There
are 92 false negatives (38 %) not included in the point oversampling.
Typically, false positives are pixels closer to nadir, whereas false
negatives are pixels away from nadir. In combination, the oversampled value
at this grid location has contributions from a much different set of
satellite observations than what should be represented. A larger averaging
radius will decrease the occurrence of false negative cases but increase
that of false positive cases. Because the OMI pixel dimension is larger at
the across-track direction, these sampling biases differ in direction;
observations in the across-track direction of the target grid point are more
likely to become false negatives, and observations in the along-track
direction are more likely to become false positives.
In reality, the OMI ground pixel footprints are not as sharp as quadrilateral
boundaries , so the false positive and negative cases are not
as well defined as in Fig. . This will be discussed in
Sect. .
Centers of screened OMI pixels in 2005 over a target grid point (red
star) near Denver, CO. Pixels that overlap with the target grid point with
the pixel center falling within the averaging radius (dashed circle) are plotted
as black points (correct oversampling, 40 %). Pixels that overlap with the
target grid point with the pixel center falling outside the averaging radius are
plotted as purple squares (false negative, 38 %). Pixels that do not overlap
with the target grid point with the pixel center falling in the averaging radius
are plotted as red triangles (false positive, 22 %). Extreme cases of false
positives or negatives are illustrated by OMI pixel quadrilaterals. The
percentages of correct oversampling, false positive, and false negative
pixels are labeled in the legend.
Satellite observations as polygons (i.e., tessellation)
This approach assumes that each satellite observation footprint is a polygon
on the surface, and calculates the areal proportions of grid cells inside
each polygon. Because calculating these overlapping areas requires filling
irregular satellite footprint polygons with rectangular grid cells, it is
also known as the “tessellation” approach. The contribution of each
satellite observation to a given grid cell is weighted by the overlapping
area and inversely weighted by the total pixel polygon area and the
observational uncertainty, as shown by the following
equations modified from:
C(j)=A(j)/B(j),
where
A(j)=∑iΩ(i)S(i,j)σ(i)p∑jS(i,j),B(j)=∑iS(i,j)σ(i)p∑jS(i,j).
In the equations above, C(j) is the oversampled result for destination grid
cell j; Ω(i) is the variable to be oversampled (e.g., total column)
associated with the satellite pixel i; S(i,j) is the overlapping area
between pixel i and grid cell j, and hence ∑jS(i,j) is the total
area of pixel i, assuming that the grid extends beyond all pixel
boundaries. When the destination grid is regular with constant grid cell
area, it is convenient to normalize S(i,j) by the grid cell area, leading
to overlapping fractions. We will follow this convention hereafter, and hence
S(i,j) is always a dimensionless number. These equations take into account
the extent of a pixel and give more weight to a nadir observation than to an
observation at the edges of the satellite swath, where the information is
more smeared out. The variable σ(i)p is the uncertainty term, and the power p
has been assumed to be 1 or
2 by different studies. If we assume
each observation Ω(i) is a measurement of a constant true value with
Gaussian random error σ(i), p=2 yields the maximum likelihood
estimate of the true value. However, the true measurement and sampling errors
often show heavier tails than a Gaussian distribution. In this study we adopt
p=1, following . The oversampled results are generally
similar for both cases. Unlike the point oversampling discussed in
Sect. where C(j) is simply the average of Ω(i)
within a circle, the tessellation approach fully utilizes the geometry and
error information for each satellite observation. It has been adopted by many
operational level 3 products and oversampling
studies .
It is sometimes convenient to define
D(j)=∑iS(i,j)
to quantify the total number of overlapping pixel polygons used in averaging
for grid cell j. Unlike the point oversampling, this number does not have
to be an integer due to the consideration of partial overlaps. Because the
location and size of these pixels vary day by day, averaging a large number
of pixels reveals spatial patterns at scales finer than the satellite pixel
scales, if these patterns are consistent through the averaging time period.
Figure illustrates the tessellation process for OMI (a)
and IASI (b) pixels, where the elliptical IASI pixel is represented by a
100-vertex polygon calculated from its minor/major axes and rotational
angle lookup tables. The destination grid size is 5 km × 5 km, and the
overlapping areas are normalized by the grid cell area (25 km2), as
labeled in each grid cell.
Tessellation process for OMI (a) and IASI (b) pixels. The IASI pixel
is approximated by a 100-vertex polygon. The overlapping area (S(i,j)) between
satellite pixel i and grid cell j is labeled at grid cell center,
normalized by grid cell area (25 km2). Across-track: xtrack.
Proposed method
Satellite observations as sensitivity distributions
The tessellation approach discussed in Sect.
inherently assumes that the satellite observation is uniformly sensitive to
the scene inside the pixel polygon and has no sensitivity outside it.
However, depending on target grid size and the spatial response function of
specific satellite observations, this may be too strong of an assumption. For
example, characterized the complex spatial
response function of the AIRS instrument and used it to improve the
comparison of radiances measured by AIRS and MODIS. and
derived an in-flight spatial response function of OMI using
collocated MODIS radiance. The operational Sentinel-5 Precursor, Sentinel-5,
and Sentinel-4 cloud processors also rely on the spatial response functions
of the imaging grating spectrometers to accurately calculate the cloud
coverage within each FOV using collocated high-resolution cloud
imagers .
For imaging grating spectrometers like OMI, the spatial response function
depends on the diffraction of the fore optics, the instantaneous field of
view (i.e., the instantaneous projection of the slit on the ground from the
point of view of a native detector pixel), the numbers of across- and
along-track bins, and the along-track movement of subsatellite point during
the integration time. The satellite movement only affects the along-track
direction, generally making the spatial response in the along-track direction
smoother than that in the across-track direction. and
fitted the OMI spatial response function using a 2-D super
Gaussian function to parameterize the different smoothness in the along- and
across-track directions. To standardize the representation of spatial
response functions for diverse satellite sensors, we generalize the 2-D super
Gaussian function as
S(x,y)=exp-xwxk1+ywyk2k3,
where
wx=FWHMxln(2)1/(k1k3),wy=FWHMyln(2)1/(k2k3).
In these equations, x and y are distances to the center of ground FOV in
orthogonal directions, usually transformed by geometric projections of the
across- and along-track directions. FWHMx and FWHMy are full widths at
half maximum of the spatial response function, S(x,y), in the directions of
x and y. The three exponential terms, k1, k2, and k3, control
the distribution of spatial response, as illustrated by Fig. .
When k3=1 (Fig. a and c), Eq. () becomes the 2-D
super Gaussian function used by and to
characterize the OMI spatial response:
S(x,y)=exp-xwxk1-ywyk2.
For OMI, k1∼4 and k2∼2 .
For FTS systems with stop-and-stare sampling, like IASI and CrIS, the spatial
response function (also known as point spread function by the community) is
more simply defined by the circular aperture and some diffraction around the
edge. The nadir FOV is circular with no difference between across- and
along-track directions, and hence the spatial response function can be
characterized by a 1-D super Gaussian function rotating around the nadir
point. This rotating super Gaussian function is another special case of the
generalized 2-D super Gaussian (Eq. ) with k1=k2=2 and
wx=wy:
S(x,y)=exp-Rw2k3, whereR=x2+y2 and w=wx=wy.
The smoothness of the rotating super Gaussian is controlled by only one
exponent, which equals to 2×k3. The elongated spatial response
functions for off-nadir angles can be readily characterized by different
values for wx and wy (Fig. a–b). The spatial response
function of IASI is rather sharp at the edge with little variation at the
top, close to a super Gaussian with an exponent of ∼18 .
The spatial response function of CrIS is relatively smoother at the edge,
best fit by a super Gaussian with an exponent of ∼8 .
Details on the spatial response functions of IASI and CrIS can be found in
Appendix A.
In the generalized 2-D super Gaussian function (Eq. ), k1×k3 and k2×k3 are the exponents in the x and y
directions, respectively, and determine the sharpness of the spatial response
in the corresponding direction. An exponent of 2 leads to a standard Gaussian
function; the larger exponents produce a top-hat shape, converging to a
boxcar shape when the exponent approaches
infinity . Redistributing the contributions
from k1/k2 and k3 makes hybrid spatial response functions that may
have sharp edges in sensitivity but rounded corners in space, as in the case
of OMPS (Glen Jaross, personal communication, 2017). The difference between this
hybrid case and conventional 2-D super Gaussian is illustrated by
Fig. c–d.
(a) Standard 2-D Gaussian function. It is both a rotating super
Gaussian with an exponent of 2 and a 2-D super Gaussian function with the x
and y direction exponents equal to 2. (b) Rotating super Gaussian with an
exponent (2×k3) of 18. (c) 2-D super Gaussian function with an
exponent of 18 in the x direction and an exponent of 6 in the y direction.
(d) A hybrid case between a rotating super Gaussian and a 2-D
super Gaussian, featuring rounded corners. In all cases, FWHMx=1.618×FWHMy. The grid size is 5 % of FWHMy.
The projection of a rectangular FOV for imaging grating spectrometers like OMI
on the surface at large viewing angles leads to distorted quadrilateral
footprints, as shown by the polygon ABCD in Fig. a. To
account for this effect, a geometric transformation function is determined by
the OMI pixel corner points (ABCD in Fig. a) and the
corresponding rectangle (A′B′C′D′ in Fig. b) defined by
the distances between the middle points of opposing edges of the OMI pixel
quadrilateral. The spatial response function is first calculated according to
Eq. () with FWHMx =|A′D′| and FWHMy =|A′B′| as shown in
Fig. b and then projected to match the OMI pixel corners
ABCD (Fig. a) using the geometric transformation function.
This algorithm is implemented using both the OpenCV library in Python and the
Image Processing Toolbox in MATLAB.
(a) OMI pixel corners (ABCD) for across-track position 60 out of
1–60 and spatial response function with k1=4, k2=2, and k3=1. (b) The
same OMI pixel transformed to a rectangle (A′B′C′D′) and the
corresponding transformed spatial response function. The horizontal and vertical
axes are in different scales to demonstrate that the OMI pixel is not
exactly a parallelogram. As a result, the geometric transformation function
is projective (not exactly affine).
The proposed oversampling approach represents each satellite observation as a
sensitive distribution, instead of a point or a polygon. If the true
satellite spatial response function is used as the sensitive distribution,
this approach is the theoretically optimal solution to the oversampling
problem, and is hence referred to as “physical oversampling” hereafter. It
follows the same equations as the tessellation approach as in
Eqs. ()–(), except that the fractional overlapping area
S(i,j) is generalized to the integration of the spatial response function
of satellite observation i, S(x,y|i), over the grid cell j:
S(i,j)=∬gridjS(x,y|i)dxdy∬gridjdxdy,
where the denominator is the grid cell area. Similar to the tessellation
approach, S(i,j) is always a dimensionless number between 0 and 1. By
normalizing the grid cell area, this accurate form of S(i,j) can be
directly replaced by approximating values such as S(x,y|i) evaluated at the
grid center. S(i,j)/∑jS(i,j) is just the normalized spatial response
function for observation i so that its spatial integration is unity. If the
spatial response is uniform inside the pixel polygon and zero outside the
polygon, this integration of the spatial response function within the grid cell
is equivalent to the fractional overlapping area used in the tessellation
approach. As such, the tessellation is just the extreme case where the
spatial response function is a perfect 2-D boxcar. This corresponds to k1×k3→∞ and k2×k3→∞ in
Eq. ().
This physical oversampling approach can also be considered as a spatial
interpolation method as discussed in Sect. because
the spatial response function can be defined beyond the satellite pixel
boundaries and theoretically on the entire 2-D space. Moreover, instead of
the exact form of spatial response function, the satellite observations can
be represented by similar (with the same FWHM) but smoother sensitivity
distributions to enhance the quality of the oversampling results. This
possibility will be demonstrated in Sect. .
Balancing the errors from tessellation and discretization of spatial response
The tessellation approach is perfect if the spatial response of satellite
observation is a boxcar, but otherwise it will introduce some error in the
oversampled results (referred to as “tessellation error” hereafter). When
the satellite spatial response function is smooth (instead of a boxcar), the
exact solution is to calculate S(i,j) as the integration of the spatial
response of satellite observation i over the area covered by the target
grid cell j (Eq. ). It is computationally demanding to
numerically integrate the spatial response of all satellite pixels over each
grid cell. To simplify it, one may discretize the spatial response function
to the target oversampling grid and use the spatial response value at the
grid center to approximate the integration. As such, the spatial response
function only needs to be evaluated once per pixel per grid cell. To improve
this simple discretization scheme, we calculate a weighted average of the
spatial response values at the grid center and grid corners as
proposed for MODIS by. Because the grid corners are shared
by neighboring grid cells, this approach only doubles the spatial response
calculation but significantly reduces the error induced by discretization
(“discretization error” hereafter). Appendix B gives a detailed comparison
of different discretization schemes.
The satellite sensors have very different spatial responses. The target grid
size for level 3 data ranges from 0.25∘ (∼25 km) for many global
operational products to 0.01∘ (∼1 km) for regional oversampling.
The discretization error decreases as the size of the target grid cells
becomes finer and the spatial response of satellite observations becomes
better resolved. At any fixed target grid size, spatial response functions
with smoother edges are better approximated by the discretization scheme. As
such, it is essential to balance the tessellation and discretization errors based
on the target grid cell size and the smoothness of the satellite spatial
response so that the most accurate and efficient approximating method can be
chosen.
Oversampling a synthetic checkerboard pattern, shown in panel (a), at a
spatial scale smaller than the OMI pixels to a grid size of 1 km. The pattern
in panel (a) is the ground truth of the concentration distribution. The ideal OMI
observation in panel (b) is generated using spatial response function defined in
Fig. at very fine grids and then co-added back to 1 km. The
pattern in panel (b) represents the ideal observation by OMI because no errors are
introduced during the oversampling process. Panel (c) shows the result from the
tessellation method (assuming S(i,j) is equal to the overlapping area between
satellite pixel i and grid cell j). Panel (d) shows the difference between
tessellation and the ideal observation. The values in panel (d) are equal
to the values in panel (c) minus the values in panel (b). Panels (e, f) show the oversampling result by discretizing the spatial
response function and its difference from the ideal observation. The values
in panel (f) are equal to the values in panel (e) minus the values in panel (b).
Figure compares the tessellation and discretization
errors when oversampling synthetic OMI observations to a grid of 1 km
(∼0.01∘). A checkerboard pattern is used as the “true”
concentration distribution (alternating values of zeros and ones with a spatial
period of 20 km × 20 km, as shown in Fig. a; it also
shows OMI pixel polygons at across-track position no. 1 in red and across-track
position no. 30 in cyan). Synthetic OMI observations are generated by sampling
the checkerboard pattern using the OMI spatial response function, simplified
using Eq. () with k1=4, k2=2 and discretized at a very fine
grid (0.05 km, or ∼0.0005∘) so that the spatial response
distribution is always fully resolved. The locations of OMI observations are
from the real OMI NO2 products , filtered by cloud
fraction <25 % and solar zenith angle <75∘ for 2005–2006. Instead
of NO2 columns, the synthetic OMI observations at these locations are
oversampled. The oversampled area is in the north midlatitude
(∼40∘Ṅ). In Fig. b, the oversampling is
conducted at a native grid size (0.05 km), and then the result is
block-averaged to the 1 km target grid size to represent ideal OMI
observations, as in Eq. (). One should note that this
discretization at 0.05 km is used to get the true map of OMI observation
where the discretization error is negligible. It is unnecessary to oversample
at this fine grid in general. Figure c and e show the
results for tessellation and discretization of the spatial response at
1 km
grid, where S(i,j) is approximated by fractional overlapping area and the
discretization scheme, respectively. They both reproduce the checkerboard
pattern in general, but the tessellation method generates errors up to 40 %
(Fig. d) relative to the peak-to-trough value of the
ideal observation because the OMI spatial response is smooth
(Fig. ) instead of boxcar. In contrast, the discretization
error is much smaller (Fig. f) because of the small
size of the target grid cells (1 km).
The analysis for Fig. is repeated for a range of
target grid sizes (1–50 km, or about 0.01–0.5∘) and different
smoothness of the spatial response functions using the same OMI observation
locations. The spatial response function is assumed to be 2-D super Gaussian
(Eq. ). The exponent in the along-track direction (k2) is
tuned from 2 to 64, whereas the exponent in the across-track direction
(k1) is set to be 2×k2. Figure a shows, for
satellite observations with a quadrilateral FOV, the contour of the ratio
between the discretization error and the tessellation error, calculated as
the root-mean-squares of the differences between the ideal observation and
the simplifications using tessellation and spatial response discretization,
respectively. The contour line of unity divides the regimes where
tessellation and discretization errors are dominant: discretization of the
spatial response is more accurate for fine-grid oversampling of satellite
observations with smooth spatial responses (small k1 and k2);
tessellation is more accurate for coarser target grids and sharper spatial
responses. Tessellation is perfect if k1 and k2 both approach infinity.
The case of OMI (k1=4, k2=2) lies at the left edge (red vertical
dashed line in Fig. a), and its intersect with the unity
contour line is located at the target grid size of ∼16 km. In other words,
it is beneficial to explicitly consider the spatial response of OMI
observation for target oversampling grids finer than ∼16 km (about
0.15∘).
Similarly, Fig. b shows the ratios between
discretization and tessellation errors for satellite observations with
circular FOVs. The pixel dimensions and locations of IASI observations for
2015–2016 are used with standard data screening, and the spatial response
function is assumed to be a rotating super Gaussian
(Eq. ). The exponential term (equal to 2×k3)
varies from 2 to 64. When characterizing the IASI spatial response as a
rotating super Gaussian function, the exponent is about 18, intersecting the
unity contour line at the target grid size of ∼2 km. If the IASI
instrument had the same spatial response as CrIS (the exponent is about 8),
the intersect would be at the target grid size of ∼4 km. The results
would be very similar when using the CrIS observation locations instead of
IASI because the exact locations of any observations are averaged out and
the IASI and CrIS pixel sizes are similar.
(a) The ratio between discretization and tessellation errors for
different combinations of spatial response function shapes and target grid
size. The unity contour line delineates the regime where the tessellation error
is larger than the discretization error (blueish contours) and the regime where
the discretization error is larger than tessellation error (reddish
contours). The red vertical dashed line indicates the approximate spatial
response for OMI. The red star marks the threshold target grid size where the
tessellation and discretization errors are equal for OMI. (b) Similar to panel (a) but
the IASI pixel shapes and locations are used instead of OMI. The spatial
response function exponents for CrIS and IASI and their intersects with the
unity contour line are marked.
As shown by Fig. , the balance between tessellation and
discretization errors depends on both the target grid size and the deviation
of satellite spatial response function from an ideal 2-D boxcar shape. The
uncertainty in the knowledge of the spatial response functions is not
considered here, but the spatial response function can be characterized
prelaunch and validated
on orbit . For all three
cases, the tessellation error significantly outweighs the discretization
error at 1 km oversampling grid size by a factor of 4 for IASI and over 200
for OMI. Therefore, we recommend discretization of the spatial response
function at a 1 km (or 0.01∘) grid for regional scale oversampling of
OMI, IASI, and CrIS data and then co-adding to coarser grids if necessary. The
threshold grid size where tessellation and discretization errors balance also
depends on the ground size of satellite FOV. For the OMI successor missions
with significantly smaller pixels (e.g., TROPOMI, TEMPO), the threshold grid
size is expected to be finer.
Spatial resolution and spatial sampling
The difference between resolution and sampling density for 1-D spectral data
has been thoroughly discussed in the
literature e.g.,. However, for 2-D,
spatially resolved data, it is common to refer to both the sizes of the level 2
pixels and the size of the level 3 grid as the spatial “resolution” of
the data. To avoid confusion, it is emphasized here that the true spatial
resolution is limited by the sizes of level 2 pixels. The size of level 3
grid only determines the density of spatial sampling, which does little to
enhance the true resolving power of the data after reaching a certain point.
For example, the oversampling results using synthetic OMI data at 1
vs. 0.05 km grids are very similar (Fig. ). Nonetheless, it
is still beneficial to oversample, i.e., make level 3 grid size significantly
smaller than level 2 pixel sizes, as demonstrated by Fig. . As
the ground truth, an array of 2-D Gaussian functions are generated with FWHM
ranging from 1 to 16 km (the second column of Fig. ) and
peak height of unity, and this true field of concentration is measured by an
imaginary sensor whose spatial response function is a 2-D super Gaussian
(Eq. ) with FWHM = 10 km and k1=k2=8 (the first column
and the white boxes inserted in the third column). The third column shows the
oversampling results using 10 000 randomly located observations. The fine
structures in the ground truth are clearly smoothed, limited by the spatial
resolution that is inherent to the level 2 pixel sizes (10 km). However, by
oversampling at a fine grid (0.2 km for the first row vs. 5 km for the second
row), the spatial gradients are better recovered, and spatial features finer
than individual level 2 pixels can be identified. Additionally, the details
in the spatial response function is better resolved with a finer target grid,
which is particularly beneficial when collocating with higher resolution
measurements (e.g., a cloud imager). As such, although the spatial resolving
power is ultimately determined by the spatial extent of satellite pixels, the
physical oversampling approach helps in enhancing the visualization of spatial
gradient and the identification of emission sources.
First column: spatial response function of an imaginary sensor
discretized at 0.2 km (a–c) and 5 km (d–f) grids. Second column: ground
truth spatial distribution generated as an array of 2-D Gaussian functions of
same height (the top and bottom panels are the same). The FWHM of each
Gaussian is labeled. Third column: physical oversampling results using 10 000
randomly generated observations and discretized at 0.2 km (a–c) and 5 km
(d–f) grids. The pixel size, which determines the spatial resolution, is
labeled as the inserted white boxes.
Applications to satellite datasets
Physical oversampling using OMI data
Figure compares the drop-in-the-box method, point
oversampling, tessellation, and physical oversampling using OMI NO2
tropospheric vertical column density (TVCD) within a 200 km × 200 km square
centered around a power plant in Arizona. The first column shows the simple
drop-in-the-box method on a 10 km grid. The second column averages OMI
observations within a 12 km radius of each grid center. These two approaches
assume OMI observations as points without consideration of pixel geometry and
retrieval uncertainties. The third column shows results using the
tessellation approach, and the fourth column shows the physical oversampling
using the OMI spatial response functions as a 2-D super Gaussian function with
k1=4 and k2=2. The target grid size is 1 km for the last three
approaches. The first and third rows show the oversampled results (C(j) in
Eq. ) using 5 days (1–5 July 2005) and 5 months (May–September 2005)
of data, respectively. The second and fourth rows show the
corresponding numbers of pixels included in the averaging for each grid cell
(D(j) in Eq. ). For the drop-in-the-box approach, the total
number of satellite observations included for each grid cell is much smaller
and shown with a different color scale for the 5-month averaging.
The drop-in-the-box approach shows significant data gaps (5-day averaging)
and high level of noise (5-month averaging), even when its target grid is 10
times coarser than the other oversampling approaches. There are two gaps
where no observation is available for point oversampling over the 5 days
(column 2, rows 1–2 in Fig. ), which is an example of
false negatives as these gaps are actually covered by OMI pixels (column 3,
rows 1–2 in Fig. ). The physical oversampling in the
fourth column consistently shows the smoothest results with clear
identification of the point source at the center of the domain, because the
spatial response function of OMI is properly incorporated. The oversampled
NO2 TVCD is biased high for the point oversampling approach because all
observations within the averaging radius are averaged equally, but larger
observation values generally are associated with larger uncertainties. The
results from tessellation become increasingly similar to those from physical
oversampling for longer averaging times, because the tessellation error is
randomly distributed and will eventually be averaged out. The physical
oversampling also does not require more computational resources than point
oversampling and tessellation, making it suitable for a wide range of spatial
scales and target grids.
Level 3 results using the drop-in-the-box method (10 km grid, a, e, i, m),
point oversampling (averaging radius: 12 km, 1 km grid, b, f, j, n),
tessellation (pixel corners from the OMPIXCOR product, 1 km grid,
c, g, k, o), and physical oversampling (2-D super Gaussian with k1=4 and
k2=2, 1 km grid, d, h, l, p). The domain size is 200 km × 200 km. The
first and third rows show the oversampled NO2 TVCD for 5 days and 5 months,
and the second and fourth rows show the corresponding numbers of OMI
observations used in the averaging for each grid cell. Note that panel (m) is on a different color scale than the other panels in the
same row.
Physical oversampling using IASI data with smoother spatial sensitivity distributions
Although the physical oversampling using the true satellite spatial response
functions produces the optimal estimation, the result is sometimes noisy and
even unphysical, especially when the observations are noisy and sparse. In
these cases, some spatial interpolation or smoothing methods are often
needed. In addition to the specialized interpolation and smoothing methods
discussed in Sect. , some smoothing can be applied
within the oversampling framework. For example, the level of smoothing can be
adjusted by the averaging radius in the point oversampling approach.
used a Gaussian filter to smooth tessellation results
for OMI HCHO and CHOCHO products. When using the generalized 2-D super
Gaussian function to characterize the satellite spatial response function
(Eq. ), it is also simple to tune the exponents (k3 in the
cases of circular FOVs such as IASI and CrIS and k1 and k2 in the cases of
quadrilateral FOVs such as OMI) so that the assumed satellite spatial
sensitivity distribution is smoother than the true spatial response function.
This often leads to better visualization and identification of local hot
spots, especially for products with a high noise level or sparse spatial
sampling. The advantage of this approach is that the smoothing is applied at
the satellite pixel level (level 2) instead of grid level (level 3), so the
geometry and error information for each satellite observation are preserved.
Similar to Fig. using IASI NH3 total column
product for 2015. The drop-in-the-box approach is not included. Instead, the
physical oversampling results using a smoother version of the IASI spatial
response function are shown in panels (d, h, l, p). The true IASI spatial
response function has much sharper edges than OMI, such that the physical
oversampling results (c, g, k, o) are very similar to tessellation results (b, f, j, n).
Figure shows similar oversampling results as
Fig. , but using IASI NH3 total column density
data for 2015 in eastern Colorado, centered around a
large cattle feedlot. The drop-in-the-box approach is not shown for IASI. The
results from point oversampling, tessellation, and physical oversampling to a
1 km grid are presented in the first three columns. The true IASI spatial
response functions have rather sharp edges (see Appendix A), so the physical
oversampling shown in the third column of Fig. is very
similar to tessellation shown in the second column. Although this is the
optimal estimation based on the physics of IASI observation, the spatial
gradients are hard to identify for 5-day averaging and noisy for 5-month
averaging. Instead of applying smoothing after the oversampling process, the
fourth column uses a smooth spatial sensitivity distribution of a 2-D
standard Gaussian function (2k3=2, rather than the true IASI spatial
response function with 2k3∼18). As illustrated by the first row in
Fig. , the physical oversampling using smoother spatial
sensitivity distributions provides the best results by clearly identifying the
central point source using only sparse (5-day) data. The third row in
Fig. demonstrates that with 5 months of averaging, the
local NH3 gradients are well resolved. The point oversampling using a
12 km radius overly smooths the results, making the central hot spot
artificially larger, whereas the general spatial gradients are still noisy
(column 1, row 3). The overall number of IASI observations used in point
oversampling is also significantly higher than tessellation and physical
oversampling, as shown by the fourth row. This is because the
12 km
averaging circle is much larger than most IASI footprints, and hence many
IASI observations are double counted as false positives. The smoothing based
on physical oversampling is much more effective in suppressing the noise, and
the spatial gradients are adequately preserved (column 4, row 3). This is
because each satellite FOV keeps the same FWHM and overall weight, and only
the distribution of sensitivity becomes more spread out.
Physical oversampling results using IASI-A NH3 total columns
under southerly wind (a, c) and northerly wind (b, d) and high
PBL temperature (>15 ∘C, a, b) and low
PBL temperature (<15 ∘C, c, d). The text arrows show the average
wind speed and wind direction at the locations and times of all IASI observations
in each category. The size and location of large CAFOs are overlaid.
Oversampling based on Eqs. ()–() also provides a flexible way
to categorize the results according to environmental and temporal variables.
The conventional way is to save the averaging weights for each level 2
observation (i.e., the level 2G product, where level 2 pixels are assigned to
points of the latitude and longitude grid), but the averaging weights can only be
defined for a specific grid. When representing each level 2 observation as a
spatial sensitivity distribution (the actual instrument spatial response
function or a smoother version of it), A(j) and B(j) can be calculated at
fine spatial and temporal grids and then aggregated spatially and/or
temporally. The level 3 map C(j) is just the grid-by-grid ratio of the
aggregated A(j) and B(j). Similarly, A(j) and B(j) can be calculated
according to environmental variables such as wind and temperature at fine
intervals and binned to coarser categories as needed.
Figure shows the physical oversampling of NH3 total
column under southerly winds (meridional wind component >0, panels a and c) and
northerly winds (meridional wind component <0, panels b and d) and high PBL
temperature (>15 ∘C, panels a and b) and low PBL temperature (<15 ∘C,
panels c and d). Here the PBL temperature is the average air temperature from
the surface to the top of the PBL, weighted by pressure. The average wind speed
and wind direction under each category are labeled in the corresponding
panels. IASI-A daytime data from 2008 to 2017 over northeastern Colorado are
included in the oversampling, and a 2-D standard Gaussian is used as the
spatial sensitivity distribution to smooth the results. The 3-D wind field,
atmospheric temperature, surface pressure, and PBL height are interpolated
from the North American Regional Reanalysis NARR; from their native resolutions of 32 km
and 3 h to
the IASI pixel locations and overpass time. Using the concentrated animal
feeding operation (CAFO) locations (colored dots; data courtesy of Daniel Bon, Colorado Department of Public Health and Environment) as a spatial
reference, the downwind dispersion of the total NH3 column under different
wind directions is clearly seen. The close match between large cattle CAFOs
and the NH3 hot spots seen from space confirms that they are the dominant
source of atmospheric NH3 in this region. The overall abundance of NH3
is significantly higher at warmer temperatures, in agreement with the
previous in situ quantification of CAFO NH3 emissions in the same
region .
Conclusions
A physics-based approach is developed to oversample diverse satellite
observational products to high-resolution destination grids. It represents
each FOV as a sensitivity distribution on the ground, which is physically a
more realistic representation of satellite observations. This sensitivity
distribution can be determined by the spatial response function of each
satellite sensor. We propose a generalized 2-D super Gaussian function that
can standardize the spatial response functions of many satellite sensors with
distinct observation mechanisms and viewing geometries. This generalized 2-D
super Gaussian function can be reduced to a rotating super Gaussian to
characterize the circular FOV of IASI and CrIS or a 2-D super Gaussian to
characterize the quadrilateral FOV of OMI and its successors. It can also
represent hybrid cases where the FOV is quadrilateral but with rounded
corners. When the shape-determining exponents in the generalized 2-D super
Gaussian function approach infinity, the FOV is equivalent to a polygon, as
assumed in the tessellation approach.
Synthetic OMI and IASI observations were generated assuming the spatial
response functions are perfectly known to compare the tessellation error and
the discretization error. The balance between these two error sources depends
on the target grid size, the ground size of FOV, and the smoothness of
spatial response functions. The proposed oversampling approach is generally
more accurate for fine-grid oversampling of satellite observations with
smooth spatial responses, whereas tessellation is more accurate for coarse
grids and sharper spatial responses. For OMI, CrIS, and IASI, the threshold
target grid size where both errors are equal are at ∼16, ∼4, and
∼2 km, respectively. Therefore, it is recommended to oversample to 1 km
(0.01∘) and then co-add to coarser grids if necessary for regional
studies. The tessellation may be more desirable for generating global level 3
products with coarse grids. The generalized 2-D super Gaussian function also
enables smoothing of the level 3 results by decreasing the shape-determining
exponents, useful for high noise levels or sparse satellite datasets. This
smoothing performed at each observation is more physically realistic than
arbitrarily tuning the averaging radius and the spatial filtering of the
level 3 map as the weightings of level 2 pixels are unchanged.
The new physical oversampling approach is applied to OMI NO2 products and
IASI NH3 products, showing substantially improved visualization of trace
gas distribution and local gradients. With proper consideration of the
spatial response functions, this approach can be applied to multiple
previous, current, and future satellite datasets, which will help to create
long-term consistent data records for atmospheric composition.