New sampling strategy removes imaging spectroscopy solar-smearing bias in sub-km vapour scaling statistics

Upcoming spaceborne imaging spectrometers will allow retrieval of total column water vapour (TCWV) over land at horizontal resolution of 30—80 m. Here we show how to obtain, from these retrievals, exponents describing the power-law scaling of sub-km horizontal variability in clear-sky bulk planetary boundary layer (PBL) water vapour (q). Using large-eddy simulations (LES) of shallow convective PBLs we show how sunlight entering the PBL up to several km away from the 10 footprint location degrades estimates of these exponents. We address this by calculating exponents perpendicular to the solar azimuth, that is to say flying “across” the sunlight path rather than “towards” or “away” from the Sun. Across 23 LES snapshots, at SZA=60° the mean bias in calculated exponent is 38±12 % (95 % range) along the solar azimuth, while following our strategy it is 3±9 % and no longer significant. Both bias and root-mean-square error RMSE decrease with lower SZA. We include retrieval errors from several sources including: (1) the Earth Surface Mineral Dust Source Investigation (EMIT) 15 instrument noise model, (2) requisite assumptions about the atmospheric thermodynamic profile, and (3) spatially nonuniform aerosol distributions. This technique can be used to obtain unique information about sub-km PBL q scaling from upcoming spaceborne spectrometer missions, while mitigating errors due to challenging solar geometries. Copyright statement. ©2021 California Institute of Technology. Government sponsorship acknowledged. 20

Such that zn is the log-log gradient of Sn as a function of r. There is strong motivation to quantify and understand these exponents and the ranges Δ within which they are valid, and here we specify second-order structure functions S2 with exponent z2, describing variance scaling. This is related to the commonly-referenced Fourier power spectrum exponent b: One motivation for obtaining these exponents is that climate model sub-grid variability in q is strongly linked to cloud 5 formation (Golaz et al., 2002;Perraud et al., 2011;Sommeria and Deardorff, 1977). In principle sub-grid variance can be tuned for each model setup, but scale-aware variance relationships allow a smooth and consistent transition between low (order ~hundreds of km) and high (order ~km) resolution models (Arakawa et al., 2011;Schemann et al., 2013).
At scales larger than model grid cells, observational estimates of variance scaling can also be used to assess model performance, as has been done using estimates of temperature (T) and q from Atmospheric Infrared Sounder (AIRS) and airborne campaign 10 data (Kahn et al., 2011).
Furthermore, the scaling exponents are related to the physical processes that generate the cascade of turbulent eddies in the atmosphere. For example, while mean-scale statistics retrieved by AIRS are approximately isotropic in the horizontal, there are differences in scaling between the horizontal and vertical (Pressel and Collins, 2012). Scaling following z2=2/3 is predicted for a passive tracer in the inertial range of three-dimensional locally isotropic turbulence following Kolmogorov theory. 15 Accounting for the buoyancy effects can strongly modify that scaling (Bolgiano, 1959;Obukhov, 1959), with different exponents expected for the velocity (z2=6/5-7/5) and scalars (z2=2/5), as shown in, e.g. Kunnen et al. (2008), Wroblewski et al. (2010) or Boffetta et al. (2012). Exponents of 7/5 have been commonly measured for vertical wind profiles from dropsondes (Lovejoy et al., 2007), and values typically near z2 of 6/5 for horizontal water vapour in non-convective areas, and z2=0.72 in convective areas have been determined from airborne lidar retrievals over horizontal ranges of up to 100 km (Fischer et al., 20 2012(Fischer et al., 20 , 2013, meanwhile the sub-km regime remains undermeasured. Of particular interest for modellers are the existence of "scale breaks", distances at which the exponents change such that a smooth transition between model resolutions may not be possible. These have been calculated to occur at distances over a broad range of 10-1000 km (Bacmeister et al., 1996;Gage and Nastrom, 1985;Kahn et al., 2011;Pinel et al., 2012) and the range of scales has been speculated to be related to the size of convective systems (Dorrestijn et al., 2018) or changes in the 25 nature of turbulence (Kurowski et al., 2015;Skamarock et al., 2014).
This study focusses on the estimation of horizontal scaling of clear-sky TCWV at sub-km scales, and specifically on the integrated PBL water vapour which we refer to as the partial column water vapour (PCWVPBL). Recent work using airborne data  and large eddy simulation (LES) output  has provided evidence that subkm horizontal variability in total column water vapour (TCWV) is almost perfectly correlated with PCWVPBL variability, such 30 that high-spatial-resolution retrievals of TCWV from visible and shortwave infrared (VSWIR) imaging spectrometers can provide unique information about PBL q variability. This is not a statement that all water vapour is inside the PBL, rather that on sub-km spatial scales, the variability in low-altitude water vapour dominates the horizontal column variability. It remains a https://doi.org /10.5194/amt-2021-163 Preprint. Discussion started: 11 August 2021 c Author(s) 2021. CC BY 4.0 License. challenge to disentangle variability from different heights within the PBL such as the sub-cloud layer or a conditionally unstable cloud layer, which may have different scaling properties.
In particular, the PBL depth is typically 1-2.5 km in these simulations, while we obtain variability statistics over horizontal ranges of under 1 km. The vertical averaging over a scale larger than the horizontal calculation means that the interpretation of the physical meaning of the derived exponents is challenging and may not be directly related to the theoretically derived 5 exponents discussed above. This study aims only to determine whether the measurement problem of the solar path can be overcome, and leaves the physical interpretation out of scope. This point will be revisited in Section 4.
Several modern and upcoming missions obtain or will obtain VSWIR spectra that could allow retrieval of TCWV at horizontal resolutions from 20-100 m. Current examples include the Multi-Spectral Imager (MSI) on Sentinel-2 (Drusch et al., 2012), the PRecursore IperSpettrale della Missione Applicativa (PRISMA, Candela et al. (2016)) and DLR Earth Sensing Imaging 10 Spectrometer (DESIS, Krutz et al. (2019)). Upcoming missions such as NASA's Earth Surface Mineral Dust Source Investigation (EMIT; Green and Thompson (2020)) and ESA's Copernicus Hyperspectral Imaging Mission for the Environment (CHIME; e.g. Rast et al. (2019)) will obtain horizontal resolution of order 30-80 m, and this study will assess performance assuming footprint sizes of 40-50 m, i.e. at the mid-point of that range.
This footprint selection was made based on the resolution of available LES output, and the analysis method is based on the 15 output of retrievals developed specifically for EMIT, which primarily retrieves TCWV to allow atmospheric correction for its surface reflectance target observable. Other instruments such as PRISMA or MSI may have different error characteristics, but we treat retrieval errors in a general manner that could be expanded to these other instruments.
This study attempts to determine whether EMIT will be able to obtain z2 over 0.5-1 km after accounting for (1) random retrieval error, (2) systematic biases in retrieval mean and sensitivity and (3) solar zenith angle. 20 The PBL is of particular interest since it is the location where reflective low clouds form and in Dorrestijn et al. (2018)'s analysis, z2 varied more in the 850 hPa layer than the 300 hPa or 500 hPa layers. However, previous analyses have generally been restricted to far larger spatial ranges with Dorrestijn et al. (2018) referring to 55-165 km scale variance as occurring at the "tiny scale". Other examples at higher resolution generally use airborne measurements, with few calculations for separations under 1 km using onboard sensors (Cho et al., 1999), using lidar at 5-100 km (Fischer et al., 2013) or evaluating 25 simulations with lidar for separations >11 km (Selz et al., 2017).
This study uses LES outputs and does not make any statements about the realism or cause of the LES output z2 values, but instead aims solely to identify and quantify retrieval biases and errors. Its greatest contribution is to demonstrate that directional calculation strategies can remove biases in estimates of z2 introduced by the solar path through the atmosphere. For illustration, consider a PBL of 2 km depth being viewed at nadir with a solar zenith angle (SZA) of 30°, and note that the sunlight enters 30 the PBL >1 km away from the nominal surface target. This effectively coarsens the horizontal resolution, potentially destroying the benefits of a nominal 50 m footprint for atmospheric investigation. For this reason Thompson et al. (2021) only used flight lines with very low SZA to calculate z2. This paper uses the solar-path-traced outputs of Richardson et al. (2021)  that by calculating the structure function in a direction perpendicular to the solar azimuth, the bias in z2 is removed for calculations within the 23 LES snapshots considered.
If this result can be extended to the real-world, then upcoming high-spatial-resolution spaceborne VSWIR spectrometers will provide a breakthrough for analysis of moisture scaling across unprecedented spatial scales. Our retrievals are restricted to clear sky areas over land, but this still represents a substantial advance on the current capacities of other instrument types. 5 Lidar measurements avoid the solar-path issue, and can retrieve vertical information including above clouds, but current and anticipated spaceborne lidars do not offer VSWIR's fine spatial resolution or broad spatial coverage from a wide swath.
Sounders such as infrared can profile the atmosphere, but have footprints that are too large for sub-km exploration.
Even airborne measurements, which offer far less coverage than spaceborne sensors, may suffer from their own challenges.
Evidence suggests that the tendency of flights to follow isobars rather than maintain altitude can introduce height variations 10 that blend vertical variation into horizontal calculations and result in exponents similar to those predicted due to buoyancy's vertical effect (Lovejoy et al., 2004;Pinel et al., 2012).
The VSWIR sampling technique introduced here could greatly expand the range of conditions under which spatial scaling of water vapour can be quantified. In Section 2 we describe the LES output, simulated retrievals, and how retrieval errors and solar path are accounted for in calculation of z2. Section 3 presents the results and Section 4 discusses and concludes.

Large Eddy Simulation output
We use output from the five LES runs of shallow convection as in Richardson et al. (2021), with four cloudy cases (ARM, ARM_lsconv, BOMEX, RICO) and one case in which clouds do not form (DRY). Simulations use two models, EULAG for the ARM cases (Prusa et al., 2008) and JPL-UCONN LES for the others (Matheou and Chung, 2014). Simulation setups are 20 described in Richardson et al. (2021) and the associated references (Brown et al., 2002;Kurowski et al., 2020;Matheou and Chung, 2014;Siebesma et al., 2003;vanZanten et al., 2011), and while some simulations represent oceanic boundary layers we simply assume a land surface for the retrievals. Static reanalysis profiles from MERRA-2 (Gelaro et al., 2017) are appended above the LES domain, but the LES domains were shown to capture horizontal variability in q from analysis of LES output and airborne lidar profiles over the Pacific (Bedka et al., 2021). The 23 selected snapshots are labelled by their timestamp, e.g. 25 DRY_7200s represents two hours into the DRY simulation. The vapour qv and liquid water qc fields are extracted from these snapshots and then TCWV is calculated from the appropriate pressure-weighted vertical sum. The LES output horizontal resolution ranges from Dx=20-50 m, here we degrade the Dx=20 m cases to Dx=40 m resolution to make the spatial difference statistics more consistent.
SZA ranges from 0-60° inclusive in increments of 15°, and the solar ray is traced along the selected SZA from the top of 30 atmosphere to the centre of a surface grid cell, and then directly up to represent a nadir viewing instrument. The solar azimuth angle is the positive y direction in each simulation, such that the sunlight's horizontal path component is in the negative y direction. Path qv is pressure weighted to obtain path column water vapour, which we also refer to as TCWV. Footprints are flagged as cloudy or shaded if their cloud water path CWP>1´10 -3 mm, with CWP calculated in the same manner as the TCWV but using liquid water content, and this threshold represents approximately t<0.3 in a subadiabatic cloud (Szczodrak et al., 2001). A footprint is masked if it is cloudy or shaded for any SZA, and by using a common mask we isolate the effect of SZA 5 and retrieval errors from changes due to sampling. Richardson et al. (2021) adopted an emulator approach for full-scene simulation due to computational constraints. We use those emulators here, which obtain retrieved TCWVret from input TCWV, which is in fact the integrated water path along the solar path, via: 10

Generating retrieved TCWV fields
Where a1 and a2 are the slope and intercept and e a random sample from a Normal distribution whose standard deviation se quantifies the random retrieval error. The parameter a1 represents the sensitivity dTCWVret/dTCWV and a2 is related to the bias, although it is only equal to the bias if a1=1. Some errors in a1 can be understood as due to erroneous atmospheric profiles assumed in the current retrieval implementation. The parameters were fit to a sample of retrievals that used plane-parallel 15 radiative transfer with profiles directly from the LES columns (i.e. without solar path tracing) and SZA=45°. In sensitivity tests, SZA>45° led to somewhat larger se but later we show that our structure-function conclusions will not be affected even if different SZAs change the emulator parameters.
The emulators were fit to subsets of retrievals that used the Imaging Spectrometer Optimal Fitting (Isofit) code (Thompson et al., 2018(Thompson et al., , 2019, with the MODTRAN6.0 radiative transfer model used for both forward and inverse calculations (Berk et al., 20 2014(Berk et al., 20 , 2015. Richardson et al. (2021) showed that Eq. (4) is valid for a scene with varying aerosol optical depth, but that the emulator parameters depend on the surface classification and atmospheric conditions. The primary requirements are that parameters are fit separately to each type of PBL, represented by different LES runs (i.e. for each case, all snapshots within a case use the same parameters), and that an observed region is either a mixture of vegetation or a mixture of mineral surfaces alone. Each LES simulation has one emulator to convert TCWV to TCWVret, and the same emulator is used to obtain TCWVret 25 from all SZA path TCWVs.

Calculation of spatial statistics and removal of random error
S2(r) is calculated from the LES field of TCWVret in one horizontal direction at a time, by including all pairs of footprints separated by r in that direction provided that neither of the footprints in the pair is flagged as cloudy or shadowed for any SZA.
To calculate along the LES x axis, each different y location is effectively treated as a 1-D field and its set of f(x+r)-f(x) values 30 are calculated, then all the 1-D subsets are concatenated and the reported S2(r) is the expectation value of this combined dataset.
The horizontal directions are either along the x axis ("perpendicular" to the sunlight) or along the y axis (the "parallel" case). that z2 will depend on the range of r over which it is calculated. The variation of z2 with Dr can be due to changes in physical processes in addition to imperfect process representation in the LES, such as nonphysical dissipation at separations smaller than several grid cells (Brown et al., 2002). This study is not concerned with the interpretation of z2 but rather with the accuracy 5 with which it can be obtained, so we do not explore this further and follow Thompson et al. (2021) in calculating the fit over r=0.5-1 km.
We estimate and remove the measurement noise following the method of Richardson et al. (2021). They derived a structurefunction-based method to estimate se from the retrieved TCWV field, thereby allowing better estimation of S2 by subtracting 2 ( # from Eq. (5). This method involves calculating S2 in one horizontal direction at r=Dx (i.e. separation of 1 grid cell, 40 m 10 or 50 m), then smoothing the field by a factor of two in the perpendicular direction and recalculating S2, which we label S2,´2.
We evaluate the sensitivity of our derived z2 to uncertainty in the retrieval emulator, when calculating S2 from the retrieved fields the emulator changes the retrieved value S2,ret as: And the estimated z2,ret is; This shows that both biases in a1 ("retrieval sensitivity") and the magnitude of random errors can change the derived z2. Richardson et al., (2021) showed that errors in a1 were the largest source of uncertainty in the estimating the spatial standard deviation. We will therefore perform sensitivity tests for a range of a1 values, and show only one se case since results were 20 insensitive to changes in se up to a factor of four scaling.
The final question we address is whether the scaling exponent estimated at the very high spatial resolution of missions such as EMIT will be fundamentally different from that obtained by current sensors such as MERIS and MODIS that can provide TCWV at a nominal resolution near Dx»250 m. Large-scale sx narrows as spatial resolution coarsens, but it is not clear how this affects the spatial scaling properties considered here, so for this test we sequentially degrade a TCWV field from Dx=50 25 m to Dx=250 m and calculate S2 and z2 at each resolution. Figure 2 shows how the z2 calculated for PCWV integrated up to different heights varies within the PBL, but that the value becomes fixed by the PBL top. That is, estimates made from TCWV refer to the value for PCWVPBL, but this conceals vertical structure within the PBL. We can therefore confirm that our derived values are indeed representative of bulk PBL statistics, but further work is needed to determine the precise utility of statistics of PCWVPBL and furthermore, we note that corresponding estimates of PBL height from other sources may be necessary to help interpret measurements of PCWVPBL scaling.

Retrieval errors and z2
The results from all sensitivity tests applied to the clear-sky TCWV in the ARM_18000s snapshot are shown in Figure 3, 5 results are similar for other snapshots (not shown). Figure 3(a) shows that random errors that we estimate will be typical for EMIT result in substantial changes to calculated ln(S2)/ln(r), with a notable flattening over much of the r range and an unacceptably large error in z2 derived from retrieved TCWV. Our error correction reduces the z2 error from 53.1 % to 1.4 %. Furthermore, more footprints will be partially cloudy, meaning that the samples included will be too small for robust estimation 15 of z2. We did not apply a matched cloud mask for this test, and expect that this will lead to a more uncertain estimates of z2.
Tests across all 23 snapshots show that most return a higher, and more uncertain, z2 when spatial resolution is degraded (not shown). This points to new information being obtained from finer-spatial-resolution retrievals, provided that the slanted solar paths do not destroy the correspondence between true and retrieved z2.

Solar zenith angle, calculation direction and z2 20
In our final tests we compare z2 calculated on the TCWVret fields with SZA changed from 15-60° as a function of the "true" value obtained from the TCWV field. The TCWVret values are those using each simulation's emulator parameters and include the random error correction, while the true values refer to the columns directly over each footprint with no retrieval error. All use Dx=40-50 m with fits calculated over r=500-1000 m, and calculations are either along the y axis and parallel to the solar azimuth, or along the x axis and perpendicular to the solar azimuth. 25 Figure 4 shows that there is substantial spread introduced for realistic SZA, with a strong tendency for bias in z2 to increase with SZA when calculated parallel to the solar azimuth. This spread and apparent bias is greatly reduced when the calculation is performed perpendicular to the solar azimuth. In the parallel case there are still significant differences (p<0.05) between true and retrieved z2. The p=0.05 value threshold is calculated as 1.96 times the standard error of the difference in best-fit parameters derived from the ln(S2) as a function of ln(r) gradient. Figure 5 shows that both bias and error range expand with SZA when calculating in the solar azimuth direction, but the median (and mean, not shown) bias is eliminated by calculating S2 in the direction perpendicular to the solar azimuth. The spread is also wider in the parallel case, the 5-95 % range of differences relative to the truth is -0.04-0.49 versus -0.20-0.22 when calculating perpendicular, while the standard deviation is 0.18 for parallel versus 0.13 for perpendicular.
These results show that realistic SZA result in path-integrated water vapour structures that have somewhat different 5 characteristics at a 0.5-1 km scale than that of the PCWVPBL over each footprint. This results in additional error to estimated z2, but appropriate calculation strategies that account for solar azimuth angle can reduce the magnitude of this error and suppress systematic error.

Discussion and conclusions
We have shown that for the q fields simulated in five LES runs of shallow convective PBLs, a novel strategy accounting for 10 solar azimuth eliminates the SZA-induced bias in calculated z2 over r of 0.5-1 km from high-spatial-resolution VSWIR retrievals. This substantially increases the range of applicable scattering geometries for which VSWIR retrievals can be used to estimate spatial scaling statistics. For example, in the airborne case studies of Thompson et al. (2021), only flight lines with SZA<15° could be used, while our method promises unbiased estimates of sub-km z2 for SZA up to 60°. Removal of the bias is particularly important for applications, since q-scaling analyses using spaceborne data typically group or average over many 15 sets of measurements (e.g. Kahn and Teixeira (2009)), and this averaging will not reduce bias in the way that it reduces RMSE.
For the first time it also allows calculation of high-resolution z2 statistics over midlatitude and polar areas of the globe that do not experience low solar zenith angles.
This approach should be applicable to any instrument that obtains TCWV from VSWIR with horizontal resolution approaching 50 m, not just EMIT. Operationally, this requires sufficiently long, continuous sampling perpendicular to the solar azimuth. 20 We have not analysed length requirements here, but note that for airborne campaigns this is easily addressed on a flight-toflight basis. For spaceborne instruments, the sampling will depend on the swath size and orbital geometry. EMIT's approximately 75 km swath (Bradley et al., 2020) spans distances larger than those considered here, and so we expect sampling to be sufficient regardless of orbital configuration. Similarly, ESA's CHIME contractor notes a 128 km swath, and similar capacities are expected for the missions that address NASA's Surface Biology and Geology (SBG) and Aerosols, Clouds, 25 Convection and Precipitation (ACCP) designated observables. However, we note that it will be necessary to understand the implications of non-uniform footprint size and footprint-dependent errors determined from the on-orbit instrument performance in order to increase the confidence in estimates of z2 from these spaceborne sensors.
Our analysis is based on the retrieval outputs of Richardson et al. (2021), which showed how random retrieval error se can be identified and removed from TCWVret fields by exploiting the properties of TCWV S2 at small scales. It also showed that 30 errors in the T and q profiles assumed in the retrieval could affect retrieval sensitivity a1=dTCWVret/dTCWV. We have shown here that to obtain z2 it is critical to remove the effect of se, but that while errors in a1 are the largest error source for estimates of spatial standard deviation, they do not greatly affect the z2.
Ideally this sampling strategy could be field tested, and a strategy to do so would be to perform flights both parallel and perpendicular to the solar azimuth with collocated retrievals of TCWV from VSWIR and an independent instrument that is not affected by SZA, such as a differential absorption lidar (DIAL) or passive sounding instrument. The non-VSWIR 5 instrument would allow calculation of z2 that is not affected by sunlight path, and the conclusions of this study would be supported if the VSWIR and non-VSWIR estimates showed good agreement for the perpendicular but not parallel flight paths.
We highlight the High Altitude Lidar Observatory (Bedka et al., 2021) as a candidate sensor for such an experiment.
For science applications, it would also be necessary to identify which scenes are likely to satisfy our requirements, for example conditions proximate to deep convection may have more substantial above-PBL variability at <1 km scales and result in weaker 10 correspondence between TCWV and PCWVPBL. In addition, it may be necessary to identify PBL height, which would require independent information from weather forecast models or reanalysis, in situ measurements or other instruments. Furthermore, users would also need to consider the effect of sampling biases that may depend on cloud fraction or on the presence of nonisotropic variability in features such as horizontal convective rolls (e.g. as explored in Carbajal Henken et al. (2015)). If locations commonly experience meteorological features with a preferential orientation, for example due to orography or a 15 coastline, then this method may not adequately capture its full structure.
Finally, the interpretation of these exponents has not been considered in detail here but future work could proceed either observationally or theoretically. For an observational study, the relationship between retrieved exponents and other properties, such as later convective initiation could be investigated. A theoretical study would need to apply current understanding of turbulent physics to address precisely the problem of the horizontal variability of vertically-averaged profiles. Further in the 20 future, perhaps multi-angle imaging spectroscopy could provide profiling from VSWIR measurements via computed tomography, which has been demonstrated to retrieve aerosol profiles in some conditions using the Multi-Angle Imaging SpectroRadiometer (MISR) on Terra (Garay et al., 2016) and upper-tropospheric water vapour profiles from airborne measurements with the Gimballed Limb Observer for Radiance Imaging of the Atmosphere (GLORIA, Ungermann et al. (2015)). We are not aware of any likely short-term space missions that would allow water vapour tomography from VSWIR 25 retrievals at EMIT-like horizontal resolution, with the upcoming Multiangle Imager for Aerosols (MAIA, Diner et al. (2018)) having horizontal resolution similar to that of MODIS. If such an instrument were launched, high-spatial-resolution tomography would also be subject to the smearing effect from the non-vertical solar path and so may benefit from our proposal, although a detailed study of the particular instrument and retrieval setup would be required.
Our main conclusion is that a novel sampling strategy can allow a breakthrough in space-based measurement of PBL water 30 vapour scaling, and that while individual uncertainties and geographical sampling may vary with the mission, this principle will apply to the array of upcoming VSWIR instruments whose spectra will allow column water vapour retrievals at resolutions from 30-80 m or better. The resulting statistics offer a new check on the validity of high-resolution atmospheric models and inform sub-grid parameterisations of coarser ESMs.  Figure 3. Sensitivity of derived S2 to retrieval errors and spatial resolution, the legends in each case report the calculated exponent from a fit over separations 0.5-1 km, the region which is shaded grey in each panel. In each case "truth" refers to the value calculated from TCWV at native LES resolution of 50 m. (a) blue shows S2 derived from retrieved TCWV including random error only, and orange the S2 after subtracting the estimated retrieval variance. (b) S2 calculated with no random error, but non-unity