Shortwave Radiative Effect of Arctic Low-Level Clouds: Evaluation of Imagery-Derived Irradiance with Aircraft Observations

1University of Colorado, Department of Atmospheric and Oceanic Sciences, Boulder, CO, USA 5 2University of Colorado, Laboratory for Atmospheric and Space Physics, Boulder, CO, USA 3Science Systems and Applications, Inc., Lanham, MD, USA 4Naval Research Lab, Monterey, CA, USA 5Bay Area Environmental Research Institute Sonoma, Sonoma, CA, USA 6NASA Ames Research Center, Moffett Field, CA, USA 10 7 Department of Geophysics, Porter School of the Environment and Earth Sciences, Tel-Aviv University, Israel 8NASA Langley Research Center, Climate Science Branch, Hampton, VA, USA

Spectral surface albedo measurements for snow and ice have been collected during ground-based field experiments in polar regions (e.g., Perovich et al., 2002;Brandt et al., 2005). However, they do not necessarily capture the spatial and temporal variability in the Arctic surface conditions. Finally, water vapor, the most abundant natural greenhouse gas, plays a key role in modulating the surface radiation budget (Schmidt et al., 2010). In the Arctic, water vapor has a special significance due to frequent temperature inversions, complex 5 cloud regimes, and moisture advection (Curry et al., 1995). Doyle et al. (2011) showed that an average of 17 Wm -2 increase in downwelling longwave irradiance (LWD) was associated with a moisture intrusion during 9-11 February 2010, on Ellesmere Island in the Canadian High Arctic, and that optically thin clouds further increased LWD by 20 Wm -2 . Sedlar and Devasthale (2012) showed that the water vapor anomaly positively covaries with temperature anomaly, resulting in an anomaly as large as ±10 Wm -2 in clear-sky LWD. Without adequate information, the varying water vapor content in the Arctic atmosphere can contribute to 10 uncertainties in radiative calculations.
In summary, the challenges for deriving CRE from passive remote sensing are (a) inaccurate detection of clouds and cloud optical property retrievals over snow or ice surfaces; (b) lack of accurate surface albedo as a constraint in the radiative transfer model (RTM); (c) insufficient knowledge about the water vapor content.
The aim of this paper is to use aircraft radiation measurements collected during the NASA Arctic Radiation - IceBridge 15 Sea & Ice Experiment (ARISE, Smith et al., 2017) to evaluate irradiance as derived from coincident satellite imagery, and to investigate the causes of any biases. In the first step, the spectral snow surface albedo was derived from upwelling and downwelling irradiance measurements, accounting for partially snow-covered scenes by the snow fraction estimated from aircraft camera imagery. In the second step, we used an RTM to calculate the upwelling and downwelling broadband and spectral irradiance at flight level, incorporating the MODIS-derived COPs and spectral surface albedo derived from the aircraft measurements as inputs. 20 The calculated irradiances were then compared with the measured broadband and spectral irradiance pixel by pixel for two cases -above-cloud and below-cloud. Section 2 describes the data and method used in this study. Section 3 provides the results and discussions for the measured spectral surface albedo, as well as for the comparisons between irradiance calculations and measurements. Conclusions are drawn in Section 4.

Data and Methods 25
ARISE was a NASA airborne measurement campaign to study snow and ice properties in the Arctic marginal ice zone (MIZ) in conjunction with cloud microphysics and radiation (Smith et al., 2017). The NASA C-130 aircraft was instrumented with shortwave and longwave radiometers, described in this section, along with cloud microphysics probes, aerosol optical properties instruments, and snow and ice remote sensors. The experiment was based at Eielson Air Force Base near Fairbanks, Alaska, from 2 September to 2 October 2014, to capture the September sea ice minimum. One of the primary objectives of ARISE was to validate irradiance 30 (or flux densities) derived from CERES-MODIS observations with aircraft radiation measurements.
In this paper, we use imagery from a downward-looking video camera together with measurements of shortwave broadband and spectral irradiance. In the Arctic, overpasses of polar-orbiting satellites are fairly common. ARISE targeted multiple overpasses of MODIS and CERES on Aqua, Terra, or VIIRS on Suomi NPP on almost every flight. We focus on two science flights on 11 September and 13 September that sampled under above-and below-cloud conditions, respectively. These flights 35 include so-called "lawnmower" patterns, a series of parallel flight legs laterally offset by about 20 km as shown in Fig. 1. They were specifically designed for ARISE to sample one or two 100 × 100 km 2 grid boxes per flight with sufficient coincident CERES https://doi.org /10.5194/amt-2019-344 Preprint. Discussion started: 28 October 2019 c Author(s) 2019. CC BY 4.0 License.
footprints (each with a 20-km diameter at nadir) to acquire statistically significant above-or below-cloud aircraft measurements for the validation of CERES-MODIS derived irradiance.
Comparing the aggregated data from ARISE directly with the CERES-MODIS flux products within the grid box, e.g., using histograms, is challenging because of the heterogeneity of the scenes in terms of surface albedo, cloud conditions, and changing solar zenith angle. Therefore, we instead compare satellite-based radiative transfer calculations and aircraft observations 5 pixel by pixel along the flight track. We use the MODIS cloud products (see Section 2.4) from collection 6.1 instead of CERES fluxes because of their higher spatial resolution (1 km versus 20 km). From these cloud products, above-and below-cloud spectral and broadband irradiance are calculated, with the additional input of spectral surface albedo derived from ARISE aircraft measurements. The details of each dataset and instrument are described in the following subsections.

BroadBand Radiometer System (BBR) 10
The BBRs deployed during ARISE are modified CM 22 Precision Pyranometers from Kipp & Zonen . They measured upwelling and downwelling broadband irradiance (unit: W m -2 ) that is, the spectrally integrated irradiance from 200 nm to 3600 nm. In addition, a sunshine pyranometer (SPN-1) was flown to measure diffuse and global radiative fluxes (Badosa et al., 2014;Long et al., 2010). The SPN-1 radiometer was originally intended for ground-based use, but is suited for airborne measurements of global and diffuse radiative fluxes because it does not have any moving parts, unlike traditional instruments such as the 15 Multifilter Rotating Shadowband Radiometer (MFRSR).

Solar Spectral Flux Radiometer (SSFR)
To attribute discrepancies between satellite-derived irradiance and airborne observations to causes such as erroneous water vapor, cloud properties, or three-dimensional radiative transfer effects, spectrally resolved measurements are needed (Schmidt and Pilewskie, 2012). SSFR is a moderate resolution flux spectrometer built at the Laboratory for Atmospheric and Space Physics 20 (LASP, University of Colorado Boulder). It is an updated version of the heritage spectrometer system originally developed at NASA Ames (Pilewskie et al., 2003).
SSFR is typically flown in conjunction with an Active Leveling Platform (ALP, also built at LASP), which was originally developed for the ER-2 and was re-built for the C-130. Counteracting the changing aircraft attitude with a system that keeps the zenith light collector horizontally aligned is particularly important in the Arctic, where low sun elevations lead to large systematic 25 errors for fix-mounted or poorly stabilized sensors (Wendisch et al., 2001). One reason is that radiation from the lower hemisphere (for example, from clouds below or at the aircraft altitude) is registered by the zenith detector when it is tilted, which leads to systematic biases that cannot be corrected. Another reason lies in the specific design of the SSFR light collectors, which are realized as integrating spheres with a circular aperture on top. They diffuse the incoming light collected by the aperture and bundle it into a fiber optics cable that transmits it to the spectrometer system inside the aircraft (Schmidt and Pilewskie, 2012). The integrating 30 sphere has an imperfect response to the incidence angle θ (Kindel, 2010), in contrast to the response of broadband radiometers such as BBR, which are closer to cos(θ) as required for irradiance. At high sun elevations, a so-called hot spot arises from a baffle that prevents light from being directly transmitted into the fiber optics. Since the response deviates significantly from the 1:1 line, the direct and the diffuse light need to be corrected differently. The diffuse/direct correction is done by separating the diffuse and direct component from spectral radiative transfer calculations based on broadband SPN-1 measurements (details are provided in 35 Appendix A), and further assuming that the downwelling diffuse radiation is close to isotropic. This assumption becomes invalid if parts of the lower hemisphere are in the light collector's field of view. The light collector's angular response to the azimuthal angle also needs to be considered. Throughout the course of the mission, the zenith light collector revealed a response that depended on the relative azimuth of the sun to the aircraft, which was characterized by two calibration circles flown on 2 October. The non-homogeneous azimuthal response of the zenith light collector occurred for solar zenith angles greater than 66°. In many cases, an azimuthally variable response (i.e., a signal that depends on either the sun azimuth angle or on aircraft heading angle) can be attributed to the tail and/or propellers of the host aircraft. BBR 5 and SPN-1, which were both fix-mounted on the C-130, would also be affected. To assess their azimuthal response, the attitudecorrected BBR data (Bannehr and Schwiesow, 1993;Bucholtz et al., 2008;Long et al., 2010) was compared with the SPN-1 global irradiance data, as well as with radiative transfer calculations. This comparison revealed that in this case, aircraft interferences were actually minor compared to atmospheric effects. Only SSFR measurements, but not BBR and SPN-1, had a significant azimuthal dependence, suggesting the SSFR light collector as the source, rather than aircraft interferences. To determine the SSFR 10 azimuthal response during the mission, SSFR's measurements were referenced to BBR during a full calibration circle flown on 2 October (details are provided in Appendix B). The calibration circle constitutes SSFR's azimuthal response at this solar zenith angle, which was then used to correct SSFR's downwelling irradiance for the conditions encountered for the SSFR data collected during other research flights. By using BBR, SPN-1, and SSFR in such a way, the redundancies between the instruments were used to capitalize on the strengths of the individual instruments (BBR: un-biased angular response; SPN-1: diffuse/global separation; 15 SSFR; spectral resolution for sub-range of BBR and SPN-1).
In addition to the angular calibrations, wavelength and radiometric calibrations were performed in the laboratory before and after the mission. The wavelength calibrations ensured spectral accuracy by referencing the SSFR measurements to several line sources. The primary radiometric calibration, performed with a NIST-traceable calibrated lamp, links SSFR measured digital counts to spectral irradiance. In addition, the radiometric calibration was transferred to a so-called secondary radiometric field 20 standard, which monitored the stability of the radiometers throughout the mission. From the SSFR measurements, spectral albedo, net flux, and absorption can be derived at a spectral resolution of 8-12 nm (4-6 nm sampling). From the spectral albedo, cloud optical properties or surface albedo can be derived. To increase signal-to-noise ratio, spectral or spatial binning is possible. A spatial data aggregation technique is pursued here (Section 3.1).

Imagery from Downward-Looking Video Camera 25
A downward-looking video camera (referred to as "nadir camera") is often included as a standard payload on NASA aircraft. It typically records scenes for context only and is not radiometrically or geometrically calibrated. Despite this shortcoming, the videos recorded by the nadir camera are used for quantitative image analysis. From the video, we first extract image frames with an average rate of 2 Hz (2 frames per second). To co-register the nadir imagery in conjunction with the measurements from other instruments, the times for individual image frames are also needed. However, the image frames themselves did not contain a 30 digitally stored time. Instead, they included a timestamp located at the lower left side that contains time information. We used Optical Character Recognition (OCR) to retrieve the time from this information.
In the second step, the nadir camera imagery was used to quantify the fractional snow coverage. The snow fraction was estimated by calculating the fraction of bright pixels of the image. To this end, the image was converted from RGB (red, green, and blue) into grayscale through for each pixel. One issue of the nadir camera imagery was the darkening effect from the center to the edge of its field of view, which is known as the vignette effect. To compensate, the brightness of the image was linearly increased from edge to center through an image blending and interpolation technique by Haeberli and Voorhies (1994): where Black is a black image with the same dimensions as Gray, and is the image blending factor, a 2D matrix with increasing 5 values of 1.1-1.5 from the image center to the edge. The operator "×" denotes element-by-element multiplication. To avoid the vignetting extremes in the corners, only the imagery within a circle centered on the image was used to derive snow fraction (left panel of Fig. 2a). The key step of the snow fraction detection algorithm is the separation of dark versus bright pixels. An adaptive thresholding technique was applied to estimate the snow fraction. The adaptive thresholding is an approach for handling an image with unevenly distributed intensities by dividing the image into subimages and assigning different thresholds for each of the subi-10 mages (Gonzalez et al., 2002). The details of the adaptive thresholding are described in Appendix C. The snow fraction is estimated where IJKLMN is the number of thresholding pixels and NONPQ is the total number of pixels within the circle. The detection results for the left panel in Fig. 2a are illustrated in the right panel. The method provides robust estimates of snow fraction from the nadir 15 downward-looking imagery, even for scenes with pure snow. Figure 2b shows the simultaneously measured upwelling and downwelling spectral flux from the SSFR for the scene shown in Fig. 2a.

C-130 Thermometer and Hygrometer and Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2)
The NASA C-130 aircraft was equipped with a thermometer and a hygrometer to measure air temperature and relative humidity. 20 Figure 3b shows the profiles derived from C-130 during a descending leg from 19:31:14 (altitude: 6.447 km) to 19:50:05 (altitude: 0.258 km) on 13 September, 2014. Due to a malfunction of the hygrometer on 11 September, 2014, no profile from C-130 is available on this day. Instead (Figure 3a), we use the temperature and water vapor content profiles from MERRA-2, which is an atmospheric reanalysis dataset from NASA (Bosilovich et. al., 2015). MERRA-2 (M2I3NVASM) provides 3-hourly assimilated 3D meteorological fields (dimensions: 576 in longitude; 361 in latitude; 72 pressure levels from 985 hPa to 0.01 hPa). The com-25 parison of in-situ profiles and MERRA-2 in Figure 3b shows good agreement, although the reanalysis does not reproduce the details of the vertical profile. A more systematic comparison of reanalysis and in-situ data from ARISE is done by Rozenhaimer et al. (2018) and is not the focus of this paper. The observations reveal much drier and slightly colder conditions than captured in the widely used subarctic climatology from Anderson et al. (1986), referred to here as AFGL. Above 6.5 km, we used the climatology regardless to provide complete temperature and water vapor profiles from 0 to 120 km, after rescaling them to the observed 30 temperature and water vapor values at 6.5 km.

Moderate Resolution Imaging Spectroradiometer (MODIS)
The publicly available pixel-level MODIS cloud products (MOD/MYD06, collection 6.1), which are provided in 5-minute granules, are used in this study (Platnick et al., 2017). The MODIS cloud product includes COPs such as COT, CER, and cloud thermodynamic phase, which are essential parameters for calculating cloud radiative effects. As described before, the MODIS COT and CER are retrieved simultaneously using a bi-spectral reflectance method (Nakajima and King, 1990). To minimize the influence of the surface on cloud retrievals, the 1630 nm and 2130 nm bands are used since the snow and ice surface are relatively dark at those two bands (Platnick et al., 2001;King et al., 2004). These retrievals are included in the MOD/MYD06 files and will be referred to as the "1621" cloud product. Since the clouds evaluated in this study are all liquid, the cloud thermodynamic phase is not assessed. The data include COPs for cloudy and partially cloudy conditions (the latter are indicated by "PCL" in the MODIS 5 data variable name), as described in Platnick et al. (2017). The RTM uses a solar spectrum with 1 nm resolution as solar source at TOA (Kurucz, 1992). The Discrete Ordinates Radiative Transfer Program (DISORT, Stamnes et al, 1988) is used as the radiative transfer solver. LOWTRAN 7 (Pierluissi and Peng, 1985) is used for the molecular absorption parameterization. The RTM output includes downwelling (global and direct) and upwelling irradiance at specified output altitude and wavelength. The output altitude is set to flight altitude. The wavelength range of the 25 calculations is set to 200 to 3600 nm.

Results and Discussions
This section shows the results for the mixed-scene spectral surface albedo derived from SSFR, and the comparisons of broadband and spectral irradiance between aircraft measurements and MODIS-COP based radiative transfer calculations (referred to as "calculations"). 30

Spectral Surface Albedo
The spectral surface albedo results shown in this subsection stem from the aircraft measurements from 20:00:26 UTC to 20:10:51 UTC on 13 September. Only the SSFR measurements under clear-sky (no clouds above or below) scenes above snow or ice, as identified from the forward and nadir cameras, were used to calculate the spectral flight level albedo through where ( ) ↑ and ( ) ↓ are SSFR-measured spectral upwelling and downwelling irradiance. An atmospheric correction was applied to the flight level albedo calculated from Equation (4) using RTM and the atmospheric profiles derived in Section 2.4. This was done through RT calculations with five surface albedo spectra: the measured flight level albedo from SSFR, \\V] , scaled by 0.6, 0.7, 0.8, 0.9, and 1.0. The five \\V] and RTM calculations provided a modeled relationship for surface and flight level albedo (nearly linear). The linear relationship at each wavelength was then inverted to infer the surface albedo from the measurements at 5 flight level. To capture the spatial and spectral variability of the surface, we developed a data aggregation technique that combines collective measurements in a partially snow-covered environment into one spectral surface albedo dataset that is parameterized by snow fraction ("binary" representation of the radiative surface properties), which is estimated from the nadir camera imagery using the method described in Section 2.3. Each surface albedo data point was linked to the associated snow fraction. From the collection of data points, the relationship between the spectral surface albedo and snow fraction can be investigated. 10 The SSFR wavelengths at water absorption bands, and those less than 350 nm or greater than 1800 nm, were excluded from the albedo data set because of a low signal-to-noise ratio. Figure 4 shows the surface albedo at 640 nm, 1240 nm, and 1630 nm plotted versus the snow fraction. Linear regression can be used to establish a simple relationship between snow fraction and albedo, assuming that each observed spectrum is a mixture of only two so-called end-members: the spectral albedos of a dark and a bright surface. These end-members can vary depending on the local conditions. For example, the dark component can either be 15 open ocean or young ice. The bright component can either be thick ice or a snow-covered surface. The resulting spectral surface albedo for a mixed sampling region is established through the slopes and intercepts of the linear fit, with the snow fraction ranging from 0 to 1 as the independent variable. The representative snow albedo for the region at a particular wavelength can be read off on the right side of Fig. 4 (snow fraction of 1). This simple separation method has obvious drawbacks; for example, the implicit linear-mixing assumption, the variability of the end-members, and data sparsity of the individual end members (in the example in 20 Fig. 4, snow fractions below 0.6 rarely occur). However, it is the most effective way to characterize mixed regions, which are the norm for the MIZ.
The snow spectral end-member (snow fraction of 1) of the mixed-scene spectral surface albedo (referred to as "2013-09-13 surface albedo") is shown in Fig. 5. As expected, the surface albedo is high in the shortwave range from 400 to 900 nm and decreases in the near-infrared. The SSFR-derived albedo spectra resembles the ground-based measurements of thick snow over ice 25 near Davis Station, Antarctica (Brandt et al., 2005), and is fairly consistent with the dry-season climatology (Kay and L'Ecuyer, 2013) except at 1200 nm where spring-time aircraft measurements near Barrow (Alaska, Lyapustin et al., 2010) are closer. Overall, the data sets differ significantly in the visible wavelength range. Figure 5 also shows the surface albedo with zero snow fraction. As pointed out above, snow fractions below 0.6 were extremely rare during 20:00:26 UTC -20:10:51 UTC on 13 September. Nevertheless, the mixed-surface data, extrapolated to 0 30 snow fraction, compares surprisingly well to ground-based measurements of young gray ice, taken during the Australian National Antarctic Research Expeditions (ANARE) in 1996 (Warren et al., 1997). This suggests that during the sampled time period, the dark pixels from the nadir camera imagery are dominated by various freezing states.
As mentioned above, the binary representation of surface types oversimplifies the actual mixture of ice at different freezing stages, but is adequate to serve as surface albedo input for the RTM to constrain the irradiance calculations over mixed surfaces 35 in the next subsections.

Broadband Irradiance Comparison
In this section, we show broadband irradiance comparisons between SSFR and BBR measurements and MODIS-COPs based RTM calculations at aircraft flight level for an above-cloud case and a below-cloud case, collected by the research flights on 11 September and 13 September, respectively.
The RTM irradiances were calculated for wavelengths from 200 nm to 3600 nm. Since the SSFR-derived surface albedo 5 described in previous subsection was not available at wavelengths shorter than 350 nm, gas absorption bands, and wavelengths greater than 1800 nm due to a low signal-to-noise ratio, several techniques were applied to fill in the surface albedo spectra. Figure   6 is an illustration of how the surface albedo was calculated for 11 September. First, the "13 September surface albedo" (Section 3.1) with a constant snow fraction of 70% is calculated from the linear regression coefficients (marked in red). This is justified because this flight occurred in a similar geographical area. In the gas absorption bands (red area), the surface albedo was replaced 10 with interpolated values; from 1800nm to 1900nm, a polynomial fit was used for extrapolation, based on the spectral dependence from 1650 nm to 1800 nm. For the wavelengths shorter than 350 nm and greater than 1900 nm, we used modeled snow albedo (Wiscombe and Warren, 1981) times a scale factor to match the measurements at the joinder wavelength. where MODIS does not detect any. In this case, undetected optically thin clouds made up almost one fifth of the points along the flight track. Fig. 7b indicates that the undetected clouds lead to an underestimation of the upwelling irradiance by 30 Wm -2 averaged over these pixels (>10% discrepancy). By contrast, the calculated irradiances for the locations where MODIS does detect clouds are only 10 Wm -2 lower than the measurements (4%), which is only slightly larger than the BBR/SSFR measurement uncertainty and can be explained either by (a) incorrect COPs (optical thickness, effective radius, or thermodynamic phase) and/or (b) inaccu-30 rate or variable surface albedo. To quantify the contributions of these effects to the total discrepancy, the spectral information from SSFR is used in the next section.
After the investigation of the above-cloud case for MODIS-derived irradiance, we turn our attention to the below-cloud case, which relates to near-surface irradiance. The primary cloud layer consisted of stratocumulus cloud and was located between 0.8 and 1.2 km. A secondary cloud layer close to the surface, located below the aircraft's minimum flight altitude of 500 ft (ap-35 proximately 150 m), frequently occurs due to a temperature inversion close to the surface, where leads and cracks in the ice provide the necessary moisture for their formation. These clouds also need to be considered to quantify the radiative surface budget, but they are excluded from the analysis here because the aircraft could not underfly them. As a result, only the data from 22:21:00 to 22:25:48 (minimal occurrence of the secondary cloud layer as indicated by the forward and nadir camera imagery) was selected for comparison. In contrast to the above-cloud case where the surface albedo was held constant in the RTM, the surface albedo variability on the below-cloud leg was considered here. The missing wavelengths were filled in as illustrated in Fig. 6. Deriving the surface albedo directly from the measurements was impossible because the zenith irradiance varied due to the overhead cloud layer. Instead, it was obtained from the ARISE parameterization via the snow fraction, derived from the nadir camera imagery for each point along the leg (see Section 3.1). Figure 8 shows the upwelling and downwelling broadband irradiance comparison between calculations and observations from BBR and SSFR. When incorporating the "13 September surface albedo" into the RTM, 5 the upwelling irradiance calculations resemble the SSFR and BBR measurements (Fig. 8b). The calculations agreed well with SSFR and BBR when clouds were detected except for the time period before 22:22:48 when the aircraft was entering the cloud field. The MODIS granule from Aqua was a snapshot of the cloud scene at 22:10, 10 minutes prior to the beginning of the flight leg. Measurement-model discrepancies for specific pixels can therefore be explained by changes of the cloud field over time. The bimodal behavior that is apparent in the time series ( Fig. 8a and 8b) as well as in the histograms (Fig. 8c) stems from time periods 10 with and without clouds in the model input. The observations show no evidence of any cloud gap -hence only one mode appears.
From the distance of the cloudy/clear modes, one can estimate the pixel-level bias caused by undetected clouds: 45 Wm -2 bias for the downwelling and 19 Wm -2 bias for the upwelling shortwave irradiance. One can also estimate the cloud radiative effect (CRE, provided in Table 1 and 2 for 11 September and 13 September) from the difference of the net irradiance between the calculations when clouds are detected and when they are not detected (clear-sky). 15

Spectral Irradiance Comparison
Although the model-measurement biases in the broadband shortwave CRE are negligible when clouds were detected, the time series as shown in Fig. 7b do not quite match, especially for the thin parts of the clouds near the edge of a field. To diagnose the cause, we use the spectrally resolved measurements by SSFR in this section. 20 For the above-cloud case (11 September), Fig. 9 presents the spectral irradiance comparison at 860nm and 1640nm. To put these results into context, the RTM calculations (based on "13 September surface albedo" with SF=70%) were also performed with climatological surface albedos of the Arctic dry and wet seasons (0.85 and 0.75) for 860nm from Kay and L'Ecuyer (2013).
As shown in Fig. 9a, the baseline of the clear-sky RTM calculations varied significantly with surface albedo. Arctic dry-season surface albedo is clearly inconsistent with the conditions during the flight; instead the measurements support a surface albedo 25 somewhere between the ARISE parameterization and the wet-season climatology value. These results reveal the importance of the spectral surface albedo since any inaccuracies will propagate into model biases for both cloudy and clear-sky conditions. The irradiance range apparent in the time series varies considerably with surface albedo; the climatology surface albedos result in a much smaller range than obtained from the measured surface albedo. Consequently, the clouds' shortwave cooling effect might be underestimated by some climate models. In this context, it is important to note that the small broadband model-measurement 30 discrepancy of 8 Wm -2 from Fig. 6) is only achieved when the SSFR derived surface albedo is used in the RTM calculations; otherwise it would be larger. At 1640nm (Fig. 9b), there is excellent model-measurement agreement for the clear-sky baseline and for cloudy pixels that MODIS detects. This is because snow is dark in the shortwave infrared, and because MODIS COPs in the Arctic are primarily based on these wavelengths. Because of the obvious distinction between cloudy and clear pixels in the measurements and calculations, it is possible to estimate the fraction of partially or fully cloudy pixels that are not detected by MODIS. 35 Of all pixels along the flight leg with a MODIS-COD below the detection threshold of 0.5 (i.e., "clear"), 22% (highlighted in green) are actually cloudy. One interesting finding from the broadband irradiance comparison (Fig. 6b) is that the calculations are lowbiased relative to the observations. However, from the spectral comparison (Fig. 9) the SSFR measurements at 860 nm/1640 nm. To reconcile the apparently contradictory results, we use the full spectrum from the calculations and observations at 21:24 UTC on 11 September, when the broadband calculation indicates a 6 Wm -2 low bias. Figures 10a and 10b show the spectral upwelling irradiance from the RTM calculations and from the SSFR measurements, as well as the ratio of RTM and SSFR. The agreement in the water vapor absorption bands indicates that MERRA-2 is sufficient to prescribe the water vapor content in the calculations. Outside of the gas absorption bands, the calculations agree with the meas-5 urements at wavelengths smaller than around 850nm, but are slightly low-biased at near-infrared wavelengths. Spectral discrepancies are caused by the use of inaccurate 1) surface albedo 2) cloud optical parameters, some of which compensate each other in the broadband integral. Such error compensation may lead to an improved model-measurements agreement for the "wrong reasons"; therefore, validation efforts should include spectrally resolved measurements.
So far, the analysis did not reveal whether the observed model-measurement discrepancies are due to biases in the COPs 10 or in the surface albedo. Figures 11-12 are an attempt to disentangle both sources of uncertainty despite the limited number of observations during ARISE. Figure 11 shows the ratio between modeled (labeled "RTM") and measured ("SSFR") upwelling broadband irradiance at flight-level as a function of the retrieved COT for the collection of cloudy pixels from 11 September. At large COT, clouds dominate the upwelling irradiance, whereas the surface dominates in the limit of zero COT (as stated above, the retrieved minimum is 0.5). The ratio of RTM/SSFR can be used to indicate how biased the surface albedo is in the RTM when 15 COT is approaching to 0 and how biased the cloud optical properties are when the COT approaches large values. The data reveal a functional relationship between COT and the RTM/SSFR ratio. An exponential fitting of = − I^_` is used to parameterize the upwelling irradiance ratio as a function of COT. The black curve in Fig. 11 suggests that the surface albedo in the calculations is biased low by about 9% (y-axis intercept of ~0.91, a = −`), whereas almost no bias is detectable in the cloud properties ( of ~1.01). Figure 12 shows the fits for the spectrum between 350 and 1800nm. Two spectra are calculated: the intercept when = 20 0 ( a ( ), corresponding to cloud-free conditions; and the spectrum in the limit of large x (denoted as c ( ) = ), corresponding to cloudy conditions. The a ( ) spectrum (red) is consistently lower than 1.0 at short wavelengths (< 1300 nm) and slightly greater than 1.0 for wavelengths longer than 1500 nm. This suggests that the surface albedo is underestimated for the shorter wavelengths and overestimated for the longer wavelengths. Simply changing the snow fraction does not improve the agreement; it is the spectrum itself that seems to have changed somewhat over the course of two days. This is either due to physical changes of the surface, 25 a slightly different sun angle, or instrument performance changes. The c ( ) spectrum (blue) oscillates around 1.0 for the shorter wavelengths and is consistently larger than 1.0 for longer wavelengths, which might suggest that the retrieved effective radius is slightly biased. With these qualifications in mind, the agreement between MODIS-derived and measured irradiance is remarkable.
Unfortunately, owing to limited sampling time, the below-cloud flight (13 September) leg does not lend itself to any conclusions from a cloud transmittance perspective since it is not the same cloud field as on 11 September. In future flight campaigns, coordi-30 nated above-and below-cloud legs will furnish more information on bias analyses than possible from ARISE.

Conclusions
In this paper, we used aircraft observations to validate shortwave irradiance derived from satellite passive imagery (MODIS) of low-level cloud fields. This was done with two consecutive flights from the NASA ARISE campaign, which sampled the radiation well as surface angles. In addition, accurate knowledge of the surface albedo and of the water vapor vertical distribution is required to derive the net fluxes at the surface, above the cloud layer, and at the top of atmosphere. The two cases analyzed here only focused on one region with one specific surface and cloud type, but this allowed developing a validation approach that can help answer specific questions such as: 1. What is the reliability of passive imagery cloud detection in the MIZ and over solid snow-covered regions? 5 2. How much do undetected clouds bias imagery-derived irradiance, especially at the surface? 3. What is the relative magnitude of irradiance errors caused by undetected clouds, biased cloud properties, incorrect surface albedo parameterization, and water vapor?
This paper sheds some light on these questions using the combined measured broadband and spectral irradiance in the study region, but these results are far from representative for the Arctic as a whole. To gain a statistically based understanding, validation data 10 from multiple experiments will have to be combined. By aggregating data from multiple missions, it should be possible to answer more general questions, which a single case study cannot address: • Do existing cloud climatologies from space-borne passive imagery observations accurately reproduce the frequency of low-level optically thin clouds over different surface types?
• Do existing climatologies of surface albedo capture the spatial and temporal variability sufficiently to keep errors in the 15 derived all-sky irradiance and cloud radiative effects to an acceptable level?
It is unclear what "acceptable" would mean for the second question, but our study showed that the actual surface albedo may deviate from commonly used climatologies. Throughout the Arctic, inaccurate knowledge of the surface albedo and its variability may lead to an inaccurate estimation of cloud radiative effects and net surface fluxes, even under clear-sky conditions. This is especially important in the visible part of the spectrum where most of the shortwave energy resides, and where the albedo of 20 different surface types (ice, fresh and old snow) varies significantly. Of course, knowledge of the near-infrared variability of snow and ice albedo (via grain size) is also important because it affects the accuracy of imagery-derived cloud products.
To capture the spatial and spectral variability of the surface, we developed a data aggregation technique that combines collective measurements in a partially snow-covered environment into one spectral surface albedo dataset that is parameterized by snow fraction ("binary" representation of the radiative surface properties). The dataset we obtained agrees with ground-based 25 measurements for the two extremes (called spectral end-members): snow and thin ice. In our case, ice-free open ocean was radiatively insignificant, and the two end-members were sufficient to represent the surface variability. In more complex, more general cases, more end-members may be required.
In assessing the relative magnitude of different errors (question 3 above), we found that undetected clouds have the most significant impact on the imagery-derived irradiance. In the case studied here, MODIS did not detect clouds below a threshold of 30 0.5 in optical thickness, even when including partially cloud-covered pixels. Undetected thin clouds (COT<0.5) led to a high bias of about 45 Wm -2 below clouds for the downwelling and a low bias of 19 Wm -2 above clouds for the upwelling shortwave irradiance -the primary error source. Secondary error sources are (a) surface albedo, and (b) cloud optical properties. By using an SSFRderived surface albedo and atmospheric profiles from aircraft measurements and MERRA-2 along with MODIS-COPs in RTM calculations, we achieved excellent agreement with the measured spectral and broadband shortwave irradiance. An overarching goal of studies such as this should be to quantify the absolute and relative magnitude of some of the errors discussed here. This will be done by analyzing more data from ARISE and other field experiments in a similar fashion as proposed here. In addition, targeted field operations will be required to systematically do so for a range of cloud types. A critical piece of such studies should be the characterization of surface properties, which are almost as important as the cloud properties themselves.
Generalizing the findings from airborne studies such as these will only be possible by improving satellite remote sensing along the 5 way, which in turn requires airborne observations for the development and validation of a new generation of cloud retrievals in the Arctic. Such retrievals will need to account for surface and cloud variability, and address the issue of undetected thin clouds. A database of spectral albedos, acquired with similar techniques as proposed here, would provide the necessary testbed for developing operational space-based retrievals for surface reflectance as available for the lower latitudes. With lower COT thresholds for cloud detection, spatially and temporally dependent surface albedo, accurate cloud retrievals even for thin clouds, passive remote sensing 10 will significantly improve our current understanding of cloud radiative effects in the Arctic. Finally, it will be important to pursue a similar strategy for the thermal wavelength range.

A. Diffuse/Direct Correction
The diffuse/direct correction is made under the assumption of 15 where n and t are the wavelength range of SPN1 and ↓ ( ) is the calculated downwelling spectral irradiance from RTM. Plug 25 Equation (1)

B. Azimuth Response 5
Since SSFR only covers part of BBR's bandwidth from 200 to 3600 nm, RTM calculations were used to fill in SSFR spectra beyond its nominal wavelength range of 350 -2050 nm. Subsequently, the SSFR spectral irradiance was spectrally integrated (referred to as FSSFR). A second-order Fourier series was applied to fit the azimuthal dependence captured by FSSFR/FBBR. Fig. 13 shows the ratio (FSSFR/FBBR) as a function of reference azimuth angle, defined as the azimuth angle of the sun with respect to the light collector, 0 degrees defined as the aircraft flying towards to the North. A second-order Fourier series was applied to fit the 10 azimuthal dependence captured by FSSFR/FBBR. It constitutes SSFR's azimuthal response at this solar zenith angle, which was then used to correct SSFR's downwelling irradiance for the conditions encountered for the SSFR data collected during other research flights.

C. Adaptive Thresholding
The threshold value at each pixel location of the image depends on the neighboring pixel intensities I. For a pixel located at (x, y), the threshold value T(x, y) is calculated through the following steps: 1. A subdomain of d × d is selected with (x, y) at the center of the subdomain; 2. The weighted average C(x, y) is calculated for the subdomain using Gaussian weights (Davies 1990) W(x, y), C(x, y) = 20 ∑ ∑ I(i, j) • W(i, j) ‡ ‰a ‡ Š‰a ; 3. The threshold for the pixel at (x, y) is the difference of the weighted average calculated in the previous step and a constant C a , T(x, y) = C(x, y) − C a . d and C a are input parameters that can be adjusted to improve the results. In this study, the d is set to 1501 and C a is set to 0.        (2) (1) (3)