A kernel-driven BRDF model to inform satellite-derived visible anvil cloud detection

Satellites routinely observe deep convective clouds across the world. The cirrus outflow from deep convection, commonly referred to as anvil cloud, has a ubiquitous appearance in visible and infrared (IR) wavelength imagery. Anvil clouds appear as broad areas of highly reflective and cold pixels relative to the darker and warmer clear sky background, often with 10 embedded textured and colder pixels that indicate updrafts and gravity waves. These characteristics would suggest that creating automated anvil cloud detection products useful for weather forecasting and research should be straightforward, yet in practice such product development can be challenging. Some anvil detection methods have used reflectance or temperature thresholding, but anvil reflectance varies significantly throughout a day as a function of combined solar illumination and satellite viewing geometry, and anvil cloud top temperature varies as a function of convective equilibrium level and tropopause height. This paper 15 highlights a technique for facilitating anvil cloud detection based on visible observations that relies on comparative analysis with expected cloud reflectance for a given set of angles, thereby addressing limitations of previous methods. A one-year database of anvil-identified pixels, as determined from IR observations, from several geostationary satellites was used to construct a bidirectional reflectance distribution function (BRDF) model to quantify typical anvil reflectance across almost all expected viewing, solar, and azimuth angle configurations, in addition to the reflectance uncertainty for each angular bin. Application of 20 the BRDF model for cloud optical depth retrieval in deep convection is described as well.

develop the BRDF model, IR calibration is based on hourly adjustments to GSICS-referenced VIRS ray-matched pairs (Scarino et al. 2017).
The Meteosat Second Generation (MSG) satellites are not included in this analysis because 1-km VIS imagery is not collected across the entire 65 °N to 65 °S domain throughout a day. Much of the Northern Hemisphere is observed at 1 km but, over the Southern Hemisphere, a moving window of 1-km data is collected, which follows the Sun and captures data during well-5 illuminated periods of the day. Visible data are only available at 3-km resolution across the full disk view, which would be inconsistent with the GOES and Himawari-based analyses. Given that MSG data is incorporated into the GSICS inter-calibration analysis, we expect that methods developed from GOES and Himawari will perform consistently when applied to MSG data.
Note that for some analyses, satellite data are supplemented by modeled atmospheric profiles provided by the Global Modeling and Assimilation Office (GMAO) Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) 10 product.

Multi-angle lookup table for anvil cloud reflectance
A three-dimensional lookup table (LUT) of anvil cloud reflectance was built from pixels classified as anvil using IR observations that satisfy a set of conditions (Bedka and Khlopenkov 2016;Khlopenkov and Bedka 2018). The specifics of the IR-based anvil classification process are described in Appendix A. The three dimensions of the LUT are VZA, SZA, and RAA, yielding the 15 mean anvil reflectance and standard deviation derived from one year of satellite observations. There are 18 bins along each dimension, with 5° bin increments from 2.5° to 87.5° (±2.5°) for both VZA and SZA, and 10° bin increments from 5° to 175° (±5°) for RAA, where 0° RAA is the backscattering angle. Figure 3 illustrates the average anvil reflectance for overhead Sun, for a solar zenith angle near 45°, and approaching early sunset. Although the allowable VZA limit for actual observations is ~88.4°, it should be noted that the maximum observed VZA for this model is ~77.6°, and sampling per VZA bin can suffer with 20 increasing viewing angle in the poleward direction, where deep convection is less likely to be founFu, D.rthermore, because VISbased observations can become highly shadowed and variable at large SZAs, we caution the use of this method where SZA is greater than 82° despite an allowable LUT limit of up to 90°. Even below this limit, increased sun angle can cause significant shadowing from OT, thereby excluding potential samples (see next paragraph). Therefore, it is not surprising that sample size is sparser, and uncertainty is higher, at the most extreme angular bins, especially when compounded by both high SZA and VZA.

25
This pattern can be observed in Figs. 4 and 5, which illustrate the bin sampling and reflectance standard deviation (σ), respectively.
Anvil reflectance pixels must satisfy three homogeneity criteria before being included in the LUT. First, each pixel of a 5 × 5 window/array centered on the pixel of interest must initially be classified as an anvil using the IR-based method (see Appendix A). As such, each pixel of the 5 × 5 window must have a rating greater than 0, and the rating of the pixel being considered for the 30 LUT (i.e., the center pixel of the 5 × 5 window) must be greater than 75. These requirements help ensure that each included pixel is in an area of reasonably contiguous anvil detection as determined by the IR method, and is therefore likely to be separated from anvil cloud edges where the quality of anvil detection is more suspect due to semitransparency effects. This 5 × 5 array technique must inherently exclude pixels within two spaces of the image edge, but this is necessary given that the continuity of the anvil cloud beyond the image boundary is unknown, and therefore we are unable to reliably judge those edge pixels.

35
The IR anvil rating threshold of 75 was chosen for the LUT based on empirical judgement of the relationship between the detection rating and remarkably obvious, spatially coherent cold clouds, which leads to higher certainty in the model. Compared to having no anvil threshold criterion, setting the anvil rating cutoff to 75 reduces the overall dataset size by 6.5% while slightly lowering LUT bin standard deviations, but the magnitude of the LUT does not meaningfully change. Nominally and in this study, https://doi.org /10.5194/amt-2020-206 Preprint. Discussion started: 11 June 2020 c Author(s) 2020. CC BY 4.0 License. both the IR and VIS anvil masks are defined by an anvil detection rating of 15 or more, which is again a determination made based on empirical assessment of satellite imagery from varied regions and seasons. Having different thresholds for defining a mask and developing the LUT of expected reflectance is acceptable because it is critically important that the LUT observations are exceptionally consistent and predictable. As such, with this first criterion we establish a solid foundation for the BRDF model.
The second and third homogeneity criteria require that the standard deviation of the 5 × 5 VIS reflectance array is less than 3% 5 of the 5 × 5 reflectance average, and that the standard deviation of the 5 × 5 IR BT is less than 1 K. These requirements build upon the first criterion by independently quantifying a standard for homogeneity of the surrounding pixels. The thresholds of 3% and 1 K are well-suited for filtering pixels near anvil edges as a secondary check to the anvil continuity test above. More importantly, perhaps, is the ability of these standard deviation checks to exclude anomalous anvil pixels, particularly those that may be associated with OT or gravity waves. These features and other irregularities in the anvil generate localized temperature 10 variability and shadowing effects that would not satisfy the filter thresholds. In the VIS case, comparing the standard deviation to the 5 × 5 average rather than the center pixel reflectance helps mitigate the rare occurrence that an abnormally bright center pixel surrounded by dark anvil pixels will satisfy the homogeneity filter, given that the standard deviation in such a case is likely to be less than 3% of the bright pixel but not less than 3% of the array average.

15
Despite the fact that the three-dimensional LUT approach described in the previous section is relatively simple to construct and computationally fast for estimating anvil reflectance across the spectrum of viewing and illumination angles, it suffers from certain drawbacks. The anisotropy of an Earth target is expected to vary continuously with viewing and solar geometry. The finite discretization of angular bins in the LUT, however, can generate sharp discontinuities in the anvil reflectance between neighboring angular bins. The non-uniformity in the sample sizes between bins also impairs the smoother transition of 20 reflectance across the bins resulting in the discontinuous patterns that are seen in the reflectance contour lines of Fig. 3, especially at higher SZA. In addition, the LUT approach is unable to define an anvil reflectance for bins without convection.
Here we describe the construction of a kernel-based BRDF model for characterization of anvil top-of-atmosphere reflectance at continuously varying SZA, VZA, and RAA. This approach not only mitigates the discretization discontinuities that are an effect of the LUT approach, but also fills in the missing intermediate bins. The BRDF is described by a linear superposition of a set of 25 weighting functions, e.g., geometric and optical, that characterize its shape, which is the defining concept of a kernel-driven model (Roujean et al. 1992;Wanner et al. 1995). Despite needing to be flexible enough for application to a variety of inhomogeneous scene types, kernel-driven BRDF models are able to adequately provide description of the anisotropic reflectance of natural surfaces (Hu et al. 1997;Wanner et al. 1997;Hu et al. 1999;Breon and Maignan 2017). The BRDF model is based on the work of Roujean et al. (1992) that describes a bidirectional reflectance R as a linear sum of three kernels in the 30 following form: where θ, ψ, and φ are the SZA, VZA, and RAA, respectively. Expressions f1 and f2 are the model kernels defined as analytical functions of θ, ψ, and φ that represent the geometric and volume scattering components, respectively. These functions are given in the forms: Terms K0, K1, and K2 are scene-specific kernel coefficients that are determined using the least-squares solution of the linear 5 BRDF function for a given set of observations. That is, K0, K1, and K2 are derived from the solution to Eq. (1) for the available empirical data and at the certain angular configuration, thereby providing the best fit for the analytical functions and the observed reflectance of each bin. Furthermore, the coefficients are linearly interpolated across adjacent three-dimensional bins in order to yield an even smoother transition of predicted R across continuous angular variation. Finally, the analytical expressions of f1 and f2 are defined such that these terms vanish at nadir viewing and overhead sun conditions. Thus, K0 is the isotropic component 10 representing the overhead sun reflectance at nadir view. An illustration of this model is seen in Fig. 6, using input data shown in . For anvil measurements from GEO imagers, the fact that atmospheric absorption is minimal above anvils and that the GEO imagers have consistent imaging schedules with repeating angular combinations supports the argument for using the kernel-based approach for modeling TOA anvil reflectance (Hu et al. 2004). In this study, K0, K1, and K2 are computed for each angular bin utilizing the satellite-observed reflectance acquired within the bin supplemented by additional measurements from the neighboring bins that are ±5° apart in SZA and VZA, and ±25° apart in RAA from the center bin. Using the extended set of 20 input data for computing the kernel coefficients ensures that the transition of modeled reflectance is uniform across the bins. In addition, this method also allows for modeling of the bidirectional reflectance for the non-filled bins based on the measurements from surrounding bins. The 1-σ uncertainty of the modeled reflectance for a given angular bin is defined by the standard error of the regression that is computed for the least-squares fit between the analytical functions and the observed reflectance values during the determination of K0, K1, and K2.

Visible anvil mask overview
As was noted in the Sect. 1, many satellite-based methods have been developed to identify anvil clouds. These methods typically rely on IR and/or WV absorption-band BT and are perhaps augmented by ancillary information such as tropopause temperature from weather prediction models or reanalysis. Bedka and Khlopenkov (2016) demonstrated a method that incorporates spatial analysis for identification of anvil cloud pixels, but this approach is designed to capture anvil regions near OT and not the entire 30 anvil cloud. A COD threshold-based method for recognizing DCC or anvil features, like that described by Hong et al. (2007), can help to address assumptions of full anvil extents, but because the method is pixel-based and does not incorporate spatial analysis it can be adversely impacted by shadowing due to texture or OT. As such, there existed a need for a reliable VIS-based anvil mask as an important prerequisite to the Bedka and Khlopenkov (2016) VIS texture and OT detection algorithm, in that a search for texture and OT should only occur within the accurate full extents of an anvil cloud.

35
Anvil reflectance prediction using the method described in Sects. 2.2 and 2.3 coupled with spatial analysis offers an opportunity to address previous limitations and enable efficient texture and OT detection. The VIS texture detection process is based on Fourier analysis of spatial frequencies in the VIS imagery. Spatial frequencies consistent with texture are present not only within convection, but also amongst scattered clouds, and especially near cloud edges. The anvil mask is used to limit the Fourier https://doi.org/10.5194/amt-2020-206 Preprint. Discussion started: 11 June 2020 c Author(s) 2020. CC BY 4.0 License.
analysis to only the actual anvil clouds, thereby eliminating false detections associated with other cloud types while also reducing processing time. Classification of the VIS anvil mask is similar to that for the IR mask (Appendix A), except with scoring based on VIS input with reference to the BRDF model.
As described by Bedka and Khlopenkov (2016), the VIS anvil mask is determined by a scoring system that uses an accumulation of histogram-derived information from a nearby ensemble of pixels relative to the assessed pixel. The input VIS reflectance 5 imagery is processed in subsets described by a 50-km-diameter circular window evaluated at every other pixel and every other line. The peak of the reflectance-based histogram within each subset is evaluated in a way such that the smooth, uniform signature associated with an anvil cloud is detected, which should ideally exhibit a tall and narrow distribution. An initial VIS anvil rating is constructed based on three main considerations: 1) the width and height of the histogram peak, excluding a possible peak at low-reflectance bins that correspond to clear-sky areas, 2) the difference between the observed reflectance and 10 some nominal reflectance predicted by the BRDF model, and 3) the existence of saturated pixels corresponding to bright OT edges. The second consideration listed above exemplifies the main purpose of this article -marking the major distinction between the VIS anvil mask derivation described here and the original method of Bedka and Khlopenkov (2016), who relied on an empirically-derived function of cos(θ) to formulate a nominal anvil reflectance that lacked consideration for bidirectional effects. The nominal reflectance relative to the histogram maximum prominence determines whether a VIS anvil rating of either 15 8 or 16 is assigned for the assessed subset pixel and surrounding pixels. As such, anvil rating for a given pixel will accumulate as the window moves through the image, until finally the accumulated mask rating is resampled to the original non-subset resolution (Bedka and Khlopenkov 2016).
The result of the initial considerations is a preliminary mask of pixels corresponding to evenly bright areas, although possibly with some saturated pixels. The area for inclusion in the mask is then expanded by 6-10 km (larger expansion is used for higher 20 SZA) to include any regions that potentially contain cloud shadows such as those around OT cores, which may have been missed by the histogram analysis. A shadowed pixel is then included in the mask if it is sufficiently surrounded by pixels defined in the preliminary mask. Finally, the expanded mask is multiplied by a scaled difference between the tropopause temperature and the pixel BT. This last step ensures that the resulting anvil mask corresponds to sufficiently cold areas that match the actual extents of an anvil cloud, and furthermore indicates that there remains an IR component to the VIS mask despite its designation.

Comparison with CloudSat anvil cloud detection
Anvil detections from GOES-16 based on the VIS mask, the IR mask, the WV−IR BTD method, and a tropopause-normalized IR temperature threshold method (i.e., IR−Trop BTD) were validated against independent determinations of anvil clouds from CloudSat using ~800 CloudSat granules from January, April, July, and October of 2018. The CloudSat definition of anvil is based on the method of Young et al. (2013) and relies on the 2B-CLDCLASS product. Following their technique, an anvil cloud 30 is determined when high/cirriform clouds are connected to within 33 product profiles of a vertical DCC cluster of sufficient depth, provided that the region below the high clouds is cloud free or only partially filled by single-layer, low-level clouds (Young et al. 2013). Receiver operating characteristics (ROC) curves, which report true positive rate against false positive rate, are then determined relating the rate of agreement with CloudSat anvil indications (probability of anvil detection) within 5 minutes using each of the four methods listed above to the rate of false alarms (positive indications that disagree with CloudSat).

35
Note that because of the nature of CloudSat measurements (two-dimensional vertical profiles along the satellite track), these comparisons can only be considered validations in a relative sense, rather than an absolute sense. That is, because the CloudSat profile must encounter a DCC cluster within 33 profiles of high/cirriform clouds in order for the clouds to be assessed as anvils, it is possible that true anvil clouds remain unidentified by CloudSat simply because the DCC cluster associated with them was https://doi.org/10.5194/amt-2020-206 Preprint. Discussion started: 11 June 2020 c Author(s) 2020. CC BY 4.0 License.
not along the scan path. Therefore, false alarm rates may be incorrectly inflated in this validation approach. Also, given the ~13:30 local equator crossing time of CloudSat, these validation results are only representative of low-SZA conditions.
Nevertheless, being that all four methods are assessed with these same limitations, we believe their relative comparison remains fair.

5
A simple parameterization for anvil COD based on the difference between observed (Obs) VIS reflectance and BRDF-modelpredicted anvil reflectance was developed. This feature is important for nowcasting, i.e., flight routing for airborne science campaigns, because it allows for rapid estimation of COD based on readily available input, whereas multi-band cloud retrieval algorithms, such as that of SatCORPS (Minnis et al. 2011;Minnis et al. 2020), require ancillary datasets, additional preliminary computations, and longer processing time (e.g., cloud masking is required before COD retrieval). That is, this parameterization 10 can approximate imager multi-band-retrieved COD in a matter of seconds rather than minutes, which is significant for real-time weather applications. The parameterization is developed based on the SatCORPS COD product, and thus will introduce additional error on top of the uncertainty of those retrievals. Therefore, the approximation should not be used as a replacement for the true retrievals but rather should be employed as a general estimate of COD when timeliness is the chief concern.
The approximation is defined by an exponential fit of COD as a function of Obs minus BRDF (Obs−BRDF) reflectance, 15 calculated as a function of SZA, with twenty-eight 3° SZA bins from 0°-3° to 81°-84°. Figure 9 shows the 0°-3°, 45°-48°, and 78°-81° fits in order to highlight how the shape of the functional relationship of COD to Obs−BRDF reflectance changes with SZA. The fits are based on calibrated VIS reflectance (Scarino et al. 2017;Doelling et al. 2018) and ice-cloud COD using Minnis et al. (2020) methodology, which was adapted to GOES-16 imagery over the CONUS in July 2018 for pixels coincident with the VIS anvil mask. Rather than being fit to the entirety of the dataset that satisfies each SZA bin, the two-term exponential model is 20 simply guided by the maximum density of data found along the curve, as indicated by black circles in Fig. 9. Fitting as such prevents influence from outliers and bad retrievals, and therefore better models the most common functional relationship between COD and Obs−BRDF reflectance.
The COD parameterization consistency was evaluated relative to its SatCORPS reference by comparing with GOES-16 COD that is independent from the training dataset, derived from daytime imagery of Hurricane Florence on 11 September 2018.

25
This date was chosen because Florence maintained Category 4 intensity throughout the day and thus sustained a large area of persistent anvil cloud across the full spectrum of SZA, which is unlike land-based convection that typically exist for a few hours in the late afternoon. With land-based convection, it is difficult to distinguish whether variations in COD are due to increasing/decreasing storm intensity, or if they are being caused by a parameterization that is dependent on SZA. It is important to note that although an intense hurricane should have relatively consistent COD throughout a day when averaged across the 30 storm anvil, variations in reflectance associated with spiral band development or eyewall replacement do occur, which can appear in our data.

Features of the kernel-driven model
As was described in Sect. 2.3, the most significant result of the kernel-driven BRDF is the mitigation of discretization 35 discontinuities between adjacent LUT angular bins and completion of bidirectional reflectance for non-filled bins based on measurements from surrounding bins. These effects are apparent, as the patterns seen in Fig. 6 are smoother and more coherent https://doi.org/10.5194/amt-2020-206 Preprint. Discussion started: 11 June 2020 c Author(s) 2020. CC BY 4.0 License. than those of Fig. 3, with gaps filled, thereby creating a more complete three-dimensional model. Keep in mind that the kerneldriven approach fills gaps in LUT based on an interpolation scheme that draws not only from neighboring VZA and RAA bins, but also from adjacent SZA increments. This is the reason why Fig. 6c, for example, appears exceptionally more continuous than Fig. 3c despite an apparent lack of valid VZA and RAA bins with which to interpolate from to complete the model as shown.
The kernel-driven model allows for filling of angular bins to exactly one bin beyond the valid coverage of the LUT.  Figs. 4b and 5b, respectively. The large differences in Fig. 7c are a result of the greater amount of interpolation required given the discrete nature of the LUT in this volume, which has a much higher associated 1-σ uncertainty compared to that of the lower SZA conditions (compare Fig. 5c to Figs. 5a and 5b), and also overall higher uncertainty than that of the model (Fig. 8). Note that the 1-σ uncertainty of Fig  Another way to qualitatively evaluate anvil mask performance is through comparison with other IR-based methods described 30 above. Examples of these methods are provided in Fig. 11, which shows developing thunderstorms near Kansas and Missouri observed by GOES-16. Figure 11a overlays the IR anvil mask (Appendix A), which had been used to construct the VIS anvil reflectance database for the LUT, and the VIS anvil mask described above. The VIS mask identifies the bright DCC whereas the IR mask extends beyond the boundaries of the VIS mask into pixels that are still cold but not as bright. Even in the northeast portion of the image where the explanation for disagreement is perhaps more questionable, close examination reveals that the 35 VIS mask is outlining a narrow region of bright cloud whereas the IR mask identifies less cohesive shapes. Both approaches define the anvil, and the preferable approach depends on application and tolerances of those that may use the data. Commercial aircraft, for instance, may choose to avoid any indication of anvil cloud out of an abundance of caution. In contrast, airborne https://doi.org/10.5194/amt-2020-206 Preprint. Discussion started: 11 June 2020 c Author(s) 2020. CC BY 4.0 License.
contour, which signifies the 34-kt wind radii (one radius per quadrant for a total of four 34-kt radii). The NOAA National Hurricane Center provides wind quadrant radii for maximum sustained wind values of 34, 50, and 64 kts. The coordinates defining each quadrant extend in the NE, SE, SW, and NW directions, radiating outward from the center of the storm to a distance where the indicated windspeed is expected to be possible. This study uses these radii as a basis for COD evaluation, with each one drawn on the Fig. 13 imagery in red (34 kts), magenta (50 kts), and blue (64 kts).

5
In the morning, the high SZA creates shadows on Florence cloud tops due to OT and gravity waves, seen most prominently near the eyewall but also within the outflow shield (Fig. 13a). Compared to mid-day (Fig. 13d), the overall observed reflectance is lower at around 0.7 to 0.8 on average, rather than ³ 0.9 when the Sun is higher overhead. According to the BRDF model for this morning angular combination (Fig. 14), the predicted anvil reflectance is between 0.71 and 0.81 -gradually increasing from the southwest to the northeast part of the image. The COD derived by the SZA-dependent functional relationship (Fig. 9), namely 10 Fig. 13a minus Fig. 14, is shown in Fig. 13 b. The areas where observed reflectance is much less than predicted reflectance is where low COD values are expected as signified by the darkest shades. The veracity of these dark shades within the outflow shield is questionable, however, because it is unlikely that COD is fluctuating so rapidly across these short distances where deep convective clouds are located. In other words, shadowing and texture generates severely under-estimated COD within threedimensional cloud-top structures. This pattern, however, is consistent with SatCORPS results (Fig. 13c), which is our current 15 baseline for comparison as the parameterization is dependent on the SatCORPS reference. Bedka et al. (2019) show that reflectance smoothing prior to COD computation dampens shadowing effects at high SZA, which results in a more spatially consistent product. That smoothing approach, however, was purposefully excluded for this study. Observed reflectance that is only slightly less than predicted values signifies a greater COD, shown in gray shades. Cloud optical depth grows exponentially as the observed reflectance matches and surpasses the predicted anvil reflectance. Note that despite the potential for the 20 exponential function to predict excessive COD values given rather modest changes in either reflectance value, COD is capped at a maximum value of 150 to be consistent with SatCORPS output.
At high SZA, the exponential growth of COD with Obs−BRDF is rather extreme and therefore uncertainty is high. Based on Fig.   9c, when Obs−BRDF is close to 0, a combined 0.05 error in observed or predicted reflectance could amount to the difference between 70 and 150 COD, or an 80-COD unit variance. At such early and late times in the day (as Figs. 13g, 13h, and 13i behave 25 similarly to 13a, 13b, and 13c) the function is steep and therefore highly sensitive to reflectance uncertainty, not to mention the increased standard error of the fit itself due to variable SatCORPS retrievals at these high-SZA conditions. On the other hand, at midday (Figs. 13d, 13e, and 13f), the exponential function is less steep, with a shape somewhere between that of Figs. 9a and 9b.

Appendix A: Calculating the IR anvil mask
The steps below describe how spatial infrared temperature patterns are quantified to identify convective anvil clouds in the form of an "anvil mask." The anvil mask is a rating that indicates a confidence in anvil detection, with values above 10-15 roughly corresponding to human perception of anvil cloud extents and values above 100 indicating a high level of confidence.
Quantification of anvil confidence is a crucial step in development of the kernel-driven BRDF.

5
Pixel-level IR BT data are first subtracted from the local tropopause temperature in order to obtain the brightness temperature difference (BTD) relative to the tropopause. This BTD is processed in circular subsets of 22-km diameter extracted at every other pixel and every other line. The local distribution of BTD within each subset is analyzed by constructing a histogram H having N=32 bins and covering the range from -35 K (i.e., warmer than the tropopause), which is a low tropopause-relative bound for anvil clouds, to 13 K colder than the tropopause, which only occurs in updraft regions. Pixels colder than the 13-K threshold are 10 accumulated in the last histogram bin. Figure A1 shows examples of BTD histograms calculated for a typical anvil cloud within a convective system (red columns) and for a region outside the anvil cloud (blue columns).
The BTD within anvils should exhibit spatially uniform cold temperature values, which in most cases will result in a sharply peaked histogram. As such, it follows that anvil rating should be made proportional to the peak height Hi, which indicates the number of counts in the i-th bin, and therefore is also proportional to that bin's index i given that higher bin indices correspond to 15 colder regions, e.g., Fig. A1. The following formula, refined through extensive testing, describes the dependence of anvil rating ranvil on index i: Here, D is the diameter of the histogram window in pixels and the term in parentheses acts to gradually flatten the ranvil(i) dependence at higher levels of confidence in anvil detection as BTD reaches zero and becomes strongly positive. Based on the 20 example shown in red in Fig. A1, using an 11-pixel diameter D at 2-km pixel resolution, the peak in the red histogram at bin 20 has a height of 23, which yields an IR anvil rating of 69. In most cases the formula above describes uniformly cold anvil clouds reasonably well. If non-uniform regions around OT cores are causing the histogram peak to split over several bins, the major peak can be counted together with neighboring bins to make the total contribution equivalent to a single strong peak, thereby lending stability in resultant ranvil across the whole anvil.

25
Finally, the obtained anvil mask has to be expanded in order to include pixels along the anvil boundary, where there is only partial anvil coverage in the subsetting window. This expansion is implemented by raising the rating for all anvil pixels inside the 22-km circular window that have BTD larger than 7.5 K below the histogram's peak. Their anvil rating is increased to reach the level of the peak bin. After this spatial expansion, the IR anvil mask presents a reasonable match relative to the actual anvil cloud region, with the anvil extents filled with the nearly uniform field of ranvil.

30
Although this method attempts to identify uniform cloud areas near the tropopause, it does not guarantee that a broad area of extremely cold cloud is indeed an anvil cloud. For example, a large area of cold jet stream cirrus in a winter storm may be assigned a significant anvil rating if a local histogram happens to have sufficient criteria. This leaves some room for improving the IR anvil rating, for instance by 1) incorporating a difference relative to the regional background in order to help define convective environments (i.e., cold cloud vs. much warmer clear sky background), or 2) using model-derived atmospheric Without tropopause normalization, the relative intensity of the northern storms appears less than that of the southern storms despite both being significant producers of severe weather.