the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Radar and Environment-based Hail Damage Estimates using Machine Learning
Luis Ackermann
Joshua Soderholm
Alain Protat
Rhys Whitley
Lisa Ye
Nina Ridder
Abstract. Large hail events are typically infrequent, with significant time gaps between occurrences at specific locations. However, when these events do happen, they can cause rapid and substantial economic losses within a matter of minutes. Therefore, it is crucial to have the ability to accurately observe and understand hail phenomena to improve the mitigation of this impact. While in-situ observations are accurate, they are limited in number for an individual storm. Weather radars, on the other hand, provide a larger observation footprint, but current radar-derived hail size estimates exhibit low accuracy due to horizontal advection of hailstones as they fall, the variability of hail size distributions (HSD), complex scattering and attenuation, and mixed hydrometeor types. In this paper, we propose a new radar-derived hail product that is developed using a large dataset of hail damage insurance claims and radar observations. We use these datasets coupled with environmental information to calculate a Hail Damage Estimate (HDE) using a deep neural network approach aiming to quantify hail impact, with a critical success index of 0.88 and a coefficient of determination against observed damage of 0.78. Furthermore, we compared HDE to a popular hail size product (MESH), allowing us to identify meteorological conditions that are associated with biases on MESH. Environments with relatively low specific humidity, high CAPE and CIN, low wind speeds aloft and southerly winds at ground are associated with a negative MESH bias, potentially due to differences in HSD or mixed hydrometeors. In contrast, environments with low CAPE, high CIN, and relatively high specific humidity aloft are associated with a positive MESH bias.
- Preprint
(4657 KB) - Metadata XML
- BibTeX
- EndNote
Luis Ackermann et al.
Status: closed
-
RC1: 'Referee Comment on amt-2023-161', Tanya Brown-Giammanco, 28 Aug 2023
General Comments:
This paper presents an analysis of the performance of MESH in predicting insured property hail damage. The authors determined that while MESH alone does not well predict damage magnitude, it has reasonable skill in determining whether damage will occur when corrected for horizontal advection of hail, shown nicely in Figure 6. They developed a neural network that incorporated additional variables from ERA5 and determined which variables were most important in predicting hail damage. They found that the neural network improved the hail damage prediction and identified environments that would lead to positive and negative biases. I found Figures 10 and 11 representing the Brisbane case studies to be particularly helpful in visualizing the calculations and resulting differences between MESH and the improved developed tool.
The techniques presented in this paper advance the state of hail damage prediction knowledge. I commend the authors on their work to relate hail damage data to the meteorological and environmental characteristics. However, there are a few assumptions that need to be stated more clearly. The hail size itself is not the only contributor to damageability. Our paper in ASCE’s Natural Hazards Review (Brown-Giammanco, Giammanco, and Estes 2021) showed that the hardness characteristic also plays a role. I realize that it’s impractical to gather these data for all events, but this can certainly contribute to explaining why damage may not be perfectly related to the hail size determinations. In addition, the underlying property characteristics (material type, age, maintenance, sheltering) will also play a role in the damageability. Finally, the assumption that the claims data are the “truth” is somewhat problematic, as there will be human judgment biases incorporated. To be clear, I think this dataset is about the best that can be used at this point, so I’m not at all suggesting it’s not appropriate—just more pointing out that the limitations of it should be stated.
Specific Comments:
- Line 19: It seems as though “measured” may not be the appropriate term to use here. Perhaps “estimated” or “approximated” would be better, since none of the methodologies listed, or the combination of them, really gives a spatial measurement. Related, “or” seems as though it should be “and/or”, since you probably increase accuracy by using a combination of methods where possible.
- Line 29-32: I understand that insurance data may be more widely available in Australia, but that is not the case in the United States, and I’m not sure about elsewhere. This is a limitation that should also be noted.
- Line 65: It could be useful to understand the scope of the data, by providing some info on Suncorp. Are they a top-5 carrier in Australia? Or is there an estimate of the percent of properties that they insure? Can you also clarify whether just residential, just commercial, or both kinds of policies were included in the dataset?
- Line 66: Can you please expand on the property characteristics by providing some examples? Did you get info on age, size, number of stories, any roofing or siding materials used? It looks like maybe the information contained in line 77 is all of the property characteristics? If so, I think it would be good to include at least some examples earlier in line 66.
- Line 100-101: This choice could introduce some unintended bias. What if a property that was filtered out truly didn’t have any claimable damage? Maybe it was a new property or made of superior materials compared to its neighbors. I understand that you’re trying to filter out what could potentially be bad data, but you may be filtering out interesting situations.
- Figure 4: Is there a reason why only the blue circle radars are labeled, while Table 1 includes several others that were seemingly included in the study? Seems like all that were included should be labeled in the figure.
- Line 110: You mention several other data types in this section, so maybe “Radar data” isn’t the best section title?
- Line 121-123: It’s unclear why this statement about the ERA5 data is included here. Please provide a connection to the SHI. I found more information about this in lines 133-134, so maybe it does not need to be duplicated in 121-123?
- Line 130-131: Can you please provide more explanation as to why you’re doing +/- one day while in the event identification you’re doing +/- seven days? I’m actually not quite following why you need the +/- one day on the SHI data, since it should have far more temporal accuracy than the damage data.
- Line 162-163: While not stated, you are also assuming that hail size is the controlling factor for damage. While it is a factor, it is not the only one—property age, materials, sheltering, etc. also play a role. It is OK to make assumptions, but the stipulations of those assumptions must be clearly stated.
- Lines 210-211: Again noting that the assumption that the damage is only related to meteorological factors is not correct. Property factors will also play a role.
Technical Corrections:
- Line 28: reference should be in parenthesis.
- Line 35: add “and” before “therefore”.
- Line 42: the information about the second limitation is fairly far removed from the original instance in the paragraph, so it may be worth restructuring this sentence to remind the reader of what the limitation is, something like “the second limitation of a mismatch between the radar-estimated hail location and actual ground observations can be mitigated by…”
- Line 47: add “and” before “low-level”.
- Line 69: remove closed parenthesis. In addition, you state that the ratio is just referred to as “damage”, yet in Figure 1 you have “relative damage”. Are these intended to be one in the same? If so, please choose one and be consistent with it.
- Figure 1: label the x-axis.
- Line 92: remove “g” from “withing” and change “where” to “were”.
- Line 103: “was” should be “were”, and “match” should be “matched”. I also think “radar’s” should be “radars’” (radars followed by apostrophe) since multiple radars seem to be used?
- Line 104: should this be “the borders” instead of “a borders”? And should “radar’s” be “radars’”?
- Line 105: “ares” should be “areas”?
- Line 127: add “the” before “Murrillo”.
- Line 145: change “these were” to “including”.
- Line 154: change “is” to “its”.
- Line 159: add a period at the end of the sentence.
- Lines 171-177: you’ve switched to present tense in this section, while all others seem to be in past tense. Update these lines accordingly.
- Line 172: I think the end of this line needs to be adjusted to something like “allows for correction for…”.
- Figure 6 caption: I’m not sure what the last sentence in the caption is for? I don’t see a dashed ring?
- Line 191: add “the” before “Brook”.
- Line 213: seems like a word such as “so” needs to be included after the comma.
- Lines 221-222: check line formatting.
- Line 249: add “and” before “this”.
- Line 252: add “the” before “VA”.
- Line 275: this sentence should be restructured to something like: “due to the inability of low MESH values to cause any…”. As is written now, I think values would need to be possessive (values’) which is a bit awkward.
- Line 320: second “by” should be changed to “and”.
- Line 325: add “and” following the comma.
- Line 328: add “the” before “ground”.
- Line 331: add “a” before “higher”.
- Line 366: should “mix” be “mixed”? Or “of a mix”?
- Line 369: add “and” before future.
- Line 372: change “showing” to “determine”.
Citation: https://doi.org/10.5194/amt-2023-161-RC1 - AC1: 'Reply on RC1', Luis Ackermann, 03 Oct 2023
-
RC2: 'Comment on amt-2023-161', Anonymous Referee #2, 29 Aug 2023
Dear authors:
The authors built a Hail Damage Estimate method using Radar and ERA datasets. Some technological methods were adapted in generating data samples such as the filtering of claims and the virtual advection of SHI. These methods makes the sample more reliable. However one question confuse me: Nether the bulk advection or virtually advected were needed, but the advection algorithm needs the damage dataset as input, was this a conflict?
Major Comments:
1. In Figure 1, relative damage distribution of 12 archetypes were showed, I did not see this kind of archetype were used in the following research, this was only used for normalization? The different archetype in different areas may impact the final result.
2. The relative damage precentage need further explained, was 0% damage in or not in the damage sample? Lots of zero damage were scattered in the following discussions such as figure 7, 8, 9,10 and 11.
3. In the claims filtering part, an obvious white cycle ring surrounding the hail area were created after filtering(Figure 2) . The discontinuous scatter round zero precentage damage (the left panel of figure 7 and bottom right in figure 10) need an explanation, was it caused by the 0% damage samples or the white cycle ring after claims filtering? The impact to CSI, POD and Far should also be discussed.
4. Besides SHI, why not involve Vertical Integrated Liquid water, echo top and composite reflectivity in your machine learning model ?
5. What’s the time interval of the new developed MSEH? In daily? Can this method be applied to instant radar datasets, which I believe more meaningful in hail warning.
6. Some composite reflectivitymap should be showed to see if the red cycle were spurious claims in figure 11.
Minor Comments:
1. (a) (b) (c) (d), should be labeled in panel figures.
2. Figure 6, what’s the meaning of ‘The dashed range ring is 150 km in radius’?
Citation: https://doi.org/10.5194/amt-2023-161-RC2 - AC2: 'Reply on RC2', Luis Ackermann, 03 Oct 2023
Status: closed
-
RC1: 'Referee Comment on amt-2023-161', Tanya Brown-Giammanco, 28 Aug 2023
General Comments:
This paper presents an analysis of the performance of MESH in predicting insured property hail damage. The authors determined that while MESH alone does not well predict damage magnitude, it has reasonable skill in determining whether damage will occur when corrected for horizontal advection of hail, shown nicely in Figure 6. They developed a neural network that incorporated additional variables from ERA5 and determined which variables were most important in predicting hail damage. They found that the neural network improved the hail damage prediction and identified environments that would lead to positive and negative biases. I found Figures 10 and 11 representing the Brisbane case studies to be particularly helpful in visualizing the calculations and resulting differences between MESH and the improved developed tool.
The techniques presented in this paper advance the state of hail damage prediction knowledge. I commend the authors on their work to relate hail damage data to the meteorological and environmental characteristics. However, there are a few assumptions that need to be stated more clearly. The hail size itself is not the only contributor to damageability. Our paper in ASCE’s Natural Hazards Review (Brown-Giammanco, Giammanco, and Estes 2021) showed that the hardness characteristic also plays a role. I realize that it’s impractical to gather these data for all events, but this can certainly contribute to explaining why damage may not be perfectly related to the hail size determinations. In addition, the underlying property characteristics (material type, age, maintenance, sheltering) will also play a role in the damageability. Finally, the assumption that the claims data are the “truth” is somewhat problematic, as there will be human judgment biases incorporated. To be clear, I think this dataset is about the best that can be used at this point, so I’m not at all suggesting it’s not appropriate—just more pointing out that the limitations of it should be stated.
Specific Comments:
- Line 19: It seems as though “measured” may not be the appropriate term to use here. Perhaps “estimated” or “approximated” would be better, since none of the methodologies listed, or the combination of them, really gives a spatial measurement. Related, “or” seems as though it should be “and/or”, since you probably increase accuracy by using a combination of methods where possible.
- Line 29-32: I understand that insurance data may be more widely available in Australia, but that is not the case in the United States, and I’m not sure about elsewhere. This is a limitation that should also be noted.
- Line 65: It could be useful to understand the scope of the data, by providing some info on Suncorp. Are they a top-5 carrier in Australia? Or is there an estimate of the percent of properties that they insure? Can you also clarify whether just residential, just commercial, or both kinds of policies were included in the dataset?
- Line 66: Can you please expand on the property characteristics by providing some examples? Did you get info on age, size, number of stories, any roofing or siding materials used? It looks like maybe the information contained in line 77 is all of the property characteristics? If so, I think it would be good to include at least some examples earlier in line 66.
- Line 100-101: This choice could introduce some unintended bias. What if a property that was filtered out truly didn’t have any claimable damage? Maybe it was a new property or made of superior materials compared to its neighbors. I understand that you’re trying to filter out what could potentially be bad data, but you may be filtering out interesting situations.
- Figure 4: Is there a reason why only the blue circle radars are labeled, while Table 1 includes several others that were seemingly included in the study? Seems like all that were included should be labeled in the figure.
- Line 110: You mention several other data types in this section, so maybe “Radar data” isn’t the best section title?
- Line 121-123: It’s unclear why this statement about the ERA5 data is included here. Please provide a connection to the SHI. I found more information about this in lines 133-134, so maybe it does not need to be duplicated in 121-123?
- Line 130-131: Can you please provide more explanation as to why you’re doing +/- one day while in the event identification you’re doing +/- seven days? I’m actually not quite following why you need the +/- one day on the SHI data, since it should have far more temporal accuracy than the damage data.
- Line 162-163: While not stated, you are also assuming that hail size is the controlling factor for damage. While it is a factor, it is not the only one—property age, materials, sheltering, etc. also play a role. It is OK to make assumptions, but the stipulations of those assumptions must be clearly stated.
- Lines 210-211: Again noting that the assumption that the damage is only related to meteorological factors is not correct. Property factors will also play a role.
Technical Corrections:
- Line 28: reference should be in parenthesis.
- Line 35: add “and” before “therefore”.
- Line 42: the information about the second limitation is fairly far removed from the original instance in the paragraph, so it may be worth restructuring this sentence to remind the reader of what the limitation is, something like “the second limitation of a mismatch between the radar-estimated hail location and actual ground observations can be mitigated by…”
- Line 47: add “and” before “low-level”.
- Line 69: remove closed parenthesis. In addition, you state that the ratio is just referred to as “damage”, yet in Figure 1 you have “relative damage”. Are these intended to be one in the same? If so, please choose one and be consistent with it.
- Figure 1: label the x-axis.
- Line 92: remove “g” from “withing” and change “where” to “were”.
- Line 103: “was” should be “were”, and “match” should be “matched”. I also think “radar’s” should be “radars’” (radars followed by apostrophe) since multiple radars seem to be used?
- Line 104: should this be “the borders” instead of “a borders”? And should “radar’s” be “radars’”?
- Line 105: “ares” should be “areas”?
- Line 127: add “the” before “Murrillo”.
- Line 145: change “these were” to “including”.
- Line 154: change “is” to “its”.
- Line 159: add a period at the end of the sentence.
- Lines 171-177: you’ve switched to present tense in this section, while all others seem to be in past tense. Update these lines accordingly.
- Line 172: I think the end of this line needs to be adjusted to something like “allows for correction for…”.
- Figure 6 caption: I’m not sure what the last sentence in the caption is for? I don’t see a dashed ring?
- Line 191: add “the” before “Brook”.
- Line 213: seems like a word such as “so” needs to be included after the comma.
- Lines 221-222: check line formatting.
- Line 249: add “and” before “this”.
- Line 252: add “the” before “VA”.
- Line 275: this sentence should be restructured to something like: “due to the inability of low MESH values to cause any…”. As is written now, I think values would need to be possessive (values’) which is a bit awkward.
- Line 320: second “by” should be changed to “and”.
- Line 325: add “and” following the comma.
- Line 328: add “the” before “ground”.
- Line 331: add “a” before “higher”.
- Line 366: should “mix” be “mixed”? Or “of a mix”?
- Line 369: add “and” before future.
- Line 372: change “showing” to “determine”.
Citation: https://doi.org/10.5194/amt-2023-161-RC1 - AC1: 'Reply on RC1', Luis Ackermann, 03 Oct 2023
-
RC2: 'Comment on amt-2023-161', Anonymous Referee #2, 29 Aug 2023
Dear authors:
The authors built a Hail Damage Estimate method using Radar and ERA datasets. Some technological methods were adapted in generating data samples such as the filtering of claims and the virtual advection of SHI. These methods makes the sample more reliable. However one question confuse me: Nether the bulk advection or virtually advected were needed, but the advection algorithm needs the damage dataset as input, was this a conflict?
Major Comments:
1. In Figure 1, relative damage distribution of 12 archetypes were showed, I did not see this kind of archetype were used in the following research, this was only used for normalization? The different archetype in different areas may impact the final result.
2. The relative damage precentage need further explained, was 0% damage in or not in the damage sample? Lots of zero damage were scattered in the following discussions such as figure 7, 8, 9,10 and 11.
3. In the claims filtering part, an obvious white cycle ring surrounding the hail area were created after filtering(Figure 2) . The discontinuous scatter round zero precentage damage (the left panel of figure 7 and bottom right in figure 10) need an explanation, was it caused by the 0% damage samples or the white cycle ring after claims filtering? The impact to CSI, POD and Far should also be discussed.
4. Besides SHI, why not involve Vertical Integrated Liquid water, echo top and composite reflectivity in your machine learning model ?
5. What’s the time interval of the new developed MSEH? In daily? Can this method be applied to instant radar datasets, which I believe more meaningful in hail warning.
6. Some composite reflectivitymap should be showed to see if the red cycle were spurious claims in figure 11.
Minor Comments:
1. (a) (b) (c) (d), should be labeled in panel figures.
2. Figure 6, what’s the meaning of ‘The dashed range ring is 150 km in radius’?
Citation: https://doi.org/10.5194/amt-2023-161-RC2 - AC2: 'Reply on RC2', Luis Ackermann, 03 Oct 2023
Luis Ackermann et al.
Luis Ackermann et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
347 | 148 | 18 | 513 | 13 | 13 |
- HTML: 347
- PDF: 148
- XML: 18
- Total: 513
- BibTeX: 13
- EndNote: 13
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1