Reply on RC1

The authors have produced good work in comparing inundation models and some discrepancies between them in representing flood inundation on maps for compound coastal flooding. Some revision should occur before publication to improve the manuscript’s clarity and organization. Generally, I agree that Event Maps should be vetted, particularly in cases of compound flooding or the broader case of many inundation mapmakers producing flood estimates with differing data sources for a flood event; however, the three models appearing in this paper have very different purposes that users of the inundation estimates would likely know about—so the paper is more a demonstration that the models produce different flood footprints. Why not use three coastal flood models and include Adcirc or RIFT, among other models designed for the hydrodynamics of compound coastal flooding? If these models are not available for use in this study, please state so. The point is well taken that any one map could be wrong because it omits variables or flood sources, but the authors do not show how the vetting framework proposed would apply to official products, like, for example, the NHC hurricane storm surge inundation graphic. In forecasting, NHC uses numerous models and real-time parameterization decisions in the production of the hurricane storm surge inundation graphic, so it seems there is something to highlight or learn from NHC's model review and communication processes. I’ve provided some more detailed comments below and would be happy to assist the authors for further review to get this paper ready for publication. Again, it’s good work, but some clarifications are needed.

We certainly agree that the National Hurricane Center (NHC) flood inundation products, along with others such as the Coastal Emergency Risks Assessment (CERA) (https://cera.coastalrisk.live/) should be included in the vetting and adjudication process for the most appropriate flood inundation maps for a given time and location on the coast. However, we think that their omission in this manuscript does not detract from the focus of the manuscript. The focus of the manuscript is to evaluate a sample of flood inundation maps that could be used during a flood event, evaluate if each flood inundation map has a different spatial composition, determine if each spatial composition is imperfect, and determine if the differences and imperfections lead to different estimates of exposure and consequences (i.e., in other others are their differences impactful). The three flood inundation mapping frameworks we select could indeed be deployed as flood inundation guidance during a compound flood event. If the user is a subject matter expert, they will likely be able to discern key differences and employ the most correct map. However, if the user is an emergency manager or the public at large, knowing which map to use can be difficult, making the vetting process critical given that this manuscript provides evidence that each flood map can lead to different conclusions on the spatial pattern and severity of a floods impacts. It's likely an NHC, CERA, RIFT or other flood inundation product would be different than the flood inundation maps we sample but these differences are not necessary to prove the point of the manuscript and all will still have imperfections. We will ensure that the revised manuscript and title of the manuscript better defines the scope of this research and addresses the central tenets of this manuscript appropriately. We will also evaluate the NHC's process and incorporate any necessary points into the manuscript.
Line 45: assumes that sources are authoritative and/or unknown, whereas event maps are typically sourced according to authority as in lines 39-43 Response: Good point, we can offer clarification here that event maps are authoritative but that non-authoritative maps can and do occur during flood events.
Lines 71-76 present a conclusion/appears out of order Response: We will add a concluding statement to the end of this paragraph for clarification.
Line 88: passive tense Response: We will make the necessary change to an active tense.
Figure 2: graphic introduces use of different DEMs and resolutions, introduces hindcast, introduces multiple unexplained acronyms and data sources (are these public?), national water model uses a land surface (NOAA) whereas auto route uses DEM-are these post processes? Do these inconsistencies in land models relate to the differences in inundation maps and/or accuracies? Figure 2 should be introduced in the proceeding text (lines 115-149). We use the acronyms in Figure 2 to simplify the graphic in Figure 2. All data are public except the RainVieux precipitation product used in the HEC-RAS modeling framework. We will ensure all acronyms are defined in the proceeding text.

Response: The acronyms and data sources in
We discuss how some of these differences lead to different flood inundation maps (see Section 3.2).
Line 126: inconsistent units (meter vs arc second in figure 2) Response: We will correct this in Figure 2 and in the text of the manuscript.
Lines 120-149: streamflow data appears to be a consistent variable across the frameworks whereas elevation, roughness coefficients, and bathymetry appear differently-the fathom framework appears well cited/situated in literature but auto route and HEC RAS approaches are less clear

Response: The HEC-RAS approach is a typical methodology used by U.S. Army
Corps of Engineers (USACE) Districts to develop flood inundation mapping frameworks for design purposes. The AutoRoute approach is available in the literature (Follum et al., 2017;Follum et al., 2020). We will clarify these aspects in the text.
Lines 150-160: comparison with HWM, here and more broadly, could be problematic given different elevation sources and other spatial/vertical corrections Response: Any form of evaluation of a flood map will have problematic components because of limited observations and the need to approximate the observed data. For example, the basis of comparing simulated and observed flood inundation extents will use an approximate observed flood inundation extent that uses an elevation source and HWM data or from remotely sensed observations, leading to an approximate flood extent observation. Though the reviewer is correct in this assertion, there will be no perfect comparison because of gaps in the observation data.

Response: We will include this detail in the manuscript.
Line 178-180: awkward phrasing/qualifier to damage estimates; insurance uptake is a separate but interesting issue-does lower uptake relate to poorer damage estimation? Would you have a better estimate of flood damage bounding if uptake rates increased to 100%?
Response: Incomplete insurance uptake does make comparison to reality a challenge in damage estimation due to difficulties in creating a 1:1 comparison. However, even with 100% insurance uptake, direct 1:1 comparison would still be problematic when using the National Structure Inventory (NSI) because the NSI does not necessarily have consistent attributes such as structure value for the structures as the insurance agencies have for the policy coverage, and for NFIP claims, the coverage is truncated on the lower end by deductibles (thus losses are not recorded because no claim is made) and on the upper end by policy caps (thus losses in excess of the policy may be truncated to the payout rather than the actual loss). Converting point estimates of exposure and damage to a kernel density map does allow us to visualize if out estimated spatial pattern of exposure and damage match with our approximation of reality (e.g., insurance claim locations) allowing for a relative comparison. In the sense of comparing a kernel density map of estimated NSI exposure and consequences, a 100% uptake rate shouldimprove the assessment if the our estimated spatial pattern of exposure and consequences matches the observed. We will incorporate this discussion to the manuscript.
Figure 3: kernel typo; why is depreciation mentioned (seems more a benefit-cost assessment than a damage assessment)? What is the resolution of the kernel density analysis? Does that represent up/downscaling?

Response: Depreciated or economic values are what is present in the NSI while claims represent only replacement costs (not depreciated replacement costs).
We wanted to ensure that the use of depreciated values clear to the reader in Figure 3. The kernel density analysis uses a 1 km search radius and outputs as a 1 km resolution raster. The kernel density analysis provides a visual means to compare our exposure and consequence estimates to FEMA insurance claims, as a direct 1:1 comparison is not possible. We will include these details in the revised manuscript. : are there HWM from Harris County Flood Control District that might supplement the analysis? I understand HCFCD to have collected this data, though it may not be publicly available. The qualitative descriptors for USGS HWM typically refer to whether the HWM itself is recognizable and is not necessarily a description of elevation differences or potential height uncertainty. Further, sources of flooding leaving the HWM is typically noted in USGS data, so it would be appropriate to describe flood sources (are all the HWM coastal, riverine, or compound? Any ponding or other disconnected flooding? Similar comments for Section 3.2).
Response: The inclusion of additional HWM data will not offer additional data to enhance the conclusions of this manuscript given that the primary focus of the study was not to analyze the performance of each flood inundation mapping framework. The focus of analyzing HWM data in this manuscript was to describe how the flood inundation maps are different and imperfect. We think that the current analysis sufficiently proves this point.
The USGS includes a qualitative and quantitative uncertainty measure based upon the type of mark, the material the mark is upon, and the environment that creates the mark (Koenig et al., 2016). The quantitative height uncertainty attribute is present within the USGS HWM data and USGS associates that uncertainty qualitative quality of the data.
We can added the flooding source (coastal, riverine, or compound) to the summary of the HWM data in this section of the manuscript. We will attempt to discern if evidence of pluvial flooding occurs in the test region.
Line 198: intersecting rather than capturing (word choice)

Response: This change will be made in the manuscript.
Lines 199-200: from where was this assumption previously stated? Please state hypotheses in the introduction or methods sections. way to state this: "the user should understand the parameterizations made by the modeler." Choosing to omit or include certain parameterizations is the key message here and relates to the discussion in Section 3.3 at line 326-the user, and reader for that matter, needs the parameterizations. A user can make assumptions that one model is "better" than another based on same reported aspect of accuracy; however, the modeler's role is to express what is and what is not accounted for in an analysis-like, as lines 342-343 reflect, *not intending to represent flood inundation* because data is insufficient, erroneous, or non-existent. The modeler chose to not parameterize inundation for Armand Bayou in the HEC RAS analysis; therefore, the HEC RAS analysis should not be compared to the other models because it is incomplete, reflected in the statement at lines 343-344.

Response: Very good suggestion. This change will be made in the manuscript.
Lines 278-280: this appears to be explaining conditions very specific to HEC RAS, whereas the comment is directed to users of Event Maps-what other explanations of very specific modeling parameters or assumptions can be made more generally to apply to each of the models?
Response: This is a question the author's will address in Section 3.4 of the manuscript. In general, the composition of the frameworks (e.g., Figure 2) should be presented in the metadata of each flood inundation map to assist with the vetting process. Also, within the metadata, a general descriptive narrative would be appropriate where the modeler can convey what they, using best professional judgement, think are appropriate specifics to convey to the user of the flood inundation map.
Line 283-284: Why was AutoRoute chosen to model this explicitly compound flood event? Why not use a combination of other models to consider coastal vs riverine vs pluvial vs compound events?
Response: AutoRoute was chosen as an example of the terrain filling flood inundation mapping frameworks, such as the Height Above Nearest Drainage (HAND) methodology. The intent of the manuscript was to gather a subset of various flood inundation mapping frameworks, estimate flood inundation extents with each, and assess if they are different and if those differences are consequential. Conventional wisdom would state that simplified riverine only flood inundation maps are inaccurate along the coast. However, they are still deployable and we intended to evaluate if the inaccuracies of the terrain filling, riverine only flood inundation maps would be consequential to our exposure and consequence estimates.
Lines 300-315: This section is unclear, particularly lines 311-312 which appears to set out the overall differences in dollar/damage exposure: greater water depths should have greater expected losses, per the damage functions used in the study; however, little attention is given to water depths across the three modeled flood inundations. There is also no comparison of modeled depths to HWM depths. Line 313 suggests a bias in the AutoRoute model whereas greater depths may simply be a feature of the model or given its configuration for this analysis (e.g., it doesn't do coastal, so WSE will be higher given upstream/inland ground elevations and thus a potential for greater depths or depth errors from DEM).

Response: Our apologies for the confusion!
Lines 298-305 summarize a general comparison of exposure and consequences from all three flood inundation mapping frameworks. These differences clearly demonstrate that the different spatial composition of the flood inundation maps leads to quantified differences in the exposure and consequences.
Lines 305-314 examine why AutoRoute inundates 6,279 structures while estimating $0.9 billion in damages while HEC-RAS inundates 19,281 structures while estimating $0.7 billion in damages (from Table 3). The relationship found is that when HEC-RAS and AutoRoute inundate the same buildings, AutoRoute estimates $0.3 Billion more in damages than HEC-RAS. The only explanation in this difference in damage is a higher depth, as go-consequences uses the same location and depth-damage function for these buildings. If we then look at structures where only HEC-RAS estimated damage, the sum total is $0.5 billion and the average water depth is 1.1 meters. Likewise, for only structures where AutoRoute estimates inundation and damage, the sum total is $0.3 billion and the average water depth is 3.8 meters. Thus, AutoRoute estimates more damage than HEC-RAS because of a tendency to estimate a higher water depth.
We will clarify this section in the manuscript. Section 3.3: recommend not using the term "impact(s)" without discussing or getting into vulnerability assessment; recommend sticking to "exposure" to reduce confusion about the assessment Response: We think the reviewer is referencing lines 322-350, as no references to impact(s) are made is Section 3.3. We will amend the manuscript here, based upon the reviewers suggestions.
Line 356: what is the "quantitative pattern" referenced here? Spatial pattern? Depths? Differences in elevations?
Response: The proportion of insurance claims within each flood inundation map mirrors the proportion of HWM's within each flood inundation map. We will add this clarification to the manuscript.
Line 360: Please clarify-is this the correct use of "deterministic" in this statement? It seems that the implication or operative term is single event, not single source. Merwade et al 2008 presents a method to display a single, deterministic (i.e. static) inundation map with possible spatial errors-that is, a flood inundation map that includes visualization of quantifiable uncertainties affecting the spatial extent of estimated flooding. Applying Merwade et al 2008 here infers that the maps produced by the 3 evaluated models each do not account for uncertainties that may include sources of flooding, different DEM resolutions and vertical errors, different roughness coefficients, etc. The follow-on reference to the national hurricane center interactions with stakeholders (NOAA 2013) does not refer to stakeholders favoring probabilistic storm surge maps and appears to conflate the approach offered in Merwade et al 2008 (that is, cartographic representation of uncertainty versus numerical or forecast uncertainties). The report states that stakeholders found the map colors and water depth classifications useful and easy to understand; however, the report details hazard-specific probabilistic maps (wind, storm location uncertainty, arrival timing of wind speeds-standard NHC advisory products) but not probabilistic storm surge inundation maps.
Response: Excellent correction here! We intended to confer that one deterministic modeling chain leading to a flood inundation map will inherently possess imperfections/limitations and Table 1 is evidence of that and that multiple modeling chains that lead to multiple flood inundation maps may better confer risk. We will refine this section and associated references to improve our discussion of the point we want to make.
Lines 369-374: Comparison of NFIP claims to NSI valuation is problematic and likely underestimates damages.
Response: We've attempted to partially rectify this problematic comparison by scaling the NFIP claim totals with an approximation of insurance rates in the region from Shao et al. (2017). However, we are comparing financial losses (NFIP) to economic/depreciated losses (NSI). We will remove this comparison, given the problematic nature of comparing these datasets. Section 3.3 offers an interesting solution to a complex problem in producing and applying flood inundation maps in emergency management situations. It would be interesting to delve further into the reasons that inundation mapping is not a primary function of the federal agencies partnered in IWRSS or that any one entity does not produce an authoritative map, like NHC does for hurricane storm surges. (Are the authors implying that the NHC storm surge map should also be refereed?) However, this seems somewhat beyond the scope or intent of this paper unfortunately-but one can't help but wonder what the reasons are for NWS or USGS or USACE not producing real-time, publiclyaccessible inundation maps beyond technical limitations. Is there a statutory reason for not producing inundation maps in real-time? Budgetary or staffing shortages? Clearly these data and maps can be made, and many in near-real-time, so is adjudication the right solution over, say, accounting for mapping uncertainties cartographically and explaining the use cases for the maps and data?
Response: Our use of the term Event Map appears to have confused the reviewer. We will change our references from Event Map to reduce confusion. Each agency in IWRSS has their own means of producing flood inundation maps for emergency situations and distributing those in real time (e.g., lines 376-381, along with our three examples). The issue has historically been that each agency produced their respective flood inundation map without fully coordinating with the other agencies. To further complicate things, there can also be non-IWRSS flood inundation maps created during a flood event. The Fathom-US maps are one example of a non-IWRSS flood inundation map and there are likely a number of others that are local and regional. With the number of flood inundation maps that may be available for a given flood event, a way of consolidating, adjudication, and promoting the appropriate flood map for a given location seems to be the most logical first step. That is what the integrated Flood Inundation Mapping (iFIM) effort intends to do (Mason et al., 2020). The idea being that if, for instance, a U. S. Geological Survey flood inundation map for a given time and location is most appropriate, that map will be promoted by all of IWRSS to emergency managers and the public. The result is the authoritative, consolidated Event Map.