Reply on RC2

The data have some issues, which are explicitly acknowledged and discussed in the text (Section 2.2.3 and Appendix B). We also respectfully disagree on the claim that the study catchment is so unusual. Most of the headwater catchments in the Dolomitic region, and in general in the southern side of the Alps, share similar features in terms of substrate heterogeneity and possible presence of internal karst regions. Nevertheless, the revised version of the paper will better emphasize the specific geological features of the study catchment, the duration of the study period, and the impact of these factors on the main results of the paper (see below).

The data have some issues, which are explicitly acknowledged and discussed in the text (Section 2.2.3 and Appendix B). We also respectfully disagree on the claim that the study catchment is so unusual. Most of the headwater catchments in the Dolomitic region, and in general in the southern side of the Alps, share similar features in terms of substrate heterogeneity and possible presence of internal karst regions. Nevertheless, the revised version of the paper will better emphasize the specific geological features of the study catchment, the duration of the study period, and the impact of these factors on the main results of the paper (see below).
1 -I am sure that these field data were hard-won, but they span only two months (or maybe only one month -Figures 6 and B2 both refer to "the study period" but one is only about half as long as the other…?), and include only a small handful of precipitation events.
Thanks for noting that. The study period actually spans two months, as seen in the plots of Figure 5 and explicitly stated in the method section. Figure B2 shows only the period during which data of the sensor S 39 were corrected, we will modify the caption of the figure accordingly.
It is hard to draw robust conclusions from such limited evidence. The study by Jensen et al. (2019, cited in the references) provides an illustrative contrast, with a much more extensive set of observations, and thus more robust inferences, drawn from a similar number of sensors along a similarly sized channel network (but a longer study period with more precipitation events). I will leave it to the editors to decide whether HESS wants to publish such a limited data set -speaking for myself I would have waited for a more comprehensive picture to emerge.
The length of the study period is affected by the characteristics of the study catchment which is a small high relief catchment located in the Alps. The basin is snow-covered from late November until the early summer (June). This constraints the time window that can be used to take field measurements and study the network dynamics (potentially to 4-5 months) and renews the underlying network dynamics every winter. Moreover the sensors' deployment is highly time-consuming and the set up of all the sensors require at least some months. The 2019 fall season was particularly short, as the snow season came earlier (early November instead of early December). At any rate, it is important to stress that, though relatively short, the study period covers a wide range of climatic conditions and network configurations, as discussed below. A detailed statistical analysis was performed to analyse the representativeness of the climatic conditions observed during the period of record. Figure 1 of the supplement material compares the distribution of the daily rainfall depths (h) observed during the reference time window (Sep and Oct of 2019) and that observed in the long term (2010-2020) during the whole period within which the network is dynamical (from July 1 to November, 30). The comparison shows that the frequency distribution of the rain depth for our study period matches quite well the corresponding long-term distribution, an instance which suggests that the rain regime during the analyzed period is in line with that driving the observed longer term network dynamics. Thus, we do not see any particular reason for which the results of the paper could be strongly impaired by the specific duration of the reference time window. This is especially true if one considers the diversity of the hydrological conditions observed during the period of record. More specifically, the main issues associated to the use of water presence sensors highlighted in the text would not be changed by the use of more data, and the nature of the relationship between the mean persistency of the nodes and statistical properties of the ER signal is unlikely to be significantly modified as well. Likewise, the procedure identified for reconstructing the high frequency space-time dynamics of the stream network using ER data would be the same, even if more data were considered. Also the hysteresis in the Q vs. L relationship would remain, as they are observed within individual events, and across all the events. Of course, it would have been great to have a longer study period. Based on the duration of the snow season and the time needed for the sensors deployment, we argue that the maximum possible duration of the period of record in an Alpine field site of this type could hardly exceed 3 months, after which the winter would freeze the catchment and renew the underlying network dynamics. However, repeating this experiment to get 3 months of data instead of 2 is practically unfeasible, and would imply a huge experimental effort (e.g. another deployment of all the sensors), eventually leading to results that are unlikely to be significantly different from those obtained in this study. All these arguments will be clarified in the revised version of the paper. In any case we do not see significant discrepancies between our results and the results shown by Jensen et al., 2019: in both cases there is a good consistency between ER data and the visual observations of the network, expansions/contractions of the network occur by growth of disconnections within the streams and different types of hysteresis between L and Q are observed.

-
The resulting uncertainties are very large (see figure 9), but this is not adequately accounted for in the presentation. The text (line 327) says that b varies by about 1% as the temporal resolution changes, but given that the uncertainty in b can be over 10%, it is actually unknown how stable b really is (or isn't). The text (line 331) even argues for a systematic increase in R 2 from 0.485 to 0.522, even though the uncertainty in R 2 can be over 20%, making this "systematic" increase statistically meaningless.
We think the referee is highlighting the fact that the estimate of the value of b and the agreement of the power law model with the data can be dependent on the specific dates of the surveys. As per the b exponent, we believe that the pattern shown by the mean scaling exponent b is quite interesting, and that variations of the order of 10% could be considered as acceptable in the light of the simplicity of the power law model and the presence of possible measurements errors. We acknowledge, however, that the pattern of R 2 is less meaningful owing to the large standard deviation, as suggested by the referee. In the revised version of the paper, we will discuss more explicitly this point about the variability of b and R 2 across the different sub samples of the data. Thanks for noting the issue.
Even these very large uncertainties may be underestimates, because the underlying data are serially correlated, meaning that (for example) few of the points in Figure 8 are statistically independent of one another. From the methods it is unclear whether this has been taken into account, as it should be.
We are puzzled by this comment. Serial correlation is unavoidable in all high frequency joint data set for L and Q. Here we are exactly reproducing a fictitious sampling campaign with sporadic measurements of active length and discharge, assuming that a standard power law is applied to the sampled data, as typically done in all previous studies where the relationship between Q and L was studied. The standard deviation shown in this Figure is a measure of the diversity of model parameters as a function of the sampling dates, not a standard model uncertainty associated to a given data set -which relies on some likelihood function, with possibly correlated residuals. This standard deviation simply resembles the heterogeneity of the fitting of a power-law model done over a set of different sub-samples of the same set of data. In the revised paper we will more explicitly discuss how this analysis should be interpreted to avoid misunderstandings of these kind. Thanks for bringing this issue to our attention.
The last main conclusion of the paper is that (lines 411ff): "The mean value of the exponent of the power law relationship between catchment discharge and total active length was found to be almost independent on the frequency of the observational data, which instead had a larger impact on the goodness of fit of the power-law model. When the frequency of the data is lower, the observed values of R 2 are, on average, larger…" In view of the vast uncertainties in Fig. 9, these conclusions are reckless. Within the uncertainties, either of these trends could be strongly increasing, strongly decreasing, or zero. There is simply no robust conclusion that can be drawn from the data.
Thanks for the comment. In the light of the fact that the R 2 are not particularly high and the available samples of Q and L are relatively small, we deem that a range of variability of b of about 10% could be acceptable. Therefore, we maintain the point of the relative stability of b for different values of T. On the other hand, we recognize that the observed pattern in R 2 might not be particularly meaningful, in view of the standard deviation of the estimate shown in Figure 8. This will be emphasized more clearly in the revised version of the paper.
3 -The limitations of the study site are severe, particularly for analyses of network dynamics.
We respectfully disagree on this claim, for the reasons detailed in this rebuttal.
The basic problem is that roughly 80% of the basin seems to have no surficial drainage network at all, consisting instead of talus slopes and moraines. The critical issue herewhich is not acknowledged anywhere in the paper -is that this ~80% of the basin is still generating discharge (at least some of which is presumably measured at the outlet), but the accompanying network dynamics are invisible because they are occurring beneath piles of rock debris. Outside of the mapped network there appears to be roughly two square kilometers of drainage area with no surficial drainage at all.
The comment is certainly relevant -and we are grateful to the referee for the insightful input -but the estimate of the percentage of catchment area drained by the existing channel network indicated by the referee is largely underestimated. In fact, in the upper Valfredda catchment 1.7 km 2 out of 2.6 km 2 (total catchment area) directly drains through the hydrographic network (65%) according to the analyses of the drainage flow paths performed with a high resolution DTM; thus, 35% of the catchment has no surface drainage network -not 80% as guessed by the reviewer.
At best, that means that any observations here cannot be compared with the rest of the network dynamics literature, in which the discharge from the whole basin is compared with the flowing stream network across the entire basin. Thus, for example, there is no way to compare Figure 8 with similar diagrams from other studies, because in this case most of the discharge appears to be generated by subsurface flow that is presumably strongly damped and lagged, suppressing the variability in Q (this may account for the sharp vertical lines in Figure 8, for example).
Let us note that the drainage density in the upper Valfredda is comparable with that of other study catchments used to study network dynamics (e.g. South Fork of Potts Creek, Fernow, Turbolo). Nevertheless, we recognize that the presence of a karst area which originates a localized spring that releases a quite constant discharge of about 40 l/s (roughly 30% of the mean discharge) and feeds a perennial stream needs to be taken into account when analyzing the relationship between active length and discharge. This will be done in the revised version of the paper, as explained below.
The manuscript doesn't confront (or even disclose) this problem anywhere, which is surprising given the abstract's mention of "the diversity of the hydrological behaviour of the study catchment" -by which the paper seems to mean only the two small drainage networks that were studied, not the other roughly three-fourths of the catchment.
Thanks for the comment, we will add a detailed description of the impact of the portion of the catchment without drainage network on the main results of the paper (see below). Let us note that the percentage guessed by the referee is largely overestimated (as threefourths should read one third).
It is virtually a truism in catchment studies that each site has its own idiosyncrasies, but here this particular "uniqueness of place" makes a network dynamics study particularly difficult. Why study network dynamics in a catchment where the great majority of the drainage area has no network at all? Such a site makes it particularly difficult to draw any mechanistic inferences from the observed network behavior.
The Valfredda catchment is one of the study sites of the ERC project Dynet, because it is quite representative of the headwater catchments in the Dolomitic region. It was used in this study because of the heterogeneity of the substrate, and the availability of long term data about rainfall, discharge, water quality and persistency of the nodes of the network, which lie at the basis of several results presented in the paper (Figures 3 and 4). We respectfully disagree on the claim that mechanistic inferences can hardly be made here, as indicated in the existing literature as the study by Durighetto et al. (2020); in the cited paper three different models were developed and statistically validated in order to describe the dynamics of the active drainage network length (ADNL) starting from the wet length values measured during nine surveys in the field, the atmospheric forcing and the geologic characteristics of the catchment. Both empirical data and the models results showed the influence of the antecedent precipitation on the measured/modelled dynamics of the stream network as much as the geologic features of the study catchment. All these arguments about the choice of the site will be further stressed in the revised text.
The manuscript says (lines 367ff): "Network length was found to be more sensitive than discharge to small precipitation inputs: while most rain events induced visible changes in the active channel length, the catchment stream flow was sensitive only to the rain events lasting for several consecutive days (6-9/09, 13-18/10, 20-24/10) and to intense storms (more than 20-30 mm in 9-12 hours)." This is exactly the behavior that one would expect from a field site like this one, with most of the discharge being generated by relatively slow subsurface flow paths over \~80 percent of the catchment, but with network lengths being measured on the very few surface drainages in the remaining small fraction of the catchment.
Thanks for the comment, which gives us the opportunity to clarify the issue. First of all, in the upper Valfredda catchment analyzed in this paper the karst area does not contribute to the large majority of the discharge but only to approximately 30% of it. In fact, the water that infiltrates in the region where terrain depressions and karst substrates dominate mostly exfiltrate through a permanent spring in the northern portion of the network, originating a constant discharge contribution of about 40l/s (while the average observed discharge exceeds 130 l/s). The large majority of the total discharge is instead released through the dynamical drainage network mapped in this study. Moreover, we contend that the presence of a permanent water source (Q_p) which feeds a perennial stream of length L_p is able to produce the hysteresis in the L vs. Q relationship shown by our data. Within a single intense rain event, the hysteresis depends on the fact that the active length increases faster than Q in the early stages of the event, while decreases much slower than Q in the recession. Across different events, instead, the hysteresis is generated by shifts in the response of Q and L to different types of rain events. In particular, when rain events are moderate the channel network activates but the amount of water conveyed to the outlet is limited, owing to disconnections and limited flow velocities; conversely, when rain events are more intense the same active length contributes a much larger discharge to the outlet. Anyways, we think that the point made by the referee about the role of a permanent water source for the analysis of L vs. Q relationships is quite interesting. To further investigate the impact of a perennial channel with a constant wet length (say, L_p) supplied by a permanent constant discharge Q_p on the underlying L vs. Q relationship, we have performed a numerical simulation. Therein, we assumed that the total active length is the sum of a constant active length L_p and a dynamical stream network length L_d(t), while the total discharge was assumed to be the sum of a dynamical discharge Q_d(t) and a constant discharge Q_p. The dynamical length was then assumed to be linked to the dynamical discharge through an exact power law relationship of the type L_d(t) = aQ_d(t) b . In our exercise, we studied the relationship between the total length L_T(t) = L_d(t) + L_p and the total discharge Q_T(t) = Q_d(t) + Q_p, when different values of Q_p and L_p are used. In particular, we focus on the observed values of the exponent of the power law model properly fitting the overall values of Q_T and L_T (Figures 2 to 4 of the supplement material of this rebuttal), which was compared to the b value of the dynamical power law relationship linking L_d(t) and Q_d(t). The plots shown in Figure 2 to 4 of this rebuttal indicate that the variations of the apparent b induced by the presence of a permanent discharge and a permanent active length are in the range of 10/20%. Such variations are particularly low when Q_p and L_p follow the proportion of the power law-model holding for the dynamical fractions of Q and L -as observed in our dataset, where Q_p and L_p were estimated based on field measurements. In all cases, however, the R 2 doesn't change considerably, indicating that the goodness of fit of the power law model is not impacted by the presence of a positive value of Q_p and L_p. All these arguments clearly indicate that the presence of the karst area in the northern part of the catchment doesn't affect significantly the fitting of the power-law model and the extent of the observed hysteresis in the L vs. Q relationship. This also applies to the time series of L and Q studied in the Ms. All these arguments will be included in the revised text, in which we plan to isolate the impact of Q_p and L_p on the fitting and the scaling exponent and we will be able to quantitatively demonstrate that the presence of a constant discharge feeding a perennial channel does not change the goodness of fit of the power-law model and has a moderate impact on the exponent b. We thank the referee for raising this important point.
The manuscript continues (lines 375ff): "In our case study, the standard deviation of the wet length as derived from the sensors' data is 360 m, while the standard deviation of L predicted by the power-law model based on the observed variability of the discharges is only 224 m (about 40% lower). This underestimation is induced by the poor ability of the power law model to capture the observed network dynamics produced by small precipitation inputs." It would rather seem that the problem is that *no* model could possibly capture the relationship between the network dynamics in a small part of the catchment, and the discharge generated by completely different mechanisms in the great majority of the catchment.
As pointed out above, the discharge generated by the unchanneled part of the catchment is about 40 l/s, while the mean discharge is more than three times larger (130 l/s). The reasons behind the observed hysteresis in the L vs Q relationship are described above, and pertain to differences in the velocity and connectivity of the dynamical streams across different events --or asymmetries in the rate of change of L and Q within individual events. At any rate, we will include an in-depth analysis of how the presence of a permanent spring which feeds a perennial stream impacts the discharge vs. active length relationship, as indicated above.
All of the conclusions concerning the relationship between stream length and discharge (essentially everything after line 10 in the abstract and after line 406 in the conclusions) are based on very thin data from a catchment in which discharge mostly comes from subsurface flow through rock debris (with the result that changes network length in the small fraction of the catchment with surface drainage are unsurprisingly not clearly related to the discharge, which mostly comes from the rest of the catchment).
As pointed out above, the permanent spring through rock debris contributes to 40 l/s while the average discharge during the study period is 130 l/s. See also our previous responses on the same point.
Thus all of those conclusions are based on very thin data that does not allow straightforward interpretation even in this study catchment, and cannot be extrapolated to the great majority of catchments that lack this particularly exotic geometry.
We believe that our conclusions could be of interest for the HESS readers, and not necessarily restricted to the Valfredda site. In fact, hysteresis in the Q vs. ) though the underlying reasons were not discussed in-depth. However, the potential impact of the specific features of the study catchment on the results will be better emphasized in the revised text, as indicated above.
If those conclusions are excluded -as they really should be, given their weak empirical support and their inherently problematic interpretation (network lengths are not measured in the part of the catchment that generates most of the discharge) -then we have essentially a technical note outlining a new way to deploy conductivity sensors, and conveying some lessons learned from a first deployment of these sensors. That would seem to be a more appropriate way to go, rather than trying to draw strong conclusions about length/discharge relationships from such limited data and such a problematic study site.
The potential role of the specific features of the study catchment will be better emphasized in the revised text, as per the referee's suggestions. As per the indication of transforming this paper into a technical note, we can not agree. There is a strong length constraint for HESS technical notes (a few pages) that makes this choice unsuited to this pretty long Ms. In particular, we do believe that the analysis of the joint response of Q and L to the observed precipitation forcing (Figures 7 and 8) should not be removed because they are an important outcome of our field campaign. Moreover, both the reconstruction of the spatial and temporal dynamics of the active stream network ( Figure 5) and the analysis of the relationship between the statistical properties of ER signal and persistencies ( Figure 4) represent important outcomes of the study, which complement our experimental data set and the discussion of the advantages and disadvantages of the sensors' deployment. Incidentally, the text quite clearly acknowledges the inherently specific nature of the findings presented in the Ms (e.g. "our results indicate that in some cases the use of a bijective L vs. Q function to infer active length changes in catchments where discharge time series are available, might lead to significant underestimation of the actual variations of the flowing channel network. In our case study, the standard deviation of the wet length as derived from the sensors' data is 360 m, while the standard deviation of L predicted by the power-law model is [...] 40% lower)."). Nevertheless, in the revised text we will try to emphasize even more that these results might not necessarily apply to all the catchments in the word, and that further investigations are needed to determine the generalizability of the behaviour observed in the Valfredda in the light of the uniqueness of the place.
As noted by at least another reviewer, the language would also need work (e.g., "customized" rather than "personalized"), but any revised manuscript is likely to be substantially different so I have not marked those issues in this go-around.
Thanks for the comment, the text will be polished and revised (and customized will replace personalized).