Reply on RC2

In this work, Noll et al. examine controls on protein depolymerization rates, a known key step in the production of LMWON compounds that can be used by microbes and (sometimes) plants. They find that substrate availability is a key control on depolymerization, and in turn they identify soil pH, MAP, and Al/Fe oxyhydroxides as key controls on substrate availability. The study is a wideranging longitudinal soil survey across Europe, from the Mediterranean to the Barents Sea. They find that land use has a negligible effect on substrate availability and depolymerization rates, somewhat surprisingly. They reached these conclusions through a combination of anova/linear regression approaches and structural equation models. The observational breadth and depth of this study is quite impressive. Fourty three sites across Europe were sampled, and exhaustive chemical and biological analyses were performed on the soils. The key measurements are well-supported from a theoretical standpoint. This seems like a monumental effort that was well-planned and carefully executed.

The manuscript definitely needs some honing to make the central story stand out more, though. I also encourage the authors to rethink their statistical approaches; I'm not asking them to redo all of their analyses, but I think a shift in emphasis toward highlighting the analyses that deal with the highly correlated nature of the predictors, is warranted. I honestly really struggled through reading this paper. There are SO many measurements taken, and the results are presented in such exhaustive detail, that I found myself losing the thread often and wondering why data was being presented / what the main thrust of the argument was. I very strongly encourage the authors to revisit all of the topic sentences for each paragraph and make sure that the conclusions that should be drawn from a paragraph are clearly stated up-front. I also encourage the authors to think very carefully about what data are actually central to the story, and to shunt a lot of their other results to the supplement. Regarding the analyses, the authors are dealing with a ton of very highly correlated predictor variables, an issue which they recognize. They lead their results and discussion, though, with exhaustive treatment of single-variable ANOVAs (over sixty ANOVAS) which do not do justice to this rich but highly correlated predictor dataset. The authors have a very nice conceptual model (Fig 1) that is very nicely examined through an SEM. I think that this should be the centerpiece of the story! Some ordination approaches also would make more sense to me in terms of understanding the highly correlated nature of the data, rather than picking apart tables of bivariate correlation coefficients.
The paper also needs to be brought into compliance with EGU's data policy. This is a nice body of work, and some careful editing will go a long way to making the story in this paper shine. (I'm also sorry the authors have waited 5 months and had many declined review requests. Frustrating!) Best wishes, Richard Marinos, U @ Buffalo Dear Richard, Thank you for your very positive response.
We highly appreciate your thorough review of the manuscript and your critical comments. We agree that the complexity of the data set and the high number of measurements and analyses make it difficult to stay focused on the central theme. We do fully understand this criticism and will thoroughly revisit our manuscript, better highlighting the central story, building this around the SEM in the Discussion section, removing more side data from Results to the Supplement, and trying to further reduce the data set using multivariate approaches. In the revised MS version we will shift even more of the "primary data" into the Supplement, though we need to mention that all single parameter analyses not central to the story were already in the Supplement in the first version. Anyway, the results section will be streamlined and important side results put in a Supplementary results section. Moreover, in all paragraphs we will take care to put the "take home message" upfront and then explain our reasoning. Certainly some parameters are highly correlated (e.g. positively soil microbial biomass C and N, negatively sand and clay, soil pH and exchangeable Ca and base saturation, etc.), while others are not or not necessarily (not talking about strong but spurious correlations). In our previous statistical analyses, in PCAs and CCAs and particularly in the SEM analysis we were fully aware of this highly correlated nature of specific soil properties and accounted for this by omitting highly co-varying parameters. This will be more clearly stated in the revised manuscript.
We also agree that the results sections focused too strongly on the single-variable ANOVAS, which are only used to support the results of our multivariate modelling but provide only limited information by their own. We will shorten the results section and transfer part of the results to the supplement.
You also suggest showing some ordination approaches. During the course of analyses we also tested some ordination approaches, mainly PCAs. But in our opinion the multivariate statistical results provided only limited insight. However, we will re-evaluate those analyses and examine if the multivariate analyses would help to increase the understanding of the highly correlated nature of the data.
Please find below our replies to the itemized referee comments. The line numbers refer to those of the original manuscript.

Line items:
180 -I don't understand what the other factor, besides land use type, is in these models. More broadly, this analysis scheme doesn't make too much sense to me... your conclusions are that bedrock type is a key driver of depoly rates, and land use type is not. But bedrock type was only subject to a 1-way anova, while land use is subject to a 2-way anova which controls for bedrock type/climate. Why, for example, was the effect of bedrock type not analyzed with a 2-way ANOVA that controlled for the effects of land use? Given relatively low sample #s, it is unsurprising that there is not enough statistical power to detect an effect of land use type in a 2-way ANOVA when controlling for bedrock/climate/soil type, but the bedrock type analysis was not subject to the same dilution of statistical power, so it seems to me that the conclusions are drawn from incommensurable statistical approaches.
For the analyses of land use effects we run the two-way ANOVA for the main effects of "land use" and "site" (with no interaction -see below), where the factor "site" controlled for any difference in climate, geological substrate and soil type across the sites. We did this since at any site we had only one composite sample analyzed per land use at this site (no replication within land use x site) and therefore the single observations were not independent. We sampled the three land use types at each site in close vicinity and therefore only minor differences in bedrock, climate and soil properties were expected within site. But we did not control the "land use" ANOVA for bedrock/climate/soil properties in particular; for this we would have needed at least 10-fold larger numbers of samples/sites. However, we are aware, that land use effects are difficult to compare to the rather large scale controls, i.e. climate and soil properties. The chosen sampling scheme was more suitable to investigate large scale controls, where land use effects might be more predominant on regional to local scales. We will clarify this in the revised manuscript.

205-Are the +/-numbers one standard error of the mean? Confidence interval? Please state at the first instance.
The +/-numbers refer to one standard error of the mean. We will indicate this at the first instance in the revised manuscript.

Figure 4 -Is there really a clear enough justification to use polynomial regression?
The visual fit improved strongly when using a polynomial regression compared to a linear regression. We will provide further justification for the used model in the revised manuscript.

-I have a hard time wrapping my head around how Fe and Al oxyhydroxides can simultaneously increase SOM stabilization AND increase SON availability.
First, this conclusion is based on the fact that soils with larger amounts of minerals with very high specific surface areas such as Fe/Al oxyhydroxides and finer texture potentially store more SOM than soils with coarser texture. Therefore the overall organic N pool size is expected to be larger in fine textured soils and in soils high in Fe/Al oxyhydroxides. Second, the strength of the binding interaction between Fe/Al oxyhydroxides and SOM, and more specifically with organic N including proteins, is high, and higher than with typical clay minerals (Newcomb et al, 2017). Overall, this means that soils rich in Fe/Al oxyhydroxides contain larger pools of proteolytic substrates (organic N and proteins), but these substrates are more strongly bound and therefore less accessible. The net effect of these adverse interactions is currently unknown; therefore this study is among the first to show a net positive effect on the in situ rates of depolymerization of high molecular weight -ON substrates. Moreover, our conclusion is supported by the findings of Leinemann et al. (2018), showing that stabilized C was re-dissolved by progressing percolation of OM, indicating that organic compounds can be easily exchanged from e.g. goethite. Hence a higher amount of Fe/Al -oxyhydroxides might corresponds to a larger fraction of weakly bound organic N, which is continuously re-dissolved and becomes thereby available for microbial utilization. We will further clarify this in the revised manuscript.
We rephrased the sentence as follows: In acidic soils, column experiments with embedded goethite revealed that sufficiently large amounts of stabilized C were re-dissolved by progressing percolation of dissolved OM and consequent exchange of adsorbed compounds. The re-dissolved compounds thus become available for microbial utilization (Leinemann et al., 2018).

-"
As demonstrated by partial correlations..." This statement makes an assumption that Al/Fe oxyhydroxides and pH are the TRUE controls, and MAT is just a latent predictor, which I don't think has been fully justified.
In the statement we concluded that "part of the negative effect of MAT on depolymerization rates can be explained by concomitant changes in amorphous Al and Fe oxyhydroxides and soil pH". Increasing temperatures would rather be expected to directly positively affect soil enzyme activities and to promote substrate and enzyme diffusion for enzyme-substrate encounter and to trigger catalytic action, instead of directly negatively affecting depolymerization rates. The MAT-depolymerization relationship therefore must be indirect. The partial correlations are one way to depict direct and indirect effects on, and primary and secondary drivers of biogeochemical processes. They showed a significant decrease of the correlation coefficient of mean annual temperature and depolymerization by removing the effects of soil mineral Fe and Al contents. The decrease of the correlation coefficient by removing effects of soil pH was not significant. However, after removing effects of soil pH and Al/Fe oxyhydroxides the effect of mean annual temperature on depolymerization rates was still significant. We will clarify this in the revised manuscript MAT likely controls depolymerization indirectly by multiple effects on vegetation, soil weathering, microbial community structure etc..

-Biogeosciences requires data to be published in a FAIR repository, or else
have the reasons for data remaining unpublished be clearly explained. This is statement is not in compliance with those requirements. I also encourage the authors to archive their code.