Reply on RC1

The authors compare down-scaled SIF with FLUXCOM GPP. Downscaled SIF using from GOME-2 can include two sources of error: I) GOME-2 retrievals are known to be somewhat less accurate than say OCO-2 and TROPOMI and II) The downscaling itself might introduce errors. Given that we have more than 2 years of TROPOMI data, I don’t understand why a simple test of downscaled GPP with “original” TROPOMI SIF data can be performed. This would help evaluate the robustness of the product used.

The authors compare down-scaled SIF with FLUXCOM GPP. Downscaled SIF using from GOME-2 can include two sources of error: I) GOME-2 retrievals are known to be somewhat less accurate than say  The downscaling itself might introduce errors. Given that we have more than 2 years of TROPOMI data, I don't understand why a simple test of downscaled GPP with "original" TROPOMI SIF data can be performed. This would help evaluate the robustness of the product used.
Such a comparisons are already available in the Duveiller et al. paper (https://essd.copernicus.org/articles/12/1101/2020/), where the downscaled SIF dataset is independently validated with OCO-2 SIF observations and a further comparison of downscaled SIF against TROPOMI data is provided. The paper shows that there is a high spatio-temporal agreement with the first TROPOMI retrievals and justified the use of this global high-resolution SIF (with a long temporal archive) in the current analysis.
As we mention in the manuscript, due to GOME-2 sensor degradation we chose not to extend the paper's analysis beyond 2014. As such, the time period considered does not overlap with OCO-2 or TROPOMI and therefore could result in data consistency issues.
2. Please always provide the reference wavelength for SIF (which is wavelength dependent) and clearly state whether it was length-of-day corrected or not. This is a good point, thank you for raising it. We will add to the manuscript the lines : (L146) The two retrievals have a spectral wavelength around 740 nm, and differ in the retrieval method… (L148) A correction factor to convert the instantaneous SIF to the daily average is applied to both datasets to ensure comparability with estimates at different acquisition times ((Frankenberg et al., 2011bhttps://agupubs.onlinelibrary.wiley.com/doi/full/10.1029Köhler et al., 2018a[https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029 L270 In order to explore the potential capabilities of SIF as an early indicator of stress across different type of vegetation type, the response of downscaled SIF to anomalies in a number of meteorological variables is analysed.
-> The response of length-of-day corrected downscaled SIF to anomalies in a number of meteorological variables is analysed.
3. The dataset by Koehler et al wasn't used but that decision is not well motivated (or described). What "bias" are the authors talking about? Statements like these really need to be rigorous, right now it is rather sloppy.
We agree that the wording here is quite sloppy and can be mis-interpreted. For nearly all of the figures we have also run the SIFPK analysis. However the difference is marginal and we wanted to avoid duplication in the paper (which would double the number of figures).
The decision to use the SIFJJ dataset references the findings of Gregory Duveilller et al.
(https://essd.copernicus.org/articles/12/1101/2020/). The paper compares each of the two SIF retrievals (at both coarse resolution and downscaled) with OCO-2 SIF (at 0.05deg), in terms of both the agreement, correlation and bias. Figures 1 & 3 show that there is 1) a slightly higher level of agreement between downscaled SIFJJ with OCO-2 compared to downscaled SIFPK with OCO-2, 2) similar levels of correlation between the downscaled SIF and OCO-2 for both retrievals, and 3) a lower level of bias between SIFJJ and OCO-2 than SIFPK and OCO-2. The (un-downscaled) SIFJJ tends to have more noise than SIFPK, however the downscaling process smooths over some of this noise ('The JJ retrieval, which is known to be noisier and to have a smaller bias than PK, benefited particularly from the downscaling procedure, probably due to the embedded spatial smoothing step'). The decision is therefore made to use the SIFJJ dataset due to the fact it has slightly less bias and the downscaling process reduces the noise.
We propose to rephrase our motivation and make greater reference to the Duveiller paper, and we will replac L155-L159 in the next version with (changes underlined): (L155). Duveiller et al. shows that the downscaled SIFJJ dataset is found to have a slightly higher level of agreement with the OCO-2 validation data than the downscaled SIFPK dataset and so is primarily used in this paper, and is henceforth referred to as 'downscaled SIF' (or SIFDS). The higher agreement likely results from the spatial smoothing step of the downscaling process that benefited the noisier SIFJJ more than the SIFPK.

4.
To me, there is some circularity in the interpretations. Most importantly, the authors state that: "Proving this technique at a global scale provides evidence for the use of highresolution SIF in monitoring the resilience of local ecosystems to environmental fluctuations, an area of growing importance as extreme weather events become more frequent and more severe". This statement is far reaching but it is actually based on just a comparison with FLUXCOM GPP, which implies that FLUXCOM GPP has the same potential (and could be provided in near real time as well). Thus, it is unclear what SIF could do that FLUXCOM (or other pure remote sensing products) can't. The interesting cases would be those in which the products disagree but the author's statement is based on the agreement in the IAV between the two.
We think the key point of this part of the analysis is that: SIF is measured independently from the meteorological variables (it is remotely sensed and not modelled) The FLUXCOM GPP is modelled using remotely sensed data including temperature and water (normalized difference water index) inputs. The SIF response to meteo fluctuations is more or less the same as the FLUXCOM GPP response, as evidenced by the study. Therefore, we have a remote sensing proxy for GPP, independent of meteorological variables, which we demonstrate is sensitive to the meteorological fluctuations in similarity to GPP. Therefore, hopefully, this early study shows the potential for the use of SIF in measuring global plant growth in response to meteorological fluctuations.
The value of the analysis is the demonstration that that downscaled SIF follows the pattern observed in the FLUXCOM GPP, and as such provides a global (mostly) independent RS near-real-time observation.
We will make the following change for clarifying purposes (changes underlined): L280 'For comparative purposes, the FLUXCOM GPP is also included, however it should be noted that the product takes several remotely-sensed climatic variables as input.' -> The FLUXCOM GPP is also included in the analysis, though, as noted, the FLUXCOM GPP product takes several remotely-sensed climatic variables as input and so is not independent of the meteorological drivers. The inclusion of the GPP product enables a comparison with the SIFDS, giving insight into whether the SIF behaves as may be expected of an independent proxy for GPP. L580 Proving this technique at a global scale provides evidence for the use of highresolution SIF in monitoring the resilience of local ecosystems to environmental fluctuations, an area of growing importance as extreme weather events become more frequent and more severe -> Proving this technique at a global scale demonstrates that high-resolution SIF responds to meteorological fluctuations in a similar way to FLUXCOM GPP. As such it has potential as a near real-time indicator of vegetation status that, unlike FLUXCOM GPP, is independent of meteorological variables.
We also agree that some of the wording is overreaching in it's conclusions from the analysis. In particular the use of the word 'resilience' and elsewhere 'monitoring environmental stress'. We would like to draw your attention to a softening of the wording in the meteo-analysis. Some of this is described elsewhere in the responses to other reviewer comments, including the following changes: L27 and demonstrates the utility of SIF as a measure of environmental stress. -> and explores the similarity of the SIF and GPP responses to meteorological fluctuations.
L469 In this context the study suggests that it is possible to use high-resolution SIF as a near-real time measure of the resilience of ecosystems to climate fluctuations -> In this context the study suggests that it may be possible to use high-resolution SIF as a near-real time measure of the response of vegetation productivity to climate fluctuations L490 This suggests the possibility of using SIF in the near-real-time monitoring of vegetation stress in reaction to environmental conditions -> This suggests the possibility of using SIF in the near-real-time monitoring of vegetation reaction to environmental conditions L580 Proving this technique at a global scale provides evidence for the use of highresolution SIF in monitoring the resilience of local ecosystems to environmental fluctuations, an area of growing importance as extreme weather events become more frequent and more severe -> Proving this technique at a global scale demonstrates that high-resolution SIF responds to meteorological fluctuations in a similar way to FLUXCOM GPP. As such it has potential as a near real-time indicator of vegetation status that, unlike FLUXCOM GPP, is independent of meteorological variables. (VPD, radiation) are also included as driver variables for FLUXCOM. It is thus unclear whether we are learning something new. The authors could do the same analysis as in Figure 10 but for FLUXCOM-GPP as well to evaluate whether the drivers (or limitations) between the datasets are identical or not. Only then would we learn something in my mind, right now a lot of the analysis is somewhat phenomenological.

Some (if not all?) of the variables analyzed
I think this follows from the comment above and we apologise if this doesn't come across clear enough in the text. Hopefully the changes mentioned in the previous comment make this clearer. These variables are drivers for FLUXCOM GPP however the object of interest in the paper is SIF, and so we are learning that SIF, an RS product measured independently of these variables, behaves in a similar way to one of the best global estimates of global GPP.
As for figure 10, it is possible to repeat it with FLUXCOM GPP (and indeed we have looked at this), but we agree, we would not be learning something new as it is simply a confirmation of (a circular argument: GPP is modelled with meteo variables, then we see how GPP responds to changes in those variables). We would like to stress that Figure 10 is not the main aim of the analysis, but rather Figure 9 is, as this shows that independent RS SIF responds to meteo-fluctuations in a way that we would perhaps expect if SIF serves as somewhat of a proxy for GPP, with the implication that this is useful in detecting future plant growth in response to short-term meteo variations. SIF offers value here as it is a near-instantaneously measured independent RS product (not modelled), and the analysis (demonstrated in fig. 9) confirms that its response to meteorological fluctuations follows what we expect from theory. Figure 10 is simply an addition to show where SIF suggests a given meteorological variable dominates. We are happy to remove the figure if it is not considered useful to the analysis of course, and we agree it is not so surprising and doesn't reveal a new insight about how vegetation growth responds to meteo-fluctuations. It is also possible to create the GPP-version or a SIF-GPP difference but I am not sure it is so useful to the paper, as fundamentally the paper is about downscaled SIF. Hopefully the suggested changes to the text in the previous comment make the reasoning clearer for not recreating the figure for GPP.