Comments on amt-2021-196 entitled "The FORUM End-to-End Simulator project: architecture and results" by Sgheri, L. et al

This paper describes a complete End-to-End Simulator (E2ES) for the FORUM mission, with the purpose of estimating the performances of the FORUM Sounding Instrument, which will observe the Earth from the orbit in the Far Infrared (FIR) for the first time. The authors showcase different simulations in clear and cloudy sky, and perform retrievals on synthetic data, reporting accuracy and sensitivity to atmospheric and surface parameters, comparing the E2ES performances to other codes (KLIMA and SACR). Applications to real scenes from MODIS data are also presented as an example to show how the E2ES deals with real data.

authors showcase different simulations in clear and cloudy sky, and perform retrievals on synthetic data, reporting accuracy and sensitivity to atmospheric and surface parameters, comparing the E2ES performances to other codes (KLIMA and SACR). Applications to real scenes from MODIS data are also presented as an example to show how the E2ES deals with real data.
Overall the manuscript presents the fundamental aspects of the E2ES, and gives interesting elements to evaluate the impact of including the FIR in the observed spectral interval when a simultaneous retrieval of many parameters is performed. Therefore the content of this work is certain to be of interest for the future activities involving the FORUM mission preparation. However, there are several revisions that need to be implemented to this manuscript before it can be considered for final publication in AMT, as the text suffers particularly from lack of clarity on several aspects.

Major comments:
1) The work does not provide any specific detail about the computational performances of the E2ES, and in particular of the SGM and L2M_I modules, which are heavily involved in the operational use. While the accuracy is assessed (and also based on a long series of studies and on the fact that the E2ES is based on well-established codes such as LBLRTM), the work should provide many more details about: 1) the computational time required to generate a scene with the SGM, and 2) the average time required to perform retrievals in each one of the cases explored in this work. Based on this, the work should also contain a discussion about eventual strategies to improve performances.
2) While the work contains a lot of details about retrieved parameters, there is very little shown about retrieved vs. modeled radiances. Those are especially important as the work deals with the FIR, for which no hyperspectral observations from orbit exist. More specifically, the work should show: 1) spectra for the case 1.1 to 7.1 (or at least a sample of those) with simulated observations, best fit from L2M_I, KLIMA, and residuals; 2) spectra and residuals related to the results in Fig. 16 and 17; 3) spectra and residuals related to the results in Fig. 18 and 19; 4) the spectra for the MODIS scene case.
3) Regarding the MODIS case, because the spectra are not shown, it is unclear -at a first glance -whether actual MODIS data are fitted (obviously adapting the Radiative Transfer to the spectral response of MODIS) or SGM scenes are generated using the MODIS L2 products and the described databases. It seems like the latter is the case, and this should be properly conveyed by the text, as the title of the section mentioning "MODIS data" is misleading.
Other/minor comments: 1) Lines 78-79: This sentence is not very clear: more than "routines", the FORUM data will likely provide feedback about the foundations of radiative transfer in the FIR in the Earth's atmosphere itself. Unless this is a quote from the document ESA (2019), it should be slightly clarified. 2) Line 81: A more recent work about continuum in the FIR, which goes at wavelengths as long as 200 um, is the following, and it is recommended to cite it: https://www.sciencedirect.com/science/article/pii/S0022407320310141?via%3Dihub 3) Line 82: point ii) needs to be more specific, as in the present form seems redundant w.r.t. the other elements mentioned here. Is the text referring to, e.g., the way in which aerosol scattering is handled? Test how high opacities by water are treated? 4) Line 88: among the works regarding parametrization of cirrus clouds, perhaps it is also appropriate to add the most recent reference to the work by Martinazzo et al. (2021): https://www.sciencedirect.com/science/article/pii/S0022407321002326 5) Line 113: can the text be more specific about the requirement of having a spatial sampling of 0.6 km for FSI to be sensitive to clouds? 6) Lines 136-137: This is not necessarily true, as the chosen approach is not equivalent to knowing how "radiative transfer works". Rather, it is useful to "making sure that the radiative transfer introduces no bias in the retrieval w.r.t. simulated observations". The text should be revised to be more precise in this sense. 7) Line 179: can the authors specify the 11 gases considered in the radiative transfer? 8) Line 188: the text is not sufficiently clear about how the aerosol properties are derived over the whole FORUM spectral range, as only the ones at 900 cm-1 are mentioned. 9) Line 193: the text should be explicit about the reasons why two different resolutions are adopted for cloudy and clear cases. This is customary in other codes. 10) Lines 240-241: the "level 1 data analysis code" component of the OSS is never cited before and if it is part of the OSS seems redundant to mention it here. Is this the same "level 1 module" as the one mentioned in line 273? 11) Line 281: How is the resampling performed? Can the text provide some more details on this? 12) Lines 355-356: the text does not explain why the choices made in the simulations "do not have an impact on the assessment of the performances of the FEES". Those performances (as shown later in the paper) relate indeed also to the degree of inhomogeneity of the FoV. 13) Lines 384-390: The nomenclature of some of the variables used in the retrieval process is confusing. Is one case the background term in the OE equations and the other one the first guess? The authors should clarify the nomenclature of the different terms ("initial condition" and "a-priori" vs "first guess" and "background") and there should be no ambiguity between the two. 14) Lines 391-392: This is not entirely accurate, In cases where the surface T and emissivity are retrieved simultaneously, the regularization of the two simultaneously could easily bring the retrieval to a local minimum if not properly regularized. The best choice of the first guess value for emissivity is indeed the background in real-data applications (as it comes from climatology), however the regularization of the combined retrieval of emissivity and surface temperature is a separate problem. 15) Lines 394-400: as it is evident from this description and from the references contained, the KLIMA FM is heavily based on the same linelist and continuum used by LBLRTM, and it works with the same line-by-line approach. The SGM itself uses LBLRTM. Since this part of the work describes the comparison between the KLIMA and L2M performances, it would be useful to also state what are the key differences between KLIMA FM and the way LBLRTM is used in the SGM, as this gives the full perspective about which components of the two codes are effectively independent. 16) Lines 433-434: the retrieved emissivity seems to have a lot of unphysical structures (on its sampling of 5 cm-1) which is problematic as it could, in principle, bias the retrieval of atmospheric parameters. The text should discuss if such bias have been observed and if they are significant, and ideally include this aspect among the directions to further improve retrievals in sight of FORUM operational retrievals. 17) Lines 503-506: This part of the text is contradictory, as it says that the objective is "to perform a realistic validation" and then it says "we cannot truly speak of validation". Also, the text does not fully explain why the atmospheric parameters are perturbed and not retrieved and why should this grant a realistic validation. 18) Line 523: In an Optimal Estimation approach, it is customary to internally normalize the matrices and arrays such that the value of each parameter is comparable to the others. Therefore this point is not necessarily granted. 19) Line 524: to prove this point it is necessary (and very informative to the readers, given the novelty of the spectra interval as well) to show a full a-posterior covariance matrix resulting from the analysis for this case. 20) Lines 562-563: This is not shown anywhere in the section, and it should be rephrased or not mentioned. 21) Figure 16: it would be very useful to indicate the boundaries of the cloud in the plots (e.g., with a gray patch or horizontal lines). 22) Lines 645-646: While this is certainly true, the emissivity exhibits much lower variability than other cases (e.g. a scene with sand + canopy). A much more challenging test would include a scenario like this and would give valuable information on the validity of the scheme. Have the authors considered to include this in the analysis? 23) Lines 647-648: is the surface temperature retrieved? Can it be added to Figure 19 (e.g. as a marker on the bottom x axis)? Also, are the emissivity fractions assumed as apriori input in the retrieval? 24) Figure 18: Using a light grey for the errorbars would significantly improve the readability of the figure. 25) Figure 20: at which wavelength is the OD represented? (perhaps 900 cm-1, but it should be specified) 26) Line 680: the reason for this limitation and how this will be dealt with in an operational context should be specified. 27) Figure 21: like in the previous cases, the difference between true and retrieved profiles compared to the errorbars should be shown, as it is difficult to see here. 28) Line 735-736: as said in one of my previous comments, the text should be more detailed on how to improve retrievals, perhaps gathering the main criticities found during this work in a bullet point list.

Technical comments:
1) There is a series of acronyms not spelled out in the text: OLR (line 74), FEES (line 346, perhaps FORUM End-to-End Simulator?), GN (line 390), NLSF (line 410), and maybe other for which the authors should carefully check the text. Also, the use of "E2E simulator" in a couple of instances should be replaced by "E2ES". 2) Line 38: "absorption" instead of "coefficients". 3) Line 128-130: This passage is a little repetitive, the text can be shortened. 4) Table 5: Is this the total OD or the OD per km? 5) Table 6: The Table could be improved if reformatted with two columns, one for KILMA and one for L2M_I. 6) Line 464: the "substantial features" are usually the Quartz Reststrahlen bands, and this should be explicited. 7) Line 587: The FSI is the instrument. Perhaps the authors mean the OSS?