Reply on RC1

The paper untitled « Assimilating realistically simulated wide-swath altimeter observations in a high-resolution shelf-seas forecasting system” present a very interesting and innovative study using state of the art methods and models and clearly presented. Main objectif of the study is to prepare assimilation of future wide-swath altimetry observation from SWOT satellite and to quantify impact and expected improvement of this future mission. Authors address this problem in a realistic operational high resolution ocean forecasting system and using an OSSE protocole perfectly defined and justified in term of ocean processes represented in the model and adequation with the model resolution, observations assimilated in the system complementarity of the observation data set and state of the art data assimilation method fully validated and used in an operational context.

don't provide more information that the bias or RMSE. But I think that analysis increments provide important information to fully understand how the data assimilation scheme work. I expect that spatial scale of increments should be different and could be illustrated on figure 7 for example. Increments could be also useful to illustrate the discussion in section 5.2.2 and/or 5.2.3 on the improvement/degradation of the solution depending of assimilated observations.
Our analysis focussed on the bias and RMSE as with an OSSE we have the truth everywhere and so comparing the bias and RMSE between experiments gives a clear indication of the benefits/detriments of changes due to the observations assimilated. However, we agree that the analysis increments are useful in understanding how the assimilation scheme works and so we have updated Sections 5.1 & 5.3 to include a figure (new Figure 9) and discussion on the increments in the Control experiment compared to the LowErrSWOT experiment.
One of the objectives of assimilating SWOT observations is to constrain smallscale structures in the ocean. This is not address in the paper (expect remark at the end of 5.3 without any illustration or explanation), the authors don't present any results illustrating the impact on meso scale structures in the different simulations or a spectral analysis presenting differences in term of energy between all the experiments. I understand that this is not the aim of the paper which is really focuses on the different sources of errors in the SWOT observations and especially the very important topic related to uncorrelated errors. I recommend adding at least a paragraph in the discussion section on the impact of SWOT on mesoscale structures and a perspective on this topic in the conclusion. Ideally, the authors will add a subsection in chapter 5 for example in section 5.1 about SSH.
As you say, the main aim of this paper was to investigate the limitations posed by correlated errors on the impact of SWOT. We intend to explore the use of a power spectral analysis in upcoming global OSSEs where the limited size of this regional model domain would not be an issue. However, the figure showing example daily increments, added in response the first comment, demonstrates that the assimilation scheme attempts to add more small-scale structures when swath altimetry is included. Additionally, in Section 5.3 we have attempted to demonstrate qualitatively the impact of assimilating SWOT observations on the surface currents. For example, Figure 15 shows improved errors in the surface currents at large-and small-scales suggesting the mesoscale structures are better initialised when assimilating the SWOT observations without correlated errors. Sections 5.1 & 5.3 and the discussion section have been updated to discuss this in more detail.
The experimental protocol is well described and fully justified, especially with regard to how SWOT errors are represented in the system and the impact of these errors when the data are assimilated. The authors should provide recommendations in the discussion or conclusion section about how SWOT data should be post processed for an optimized use in data assimilation scheme. How could correlated errors be removed or reduced? Is the HalfSWOT or the 5km and 20km filtering solution a recommendation or a haddock solution? Is it realistic to expect that only kaRIn error will remain?
We have shown that using the inner half of the swath and median-averaging with a 5km radius can reduce many of the problems caused by the correlated errors in the simulated SWOT observations. However, these choices will not be optimal for all systems; the precise level of averaging will depend on the resolution of the model used. Although some pre-processing by the data providers may be possible, we are not the experts in this area and the current literature leads us to believe that correlated errors will be a significant issue when using near-real time swath altimetry from SWOT in operational systems. As mentioned in our discussion, we expect methods to account for correlated errors in the assimilation scheme will be necessary to fully exploit the observations from SWOT. Our discussion and conclusions have been updated to emphasise these aspects.

In Chapter 1 : Introduction. Authors could add a citation of recent publication Benkiran et al 2021 "Assessing the Impact of the Assimilation of SWOT Observations in a Global High-Resolution Analysis and Forecasting System Part 1: Methods"
References to this recent paper and the associated Part2 paper have been added. Thank you for the suggestion.
In section 3.1.1. the authors comment on an important point regarding the differences between nature run and free run, in the OSSE protocol it is important to understand these differences and how the data assimilation scheme will move the model on another trajectory. In this section it is not clear why there is systematic cold and fresh bias. Is there a mistake in the explanation "due to broadly similar irradiative fluxes between the atmospheric forcing datasets". Is there a systematic bias between the two atmospheric forcings used in the experiment for the wind? the heat fluxes? The paper doesn't address the question of whether this systematic bias between nature run and free run have an impact on the results? Could you expect different impact on the sea level analysis in a unbiased system? The authors don't provide an OSSE calibration, comparing SLA differences between nature run and free run and what could be obtain in a real case assimilating real data. This is recommended to understand if in the OSSE experiment the data assimilation scheme will work as in a real case. I suggest to provide on fig 2 an additional map showing the classical SLA increment obtained in the operational system.
The different atmospheric forcing is one of the main methods we have used to introduce realistic differences between the Nature Run (NR) and OSSE runs. Both forcing sets are high-quality atmospheric forecasts. The difference between these forcing datasets is a fair reflection on the uncertainty in the true atmospheric forcing. The use of different surface forcing leads to changes in the ocean model mixed layer depth, which is the cause of the initial cold bias in the Free Run compared to the NR. This is readily corrected by SST assimilation (as shown in Figure 11). However, the initial fresh bias is primarily due to the different initial conditions used in the NR and OSSE runs. Again, this was deliberate to ensure the NR and OSSE runs had differences which reflect those between the real-world and our forecast systems. Both initial conditions used came from assimilative runs at the correct time of year and the differences reflect the uncertainty given the relative lack of sub-surface observations. Section 3.1.1 has been updated to clarify this.
To better understand if the SLA differences between the nature run and OSSE runs are similar to what might be expected in an operational system, we have compared the mean and RMS of the SSH increments from the Control and LowErrSWOT experiments with those from a separate experiment assimilating real observations in the same model and over the same time period. We found the bias were near zero in all cases and the RMS of the SSH increments was 1.21cm in the Control, 1.19cm in our experiment assimilating real observations, and 1.63cm in the LowErrSWOT experiment when simulated SWOT observations were included. We believe this demonstrates that our Control run is applying similar increments to an operational system assimilating real observations, and the simulate SWOT observations allow more of the SSH variability to be observed and assimilated. Section 3.1.1 has been updated to discuss this.

In section 4.2. It might be useful to provide a brief definition of each error and comment each figure 5 from a) to f). Could the authors provide more information on the following remark "The length-scale of these correlations can also be of the same order as the size of the domain". Is it something deduced from one of the figures? .
A brief description of each error has been added as suggested. Our comment on the length-scales was deduced from the figure -this has been clarified in the text.

One important difference between Control run, SWOT and halfSWOT run is the number of sla observations in the system during each data assimilation cycle. The authors don't provide any information on the number of observations assimilated during an assimilation cycle and the expected impact when the data assimilation scheme assimilate half the observations.
We have updated Sections 4.1 & 4.2 with details on the average number of each observation type assimilated in each assimilation cycle. Each day, there are approximately 10^5 SST observations, 10 T/S profiles with ~1000 total observations, a few thousand nadir SLA observations, and 10^5 SWOT observations. The inclusion of SWOT therefore approximately doubles the total number of observations assimilated.
The number of iterations used (40) to minimise out 3D-var cost function was sufficient to reach a similar level of convergence in all our experiments, and so we do not expect the quantity of observations in itself to be a factor in the resulting impacts. Rather we have shown the impact of the increased spatial coverage from SWOT. The differences between the HalfSWOT and SWOT experiments (as discussed in Section 5) balances the effects of reducing the spatial coverage of swath altimeter observations while at the same time removing those with the largest correlated errors. The RMSE values here have been calculated in model space over the full domain (and for the on-and off-shelf regions marked in Figure 1). This has been clarified in the table caption and in the main text in Section 5.

In table 3, it is unclear how RMSE is computed. Is it computed in the
In section 5.1, Authors noted considerable seasonal variation in the off-shelf SSH RMSE, but high frequency variability is even higher and not mentioned.
In Section 5.1, we had addressed the high-frequency RMSE variation seen in the on-shelf region, but had not commented specifically on the off-shelf RMSE. To a lesser extent, high frequency RMSE increases are also seen in the off-shelf RMSE time-series. This may be due to the misplacement of eddies: due to the distance and time between SWOT swaths, much of the mesoscale structures will still be unobserved. We have updated Section 5.1 to discuss this.    The table has been updated as the units were not displayed very prominently.

Chapter 5. Section 5.1. Why don't the authors keep the same section structures with a separation between on and off-shelf for each variables?
There is only one subsection in Section 5.1 For SSH and surface currents although the time-series separates the on-and off-shelf regions, the maps show the impacts over the full domain. We therefore felt that the onand off-shelf regions could be best described in one section. For temperature and salinity, the additional subsections were used to make the discussion easier to follow given the number of figures involved.  Corrected.