Reply on RC2

Sincere thanks for the evaluation of this work and your valuable comments and suggestions for improving this manuscript. We carefully considered the concerning points and made efforts to improve the rigor, logic, and clarity of our manuscript titled “A comprehensive geospatial database of nearly 100,000 reservoirs in China”. Here we submit the revised version, which has been modified according to the comments from the editor and reviewers. According to the editor and reviewers’ comments/suggestions, we clarified the manuscript and response letter below regarding the appropriate paragraphs and sections. The major changes that we made in the revised manuscript are summarized as follows:

Sincere thanks for the evaluation of this work and your valuable comments and suggestions for improving this manuscript. We carefully considered the concerning points and made efforts to improve the rigor, logic, and clarity of our manuscript titled "A comprehensive geospatial database of nearly 100,000 reservoirs in China". Here we submit the revised version, which has been modified according to the comments from the editor and reviewers. According to the editor and reviewers' comments/suggestions, we clarified the manuscript and response letter below regarding the appropriate paragraphs and sections. The major changes that we made in the revised manuscript are summarized as follows: (1) To further illustrate the accuracy of the CRD database, we added a validation experiment and followed the same sampling scheme (Create Random sampling Points method) to randomly selected ten sub-basins from the remaining sub-basins, including 1,752 reservoirs. The results were added to the 'Accuracy evaluation of the CRD database' section.
(2) We added one paragraph in the 'Comparisons with other reservoir databases' section to state the contributions of the CRD database. Also, Figure 10 is added to show comparisons between GRanD v1.3, GeoDAR v1.2, GOODD, and CRD in selected regions of China.
(3) We provided the residence time information of reservoirs in the revised manuscript and database and supplemented the 'Methodology' section.
(4) As suggested, we changed the unit of reservoir storage to 'km 3 ', and updated all full names of basins.
(5) We also updated the database simultaneously. Three attributes of river order, discharge, and residence time of reservoirs were added to the revised database. The revised China Reservoir Dataset (CRD v1.1) is publicly available at https://doi.org/10.5281/zenodo.6984619.
We attach the detailed item-by-item response to all comments and suggestions for the evaluation.
Yours sincerely, Chunqiao Song and co-authors

Referee #2:
This manuscript describes the new dataset of most reservoirs in China (China Reservoir Dataset), including the development methodology and the characteristics of the dataset. As the reservoir is important for understanding the water resources and water risk, the constructed database is very useful for many hydrology and climate studies. The manuscript is well designed with clear explanations. I think it can be accepted after some small revisions.
Response: Thanks for the concise summary of this work and the highlights. The concerning points raised by the reviewer are very helpful for us to improve the manuscript. We carefully addressed these points listed below and made changes accordingly.
1. L29: 979.62 Gt I (personally) think "km 3 " is more common as the unit for reservoir storage.
Response: Thanks for the suggestion. The unit for reservoir storage is changed to 'km 3 '. The usage is checked and corrected through the revised manuscript.

L249: water inundation extent
How is the boundary of the reservoir and connecting rivers decided from remote-sensing water extent map? Please explain Response: This is a good question. Most reservoirs are formed by crossing the valley with barrages, intercepting natural river runoff, and raising the water level. Therefore, in determining the boundary between the reservoir and the connecting river, we first roughly identified the width of the upstream channel relative to that before the reservoir was built based on the high-definition images. Then, we used the topographic data to determine where the channel was widened by the water level uplift caused by the dam construction. Finally, for the last section of reservoir filling, we cut off the river that is tapered relative to the width of the river in the reservoir area by manual visual interpretation as the boundary range of the reservoir.

Response:
We are sorry for missing the full term of the SMAPE (Symmetric Mean Absolute Percentage Error). The full term of the SMAPE is added in the revised manuscript. (Line 298-300) "We calculated the SMAPE (Symmetric Mean Absolute Percentage Error) of estimated storage capacity was biased of 32.62-32.64% at the 95% confidence interval based on the fitted model." 4. P303. Figure 3.
Why can we observe some step-wise increase in storage capacity? Please explain. (I guess the effective digits of the storage capacity data, which extent is more continuous).

Response:
If we understand correctly, the question should be related to Figure 2. Figure  2 represents the fitting relationship between small and medium-sized reservoirs' area and storage capacity. The upper and right subplots of Figure 2 correspond to the count of reservoir area and storage capacity values, respectively. According to the concern, we carefully checked the reservoir storage capacity data used for fitting. There are two main reasons for the observed step-wise increase in reservoir storage capacity. On the one hand, we did our best to collect 4,323 recorded small and medium-sized reservoirs to establish the statistical relationship between inundation area and storage to estimate and supplement the capacity estimation of the remaining unrecorded reservoirs. While these recorded data were unevenly distributed across different reservoir levels. For example, there are 903 between 0.0001-0.0002 km 3 (5.00-5.30 log 10 [m 3 ]), accounting for 20.89%. In addition, the scale of variables will be compressed after the logarithm of the original data of storage capacity and the area is taken, making the data more aggregated. On the other hand, as the reviewer guessed, the effective digits of reservoir storage capacity data resulted in the equality of the original continuous data (see Table R1), which then led to the superposition and aggregation of sample points in Figure 2. 5. L326: smaller than 0.01km 2 are complete. This should be "larger than 0.01km 2 ".
Response: Thanks for noting the unclear point. We revised the statement to clarify that if our data for reservoirs larger than 0.01 km 2 are complete, trend lines can be fitted and extrapolated from the Pareto distribution ( Figure 3 in the manuscript) to estimate smaller reservoirs not included in the CRD database.

L349. The main causes of errors
For user's viewpoint, the size of the lakes errors are found is better to be provided. For example, if we know there is almost no error for lakes >10 km 2 , users can safely use the dataset for large-scale studies. error reservoirs in each basin. The results indicate that 67.86% of the commission and omission error reservoirs are less than 0.10 km 2 , and the remaining are between 0.10-1 km 2 . This is because smaller reservoirs are more likely to be missed during manual visual inspection. However, although these validation statistics can be considered a measure of accuracy for our data products, the identified errors in the validation samples have been corrected as far as possible in our new release.
In addition, we have added relevant descriptions in the updated manuscript. (Line 382) "Also, these ponds and paddy fields are generally less than 0.10 km 2 ." 7. L376: YZR (and other abbreviation names) I think you don't have to use abbreviations except for Figures and Tables in the main text. Using full name improves the readability.