the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A hybrid algorithm for ship clutter identification in pulse compression polarimetric radar observations
Abstract. With the rapid development of active-phased arrays and solid-state transmitters, pulse compression technology has become increasingly important. Currently, pulse compression waveforms with peak sidelobe levels better than -50 dB have been developed, enabling the broader application of pulse compression technology in weather radar systems. However, existing sidelobe suppression levels are still insufficient to ensure that radar data quality is unaffected by range sidelobes for ship clutter, which have a high echo intensity and cannot be removed by conventional quality control methods. In this study, we introduce a Hybrid Ship Clutter Identification (HSCI) algorithm to address this issue in pulse compression polarimetric radar observations. The HSCI algorithm comprises two parts: mainlobe and sidelobe identification (including the range and antenna sidelobes). Mainlobe identification uses a random forest model that integrates multiple features to identify the mainlobe of ship clutter. Sidelobe identification uses a series of heuristic criteria derived from the statistical characteristics of ship clutter to distinguish them from precipitation echoes. The analysis results of two typical cases indicate that after implementing the HSCI algorithm, the impact of ship clutter on radar data is visually imperceptible. The statistical results show that the HSCI algorithm achieves a ship clutter mainlobe identification rate of 97.25 % with a misidentification rate of only 0.08 % in the precipitation data. Application of this algorithm to the University of Helsinki C-band dual-polarization Doppler weather radar data successfully reproduced ship tracks in the Gulf of Finland.
- Preprint
(7059 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on amt-2024-194', Anonymous Referee #1, 07 Apr 2025
General Comments
This manuscript presents a Hybrid Ship Clutter Identification (HSCI) algorithm aimed at improving data quality in pulse compression polarimetric radar observations. The problem is timely and operationally relevant, especially as solid-state transmitters and long-pulse waveforms become more common in weather radar networks.
The algorithm is structured in two stages: (1) machine-learning-based mainlobe detection using a random forest classifier, and (2) heuristic sidelobe identification based on empirical analysis of pulse compression and antenna patterns. The authors evaluate the method using C-band radar data from the Kumpula radar and show promising qualitative and statistical performance.
While the contribution is promising, several aspects require major clarification and strengthening before this work can be considered for publication in AMT. The current form lacks sufficient quantitative validation, generalizability testing, and clear discussion of algorithm limitations. These issues must be addressed to ensure scientific rigor and reproducibility.
Major Concerns
-
Lack of Rigorous Validation Metrics
The model evaluation focuses on overall accuracy and a small overlap percentage in histograms. However, these metrics are insufficient for a classification task with imbalanced classes (e.g., 400 vs. 2,500 gates in the test set). The manuscript should report precision, recall, and F1-score, especially for ship clutter, as false negatives can lead to significant data quality issues, and false positives can unnecessarily degrade precipitation data. -
Limited Generalization and Dataset Diversity
The random forest model is trained and tested on data derived from the same radar (Kumpula), location (Gulf of Finland), waveform (LFM), and limited events. There is no evidence that the algorithm generalizes to other waveform types (e.g., NLFM), elevation angles, or environmental conditions (e.g., high sea clutter, near-shore echoes, different clutter types). The authors must either test the model on independent cases or clearly state the generalization limitations. -
Overreliance on Manual Labeling
Both the ship clutter and precipitation echo datasets are manually labeled, and the methodology for doing so is not sufficiently described. This introduces potential biases. What criteria were used to define clutter? Were multiple annotators involved? Was any inter-annotator agreement measured? These questions should be addressed or acknowledged as limitations. -
Sidelobe Suppression Logic May Be Overly Aggressive
The PSD definition and filtering logic—especially the combination of velocity and SNR thresholds—may lead to over-removal of precipitation echoes, particularly in overlap regions. While some case studies suggest selective filtering is achieved, the possibility of precipitation loss is real and must be quantified. For example, is there a statistical estimate of how many precipitation gates were removed in mixed scenes? -
No Independent Test Dataset or Cross-validation
The authors should demonstrate model robustness through k-fold cross-validation or by holding out an entire day or event as an independent test case. Without this, it is difficult to assess whether the model is overfitting or merely capturing spatiotemporal autocorrelation patterns. -
Lack of Code or Reproducibility Path
For a method combining machine learning and empirical filtering, reproducibility is essential. At a minimum, a flowchart covering the entire algorithmic sequence, and pseudocode or a link to a repository, should be provided. Currently, implementation details are scattered and would be difficult for others to reproduce.
Minor Comments
- Feature Descriptions (Sect. 3.1.2): Clarify if all features are used as-is or normalized/scaled. Include distributions (box plots or ranges) for ship vs. precip echoes for added transparency.
- Figures 6 and 9: Use larger labels and more distinct color schemes. Consider adding contour lines for better interpretability.
- Discussion Section: Expand on the feasibility of signal-level suppression in commercial processors (e.g., how would this be integrated in RVP900 or similar systems?). A brief note on real-time considerations would also help.
Typographic Fixes:
- Line 134: “efficiency is exceedingly low” → consider “computational efficiency is low”
- Line 373: Remove extra parenthesis in “Palmer et al., 2023))”.
Citation: https://doi.org/10.5194/amt-2024-194-RC1 -
-
RC2: 'Comment on amt-2024-194', Anonymous Referee #2, 11 Apr 2025
The paper presents a novel method for detecting the mainlobe of ship clutter using multiple features and machine learning, instead of relying on a single parameter. While the manuscript is well-organized and generally easy to follow, it contains several significant issues that should be addressed before publication.
Major revisions:
- Precipitation Dataset Clarification (Line 110): The manuscript mentions that the precipitation dataset was manually extracted. However, it is unclear why this was necessary, especially since the algorithm is intended to work under both clear-sky and rainy conditions. If the algorithm is designed to perform differently depending on weather conditions, this distinction should be clearly stated and justified in the paper.
- Threshold Justification (Line 136): The choice of a 20 dBZ threshold is not explained. Where does this value come from? Additionally, does ship clutter with lower reflectivity (e.g., 15 dBZ) not affect data quality? This requires clarification.
- Manual Selection of Sidelobe Region (Line 244): The sidelobe region is manually defined within a 13.5 km radial range and 15° in the tangential direction. The origin and justification for these values are not provided. Were these same values used later in operational scenarios? Furthermore, what automated method replaces the manual process in operational applications?
- Case Study Placement (Section 3): Two case studies are introduced in the ‚method’ section. However, they seem more appropriate for Section 4, where other case analyses are presented. Additionally, the purpose of second case (Fig. 8) is unclear—what characteristics is it meant to highlight, and why was it chosen?
- Discussion Section Issues: The final paragraph of the discussion section appears to be more suitable for the introduction or summary. Once this paragraph is removed, the discussion becomes quite short. It would benefit from more in-depth coverage of how the algorithm operates in operational case—especially regarding the parts that were handled manually (e.g., sidelobe removal). Does the algorithm only remove sidelobes near the mainlobe, or is there a more general solution? These points should be explicitly addressed.
- Operational Performance for Winter Events: Since the algorithm is intended for operational use, it is strongly recommended to include at least one winter precipitation case to demonstrate its robustness in varying weather conditions.
Minor correction:
- Figure 3 is explained in the introduction but appears on page 4. Consider repositioning it closer to the relevant text.
- In several instances, figures are referenced before they are introduced or explained (e.g., Line 321: “This selective filtering approach is demonstrated in Figs. 11b and 12d…”). However, Figure 12 is not described until Line 345. This disrupts the flow and may confuse readers. Please revise for consistency.
- Fig 12c: it is not clear if the plot shows the difference ‚before-after‘ filtering or vise versa
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
107 | 31 | 8 | 146 | 7 | 6 |
- HTML: 107
- PDF: 31
- XML: 8
- Total: 146
- BibTeX: 7
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1