the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Mid-Atlantic Nocturnal Low-Level Jet Characteristics: A machine learning analysis of radar wind profiles
Abstract. This paper introduces a machine-learning-driven approach for automated Nocturnal Low-Level Jet (NLLJ) identification using observations of wind profiles from a Radar Wind Profiler (RWP). The work discussed here is an effort to lay the groundwork for a systematic study of the Mid-Atlantic NLLJ’s formation mechanisms and their influence on nocturnal and diurnal air quality in major urban regions by establishing a general framework of NLLJ features and characteristics with an identification algorithm. Leveraging a comprehensive wind profile dataset maintained by the Maryland Department of Environment’s RWP network, our methodology employs supervised machine learning techniques to isolate the features of the south-westerly NLLJ. This methodology was developed to illuminate spatiotemporal patterns and nuanced characteristics of NLLJ events, unveiling their significant role in shaping the planetary boundary layer. This paper discusses the construction of this methodology, its performance against known NLLJs in the current literature, intended usage, and a preliminary statistical analysis. First light results from this analysis have identified a total of 90 south-westerly NLLJs from May–September of 2017–2021 as captured by the RWP stationed in Beltsville, MD (39.05°, -76.87°, 135 m ASL). A composite of these 90 jets is presented to better illustrate many of the bulk parameters, such as core height, duration, and maximum wind speed, associated with the onset and decay of the Mid-Atlantic NLLJ. We hope our study equips researchers and policymakers with further means to monitor, predict, and address these nocturnal dynamics phenomena that frequently influence boundary layer composition and air quality in the U.S. Mid-Atlantic and Northeastern regions.
- Preprint
(5506 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on amt-2024-37', Anonymous Referee #2, 16 May 2024
General comments:
The document addresses scientific issues that are relevant to AMT, provided that some elements are clarified.
The scientific methods and assumptions seems valid but not clearly outlined.
The results are not sufficient to support the interpretations and conclusions.
The description of experiments and calculations is not sufficiently complete and precise to allow their reproduction by fellow scientists .
The authors give not proper credit to related work but clearly indicate their own contribution.
The title clearly reflect the contents of the paper.
The abstract provides a concise and complete summary.
The overall presentation is structured and clear.
The language is fluent and precise.
Any parts of the paper (figures) should be clarified.
The number and quality of references are not appropriate.
This article deals with a technique for identifying NLLJs using machine learning. Has machine learning already been used in this type of identification or is it an innovation? This should be specified and references introduced if not.
Automatic identification of NLLJs has already been developed using physics-based algorithms, so why use machine learning when these algorithms have already proved their worth ? A paragraph could be added to discuss these aspects.
Many references do not appear in the bibliography (e.g. Delgado 2013, Weldegauber 2009, Caroll 2021, ...) while others present in the bibliography are not cited in the text (e.g. Bonner 1968, Dejong 2024, Doubler 2015, Hu 2013a & b, Karipot 2009, Liang 2018, Lima, 2018 & 2019, ...). Please, review the entire bibliography. As it stands, it is impossible to verify the veracity of everything written.
The context is well presented but is repeated many times in the text (for example : NLLJs are noctural events).
As far as the figures are concerned, each of the panels should be identified by a letter and the captions should be more detailed. The legends are not complete.
There is not enough detail on how the algorithm works to reproduce it.
Specific comments :
L.27 "noctural events" already introduced in L. 26.
L.37-39 Wind speed does not decrease as far as the free troposphere in Figure 2.
L.54 LLJs not defined
L.61 NLLJ already defined
L66. MD not defined
L.75 This paper is not based on a radar network but on a single radar forming part of a network.
L.78 identified by which radar?
Section 2 contains a single sub-section 2.1 and an excessively long introduction. Please, fix this section.
L.84 - 85 This paper is not based on a network of radars but on a single radar forming part of a network.
L.85 Is there a reference with the full characteristics of this RWP?
L 86. MD not defined
Figure 1: Is it necessary to indicate the location of the Cambrigde and Cumberland RWPs? If so, highlight the location of the Beltsville RWP.
L.94 What is the value of the wind speed identified by sonde?
L.95 NOAA NCEP not defined.
L.95 Unlikely date and please add time.
L.96 Wind speed measured by sonde not known. The measured wind profile could be added to figure 1 in order to clearly see the jet.
L.99-110 This part should be included in the previous one, some things are repeated. It is not about data or method.
L.112 Figure 5 is too far from this paragraph.
L.116 Why are some data not available?
L.117 Are we to understand that these are daily files? This is not specified.
L.120 Only one sub-section follows, not several.
Figure 2(B) SPD not defined, present the curves in the legend to Figure 2. The figure should be centred on the NLLJ event in order to show its development clearly (from 20:00 UTC on 19 May, for example).
L.123-124 Insert a reference.
L.130 What time does night begin?
L.133 -134 repetition of encountered
L.137 what does a file represent? The number of columns and rows is irrelevant.
L 138. Section 2.2 is missing.
Section 3 should be expanded to provide a better understanding of how the algorithm works by focusing on its more detailed relevant phases. For example, section 3.1 could be the subject of a section in its own right, giving step-by-step details of how the algorithm works. Presented like this, it is difficult to understand clearly how it works. Note that there are no references in this key paragraph. At the end of the paragraph, the test results are not clear enough to be properly understood.
L.147 Detail this analysis.
L.148 Why use radial speeds when wind strength and direction are already taken into account? Why use the signal-to-noise ratio, what does it provide?
Figure 3, some elements are illegible and some acronyms are not defined.
Section 3.2 already contains results and could be introduced in section 4. In addition, this section only focuses on 2 cases, which does not seem sufficient to properly qualify the algorithm's performance. A more detailed study would enable us to test it more thoroughly by comparing the NLLJs identified by the algorithm with all the NLLJs identified by the manual method in a year other than the one used for training (the test set already identified, for example). This would make it possible to better characterise the algorithm's shortcomings, by quantifying the number of false events not taken into account, the % of missing data on average per event, etc. Without this kind of statistic, no conclusions can be drawn.
Simple post-processing could be used directly to eliminate outliers if no neighbours are present and to include all the data between the ground and the jet.
Figure 4: Each sub-panel should be indicated by a letter. Perhaps this figure should be split into two separate figures focusing on each event. As with figure 2, the events should be centred to better see their development.
L.218-220 Without seeing the winter months, such a conclusion cannot be drawn.
Please show the missing months in Figure 5. In addition, in Figure 5, it can be seen that May contains the most events and not June, July or August.
L.222-224 include references.
L.226 The year 2017 contains more events than 2019.
L.238-239 Give examples of synergy and cite references.
L 246 Core time not defined
L.247 Replace all « m/s » with m.s-1
L.257-259 What percentage of events does this represent? Using simple post-processing, why not exclude this erroneous data?
Section 4.3 As NLLJs are nocturnal events and sunset times vary according to the season, the data should be standardised according to these times. Otherwise, the morphology of NLLJs could not be correctly presented.
L.276 EDT is local time? Mention this earlier in the article and add sunset and sunrise hours on all figures.
Section 5: Some conclusions may need to be modified in the light of the above changes.
L.333 I-95 corridor not defined
L.346 delete « also »
Citation: https://doi.org/10.5194/amt-2024-37-RC1 -
AC1: 'Reply on RC1', Maurice Roots, 12 Sep 2024
RC1 (Referee #2):
General comments:
The document addresses scientific issues that are relevant to AMT, provided that some elements are clarified. The scientific methods and assumptions seems valid but not clearly outlined. The results are not sufficient to support the interpretations and conclusions. The description of experiments and calculations is not sufficiently complete and precise to allow their reproduction by fellow scientists. The authors give not proper credit to related work but clearly indicate their own contribution. The title clearly reflect the contents of the paper. The abstract provides a concise and complete summary. The overall presentation is structured and clear. The language is fluent and precise. Any parts of the paper (figures) should be clarified. The number and quality of references are not appropriate.
• Thank you for pointing out the need for clarification on our methods and assumptions for reproducibility by other researchers. A further discussion of related works is indeed relevant and pertinent, and we have increased our literature review in the introduction. Our refined results yield sufficient interpretations and conclusions.
This article deals with a technique for identifying NLLJs using machine learning. Has machine learning already been used in this type of identification or is it an innovation? This should be specified and references introduced if not.
• To our knowledge at time of submission and exploration of this work. We did not find any NLLJ isolation method that utilized supervised machine learning in wind profiles. Thus, this work is an innovation to explore the use of machine learning for this task. To further provide context for this effort, please see the revised introduction and section 3.
Automatic identification of NLLJs has already been developed using physics-based algorithms, so why use machine learning when these algorithms have already proved their worth ? A paragraph could be added to discuss these aspects.
• Several previous works have been published regarding the identification of Low-Level Jets in wind profiles. These methods have employed peak detection of wind speed maximums in single profiles with threshold criteria on coherent height, speed, direction, and duration. These methods are robust in their objectives of identifying continuous low-level wind maxima. However, for our objective of detecting solely a specific type of low-level wind maxima for the Mid-Atlantic region we have explored the use of supervised machine learning. Our study region encompasses complex terrain, with mountainous terrain towards the west and coastal plains towards the east. Our overall goal is to construct a fully automated machine learning algorithm for the network of wind profiles that is adept at isolating, classifying, and characterizing, low-level wind regimes and thus we report our exploration of supervised machine learning for this task of isolating NLLJs. We see this manuscripts as the first logical step on publishing our fundamental methods and we expect to entrain other data resources in later versions of this process in subsequent manuscripts.Contrary to work published by other researchers, the conceptual model of our detection method presented here relies on single measured points in vertical and temporal space that with the multiple dimensions of the dataset [wind speed (SPD), wind direction (DIR), radial velocity (RAD 1-5), and signal-to-noise ratio (SNR 1-5)]. This was attempted in order to mask the profiles for solely NLLJ activity, regardless of issues in data gaps in the profile. \
Many references do not appear in the bibliography (e.g. Delgado 2013, Weldegauber 2009, Caroll 2021, ...) while others present in the bibliography are not cited in the text (e.g. Bonner 1968, Dejong 2024, Doubler 2015, Hu 2013a & b, Karipot 2009, Liang 2018, Lima, 2018 & 2019, ...). Please, review the entire bibliography. As it stands, it is impossible to verify the veracity of everything written.
• Thank you for pointing this out. This is an oversight on our part from multiple versions of the manuscript. We present an updated references list in the revised manuscript.
The context is well presented but is repeated many times in the text (for example : NLLJs are noctural events).
• This too is an oversight on our part resulting from multiple versions of the manuscript. We have removed these unnecessary repetitions in the revised manuscript.
As far as the figures are concerned, each of the panels should be identified by a letter and the captions should be more detailed. The legends are not complete.
• We appreciate this comment to improve the clarity of our discussion. All of our panel figures have been updated to include lettering in the figure itself and where mentioned in the text.
There is not enough detail on how the algorithm works to reproduce it.
• We have extended the discussion of how the algorithm is developed. We hope that a longer discussion along with the mentioned packages used will make it easier for other researchers to follow along and be able to reproduce it.
Specific comments:
L.27 "noctural events" already introduced in L. 26.
● Removed
L.37-39 Wind speed does not decrease as far as the free troposphere in Figure 2.
● Clarified – Wind speed decrease with increasing altitude
L.54 LLJs not defined
● Corrected – Low-Level Jets introduced in the introduction
L.61 NLLJ already defined
● Removed
L66. MD not defined
● Corrected – MD introduced in the introduction section as Maryland (MD)
L.75 This paper is not based on a radar network but on a single radar forming part of a network.
● Clarified – This work uses Howard University – Beltsville Campus site which is apart of the Maryland Department of Environment Network of Radar Wind Profilers
L.78 identified by which radar?
● Clarified – In the Observations section we discuss the measurement site (i.e., Howard University Beltsville – Campus site)
Section 2 contains a single sub-section 2.1 and an excessively long introduction. Please, fix this section.
● Clarified – We have converted Section 2 in a discussion of the Observations and Measurement Site. Section 3 then is solely about the NLLJ isolation Algorithm
L.84 - 85 This paper is not based on a network of radars but on a single radar forming part of a network.
● Clarified – This work uses Howard University – Beltsville Campus site which is a part of the Maryland Department of Environment Network of Radar Wind Profilers
L.85 Is there a reference with the full characteristics of this RWP?
● Clarified – We have included the full name and model of the radar wind profiler.
L 86. MD not defined
● Corrected – MD introduced in the introduction section as Maryland (MD)
Figure 1: Is it necessary to indicate the location of the Cambrigde and Cumberland RWPs? If so, highlight the location of the Beltsville RWP.
● Corrected - It is not required the other site be listed.
L.94 What is the value of the wind speed identified by sonde?
● Clarified – We initial used the location of the wind speed maximum by the sonde to choose the pressure level need for plotting the reanalysis wind contour
L.95 NOAA NCEP not defined.
● Removed – We will use ERA5 Reanalysis instead and include a definition
L.95 Unlikely date and please add time.
● Corrected – This is a type and should say May 20, 2021
L.96 Wind speed measured by sonde not known. The measured wind profile could be added to figure 1 in order to clearly see the jet.
● Removed – We now use ERA5 Reanalysis
L.99-110 This part should be included in the previous one, some things are repeated. It is not about data or method.
● Corrected – We have converted Section 2 in a discussion of the Observations and Measurement Site. Section 3 then is solely about the NLLJ isolation Algorithm
L.112 Figure 5 is too far from this paragraph.
● Corrected – We have converted Section 2 in a discussion of the Observations and Measurement Site. Figure 5 will now be Figure 2 located in Section 2 to supplement the discussion of the wind profile observations dataset, and just before the discussion of the training dataset. Section 3 then is solely about the NLLJ isolation Algorithm (e.g. training, development, etc.)
L.116 Why are some data not available?
● Clarified - The grey lines indicate the areas where the BELT daily file was available from the MDE record, while the red lines indicate days that are unavailable because of instrument failure or scheduled maintenance.
L.117 Are we to understand that these are daily files? This is not specified.
● Clarified - The grey lines indicate the areas where the BELT daily file was available from the MDE record, while the red lines indicate days that are unavailable because of instrument failure or scheduled maintenance.
L.120 Only one sub-section follows, not several.
● Removed
Figure 2(B) SPD not defined, present the curves in the legend to Figure 2. The figure should be centred on the NLLJ event in order to show its development clearly (from 20:00 UTC on 19 May, for example).
● Clarified - [wind speed (SPD), wind direction (DIR), radial velocity (RAD 1-5), and signal-to-noise ratio (SNR 1- 5)].
L.123-124 Insert a reference.
● Clarified – To gather a suitable dataset for machine learning we have compiled scenarios expected in operation (i.e. incomplete daily files, missing data, large scale weather systems, etc.).
L.130 What time does night begin?
● Clarified – Sunrise and Sunset time have been added to our figures as vertical dashed lines.
L.133 -134 repetition of encountered
● Removed
L.137 what does a file represent? The number of columns and rows is irrelevant.
● Clarified - These RWP instruments measure the radial velocity of wind from one zenith and four azimuthal beams at 915 MHz. These are used to calculate the horizontal speed and direction with sub-100-meter vertical resolution (100 m – 3000 m AGL) at a sub-30-minute temporal resolution. Each file contains a continuous daily profile of [Wind Speed (SPD), Wind Direction (DIR), Radial Velocity (RAD 1-5), Number of Profiles to Consensus (CNT 1-5), Signal-to-Noise Ratio (SNR 1- 5)].
L 138. Section 2.2 is missing.
● Removed
Section 3 should be expanded to provide a better understanding of how the algorithm works by focusing on its more detailed relevant phases. For example, section 3.1 could be the subject of a section in its own right, giving step-by-step details of how the algorithm works. Presented like this, it is difficult to understand clearly how it works. Note that there are no references in this key paragraph. At the end of the paragraph, the test results are not clear enough to be properly understood.
● We will extend the discussion of how the algorithm is developed. We hope that a longer discussion along with the mentioned packages used will make it easier for other researchers to follow along and be able to reproduce it.
L.147 Detail this analysis.
● Clarified – The analysis description and definitions are expounded
L.148 Why use radial speeds when wind strength and direction are already taken into account? Why use the signal-to-noise ratio, what does it provide?
• Clarified – We used the radial speed as well as the speed and direction to supply higher dimensional analysis to the algorithm. When studying the covariance matrix, it appears to be relevant to the detection but does not create collinearity. Previous researchers in our group have used the signal-to-noise ratio from radar wind profilers to demonstrate boundary layer height detection and thus we supply this as well. However, the signal-to-noise ratio of each beam does create collinearity as it is dependent on the thermodynamic profile of the atmosphere which does not change much with direction in the field of view of the instrument. Thus, we take only the average signal-to-noise ratio of all five beams.
Figure 3, some elements are illegible and some acronyms are not defined.
● Clarified – The figure is adjust for height resolution and structural clarity
Section 3.2 already contains results and could be introduced in section 4. In addition, this section only focuses on 2 cases, which does not seem sufficient to properly qualify the algorithm's performance. A more detailed study would enable us to test it more thoroughly by comparing the NLLJs identified by the algorithm with all the NLLJs identified by the manual method in a year other than the one used for training (the test set already identified, for example). This would make it possible to better characterise the algorithm's shortcomings, by quantifying the number of false events not taken into account, the % of missing data on average per event, etc. Without this kind of statistic, no conclusions can be drawn.
• Clarified – These is no absolute truth value for where an NLLJ is present in the wind profiles. This is why we rely on the algorithm’s performance testing to be qualitative by previously reported and depicted by Delgado et al. (2013), Weldegauber (2009), and Sullivan et al. (2017) that were captured by the same instrument and station (i.e., BELT RWP). We also include a figure of the confusion matrix of the algorithm training and testing instead of simply listing its testing results like previous.
Simple post-processing could be used directly to eliminate outliers if no neighbours are present and to include all the data between the ground and the jet.
• Clarified – This was mentioned in the manuscript as an option for actual use of the algorithm. However, we decided to leave these errors in so we may discuss its shortcoming and show them visually in isolation and how this will propagate into the later analysis if not removed.
Figure 4: Each sub-panel should be indicated by a letter. Perhaps this figure should be split into two separate figures focusing on each event. As with figure 2, the events should be centred to better see their development.
• We appreciate this comment to improve the clarity of our discussion. All of our panel figures have been lettered in the figure itself and where mentioned in the text.
L.218-220 Without seeing the winter months, such a conclusion cannot be drawn.
• Clarified – We only seek to search for the warm-season Mid-Atlantic NLLJ as we are primarily interested on its implications towards air quality. As we develop this approach further, we plan to expand into the full year, but many of those low-level wind maxima are of a different classification.
Please show the missing months in Figure 5. In addition, in Figure 5, it can be seen that May contains the most events and not June, July or August.
• Clarified – See above response
L.222-224 include references.
● Corrected
L.226 The year 2017 contains more events than 2019.
● Corrected
L.238-239 Give examples of synergy and cite references.
● Clarified – The synergy we seek is from all available observational datasets in our study region (e.g., Aerosol and Ozone Lidars, Sondes, AERONET, Pandora) along with Reanalysis Datasets.
L 246 Core time not defined
● Clarified – The analysis description and definitions are expounded
L.247 Replace all « m/s » with m.s-1
● Corrected
L.257-259 What percentage of events does this represent? Using simple post-processing, why not exclude this erroneous data?
● Clarified – This was mentioned in the manuscript as an option for actual use of the algorithm. However, we decided to leave these errors in so we may discuss its shortcomings and show them visually in the isolation and how this will propagate into the later analysis if not removed.
Section 4.3 As NLLJs are nocturnal events and sunset times vary according to the season, the data should be standardised according to these times. Otherwise, the morphology of NLLJs could not be correctly presented.
● Corrected – Thank you for this insight. We have adjusted the plotting to be centered around scaled around sunset and sunrise.
L.276 EDT is local time? Mention this earlier in the article and add sunset and sunrise hours on all figures.
● Corrected – EDT stands for Eastern Daylight Time and will be mentioned before the Figures.
Section 5: Some conclusions may need to be modified in the light of the above changes.
● Noted – The conclusions are update with discussion from the adjustments to the figures
L.333 I-95 corridor not defined
● Clarified – I-95 refers to the U.S. Interstate 95 highway which spans the Eastern Coast of the United States. The I-95 corridor refers to the portion of this highway that spans the most populated portion of the United States – Washington, DC to New York City, New York.
L.346 delete « also »
● Corrected
-
AC1: 'Reply on RC1', Maurice Roots, 12 Sep 2024
-
RC2: 'Comment on amt-2024-37', Anonymous Referee #1, 13 Jul 2024
The study described in the manuscript is the development of an ML-based algorithm for the identification of nocturnal low level jet events in Beltsville, MD from radar wind profiler data. The wind profiler data is used to identify the events and to characterize the wind characteristics and seasonality of events at this location. There are some major issues with the manuscript that I explain below related to the articulation of the need for the ML-based algorithm, the methods used for evaluation of the algorithm and finally with the claims made in the summary that are not based on findings of the study. Based on these issues I recommend publication of this manuscript only after the major revisions detailed below.
Major Issues:
1) Motivation of the need for a new NLLJ identification algorithm - The manuscript quite clearly cites the previous studies that have examined NLLJs in the Mid-Atlantic and explains how the events are identified. These events are used as part of the evaluation of the ML-based algorithm developed in the study. If there is a robust enough method to identify these events, robust enough to be used to evaluate the ML model(s), why is there a need for an ML-based identification method at all? Please articulate this, that is, what the benefit of an ML based algorithm is and why the present method is inadequate.2) The method for the identification of the events for training and then for evaluation is not well explained in the manuscript as written. Based on what is written, it appears that a year's worth of RWP data and a pre-defined set of NLLJ events were used as training, and then the evaluation was done based on a subjective "by eye" examination of a selection of events found in the literature. The algorithm is then put to use for the 2017-2021 period and the events' wind speed characterized. This is not a robust training and evaluation method and should be improved before the study is published.
3) The manuscript's introduction contains a description of LLJ events and their characteristics from the literature that include intertial oscillations, temperature profiles and wind characteristics, as well as the influence of these events on the local atmosphere. Both the introduction and the summary refer to the study as characterizing NLLJ events and helping to understand them, but the characterization here is limited to wind characteristics. I recommend the use of some auxilliary dataset (perhaps a reanalysis) to characterize the events properly once they are identified by the algorithm.
Line by line:Line 46 - "It is believed that the mid-Atlantic NLLJ is akin to the SGP NLLJ..."
Need a reference here or say that you will show this here.Line 96 - "...clear disagreement..." - what clear disagreement is being referred to? between what?
Line 98 - NARR is not an operational model - its an analysis (reanalysis).{Fig 1 - what is the shading? 900 mb wind speed from NARR? Also - manuscript says for the case of may 20, 2024 and the figure says "composite" - what is plotted?}
Line 114 - This section talks about Figure 5 (before any manuscript mention of figs 2-4) - please reorder the figures to be consistent with the order they are referred to in the manuscript
Lines 111-121 - should move to inside section 2.1.
Lines 127-129 - There is not enough explanation here about the "inflection points", what they are and why they are important. I assume this detail is included in Zhang et al 2006, but some more of the detail is needed here.
Line 154 - "...visually conceptualized in Figure 3." How/why is the data pre-processing included in the algorithm execution loop? is this done more than once?
Line 176 - Please explain what an f1 macro test is, what the scores mean, and how this was evaluated. Alternatively, remove this statement.
Line 186 - "...more than satisfactory..." is not quantifiable, particularly when the algorithm testing is by visual inspection (of 50 cases used for training or for all the cases identified in fig 5?).
Line 211 - Why does the present study not include the "ongoing model refinement"?line 315 - The connection to synoptic situation not established in study - connection to season, yes.
line 317 - "..understanding the atmospheric at play..." was not part of the study, and a connection to predictive capability was not established.322 - "critical characteristics"... also not established.
Citation: https://doi.org/10.5194/amt-2024-37-RC2 -
AC2: 'Reply on RC2', Maurice Roots, 12 Sep 2024
RC2 (Referee #1)
The study described in the manuscript is the development of an ML-based algorithm for the identification of nocturnal low level jet events in Beltsville, MD from radar wind profiler data. The wind profiler data is used to identify the events and to characterize the wind characteristics and seasonality of events at this location. There are some major issues with the manuscript that I explain below related to the articulation of the need for the ML-based algorithm, the methods used for evaluation of the algorithm and finally with the claims made in the summary that are not based on findings of the study. Based on these issues I recommend publication of this manuscript only after the major revisions detailed below.
Major Issues:1) Motivation of the need for a new NLLJ identification algorithm - The manuscript quite clearly cites the previous studies that have examined NLLJs in the Mid-Atlantic and explains how the events are identified. These events are used as part of the evaluation of the ML-based algorithm developed in the study. If there is a robust enough method to identify these events, robust enough to be used to evaluate the ML model(s), why is there a need for an ML-based identification method at all? Please articulate this, that is, what the benefit of an ML based algorithm is and why the present method is inadequate.
• Thank you for drawing attention to our oversight in the manuscript. We have addressed a similar concern from Referee #2 (see lines 22 - 25 of this document) and have elaborated on this in the revision, largely in the introduction in the introduction and section 3.
2) The method for the identification of the events for training and then for evaluation is not well explained in the manuscript as written. Based on what is written, it appears that a year's worth of RWP data and a pre-defined set of NLLJ events were used as training, and then the evaluation was done based on a subjective "by eye" examination of a selection of events found in the literature. The algorithm is then put to use for the 2017-2021 period and the events' wind speed characterized. This is not a robust training and evaluation method and should be improved before the study is published.
• This is an important concern that you and Referee #2 both shared (see lines 185 – 190 of this document). To clarify: the training dataset is used for both training and testing, however, our oversight was to not include the results of the testing against the gradient method for concerns that it may be out of the scope of the manuscript and journal. The testing “by-eye” with previously reported and depicted NLLJs is a performance test with the trained algorithm using qualitative inspection by experts on the Mid-Atlantic NLLJ. To which there is no “absolute truth” validation set.
3) The manuscript's introduction contains a description of LLJ events and their characteristics from the literature that include intertial oscillations, temperature profiles and wind characteristics, as well as the influence of these events on the local atmosphere. Both the introduction and the summary refer to the study as characterizing NLLJ events and helping to understand them, but the characterization here is limited to wind characteristics. I recommend the use of some auxilliary dataset (perhaps a reanalysis) to characterize the events properly once they are identified by the algorithm.
• Our contribution to the study of the Mid-Atlantic NLLJ is in two-phases: First – to establish the groundwork for automated NLLJ isolation and preliminary statistics and morphology; Second – To establish a long-term (~20 year) climatology, formation, and impact analysis of NLLJ events in the Mid-Atlantic that combines all available datasets in the study area (see line 215 – 217 of this document). This endeavor is too large to conduct “by-eye” and thus we sought to develop and automated system that will classify wind regimes, and the exploration of machine learning for this task of NLLJ isolation is described in the manuscript.Line by line:
Line 46 - "It is believed that the mid-Atlantic NLLJ is akin to the SGP NLLJ..."
Need a reference here or say that you will show this here.
● Corrected
Line 96 - "...clear disagreement..." - what clear disagreement is being referred to? between what?
● Clarified – Disagreement in the Wind Speeds (Observations vs Reanalysis).
Line 98 - NARR is not an operational model - its an analysis (reanalysis).
● Removed – We now use ERA5 Reanalysis
{Fig 1 - what is the shading? 900 mb wind speed from NARR? Also - manuscript says for the case of may 20, 2024 and the figure says "composite" - what is plotted?}
● Removed and Clarified – We now use ERA5 Reanalysis with more described figures
Line 114 - This section talks about Figure 5 (before any manuscript mention of figs 2-4) - please reorder the figures to be consistent with the order they are referred to in the manuscript
● Corrected – We have converted Section 2 in a discussion of the Observations and Measurement Site. Figure 5 will now be Figure 2 located in Section 2 to supplement the discussion of the wind profile observations dataset, and just before the discussion of the training dataset. Section 3 then is solely about the NLLJ isolation Algorithm (e.g. training, development, etc.)
Lines 111-121 - should move to inside section 2.1.
● Corrected – We have converted Section 2 in a discussion of the Observations and Measurement Site. Section 3 then is solely about the NLLJ isolation Algorithm
Lines 127-129 - There is not enough explanation here about the "inflection points", what they are and why they are important. I assume this detail is included in Zhang et al 2006, but some more of the detail is needed here.
● Clarified - The training dataset for this attempt was hand-selected from NLLJ events during 2021, while the validation dataset was selected from previously reported and depicted by Delgado et al. (2013), Weldegauber (2009), and Sullivan et al. (2017) that were captured by the same instrument and station (i.e. BELT RWP). To gather a suitable dataset for machine learning we have compiled scenarios expected in operation (i.e. incomplete daily files, missing data, large scale weather systems, etc.). A manual and rudimentary isolation method was applied using gradient detection solely on the southerly winds (180 – 270 degrees from North) with maximums greater than 5 m s-1 in both time and altitude to capture the evolution and vertical extent of the NLLJ. This approach is demonstrated in figure 3, where (A) depicts the final isolated NLLJ events from the speed and direction profile (C and D), and (B) represents the visual representation of the gradient detection in the temporal evolution. This method takes the wind speed evolution averaged from 0 - 2000 m and then interpolated and smoothed. The resulting time-series is then used to find the first positive gradient and the last negative gradient, which are taken as the start and end of the NLLJ event. This process is then repeated for the vertical extent using each profile to find the top and bottom at each time step. We found that the manual tuning needed for thresholds on time constrain, continuity, and direction evolution were important for isolating NLLJs, but require attention in many different cases and thus we used the well isolated cases from this method as a training set for the supervised machine learning ensemble. The training set is comprised of 50 NLLJ events that were sufficiently isolated and 50 events that contained no low-level wind maxima that contain low-level wind maxima that we do not consider as LLJ relevant to this study for reasons of direction, or evolution.
Line 154 - "...visually conceptualized in Figure 3." How/why is the data pre-processing included in the algorithm execution loop? is this done more than once?
● Clarified - The pre-processing step is merely as set of transforms that format the data for analysis by the model. It is always necessary so that model has the data in the expected format. Since the model works on single measurement points it is independent of both vertical and temporal resolution but needs to be a properly formatted matrix for analysis.
Line 176 - Please explain what an f1 macro test is, what the scores mean, and how this was evaluated. Alternatively, remove this statement.
● Clarified - We will extend the discussion of how the algorithm is developed. We hope that a longer discussion along with the mentioned packages used will make it easier for other researchers to follow along and be able to reproduce it. We elect to keep this statement in the manuscript so that our methods may be reproduced by other researchers. It is important to note the algorithm training scoring method. F1-macro is a common supervised machine learning testing framework that we will explain in the revised text with the included discussion of the training and preliminary testing results.
Line 186 - "...more than satisfactory..." is not quantifiable, particularly when the algorithm testing is by visual inspection (of 50 cases used for training or for all the cases identified in fig 5?).
● Clarified (see lines 277 – 290 of this document)
Line 211 - Why does the present study not include the "ongoing model refinement"?
● Clarified – We feel that the exploration of supervised machine learning for this task is complete, and we seek to continue its use to further our overall goal (see lines 289 – 296 of this document). We feel that only this portion of the overall goal is sufficient to pursue publication.
line 315 - The connection to synoptic situation not established in study - connection to season, yes.
● Removed – We explore synoptic situation in our project.
line 317 - "..understanding the atmospheric at play..." was not part of the study, and a connection to predictive capability was not established.
● Clarified – This is a pursuit of our next project.
322 - "critical characteristics"... also not established.
• Clarified – We have included further discussion of our motives and nomenclature in the introduction section. “Critical Characteristics” - maximum wind speed, height of maximum, duration, wind direction, etc.
-
AC2: 'Reply on RC2', Maurice Roots, 12 Sep 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
323 | 90 | 45 | 458 | 24 | 28 |
- HTML: 323
- PDF: 90
- XML: 45
- Total: 458
- BibTeX: 24
- EndNote: 28
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1