Journal cover Journal topic
Atmospheric Measurement Techniques An interactive open-access journal of the European Geosciences Union
Journal topic

Journal metrics

IF value: 3.668
IF3.668
IF 5-year value: 3.707
IF 5-year
3.707
CiteScore value: 6.3
CiteScore
6.3
SNIP value: 1.383
SNIP1.383
IPP value: 3.75
IPP3.75
SJR value: 1.525
SJR1.525
Scimago H <br class='widget-line-break'>index value: 77
Scimago H
index
77
h5-index value: 49
h5-index49
Preprints
https://doi.org/10.5194/amt-2019-495
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/amt-2019-495
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

  17 Feb 2020

17 Feb 2020

Review status
A revised version of this preprint was accepted for the journal AMT and is expected to appear here in due course.

Classification of Lidar Measurements Using Supervised and Unsupervised Machine Learning Methods

Ghazal Farhani1, Robert J. Sica1, and Mark Joseph Daley2 Ghazal Farhani et al.
  • 1Department of Physics and Astronomy, The University of Western Ontario, 1151 Richmond St., London, ON, N6A 3K7
  • 2Department of Computer Science, The Vector Institute for Artificial Intelligence, The University of Western Ontario, 1151 Richmond St., London, ON, N6A 3K7

Abstract. While it is relatively straightforward to automate the processing of lidar signals, it is more difficult to choose periods of "good" measurements to process. Groups use various ad hoc procedures involving either very simple (e.g. signal-to-noise ratio) or more complex procedures (e.g. Wing et al., 2018) to perform a task which is easy to train humans to perform but is time consuming. Here, we use machine learning techniques to train the machine to sort the measurements before processing. The presented methods is generic and can be applied to most lidars. We test the techniques using measurements from the Purple Crow Lidar (PCL) system located in London, Canada. The PCL has over 200,000 raw scans in Rayleigh and Raman channels available for classification. We classify raw (level-0) lidar measurements as "clear" sky scans with strong lidar returns, "bad" scans, and scans which are significantly influenced by clouds or aerosol loads. We examined different supervised machine learning algorithms including the random forest, the support vector machine, and the gradient boosting trees, all of which can successfully classify scans. The algorithms where trained using about 1500 scans for each PCL channel, selected randomly from different nights of measurements in different years. The success rate of identification, for all the channels is above 95 %. We also used the t-distributed Stochastic Embedding (t-SNE) method, which is an unsupervised algorithm, to cluster our lidar scans. Because the t-SNE is a data driven method in which no labelling of training set is needed, it is an attractive algorithm to find anomalies in lidar scans. The method has been tested on several nights of measurements from the PCL measurements.The t-SNE can successfully cluster the PCL data scans into meaningful categories. To demonstrate the use of the technique, we have used the algorithm to identify stratospheric aerosol layers due to wildfires.

Ghazal Farhani et al.

Interactive discussion

Status: closed
Status: closed
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Printer-friendly Version - Printer-friendly version Supplement - Supplement

Interactive discussion

Status: closed
Status: closed
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Printer-friendly Version - Printer-friendly version Supplement - Supplement

Ghazal Farhani et al.

Ghazal Farhani et al.

Viewed

Total article views: 391 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
287 92 12 391 13 15
  • HTML: 287
  • PDF: 92
  • XML: 12
  • Total: 391
  • BibTeX: 13
  • EndNote: 15
Views and downloads (calculated since 17 Feb 2020)
Cumulative views and downloads (calculated since 17 Feb 2020)

Viewed (geographical distribution)

Total article views: 311 (including HTML, PDF, and XML) Thereof 306 with geography defined and 5 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 

Cited

Saved

No saved metrics found.

Discussed

No discussed metrics found.
Latest update: 27 Nov 2020
Publications Copernicus
Download
Short summary
While it is relatively straightforward to automate the processing of lidar signals, it is difficult to automatically preprocess the measurements to distinguish between "good" and "bad" scans. It is easy to train humans to perform the task; however, considering the growing number of measurements, it is a time consuming, on-going process. We have tested some machine learning algorithms for lidar signal classification and had success with both supervised and unsupervised methods.
While it is relatively straightforward to automate the processing of lidar signals, it is...
Citation