Introduction
Volcanic eruptions pose a global hazard due to the potential for emissions
to be entrained into the upper atmosphere and transported globally.
Emissions from volcanoes can have significant impacts on human health
(Delmelle et al., 2002; Hansell and Oppenheimer, 2004), on the aviation
industry (Miller and Casadevall, 1999; Prata, 2009), and on atmospheric
radiative transfer as seen following the eruption of Mt Pinatubo (Self et
al., 1993). In order to mitigate the possible impacts of volcanic eruptions,
timely warning of events is essential. Since the installation of a global
network of ground-based monitoring stations is both costly and impractical,
satellite-based remote sensing data are currently used to provide the
spatial and temporal coverage necessary for near-real-time (NRT) monitoring
of volcanic eruption clouds (Brenot et al., 2014). Existing volcanic cloud
detection techniques employ a threshold approach to identify volcanic
eruptions, but this can limit their capabilities with respect to smaller
events. The following work outlines a method to distinguish smaller volcanic
plumes through the implementation of a background correction factor. The
Ozone Monitoring Instrument (OMI), launched on NASA's Aura satellite in July
2004, provides near-global daily monitoring of multiple atmospheric trace
gases with absorption features in the ultraviolet (UV) spectral band. OMI
was designed to supersede the Total Ozone Monitoring Spectrometer (TOMS)
instrument and provide higher-spatial-resolution data than were previously
available. Strong absorption bands in the UV allow sulfur dioxide
(SO2) to be discerned by instruments designed to measure ozone
(Krueger, 1983). The capability of satellite-based volcanic SO2
detection was first demonstrated following the eruption of El Chichón in
1982 (Krueger, 1983), leading to the implementation of satellite-based UV
measurements as a volcano monitoring tool (Schneider et al., 1999; Krueger
et al., 2008; Carn et al., 2016). The low spatial resolution of the TOMS
instruments precluded the measurement of SO2 in small volcanic
eruptions (Carn et al., 2003). OMI's higher spatial resolution (13×24 km
at nadir) permits detection of smaller eruptions and passive volcanic
degassing of SO2, whilst providing daily, global coverage (Krotkov et
al., 2006; Carn et al., 2013, 2016). This work utilises the continuous
global coverage of OMI to identify and automatically classify volcanic
eruptions based on common characteristics established through the use of
statistical modelling.
Existing alert systems
The Support to Aviation Control Service (SACS) is one
operational alert system currently used to detect SO2 and ash emitted
from volcanoes (Brenot et al., 2012). This service provides NRT
alerts of anomalously high SO2 amounts and ash indices recorded by
four UV instruments: OMI, the Global Ozone Monitoring Experiment-2 (GOME-2;
flown on board two meteorological satellites: MetOp-A and MetOp-B), and the
Ozone Mapping and Profiler Suite (OMPS). Three infrared (IR) instruments are
also used: the Atmospheric Infrared Sounder (AIRS) and Infrared Atmospheric
Sounding Interferometer (IASI; also flown on the MetOp-A and -B platforms).
The method of SO2 alert generation used by SACS (Brenot et al., 2014;
http://sacs.aeronomie.be/info/index.php) involves the initial
identification of an anomalously high SO2 column amount (> 2 DU).
When a pixel is flagged, the area is analysed in greater detail, and an alert
is only generated if more than half of the neighbouring pixels also display
high SO2 values (> 2 DU). The technique developed by Brenot et
al. (2014) is subject to certain limitations when utilising UV data,
including the systematic noise in the data leading to false alerts and the
restriction of retrievals to those that assume a SO2 plume altitude in
the lower stratosphere (STL). Therefore, in the development of an algorithm
based on OMI data we aim to account for variable background SO2 levels
and systematic noise. Additionally, SO2 retrievals representing a lower
plume altitude are used, in an attempt to resolve plumes with lower SO2
amounts, lower injection altitude, and more diffuse characteristics.
Methodology
To distinguish volcanic events from background SO2 levels, the
characteristics of volcanic emissions must be assessed. In this work we
implement statistical classification techniques such as logistic regression
to isolate observed variations in volcanic SO2 plumes compared to
ambient SO2 levels. The use of statistical modelling also facilitates
the review of misclassified events, providing some insight into the
limitations of the detection algorithm. The aim of this work is to
distinguish volcanic events from control samples with a binary
classification algorithm and identify the strengths and weaknesses of the
resulting methodology. To develop a model and identify characteristics
specific to volcanic plumes, three datasets must be compiled. Firstly, a
database of emissions corresponding to known volcanic eruptions must be
obtained, with a complimentary dataset incorporating measurements collected
during known inactive periods against which to compare volcanic events.
These two elements are used to train the statistical models developed and
establish whether volcanic events can be distinguished from non-volcanic ones.
The final dataset is a collection of volcanic and non-volcanic events not
used to train the original model, against which the efficacy of the model
can be tested. The following section details the data collection and
statistical applications used.
Satellite data
For this analysis we use OMI Level 2 total column SO2 (OMSO2) data,
which are publicly available from NASA Goddard Earth Sciences (GES) Data and
Information Services Center (DISC;
http://disc.sci.gsfc.nasa.gov/Aura/data-holdings/OMI/omso2_v003.shtml).
These data provide global coverage with a temporal resolution of 1 day at low
latitudes and increasing daily observations towards the poles, where
measurement swaths overlap. Until June 2016, OMSO2 data provided volcanic
SO2 total column amounts calculated using a linear fit (LF) algorithm
(Yang et al., 2007), which are the products used here. Previous works have
provided in-depth descriptions of the OMI retrieval algorithms (Carn et al.,
2013; Krotkov et al., 2006; McCormick et al., 2013; Yang et al., 2007), which
have a proven track record in the assessment of volcanic and anthropogenic
emissions, including identification of volcanic plume sources (e.g. Carn et
al., 2008, 2013, 2016; McCormick et al., 2012, 2013), volcanic plume tracking
(e.g. Carn and Prata 2010; Krotkov et al., 2010; Lopez et al., 2013), and
identification of copper smelter emissions (Carn et al., 2007) and other
large SO2 emission sources (e.g. Fioletov et al., 2011, 2013).
Characteristics of the methods incorporated in the development of an
automatic classification technique.
Method
Sample area size
Position
Correction technique
M1
4∘×4∘
Centred over the volcano
None applied
M2
2∘×2∘
Centred over the volcano
None applied
M3
2∘×2∘
Centred over the volcano
Assumes that the plume is predominantly confined to the M2 region and utilises the M1 region to define the background SO2 level (Eq. 1)
The LF algorithm retrieves three SO2 column amounts corresponding to
a priori SO2 vertical profiles with centre of mass altitudes (CMAs) of
approximately 3 km (lower troposphere; TRL), 8 km (mid-troposphere; TRM),
and 17 km (STL). These altitudes are based upon
atmospheric pressure levels and therefore can display slight variations
depending upon the local temperature profile (Carn et al., 2013). In order to
obtain an accurate estimation of the SO2 column amount, the appropriate
SO2 retrieval must be selected based upon the known or inferred
injection altitude of the volcanic plume (Yang et al., 2007), which can be
poorly constrained particularly in remote regions with minimal or no
monitoring capabilities. Differences between the altitude assumed in the LF
algorithm and the true altitude of the plume can lead to errors of up to
20 %, provided the assumption is approximately correct (Yang et al.,
2007). Since our aim is to develop an algorithm capable of detecting volcanic
eruptions regardless of magnitude, including diffuse SO2 clouds, we use
the TRL SO2 product to permit identification of small eruptions confined
to the lower troposphere. The use of one retrieval altitude reduces the need
for user input or prior knowledge of the injection altitude of the plume but
will result in the overestimation of SO2 mass for plumes injected into
the mid-troposphere or above. This method is hence sufficient for plume
identification and alert purposes but precludes accurate plume mass
calculation for some eruptions. SO2 retrievals corresponding to higher
altitudes (TRM or STL) not only feature lower background noise but also significantly
underestimate SO2 columns in low-altitude volcanic clouds, possibly
preventing detection.
OMI data collected since 2008 are influenced by a row anomaly (the OMI row
anomaly; ORA) which results in data gaps in particular rows along the OMI
measurement swath. Information on the status of this anomaly is provided by
the Royal Netherlands Meteorological Institute
(http://projects.knmi.nl/omi/research/product/rowanomaly-background.php).
The ORA data gaps combined with the variation in viewing angle produced by
the 16-day orbital cycle of the Aura satellite result in varying influence
on OMI SO2 measurements (Flower et al., 2016). Any eruptions identified
after the appearance of the ORA were investigated with greater scrutiny and
excluded where the effect was significant.
Volcanic plume quantification
As a training dataset for our plume identification technique, we identified
79 volcanic eruptions at 27 different volcanoes (Table 1) using the Volcanoes
of the World (VOTW) database curated by the Smithsonian Institution's Global
Volcanism Program (GVP; see Global Volcanism Program, 2013). Note that, as a
result of the way in which eruptions are defined in the VOTW database,
several of the eruptions listed in Table 1 actually correspond to the onset
of extended periods of volcanic activity, rather than discrete eruptions. For
each identified eruption, total SO2 mass detected by OMI was obtained
for the registered day of the eruptive event (or the start of the period of
unrest) with the preceding and subsequent days analysed where no
corresponding plume could be identified on the reported day of eruption. This
allowance accounts for any inaccuracies in the assigned eruption date and
allows for the identification of eruption plumes generated after the Aura
overpass time (∼ 13:45 local time) resulting in a delay in detection.
Identification and quantification of volcanic SO2 emissions is
complicated by the presence of variable biases and noise levels in the data.
These variations are influenced by several factors, including the latitude of
the volcano, time of year, proximity to pollution sources, and the presence
of meteorological clouds (Krotkov et al., 2006; Yang et al., 2007).
In our analysis, three methods (M1, M2, and M3; Table 1) were used
to quantify the SO2 loading detected at each location, with the goal of
distinguishing volcanic SO2 from background noise. The procedures were
developed with the intention of allowing the calculation of volcanic SO2
loading with minimal user input, reducing the possible effects of human error
in the classification of what constitutes the bounds of an identified plume.
Analysis regions for method 1 (M1) and method 2 (M2) for an
SO2 plume detected by OMI at Piton de la Fournaise, Réunion, on
24 February 2010.
Method 1 (M1) and method 2 (M2) differ only in the geographic extent over
which OMI SO2 columns are integrated to obtain total SO2 mass
(Fig. 1). For each eruption analysed, M1 calculates integrated SO2
mass in a 4∘×4∘ box centred over the volcano
location (thus capturing plumes regardless of wind direction). The
4∘×4∘ box encompasses an area which captures most
small–moderate volcanic plumes with few instances of dispersion of emissions
outside the region; however, this relatively large sample area also
potentially includes greater background noise, particularly where other
nearby volcanoes are also active. Regions with increased background SO2
concentrations from multiple sources would result in a higher number of false
alerts. As an alternative to M1, M2 uses a 2∘×2∘ region which, whilst more susceptible to possible plume
dispersion beyond the defined limits, is less influenced by contamination
(Fig. 1). Manual inspection indicated that plume dispersion beyond the
defined geographic limits was only an issue for the largest eruptions in
Table 2. Figure 1 shows an example of a small volcanic SO2 plume at
Piton de la Fournaise volcano (Réunion); here, the M2 region captures
most of the SO2 plume that is visually apparent, only excluding some
very diffuse SO2 further downwind that is included in the M1 region.
Test dataset of volcanic eruption dates and control dates (organised
alphabetically by volcano).
Volcano
Location
Eruption date
Control date
Ambrym
Vanuatu
08/11/2006
31/03/2005
23/05/2008
04/06/2008
Anatahan
Mariana
06/01/2005
11/05/2009
Islands
05/04/2005
16/06/2006
17/03/2006
15/01/2007
24/02/2007
22/11/2005
27/11/2007
29/06/2005
Bagana
Papua
17/03/2005
26/11/2007
New Guinea
06/06/2005
14/06/2008
09/01/2007
25/04/2005
10/03/2007
23/10/2006
20/05/2007
06/11/2006
14/07/2007
01/07/2009
23/08/2007
22/01/2006
12/09/2007
26/12/2009
06/10/2007
01/03/2009
Bezymianny
Kamchatka,
10/05/2007
31/01/2005
Russia
11/07/2008
22/05/2008
Chaitén
Chile
02/05/2008
11/12/2009
Dukono
Indonesia
25/05/2008
03/03/2006
25/07/2008
02/10/2008
Fuego
Guatemala
27/12/2005
03/04/2008
Ibu
Indonesia
04/04/2008
16/07/2008
Kathala
Comoros
16/04/2005
07/07/2005
24/11/2005
12/09/2006
28/05/2006
17/10/2008
12/01/2007
24/05/2009
Kelut
Indonesia
18/05/2006
21/02/2006
Manam
Papua
27/01/2005
15/07/2005
New Guinea
17/07/2006
25/02/2005
05/10/2007
24/11/2008
29/12/2007
07/08/2005
11/05/2008
04/07/2008
Mayon
Philippines
17/08/2005
02/08/2006
21/02/2006
27/03/2008
Merapi
Indonesia
07/03/2006
29/05/2008
11/03/2006
31/03/2007
Nyamuragira
DR Congo
27/11/2006
09/12/2007
02/01/2010
11/05/2009
06/11/2011
03/09/2008
22/06/2014
31/08/2008
Nyiragongo
DR Congo
07/09/2005
22/06/2006
10/10/2005
27/08/2005
07/11/2005
08/03/2007
01/01/2009
09/05/2009
Continued.
Volcano
Location
Eruption date
Control date
Ol Doinyo
Tanzania
20/07/2005
20/07/2005
Lengai
30/03/2006
11/03/2008
Pagan
Mariana
11/01/2007
24/06/2007
Islands
Piton de la
Réunion
24/02/2005
01/02/2006
Fournaise
20/07/2006
22/03/2007
30/08/2006
13/11/2008
Popocatépetl
Mexico
06/04/2006
22/02/2006
23/05/2006
27/08/2005
11/04/2007
08/03/2007
01/12/2007
09/05/2009
22/02/2008
25/02/2005
16/11/2008
11/03/2008
Rabaul
Papua
07/10/2006
24/06/2007
New Guinea
04/08/2007
01/02/2006
22/08/2007
22/03/2007
Santa Ana
El Salvador
01/10/2005
13/11/2008
Santa Maria
Guatemala
26/10/2005
22/02/2006
SHV
Montserrat
20/05/2006
10/04/2009
08/01/2007
12/03/2008
29/07/2008
05/06/2007
11/02/2010
02/04/2007
Soputan
Indonesia
19/04/2005
07/02/2005
15/12/2006
10/05/2009
15/12/2006
17/11/2008
06/06/2008
20/12/2006
Tinakula
Solomon
12/02/2006
03/02/2009
Islands
22/09/2009
15/06/2009
17/01/2010
05/10/2005
Tungurahua
Ecuador
14/07/2006
16/06/2005
16/08/2006
31/12/2009
10/01/2008
04/05/2009
06/02/2008
11/02/2007
Turrialba
Costa Rica
06/01/2010
06/09/2005
12/01/2012
04/08/2005
Dates are displayed as DD/MM/YYY.
A third method (M3) was developed in an attempt to intrinsically account
for the variable noise levels in SO2 data collected in different
geographic regions (Carn et al., 2013). We posit that in order to effectively
develop a global volcanic plume detection methodology without a significant
number of false alerts a background noise correction may be necessary. Our
technique is analogous to contextual thermal infrared (TIR) anomaly detection
procedures used at active volcanoes, where a background radiance value is
calculated as a reference against which anomalously high radiance values can
be compared (e.g. Wright et al., 2002; Murphy et al., 2011). In the M3
method, the 2∘×2∘ region (M2) is considered the
active emission region with a background SO2 offset value derived from
the total SO2 mass in the 4∘×4∘ M1 region
(Eq. 1).
M3=M2-M1-M23
Classification based on a latitudinal range leads to variations in the
physical dimensions of the analysis region depending upon the latitude of the
volcano. The maximum such variation in this analysis would occur between
equatorial volcanoes and those located in Kamchatka and Alaska, equating to
2.8 and 1.4 km in the north–south dimensions of the M1 and M2 regions
respectively. This equates to the loss of less than one pixel at the furthest
extent of the analysis region and is not likely to influence the resulting
analyses. In contrast the variation in the longitudinal dimensions equates to
a ∼ 35 % decrease in the east–west dimensions of high-latitude
regions relative to the Equator. The high-latitude samples analysed here will
be investigated to identify whether this variation in sample size influenced
the sample classification techniques employed.
Eruptive events that post-date the appearance of the ORA were manually
assessed in order to identify whether the ORA data gap significantly impacted
the detection of SO2, such as complete masking of the plume in extreme
cases (Flower et al., 2016). Additional factors impacting the selection of
eruptive events are the presence of meteorological clouds, which can
effectively mask any volcanic plume at lower altitudes from a satellite-based
sensor (Carn et al., 2013; Krotkov et al., 2006), and the seasonal variation
in UV radiation at high latitudes. Cloud masking is due to the high UV albedo
of clouds, and this, coupled with low UV irradiance, can make SO2
detection at high latitudes during winter months particularly challenging
(Telling et al., 2015). Through implementation of a cloud fraction threshold
of ∼ 20 % within each scene the majority of the eruptions analysed
here were restricted to latitudes below 30∘.
Control samples
A control group is required to assess whether volcanic eruptions can be
distinguished from background SO2 levels. Therefore, for each volcanic
eruption analysed (Table 2) a control SO2 mass was calculated using each
of the three incorporated methodologies (M1, M2, and M3) for a second
date at the same volcano. Assignment of control group analysis dates was
limited to a period between 1 January 2005 and 31 December 2009. The 2009
cut-off date was employed due to the increasing influence of the ORA after
this time, in an attempt to reduce the influence of data gaps on the model
output. Control dates were assigned for comparison with each identified
volcanic eruption, using an online random-number generator (Haahr, 2015;
http://www.random.org) to assign a value between 0 and 1825 to each
data point. These random values were used to determine the number days from
the beginning of the analysis period at which to assign a control date
(Table 2). The identified dates were then assigned to each target volcano
alphabetically, with a corresponding number of events assigned to each
location as number of volcanic eruption analyses performed (Table 2).
Modelling techniques
Modelling procedures were conducted with the Weka 3 software package: a
collection of algorithms that can be implemented for data mining tasks (Hall
et al., 2009) provided by the University of Waikato
(http://www.cs.waikato.ac.nz/ml/weka/). The quantity of SO2
present within each analysis region is a complex function of eruption
composition and magnitude in addition to ambient SO2 levels and local
sources of interference such as neighbouring volcanoes. This precludes the
use of a fixed threshold, whereas statistical models permit a probabilistic
approach to volcanic eruption identification with multiple statistical
analyses trialled using the Weka 3 package. A simple logistic regression
analysis (Eq. 2) was found to be the most effective technique for the
classification of volcanic and non-volcanic events. Simple logistic
regression is a binary classification technique, here defining volcanic (v)
and non-volcanic control (c) events and facilitating the development of a
linear model constructed from a transformed target variable (Witten and
Frank, 2005). The logistic regression equation used here assigns the
probability P of the occurrence of a volcanic eruption or degassing event:
P=1-11+e-(a+bX),
where e is the base of the natural logarithm, a is the probability when
the independent variable (X; here, the volcanic plume SO2 mass
measured is in tonnes) is equal to 0, and b represents the rate at which
probabilities vary with incremental changes in X.
Output of a logistic regression analysis is assessed against a series of
validation statistics that test the accuracy of the generated model. These
statistics include overall accuracy, precision, and recall, in addition to
receiver operating characteristic (ROC) curves. In this analysis, the overall
accuracy relates to the percentage of correctly classified events in both the
volcanic and control (non-volcanic) samples; however, this statistic alone
cannot account for preferential classification of one sample over another
(Oommen et al., 2010). Hence precision and recall statistics, characterised
by values between 0 and 1, are used to identify whether preferential
classification is occurring. Precision relates to the accuracy of prediction
of a single sample group (volcanic or non-volcanic), whilst recall measures
the effectiveness of the predictions themselves (Oommen et al., 2010). In the
context of this study, if a volcanic classification has a precision of 0.9,
then 90 % of the events predicted as being volcanic in nature are
volcanic events, whilst the remaining 10 % are misclassified as
non-volcanic and will be termed here as “missed alerts”. In contrast, a
recall value of 0.8 would correspond to 80 % of observed volcanic events
being correctly classified, but this does not take into account any
non-volcanic events which are misclassified as volcanic, referred to here as
“false alerts”. The final validation statistic used here is the ROC curve,
which represents a method for assessing the rate of accurately classified
events against possible falsely classified events. ROC values relate to the
accuracy of the classification system implemented, with a value of 1
indicating accurate prediction of all events (Oommen et al., 2010; Witten and
Frank, 2005).
Logistic regression model calculation was conducted using the k-fold
cross-validation technique incorporated into the Weka 3 software package. This
method segregates the data into k partitions, allowing k-1 segments of the
data to be used as a training set, with the remaining data used for validation
purposes. This method is then repeated with each of the k partitions being
used to validate the corresponding model from which it was withheld, with the
final statistics comprising an average of the output of all k models
(Oommen et al., 2010). We implement a k value of 10 due to the associated
reduction in bias compared to k values < 5 (Rodríguez et al., 2010;
Witten and Frank, 2005).
Results
Of the three SO2 mass calculation procedures employed (M1, M2, and
M3), the most success was achieved with the background-corrected dataset
(M3). None of the logistic regression model investigations undertaken with
the M1 and M2 datasets produced more than 55 % overall accuracy in
the classification of volcanic events, and therefore these data were not
investigated further. In contrast, the M3 technique provided the best
results, with a 77 % overall accuracy and no additional data
pre-processing required; therefore this technique was employed for all
further assessments and model development.
OMI SO2 measurements
Of the 79 volcanic eruptions analysed, 13 displayed low SO2 amounts
(< 100 t), following application of the SO2 correction (M3),
on the identified day of eruption. Two eruptions produced very large amounts
of SO2: Nyamuragira (November 2006; 46 kt) and Rabaul (October 2006;
550 kt); however, use of the OMI TRL SO2 columns is likely to
overestimate the actual SO2 amounts in these upper-tropospheric or
lower-stratospheric plumes (Carn et al., 2013).
Average SO2 column amounts (tonnes) for volcanic and control
events.
Sample
Average SO2
mass (tonnes)
Volcanic
M1 (4∘)
2680
M2 (2∘)
1150
M3 (corrected)
680
Control
M1 (4∘)
450
M2 (2∘)
170
M3 (corrected)
90
Excluding the aforementioned very high values, the average M3 plume
contained 680 t of SO2, approximately 60 % of the average of the
M2 analysis and 25 % of the M1 average (Table 3). The control
dataset displays significantly lower SO2 loadings than the volcanic
events, with an average corrected SO2 mass of 90 t and a maximum
corrected SO2 mass of 1040 t. This variation indicates that the
volcanic data displays generally higher SO2 levels than the control
data, as would be expected. In all of the selection methodologies the
SO2 mass detected on control dates was 14–17 % of the average mass
detected in the volcanic dataset. Box plots were generated to assess the
general dynamics of the volcanic and control datasets (Fig. 2). Comparison of
these plots confirms the pattern identified in Table 3, with the SO2
measurements on “eruption” days displaying significantly higher values than
the control data.
Box-and-whisker plots displaying the spread and distribution of
volcanic and control data, with lines indicating upper and lower quartiles of
the data and the remainder represented by the box region. Additional data
points indicate the individual missed alerts in the volcanic data and false
alerts in the control data detailed in Table 4.
Model output
The most accurate model consisted of a simple logistic regression applied to
the M3 SO2 dataset with an overall accuracy of 76.6 % and an ROC
of 0.843. This model favoured volcanic precision (volcanic precision of 0.83
vs. control precision of 0.72) at the expense of control recall (control recall of 0.86
vs. volcanic recall of 0.67), which indicates that the model preferentially
classifies alerts as control samples. This model reduces the number of false
alerts generated relative to missed alerts. Investigations were undertaken to
identify characteristics of volcanic events that facilitated classification
and to elucidate the likely cause of the 23 % error associated with the
model. Removal of volcanic plumes containing less than 50 t of SO2 from
the M3 dataset resulted in a ∼ 6 % increase in model accuracy.
Eight data points produced false alerts with control events classified as
volcanic eruptions, whilst 18 volcanic events were misclassified as controls,
producing missed alerts (Table 4). The misclassified alerts were isolated to
assess if any common characteristics of these events could be identified,
with each individual alert incorporated into Fig. 2 for comparison with the
overall dynamics of the data. The comparison of missed alerts indicates that
each one falls within the lower quartile of the volcanic dataset, whilst the
false alerts displayed values consistent with the upper quartile of the
control data range (with one exception; see Fig. 2). The potential causes of the
misclassification of events are discussed further in Sect. 4.1.
Misclassified alerts identified in the initial logistic regression
model.
Sample
Name
Date
Plume SO2
Predicted
Original
Error generation
(S)
(DD/MM/YYYY)
mass (tonnes)
classification
classification
C
1
Ambrym
31/03/2005
1040
Volcanic
Control
Persistent degassing
C
8
Bagana
26/11/2007
340
Volcanic
Control
Persistent degassing
C
10
Bagana
23/10/2006
?
Volcanic
Control
Missing data
C
24
Karthala
07/07/2005
?
Volcanic
Control
Missing data
C
29
Manam
15/07/2005
350
Volcanic
Control
Localised noise
C
32
Manam
07/08/2005
450
Volcanic
Control
Small Eruption
C
54
Popocatépetl
08/03/2007
600
Volcanic
Control
Ongoing eruption
C
57
Popocatépetl
11/03/2008
340
Volcanic
Control
Ongoing eruption
V
3
Anatahan
06/01/2005
230
Control
Volcanic
Diffuse plume
V
5
Anatahan
17/03/2006
120
Control
Volcanic
Drifting plume
V
13
Bagana
14/07/2007
320
Control
Volcanic
Diffuse plume
V
17
Bezymianny
10/05/2007
140
Control
Volcanic
Drifting plume
V
19
Chaiten
02/05/2008
250
Control
Volcanic
High noise
V
20
Dukono
25/05/2008
300
Control
Volcanic
Diffuse plume
V
21
Dukono
25/07/2008
270
Control
Volcanic
Drifting plume
V
23
Ibu
04/04/2008
210
Control
Volcanic
Diffuse plume
V
24
Karthala
16/04/2005
110
Control
Volcanic
Drifting plume
V
28
Kelut
18/05/2006
170
Control
Volcanic
Diffuse plume
V
32
Manam
29/12/2007
80
Control
Volcanic
Diffuse plume
V
33
Manam
11/05/2008
190
Control
Volcanic
Diffuse plume
V
34
Mayon
17/08/2005
120
Control
Volcanic
Drifting plume
V
43
Nyiragongo
10/10/2005
230
Control
Volcanic
Drifting plume
V
48
Pagan
11/01/2007
160
Control
Volcanic
Diffuse plume
V
53
Popocatépetl
23/05/2006
250
Control
Volcanic
Interfering signal
V
64
SHV
08/01/2007
240
Control
Volcanic
Drifting plume
V
67
Soputan
19/04/2005
170
Control
Volcanic
Drifting plume
Discussion
Analysis of inaccurate classifications
False alerts
Investigation of the incorrectly classified false alerts (Fig. 2; Table 4)
revealed that, due to the random selection procedure used for assigning
control sample dates, some of the control SO2 values corresponded to
periods of ongoing volcanic activity. These anomalous control values relate
to stronger, persistent plumes, despite not being associated with large or
“initiating” events as reported in the VOTW database; this was the case for
five of the nine false alerts (C1, 8, 32, 54, and 57; Table 4). Two additional
alerts were generated as a result of a data gap in the OMI measurements (C10
and 24). Missing values (characterised by a blank cell to differentiate these
from days with data available but no recordable SO2 emissions) are
incorrectly classified by the incorporated model as volcanic events.
Pre-screening of samples for data gaps prior to incorporation into the model
is required to prevent the classification of missing values as volcanic
events. The one remaining false alert (C29) was the result of increased noise
levels preferentially affecting the M2 over the M1 region, resulting in
an artificially high SO2 mass derived from the M3 calculation and a
false alert.
Missed alerts
Missed alerts occurred at a higher frequency than false alerts, but a common
characteristic of all missed alerts is an SO2 plume mass below 325 t
(Fig. 2; Table 4). We attribute the misclassification of volcanic events to
four main causes. The first influenced eight of the volcanic events (V3, 13,
20, 23, 28, 32, 33, and 48; Table 4) and is the result of eruptions producing
diffuse plumes containing low SO2 amounts close to the OMI detection
limit (e.g. small eruptions and/or eruptions to low altitudes). The second
cause of misclassification affecting eight samples (V5, 17, 21, 24, 34, 43,
64, and 67; Table 4) is the drifting of the volcanic plume out of the
geographic area of analysis (M2) into the region utilised for background
classification (M1), causing signal suppression in the M3 methodology.
Implementation of this system on a global grid would allow the identification
of drifting plumes in addition to those located directly above the
corresponding emitting source. One event (V19; Table 4) was impacted by
increased noise in the background classification region, also suppressing the
plume SO2 loading in the M3 calculation. The final factor preventing
the correct identification of a volcanic eruption (V53; Table 4) occurred at
Popocatépetl (Mexico), through the masking of a moderate eruption plume when
a large SO2 cloud from another volcano (Soufrière Hills, Montserrat)
drifted into the M1 region, causing an anomalously high background SO2
mass in the M3 calculation.
Optimisation of event classification
We assessed the impact of varying the maximum SO2 plume mass included in
the logistic regression model to investigate whether the use of a threshold
SO2 loading improved the classification capabilities of the model. The
volcanic dataset was incrementally filtered to remove a proportion of the
data, to identify how this influenced the validation statistics. Each reduced
volcanic dataset was incorporated into a logistical regression model with a
k-fold validation system; however, the control sample was maintained
throughout all of the analyses. The variation in class size produced by the
removal of volcanic data actually provides a more accurate representation of
the natural system (Oommen et al., 2011), with more control samples than
volcanic, as more days are characterised by quiescence than volcanic
activity. In each instance the overall accuracy, precision, and recall
statistics were tracked (Fig. 3) to assess the changes in the model as the
minimum incorporated SO2 mass varied. The linear correlation between
control recall and volcanic precision is evident in the comparison of these
statistics (Fig. 3b) as well as that between the control precision and
volcanic recall.
Result of the application of a threshold SO2 loading to the
volcanic dataset on (a) accurately classified events and
(b) the precision (no false alerts) and recall (no missed alerts)
values for both the volcanic and control datasets.
When all data are incorporated, the model appears to favour volcanic
precision and control recall, resulting in a model that will display a larger
number of missed volcanic alerts than false classification of control
samples. When 60 % of the dataset is used, the volcanic precision and
recall are equal, as are the control precision and recall, all displaying
values greater than 0.9. The threshold SO2 loading in this case is
360 t; i.e. if this model were to be implemented, any volcanic plume
containing less than this amount would not be identified as a volcanic event.
The use of 75 % of the volcanic dataset appears to represent a good
compromise between variation in the statistics and the elimination of smaller
plumes (Fig. 3). The volcanic and control precision are almost equal,
indicating that this model is equally effective at predicting volcanic and
non-volcanic events, with a higher control recall than volcanic
recall (Fig. 3). The tendency of this model is to miss smaller volcanic
events rather than falsely classify control samples displaying moderate noise
levels. Favouring missed over false alerts is also a characteristic of the MODVOLC
MODIS-based automatic volcanic alert system (http://modis.higp.hawaii.edu/),
designed to detect volcanic thermal anomalies. Quantitative comparison of these models could not be conducted as
assessment of the MODVOLC system was performed in a more qualitative manner,
assessing whether alerts were identified in locations where they would be
expected (e.g. lava flow fields) (Wright et al., 2002, 2004).
Figure 4 shows the variation of ROC values associated with each of the
logistic regression models and minimum SO2 plume mass with the
percentage of the total dataset analysed, with the total change in each
normalised. The trends in both ROC and SO2 mass threshold show second-order
polynomial characteristics with R2 values of 0.985 and 0.993
respectively. The intersection of these trend lines represents model
optimisation, offering the greatest gain in accuracy (ROC) combined with the
least impact on the identifiable SO2 plume mass. This optimisation point
corresponds to the removal of 22 % of the volcanic data, resulting in a
minimum incorporated SO2 mass of ∼ 150 t, and correlates with
that inferred through the comparison of precision and recall statistics
(Fig. 3). Application of a 150 t SO2 mass threshold prevents the
resolution of smaller plumes, but the original assessment (Fig. 2; Table 4)
indicates that SO2 loadings below this value tended to be misclassified
anyway.
The effect of proportional removal of lowest data points on
minimum incorporated SO2 mass from the volcanic dataset and the ROC
(receiver operator characteristic) statistic of each model where ROC = 1
implies all events correctly classified.
The model based on 78 % of the volcanic dataset has an overall accuracy
of 85.7 % and an ROC of 0.95, producing eight false alerts that correspond to
those identified in the original assessment, with the exception of C8
(Table 4), which was accurately classified with this model. In contrast,
27.8% of the missed alerts originally identified were no longer flagged;
of these five instances, four were eliminated due to their low SO2
loadings, with the remaining alert correctly classified as a result of
improvements in event classification by the optimised model.
Parameterisation of Eq. (2) using the 78 % model output facilitates the
validation of individual records and allows the incorporation of new data
points (Eq. 3) through the substitution of X with measured volcanic
SO2 mass in tonnes:
P=1-11+e-(-2.943+0.0091X).
Independent validation
A secondary testing procedure was employed to assess the efficacy of the
developed logistic regression models on an independent test dataset
consisting of 12 volcanic eruptions not initially included (Global Volcanism
Program, 2013) and displaying variable plume characteristics, and 12
corresponding control samples, resulting in 24 data points (Table 5).
Locations and dates of volcanic and control “eruptions” for
validation dataset.
Volcano
Location
Eruption
Correct classification
Control
Correct classification
date
(Y/N)
date
(Y/N)
(DD/MM/YYY)
Original
Optimised
(DD/MM/YYY)
Original
Optimised
Ambrym
Vanuatu
01/05/2007
N
Y
12/12/2008
N
Y
Anatahan
Mariana Islands
29/05/2006
Y
Y
06/09/2005
Y
Y
Cleveland
Aleutian
06/02/2006
Y
Y
17/05/2009
Y
Y
Islands
23/05/2006
Y
Y
06/01/2008
Y
Y
28/10/2006
N
N
28/02/2006
N
N
Colima
Mexico
24/04/2005
N
N
04/05/2009
Y
Y
Lascar
Chile
04/05/2005
N
N
11/02/2007
Y
Y
Lopevi
Vanuatu
21/04/2007
Y
Y
13/12/2006
Y
Y
Okmok
Aleutian Islands
12/07/2008
Y
Y
01/07/2008
Y
Y
Sierra Negra
Galápagos Islands
22/10/2005
Y
Y
04/08/2005
Y
Y
Soputan
Indonesia
25/10/2007
Y
Y
02/07/2008
Y
Y
Soufrière Hills
Montserrat
20/05/2006
N
N
24/08/2006
Y
Y
The incorporation of an independent investigation allowed the data
characteristics isolated in the original analysis to be tested against data
not utilised in the training of the model. Classification of the data with
the original model containing all data points resulted in an accuracy of
75 %, whereas analysis with the optimised model (78 % of the data)
produced an overall accuracy of 79.2 %; a detailed overview of the
validation statistics of each model is given in Table 6. The optimised model
resulted in no false detections although four volcanic events were missed;
these consisted of one sample in which the SO2 plume had drifted out of
the analysis area (Soufrière Hills), two weak plumes with SO2 loadings
below 60 t (Cleveland & Lascar), and one moderate plume with SO2
loadings of 255 t (Colima). All SO2 plumes exceeding 390 t were
correctly classified as volcanic; therefore we conclude that events emitting
less than 390 t SO2 are likely to be misclassified with this
methodology. Taking into account the thresholds of the incorporated methods
(Table 6) and solving Eq. (3), we find that the minimum SO2 mass that
would be classified as volcanic in origin by this model is 378 t.
Validation statistics generated through the assessment of the test
data with three methods.
Validation statistic
Original
Optimised
Optimised
model
model
threshold model
Overall accuracy (%)
75
79.2
95
Volcanic precision
0.8
1
1
Volcanic recall
0.667
0.583
0.889
Control precision
0.714
0.706
0.917
Control recall
0.833
1
1
ROC
0.813
0.84
0.979
Threshold (P)
0.432
0.620
0.660
Limitations
This analysis has indicated that, prior to implementation of the incorporated
classification technique (logistic regression), pre-screening of data samples
is required to account for the influence of missing data points and
meteorological cloud cover. The incorporated modelling technique
automatically interpreted missing values as volcanic alerts, thus influencing
the calculated threshold, and therefore data gaps must be removed prior to
logistic regression analysis. Persistent meteorological cloud cover can mask
SO2 plumes at lower altitudes from satellite sensors, precluding
detection. This effect can be significant at higher latitudes, particularly
in winter, and therefore the methodology described here may be limited at
these locations. Where high-latitude data were available and incorporated
into this trial (Bezymianny, Okmok, and Cleveland), correct classification
occurred on all but one of those days for which data were available (one
additional control sample characterised by no available data was
misclassified). Consistent high-latitude classification indicates the robust
nature of the M3 pre-processing technique employed, with no indication
that differences in sample region size due to latitudinal variations
(discussed in Sect. 2.2) influenced the identification of volcanic clouds.
Further investigation is necessary to accurately assess the capabilities of
the technique in high-latitude regions, particularly regarding the influence
of persistent cloud cover.
The main constraint on SO2 plume detection using this methodology is the
detection limit of the satellite measurements used as input (here, the OMI
TRL SO2 columns). Indeed, this analysis indicates that the minimum
SO2 mass that could be reliably classified as volcanic in origin using
the OMI TRL SO2 data is on the order of 400 t. The lack of a priori
knowledge of volcanic SO2 plume altitude restricts the classification
technique to SO2 retrievals corresponding to a single CMA, and our use
of the TRL SO2 product does not imply any knowledge of SO2 altitude
(which is not required for eruption detection). The use of OMI SO2
products with lower noise (e.g. STL columns) or more sensitive SO2
algorithms (e.g. Li et al., 2013) would result in lower detection limits,
although STL retrievals would also inhibit the detection of low-altitude or
diffuse plumes. Future UV satellite instruments such as the Tropospheric
Monitoring Instrument (TROPOMI; http://www.tropomi.eu/), with better
spatial resolution than OMI, should also have lower SO2 detection
limits. In order to resolve smaller plumes, an instrument with a higher
spatial resolution would be required, but such instruments typically offer
lower temporal resolution. Reduced temporal resolution would not provide the
daily coverage necessary for the implementation of this technique in a global
near-real-time alert system.
Although the technique described here was designed for global detection of
volcanic eruptions, it could also be adapted for regional-scale assessments.
For initial investigative purposes and model training the location of
analysis regions was fixed based upon the location of the known emitting
source, although this method could also be implemented on a global or
regional scale using a reference grid. Implementation of a gridded analysis
over a wider area with the developed analysis performed within each grid cell
could allow the classification both above the volcano and in adjacent
regions into which plumes may have drifted. For volcanoes with frequent
activity during the OMI mission (2004–present), the developed method of data
collection and model training could be applied. Application to a single or
small cluster of volcanoes would specifically tailor the resulting output to
the eruptive style of the volcano or volcanoes in question and could allow
refinement of the background correction based on local conditions (e.g.
avoiding other known regional SO2 sources). High frequencies of
eruptions and emissions in locations such as Vanuatu or Indonesia could
facilitate the training of the data and might prove effective in the
monitoring of persistently degassing volcanoes.
Conclusions
Through the analysis of operational OMI SO2 measurements (TRL SO2
columns) for 79 volcanic eruptions, a simple logistic regression model
allowed classification of volcanic from non-volcanic control events with an
accuracy of 80 %. Optimisation of the model by progressive removal of
input data enabled volcanic plumes containing at least ∼ 400 t of
SO2 to be consistently resolved and correctly classified. With an
appropriate training dataset, this technique could form the basis of a
near-real-time volcanic eruption detection scheme, with minimal user input
necessary.
We identified some common factors resulting in misclassification of control
or volcanic events, including contamination of the background analysis
region with SO2 emissions from another volcano, low SO2 emissions
and/or low plume altitude (i.e. resulting in emissions below detection
limits), advection of SO2 emissions out of the analysis region prior to
the satellite overpass, and data gaps.
The implementation of a NRT volcanic eruption alert system based on the
technique described here would represent an advance in small eruption
identification over current systems, such as SACS, which use a simple
threshold SO2 column amount to identify significant volcanic degassing
events (Brenot et al., 2014). In dispersed volcanic clouds, SO2 column
amounts may be low, yet the total SO2 loading could be high; hence alerts
based on SO2 mass rather than column amount may be more effective in
certain situations. Development of this technique within a global or regional
grid system would be effective at identifying drifting volcanic clouds far
from the source, which is a current limitation. A combination of both the
developed technique and existing SO2 threshold approaches would likely
yield an optimal NRT volcanic cloud detection system suitable for both large
drifting plumes and smaller eruptions.