文档介绍:ic Algorithms and Support Vector Machines
for Time Series Classification
Damian Eads*a,b, Daniel Hillb, Sean Davisa,
Simon Perkinsa, Junshui Maa, Reid Portera, and James Theilera
aNonproliferation and International Security Division bDepartment puter Science
Los Alamos National Laboratory Rochester Institute of Technology
MS D436 102 Lomb Memorial Drive
Los Alamos, NM 87545 Rochester, NY 14623
ABSTRACT
We introduce an algorithm for classifying time series data. Since our initial application is for lightning data,
we call the algorithm Zeus. Zeus is a hybrid algorithm that employs putation for feature
extraction, and a support vector machine for the final “backend” classification. Support vector machines have a
reputation for classifying in high-dimensional spaces without overfitting, so the utility of reducing dimensionality
with an intermediate feature selection step has been questioned. We address this question by testing Zeus on a
lightning classification task using data acquired from the Fast On-orbit Recording of Transient Events (FORTE)
satellite.
Keywords: time series classification, ic algorithm, ic programming, support vector machines, feature
selection, lightning, tornado, n-fold cross validation
1. INTRODUCTION
In this paper, we address the task of creating classification algorithms for time series data. Machine learning
techniques can be used to automatically derive such classifiers from sets of labeled sequences, but care must
be taken to achieve adequate performance on the task without overfitting the training data. Overfitting occurs
when the learning algorithm effectively “memorizes” the training data, and then performs poorly when applied
to data not in the training set.
Machine learning methods for classification can often be broken down into two steps: (a) selecting features,
and (b) fitting a model which takes these features as input and provides a classification as output. We have
developed a time series classification syste