Abstract
Change detection is one of the most important tasks in time series analysis. When the series is very long, or when it is rapidly updated, it has to be treated as a stream. This means that the change detection algorithm must process each sample in O (1) time and memory. A good algorithm must be generic in terms of the type of changes it can detect. Beyond all, a good algorithm must present a favorable and controlled ratio of the number of samples needed to detect a change to the rate of false positives. We present a change-point detection algorithm called ProTO which dynamically manages a set of candidate change-points whose expected size is a controllable constant. In terms of sample processing, ProTO is comparable with the fastest known algorithm-the Page-Hinkley Test (PHT). Yet, because PHT is limited to just one candidate, ProTO outperforms it in terms of the ratio of the delay to the false positive rate, as well as in terms of robustness. We provide variants of ProTO for detecting changes in the mean or the variance of the stream, and experiment with two realistic applications, as well as with synthetic data. On real problems, ProTO compares favorably with state-of-the-art algorithms implemented in the R-package, which require more than O (1) time per sample.
Original language | English |
---|---|
Pages (from-to) | 125-139 |
Number of pages | 15 |
Journal | Statistical Analysis and Data Mining |
Volume | 7 |
Issue number | 2 |
DOIs | |
State | Published - Apr 2014 |
Bibliographical note
Publisher Copyright:© 2014 Wiley Periodicals, Inc.
Keywords
- Big-data
- Change detection
- Data stream
- Two-sample test
ASJC Scopus subject areas
- Analysis
- Information Systems
- Computer Science Applications