Abstract
We present a new simple algorithm that constructs an Aho Corasick automaton for a set of patterns, P, of total length n, in O(n) time and space for integer alphabets. Processing a text of size m over an alphabet Σ with the automaton costs O(mlog|Σ|+k), where there are k occurrences of patterns in the text. A new, efficient implementation of nodes in the Aho Corasick automaton is introduced, which works for suffix trees as well.
Original language | English |
---|---|
Pages (from-to) | 66-72 |
Number of pages | 7 |
Journal | Information Processing Letters |
Volume | 98 |
Issue number | 2 |
DOIs | |
State | Published - 30 Apr 2006 |
Bibliographical note
Funding Information:* Corresponding author. Tel.: +972 4 828 8375, fax: +972 4 824 9331. E-mail address: shiri@cri.haifa.ac.il (S. Dori). 1 Tel.: +972 4 824 0103, fax: +972 4 824 9331. Partially supported by the Israel Science Foundation grants 282/01 and 35/05.
Keywords
- Design of algorithms
- String matching
- Suffix array
- Suffix tree
ASJC Scopus subject areas
- Theoretical Computer Science
- Signal Processing
- Information Systems
- Computer Science Applications