TY - GEN
T1 - Optimal packed string matching
AU - Ben-Kiki, Oren
AU - Bille, Philip
AU - Breslauer, Dany
AU - Ga̧sieniec, Leszek
AU - Grossi, Roberto
AU - Weimann, Oren
PY - 2011
Y1 - 2011
N2 - In the packed string matching problem, each machine word accommodates α characters, thus an n-character text occupies n/α memory words. We extend the Crochemore-Perrin constant-space O(n)-time string matching algorithm to run in optimal O(n/α) time and even in real-time, achieving a factor α speedup over traditional algorithms that examine each character individually. Our solution can be efficiently implemented, unlike prior theoretical packed string matching work. We adapt the standard RAM model and only use its AC0instructions (i.e., no multiplication) plus two specialized AC0packed string instructions. The main string-matching instruction is available in commodity processors (i.e., Intel's SSE4.2 and AVX Advanced String Operations); the other maximal-suffix instruction is only required during pattern preprocessing. In the absence of these two specialized instructions, we propose theoretically-efficient emulation using integer multiplication (not AC0) and table lookup.
AB - In the packed string matching problem, each machine word accommodates α characters, thus an n-character text occupies n/α memory words. We extend the Crochemore-Perrin constant-space O(n)-time string matching algorithm to run in optimal O(n/α) time and even in real-time, achieving a factor α speedup over traditional algorithms that examine each character individually. Our solution can be efficiently implemented, unlike prior theoretical packed string matching work. We adapt the standard RAM model and only use its AC0instructions (i.e., no multiplication) plus two specialized AC0packed string instructions. The main string-matching instruction is available in commodity processors (i.e., Intel's SSE4.2 and AVX Advanced String Operations); the other maximal-suffix instruction is only required during pattern preprocessing. In the absence of these two specialized instructions, we propose theoretically-efficient emulation using integer multiplication (not AC0) and table lookup.
KW - Bit parallelism
KW - Real time
KW - Space efficiency
KW - String matching
UR - http://www.scopus.com/inward/record.url?scp=84863112223&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.FSTTCS.2011.423
DO - 10.4230/LIPIcs.FSTTCS.2011.423
M3 - Conference contribution
AN - SCOPUS:84863112223
SN - 9783939897347
T3 - Leibniz International Proceedings in Informatics, LIPIcs
SP - 423
EP - 432
BT - 31st International Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2011
T2 - 31st International Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2011
Y2 - 12 December 2011 through 14 December 2011
ER -