Abstract
We revisit the fundamental problem of dictionary look-up with mismatches. Given a set (dictionary) of d strings of length m and an integer k, we must preprocess it into a data structure to answer the following queries: Given a query string Q of length m, find all strings in the dictionary that are at Hamming distance at most k from Q. Chan and Lewenstein (CPM 2015) showed a data structure for k = 1 with optimal query time O(m/w + occ), where w is the size of a machine word and occ is the size of the output. The data structure occupies O(wd log1+ε d) extra bits of space (beyond the entropy-bounded space required to store the dictionary strings). In this work we give a solution with similar bounds for a much wider range of values k. Namely, we give a data structure that has O(m/w + logk d + occ) query time and uses O(wd logk d) extra bits of space.
| Original language | English |
|---|---|
| Title of host publication | 43rd International Symposium on Mathematical Foundations of Computer Science, MFCS 2018 |
| Editors | Igor Potapov, James Worrell, Paul Spirakis |
| Publisher | Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |
| ISBN (Print) | 9783959770866 |
| DOIs | |
| State | Published - 1 Aug 2018 |
| Event | 43rd International Symposium on Mathematical Foundations of Computer Science, MFCS 2018 - Liverpool, United Kingdom Duration: 27 Aug 2018 → 31 Aug 2018 |
Publication series
| Name | Leibniz International Proceedings in Informatics, LIPIcs |
|---|---|
| Volume | 117 |
| ISSN (Print) | 1868-8969 |
Conference
| Conference | 43rd International Symposium on Mathematical Foundations of Computer Science, MFCS 2018 |
|---|---|
| Country/Territory | United Kingdom |
| City | Liverpool |
| Period | 27/08/18 → 31/08/18 |
Bibliographical note
Publisher Copyright:© Paweł Gawrychowski, Gad M. Landau, and Tatiana Starikovskaya.
Keywords
- Compact data structures
- Dictionary look-up
- Hamming distance
ASJC Scopus subject areas
- Software