Top Tree Compression of Tries

Philip Bille, Paweł Gawrychowski, Inge Li Gørtz, Gad M. Landau, Oren Weimann

Research output: Contribution to journalArticlepeer-review

Abstract

We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to preprocess a set of strings of total length n over an alphabet of size σ into a compressed data structure of worst-case optimal size O(n/ log σn) that given a pattern string P of length m determines if P is a prefix of one of the strings in time O(min (mlog σ, m+ log n)). We show that this query time is in fact optimal regardless of the size of the data structure. Existing solutions either use Ω (n) space or rely on word RAM techniques, such as tabulation, hashing, address arithmetic, or word-level parallelism, and hence do not work on a pointer machine. Our result is the first solution on a pointer machine that achieves worst-case o(n) space. Along the way, we develop several interesting data structures that work on a pointer machine and are of independent interest. These include an optimal data structures for random access to a grammar-compressed string and an optimal data structure for a variant of the level ancestor problem.

Original languageEnglish
Pages (from-to)3602-3628
Number of pages27
JournalAlgorithmica
Volume83
Issue number12
DOIs
StatePublished - 2021

Bibliographical note

Funding Information:
Philip Bille and Inge Li Gørtz: Supported by the Danish Research Council (DFF—4005-00267, DFF—1323-00178). Gad M. Landau: Supported by the Israel Science Foundation Grants 1475/18, and No. 2018141 from the United States-Israel Binational Science Foundation. Oren Weimann: Supported by the Israel Science Foundation Grant 592/17.

Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

Keywords

  • Pattern matching
  • Pointer machine
  • Top trees
  • Tree compression

ASJC Scopus subject areas

  • General Computer Science
  • Computer Science Applications
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Top Tree Compression of Tries'. Together they form a unique fingerprint.

Cite this