Abstract
Given two indeterminate equal-length strings p and t with a set of characters per position in both strings, we obtain a determinate string pw from p and a determinate string tw from t by choosing one character per position. Then, we say that p and t match when pw and tw match for some choice of the characters. While the most standard notion of a match for determinate strings is that they are simply identical, in certain applications it is more appropriate to use other definitions, with the prime examples being parameterized matching, order-preserving matching, and the recently introduced Cartesian tree matching. We provide a systematic study of the complexity of string matching for indeterminate equal-length strings, for different notions of matching. We use n to denote the length of both strings, and r to be an upper-bound on the number of uncertain characters per position. First, we provide the first polynomial time algorithm for the Cartesian tree version that runs in deterministic O(n log2 n) and expected O(n log n log log n) time using O(n log n) space, for constant r. Second, we establish NP-hardness of the order-preserving version for r = 2, thus solving a question explicitly stated by Henriques et al. [CPM 2018], who showed hardness for r = 3. Third, we establish NP-hardness of the parameterized version for r = 2. As both parameterized and order-preserving indeterminate matching reduce to the standard determinate matching for r = 1, this provides a complete classification for these three variants. 2012 ACM Subject Classification Theory of computation ! Pattern matching.
Original language | English |
---|---|
Title of host publication | 31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020 |
Editors | Inge Li Gortz, Oren Weimann |
Publisher | Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |
Pages | 14:1-14:14 |
ISBN (Electronic) | 9783959771498 |
DOIs | |
State | Published - 1 Jun 2020 |
Event | 31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020 - Copenhagen, Denmark Duration: 17 Jun 2020 → 19 Jun 2020 |
Publication series
Name | Leibniz International Proceedings in Informatics, LIPIcs |
---|---|
Volume | 161 |
ISSN (Print) | 1868-8969 |
Conference
Conference | 31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020 |
---|---|
Country/Territory | Denmark |
City | Copenhagen |
Period | 17/06/20 → 19/06/20 |
Bibliographical note
Funding Information:Funding Samah Ghazawi: Partially supported by the Israel Science Foundation (ISF) grant 1475/18. Gad M. Landau: Partially supported by the Israel Science Foundation (ISF) grant 1475/18, and the United States-Israel Binational Science Foundation (BSF) grant No. 2018141.
Publisher Copyright:
© 2020 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. All rights reserved.
Keywords
- Cartesian trees
- Indeterminate strings
- Order-preserving matching
- Parameterized matching
- String matching
ASJC Scopus subject areas
- Software