On Indeterminate Strings Matching

PaweÅ Gawrychowski, Samah Ghazawi, Gad M. Landau

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Given two indeterminate equal-length strings p and t with a set of characters per position in both strings, we obtain a determinate string pw from p and a determinate string tw from t by choosing one character per position. Then, we say that p and t match when pw and tw match for some choice of the characters. While the most standard notion of a match for determinate strings is that they are simply identical, in certain applications it is more appropriate to use other definitions, with the prime examples being parameterized matching, order-preserving matching, and the recently introduced Cartesian tree matching. We provide a systematic study of the complexity of string matching for indeterminate equal-length strings, for different notions of matching. We use n to denote the length of both strings, and r to be an upper-bound on the number of uncertain characters per position. First, we provide the first polynomial time algorithm for the Cartesian tree version that runs in deterministic O(n log2 n) and expected O(n log n log log n) time using O(n log n) space, for constant r. Second, we establish NP-hardness of the order-preserving version for r = 2, thus solving a question explicitly stated by Henriques et al. [CPM 2018], who showed hardness for r = 3. Third, we establish NP-hardness of the parameterized version for r = 2. As both parameterized and order-preserving indeterminate matching reduce to the standard determinate matching for r = 1, this provides a complete classification for these three variants. 2012 ACM Subject Classification Theory of computation ! Pattern matching.

Original languageEnglish
Title of host publication31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020
EditorsInge Li Gortz, Oren Weimann
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
Pages14:1-14:14
ISBN (Electronic)9783959771498
DOIs
StatePublished - 1 Jun 2020
Event31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020 - Copenhagen, Denmark
Duration: 17 Jun 202019 Jun 2020

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume161
ISSN (Print)1868-8969

Conference

Conference31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020
Country/TerritoryDenmark
CityCopenhagen
Period17/06/2019/06/20

Bibliographical note

Funding Information:
Funding Samah Ghazawi: Partially supported by the Israel Science Foundation (ISF) grant 1475/18. Gad M. Landau: Partially supported by the Israel Science Foundation (ISF) grant 1475/18, and the United States-Israel Binational Science Foundation (BSF) grant No. 2018141.

Publisher Copyright:
© 2020 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. All rights reserved.

Keywords

  • Cartesian trees
  • Indeterminate strings
  • Order-preserving matching
  • Parameterized matching
  • String matching

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'On Indeterminate Strings Matching'. Together they form a unique fingerprint.

Cite this