TY - GEN

T1 - Consequences of faster alignment of sequences

AU - Abboud, Amir

AU - Williams, Virginia Vassilevska

AU - Weimann, Oren

PY - 2014

Y1 - 2014

N2 - The Local Alignment problem is a classical problem with applications in biology. Given two input strings and a scoring function on pairs of letters, one is asked to find the substrings of the two input strings that are most similar under the scoring function. The best algorithms for Local Alignment run in time that is roughly quadratic in the string length. It is a big open problem whether substantially subquadratic algorithms exist. In this paper we show that for all ε > 0, an O(n2-ε) time algorithm for Local Alignment on strings of length n would imply breakthroughs on three longstanding open problems: it would imply that for some δ > 0, 3SUM on n numbers is in O(n2-δ) time, CNF-SAT on n variables is in O((2-δ) n) time, and Max Weight 4-Clique is in O(n4-δ) time. Our result for CNF-SAT also applies to the easier problem of finding the longest common substring of binary strings with don't cares. We also give strong conditional lower bounds for the more general Multiple Local Alignment problem on k strings, under both k-wise and SP scoring, and for other string similarity problems such as Global Alignment with gap penalties and normalized Longest Common Subsequence.

AB - The Local Alignment problem is a classical problem with applications in biology. Given two input strings and a scoring function on pairs of letters, one is asked to find the substrings of the two input strings that are most similar under the scoring function. The best algorithms for Local Alignment run in time that is roughly quadratic in the string length. It is a big open problem whether substantially subquadratic algorithms exist. In this paper we show that for all ε > 0, an O(n2-ε) time algorithm for Local Alignment on strings of length n would imply breakthroughs on three longstanding open problems: it would imply that for some δ > 0, 3SUM on n numbers is in O(n2-δ) time, CNF-SAT on n variables is in O((2-δ) n) time, and Max Weight 4-Clique is in O(n4-δ) time. Our result for CNF-SAT also applies to the easier problem of finding the longest common substring of binary strings with don't cares. We also give strong conditional lower bounds for the more general Multiple Local Alignment problem on k strings, under both k-wise and SP scoring, and for other string similarity problems such as Global Alignment with gap penalties and normalized Longest Common Subsequence.

UR - http://www.scopus.com/inward/record.url?scp=84904205204&partnerID=8YFLogxK

U2 - 10.1007/978-3-662-43948-7_4

DO - 10.1007/978-3-662-43948-7_4

M3 - Conference contribution

AN - SCOPUS:84904205204

SN - 9783662439470

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 39

EP - 51

BT - Automata, Languages, and Programming - 41st International Colloquium, ICALP 2014, Proceedings

PB - Springer Verlag

T2 - 41st International Colloquium on Automata, Languages, and Programming, ICALP 2014

Y2 - 8 July 2014 through 11 July 2014

ER -