TY - CHAP

T1 - Sparse LCS common substring alignment

AU - Landau, Gad M.

AU - Schieber, Baruch

AU - Ziv-Ukelson, Michal

PY - 2003

Y1 - 2003

N2 - The "Common Substring Alignment" problem is defined as follows. The input consists of a set of strings S1,S2...Sc, with a common substring appearing at least once in each of them, and a target string T. The goal is to compute similarity of all strings Si with T, without computing the part of the common substring over and over again. In this paper we consider the Common Substring Alignment problem for the LCS (Longest Common Subsequence) similarity metric. Our algorithm gains its efficiency by exploiting the sparsity inherent to the LCS problem. Let Y be the common substring, n be the size of the compared sequences, Ly be the length of the LCS of T and Y, denoted |LCS[T, Y]|, and L be max{|LCS[T, Si]|}. Our algorithm consists of an O(nLy) time encoding stage that is executed once per common substring, and an O(L) time alignment stage that is executed once for each appearance of the common substring in each source string. The additional running time depends only on the length of the parts of the strings that are not in any common substring.

AB - The "Common Substring Alignment" problem is defined as follows. The input consists of a set of strings S1,S2...Sc, with a common substring appearing at least once in each of them, and a target string T. The goal is to compute similarity of all strings Si with T, without computing the part of the common substring over and over again. In this paper we consider the Common Substring Alignment problem for the LCS (Longest Common Subsequence) similarity metric. Our algorithm gains its efficiency by exploiting the sparsity inherent to the LCS problem. Let Y be the common substring, n be the size of the compared sequences, Ly be the length of the LCS of T and Y, denoted |LCS[T, Y]|, and L be max{|LCS[T, Si]|}. Our algorithm consists of an O(nLy) time encoding stage that is executed once per common substring, and an O(L) time alignment stage that is executed once for each appearance of the common substring in each source string. The additional running time depends only on the length of the parts of the strings that are not in any common substring.

UR - http://www.scopus.com/inward/record.url?scp=35248816884&partnerID=8YFLogxK

U2 - 10.1007/3-540-44888-8_17

DO - 10.1007/3-540-44888-8_17

M3 - Chapter

AN - SCOPUS:35248816884

SN - 3540403116

SN - 9783540403111

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 225

EP - 236

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

A2 - Baeza-Yates, Ricardo

A2 - Chavez, Edgar

A2 - Crochemore, Maxime

PB - Springer Verlag

ER -