Given a text of length n, a pattern of length m and an integer k, we present an algorithm for finding all occurrences of the pattern in the text, each with at most k substitutions. The algorithm runs in O(k(m log m + n)) time, and requires O(nk) space. This algorithm has direct implications for nucleotide and amino acid sequence comparisons.
Bibliographical noteFunding Information:
U. Vishkin has been supported by NSF grants NSF-CCR-8615337 and NSF-DCR-8413359, ONR grant N00014-85-K-0046. U. Vishkin and G. M. Landau have been supported by the Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy, under contract number DE-AC02-76ER03077.
ASJC Scopus subject areas
- Statistics and Probability
- Modeling and Simulation
- Biochemistry, Genetics and Molecular Biology (all)
- Immunology and Microbiology (all)
- Agricultural and Biological Sciences (all)
- Applied Mathematics