Approximate runs - revisited

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The problem of finding repeats within a string is an important computational problem with applications in data compression and in the field of molecular biology. Both exact and inexact repeats occur frequently in the genome, and certain repeats are known to be related to human diseases. A multiple tandem repeat in a sequence S is a (periodic) substring r of S of the form r = u au, where u (the period) is a prefix of r, u is a prefix of u and a ≥ 2. A run is a maximal (non-extendable) multiple tandem repeat. An approximate run is a run with errors (i.e. the repeated subsequences are similar but not identical). Many measures have been proposed that capture the similarity among all periods. We may measure the number of errors between consecutive periods, between all periods, or between each period and a consensus string. Another possible measure is the number of positions in the periods that may differ. In this talk I will survey a range of our results in this area. Various parts of this work are joint work with Maxime Crochemore, Gene Myers, Jeanette Schmidt and Dina Sokol.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 15th International Symposium, SPIRE 2008, Proceedings
EditorsAndrew Turpin, Alistair Moffat, Amihood Amir
PublisherSpringer Verlag
Pages2
Number of pages1
ISBN (Print)9783540890966
DOIs
StatePublished - 2008
Event15th International Symposium on String Processing and Information Retrieval, SPIRE 2008 - Melbourne. VIC, Australia
Duration: 10 Nov 200812 Nov 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5280 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th International Symposium on String Processing and Information Retrieval, SPIRE 2008
Country/TerritoryAustralia
CityMelbourne. VIC
Period10/11/0812/11/08

Bibliographical note

Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2008.

Keywords

  • Prefix

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Approximate runs - revisited'. Together they form a unique fingerprint.

Cite this