A linear time approximation scheme for maximum quartet consistency on sparse sampled inputs

Research output: Contribution to journalArticlepeer-review

Abstract

Phylogenetic tree reconstruction is a fundamental biological problem. Quartet amalgamation-combining a set of trees over four taxa into a tree over the full set-stands at the heart of many phylogenetic reconstruction methods. This task has attracted many theoretical as well as practical works. However, even reconstruction from a consistent set of quartet trees, i.e., all quartets agree with some tree, is NP-hard, and the best approximation ratio known is 1/3. For a dense input of θ (n 4) quartets that are not necessarily consistent, the problem has a polynomial time approximation scheme. When the number of taxa grows, considering such dense inputs is impractical and some sampling approach is imperative. It is known that given a randomly sampled consistent set of quartets from an unknown phylogeny, one can find, in polynomial time and with high probability, a tree satisfying a 0.425 fraction of them, an improvement over the 1/3 ratio. In this paper we further show that given a randomly sampled consistent set of quartets from an unknown phylogeny, where the size of the sample is at least θ(n 2 log n), there is a randomized approximation scheme that runs in linear time in the number of quartets. The previously known polynomial approximation scheme for that problem required a very dense sample of size θ (n 4). We note that samples of size θ (n 2 log n) are sparse in the full quartet set. The result is obtained by a combinatorial technique that may be of independent interest.

Original languageEnglish
Pages (from-to)1722-1736
Number of pages15
JournalSIAM Journal on Discrete Mathematics
Volume25
Issue number4
DOIs
StatePublished - 2011

Keywords

  • Approximation scheme
  • Phylogenetic reconstruction
  • Quartet amalgamation

ASJC Scopus subject areas

  • General Mathematics

Fingerprint

Dive into the research topics of 'A linear time approximation scheme for maximum quartet consistency on sparse sampled inputs'. Together they form a unique fingerprint.
  • A linear time approximation scheme for maximum quartet consistency on sparse sampled inputs

    Snir, S. & Yuster, R., 2011, Approximation, Randomization, and Combinatorial Optimization: Algorithms and Techniques - 14th International Workshop, APPROX 2011 and 15th International Workshop, RANDOM 2011, Proceedings. p. 339-350 12 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 6845 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Cite this