Abstract
Phylogenetic tree reconstruction is a fundamental biological problem. Quartet amalgamation-combining a set of trees over four taxa into a tree over the full set of taxa-stands at the core of many phylogenetic reconstruction methods. This task has attracted many theoretical as well as practical works. However, even reconstruction from a consistent set of quartet trees is NP-hard, and the best approximation ratio known is 1/3. Despite its importance, the only rigorous results for approximating quartets are the naive 1/3 approximation that applies to the general case and a polynomial time approximation scheme (PTAS) when the input is the complete set of all (n4) possible quartets. Even when it is possible to determine the correct quartet induced by every four taxa, the time needed to generate the complete set of all quartets may be impractical. A faster approach is to sample at random just m (n4) quartets and provide this sample as an input. In this work we present the first polynomial time approximation algorithm whose expected guaranteed approximation is strictly better than 1/3 when the input is any random sample of m consistent quartets. The approximation ratio of the algorithm is greater than 0.425. An important ingredient in our algorithm involves solving a weighted maximum cut problem in a certain weighted graph that corresponds to the set of input quartets. Our second main result generalizes the aforementioned PTAS algorithm to handle dense, rather than complete, inputs.
Original language | English |
---|---|
Pages (from-to) | 1466-1480 |
Number of pages | 15 |
Journal | SIAM Journal on Computing |
Volume | 41 |
Issue number | 6 |
DOIs | |
State | Published - 2012 |
Keywords
- Approximation scheme
- Maxcut
- Phylogenetic reconstruction
- Quartet amalgamation
ASJC Scopus subject areas
- General Computer Science
- General Mathematics