Abstract
Supertree methods are used to construct a large tree over a large set of taxa from a set of small trees over overlapping subsets of the complete taxa set. Since accurate reconstruction methods are currently limited to a maximum of a few dozen taxa, the use of a supertree method in order to construct the tree of life is inevitable. Supertree methods are broadly divided according to the input trees: When the input trees are unrooted, the basic reconstruction unit is a quartet tree. In this case, the basic decision problem of whether there exists a tree that agrees with all quartets is NP-complete. On the other hand, when the input trees are rooted, the basic reconstruction unit is a rooted triplet and the above decision problem has a polynomial time algorithm. However, when there is no tree which agrees with all triplets, it would be desirable to find the tree that agrees with the maximum number of triplets. However, this optimization problem was shown to be NP-hard. Current heuristic approaches perform min cut on a graph representing the triplets inconsistency and return a tree that is guaranteed to satisfy some required properties. In this work, we present a different heuristic approach that guarantees the properties provided by the current methods and give experimental evidence that it significantly outperforms currently used methods. This method is based on a divide and conquer approach, where the min cut in the divide step is replaced by a max cut in a variant of the same graph. The latter is achieved by a lightweight semidefinite programming-like heuristic that leads to very fast running times.
Original language | English |
---|---|
Pages (from-to) | 323-333 |
Number of pages | 11 |
Journal | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
Volume | 3 |
Issue number | 4 |
DOIs | |
State | Published - Oct 2006 |
Externally published | Yes |
Bibliographical note
Funding Information:The authors would very much like to thank David Fernandez Baca and Duhong Chen for their help with the many technicalities involved and Usman Roshan for providing the rcbL data. They also thank Benny Chor, Oliver Eulenstein, Rod Page, Mauricio Resende, and Mike Steel for helpful discussions. Finally, they thank the two anonymous referees who gave very helpful comments on the manuscript. This research was supported by US National Institutes of Health Grant R01-HG02362-02. Satish Rao was partially supported by US National Science Foundation Award 0331494.
Keywords
- Phylogenetic trees
- Rooted triplets
- Semidefinite programming
- Supertrees
ASJC Scopus subject areas
- Biotechnology
- Genetics
- Applied Mathematics