Abstract
With the availability of enormous quantities of genetic data it has become common to construct very accurate trees describing the evolutionary history of the species under study, as well as every single gene of these species. These trees allow us to examine the evolutionary compliance of given markers (characters). A marker compliant with the history of the species investigated, has undergone mutations along the species tree branches, such that every subtree of that tree exhibits a different state. Convex recoloring (CR) uses combinatorial representation to measure the adequacy of a taxonomic classifier to a given tree. Despite its biological origins, research on CR has been almost exclusively dedicated to mathematical properties of the problem, or variants of it with little, if any, relationship to taxonomy. In this work we return to the origins of CR. We put CR in a statistical framework and introduce and learn the notion of the statistical significance of a character. We apply this measure to two data sets - Passerine birds and prokaryotes, and four examples. These examples demonstrate various applications of CR, from evolutionary relatedness, through lateral evolution, to supertree construction. The above study was done with a new software that we provide, containing algorithmic improvement with a graphical output of a (optimally) recolored tree. Availability: A code implementing the features and a README is available at http://research.haifa.ac.il/ssagi/software/convexrecoloring.zip.
Original language | English |
---|---|
Pages (from-to) | 209-220 |
Number of pages | 12 |
Journal | Molecular Phylogenetics and Evolution |
Volume | 107 |
DOIs | |
State | Published - 1 Feb 2017 |
Bibliographical note
Publisher Copyright:© 2016 Elsevier Inc.
Keywords
- Character compatibility
- Maximum parsimony
- Optimal convex recoloring cost
- Perfect phylogeny
- Phylogenetics
- Statistical significance
- Supertree
ASJC Scopus subject areas
- Ecology, Evolution, Behavior and Systematics
- Molecular Biology
- Genetics