This work deals with symbolic mathematical solutions to maximum likelihood on small phylogenetic trees. Maximum likelihood (ML) is increasingly used as an optimality criterion for selecting evolutionary trees, but finding the global optimum is a hard computational task. In this work, we give general analytic solutions for a family of trees with four taxa, two state characters, under a molecular clock. Previously, analytical solutions were known only for three taxa trees. The change from three to four taxa incurs a major increase in the complexity of the underlying algebraic system, and requires novel techniques and approaches. Despite the simplicity of our model, solving ML analytically in it is close to the limit of today's tractability. Four taxa rooted trees have two topologies - the fork (two subtrees with two leaves each) and the comb (one subtree with three leaves, the other with a single leaf). Combining the properties of molecular clock fork trees with the Hadamard conjugation, and employing the symbolic algebra software Maple, we derive a number of topology dependent identities. Using these identities, we substantially simplify the system of polynomial equations for the fork. We finally employ the symbolic algebra software to obtain closed form analytic solutions (expressed parametrically in the input data).
Bibliographical noteFunding Information:
Research supported by ISF grant 418/00. Part of these results were presented at the RECOMB 2003 conference in Berlin.
- Analytic solutions
- Hadamard conjugation
- Maximum likelihood
- Molecular clock
- Phylogenetic trees
- Symbolic manipulation
ASJC Scopus subject areas
- Statistics and Probability
- Modeling and Simulation
- Biochemistry, Genetics and Molecular Biology (all)
- Immunology and Microbiology (all)
- Agricultural and Biological Sciences (all)
- Applied Mathematics