Using homologous sequences from eight vertebrates, we present a concrete example of the estimation of mutation rates in the models of evolution introduced in Chapter 4. We detail the process of data selection from a multiple alignment of the ENCODE regions, and compare rate estimates for each of the models in the Felsenstein hierarchy of Figure 4.7. We also address a standing problem in vertebrate evolution, namely the resolution of the phylogeny of the Eutherian orders, and discuss several challenges of molecular sequence analysis in inferring the phylogeny of this subclass. In particular, we consider the question of the position of the rodents relative to the primates, carnivores and artiodactyls; we affectionately dub this question the rodent problem. Estimating mutation rates Given an alignment of sequence homologs from various taxa, and an evolutionary model from Section 4.5, we are naturally led to ask the question, “what tree (with what branch lengths) and what values of the parameters in the rate matrix for that model are suggested by the alignment?” One answer to this question, the so-called maximum-likelihood solution, is, “the tree and rate parameters which maximize the probability that the given alignment would be generated by the given model.” (See also Sections 1.3 and 3.3.) There are a number of available software packages which attempt to find, to varying degrees, this maximum-likelihood solution. For example, for a few of the most restrictive models in the Felsenstein hierarchy, the package PHYLIP [Felsenstein, 2004] will very efficiently search the tree space for the maximum-likelihood tree and rate parameters.
|Title of host publication
|Algebraic Statistics for Computational Biology
|Cambridge University Press
|Number of pages
|Published - 1 Jan 2005
Bibliographical notePublisher Copyright:
© Cambridge University Press 2005.
ASJC Scopus subject areas
- General Mathematics