A statistical model is presented for dealing with genotypic frequency data obtained from a single population observed over a run of consecutive generations. This model takes into account possible correlations that exist between generations by conditioning the marginal probability distribution of any one generation on the previously observed generation. Maximum likelihood estimates of the fitness parameters are derived and a hypothesis testing framework developed. The model is very general, and in this paper is applied to random-mating, selfing, parthenogenetic and mixed random-mating and selfing populations with respect to a single locus, g-allele model with constant genotypic fitness differences with all selection occurring either before or after sampling. The assumptions behind this model are contrasted with those of alternative techniques such as minimum chi-square or "unconditional" maximum likelihood estimation when the marginal likelihoods for any one generation are conditioned only on the initial conditions and not the previous generation. The conditional model is most appropriate when the sample size per generation is large either in an absolute sense or in relation to the total population size. Minimum chi-square and the unconditional likelihood are most appropriate when the population size is effectively infinite and the samples are small. Both models are appropriate when the samples are large and the population size is effectively infinite. Under these last conditions, the conditional model may be preferred because it has greater robustness with respect to small deviations from the underlying assumptions and has a greater simplicity of form. Furthermore, if any genetic drift occurs in the experiment, the minimum chi-square and unconditional likelihood approaches can create spurious evidence for selection while the conditional approach will not. Worked examples are presented.
ASJC Scopus subject areas
- Agronomy and Crop Science