Predictive validity of discriminant analysis for genetic data

Research output: Contribution to journalArticlepeer-review


We examined the predictive validity of the results using discriminant analysis to distinguish statistically among two or more populations with a large sample of random amplified polymorphic DNA (RAPD) loci, but a small sample of genotypes from each population. We compared and contrasted results from randomized data with results from real data of three studies by 100 randomized shuffling of genotypes into various populations. We generally obtained substantial differences between results from randomized data compared to those from the real data in several characteristics of discriminant analysis. We showed that a high level of correctly classified percentage is also obtainable in the analysis of randomized data, mainly with a low number of populations. However, the correctly classified percentage obtained from the real data was generally significantly higher than the percentage obtained from the randomized data. We suggested that the high level of real differences in allele frequencies of the RAPD polymorphic loci clearly distinguished the various populations and that the populations differ significantly in their RAPD contents in accordance with ecological heterogeneity. We obtained either no or a low level of difference between the correct classification rate obtained by the leaving-one-out procedure and that obtained from the original data, attributed to a low number of loci selected by the stepwise method. The results strengthen and support our conclusion and lead us to focus on the discriminant analysis by selecting only low numbers of discriminating variables.

Original languageEnglish
Pages (from-to)259-267
Number of pages9
Issue number3
StatePublished - Nov 2003

Bibliographical note

Funding Information:
The authors thank Dr Avigdor Beiles for valuable comments on an earlier draft of the manuscript. We also thank Dr Tzion Fahima, Dr Edward D Owuor, and Dr Youchun Li for kindly permitting us to use their data sets in this study. This study was supported by the Israel Discount Bank Chair of Evolutionary Biology and the Ancell-Teicher Research Foundation for Genetics and Molecular Evolution.


  • Best differentiating loci
  • Canonical discriminant functions
  • Correct genotype classification
  • Leaving-one-out
  • RAPD
  • Randomized data
  • Wilk's lambda

ASJC Scopus subject areas

  • Animal Science and Zoology
  • Genetics
  • Plant Science
  • Insect Science


Dive into the research topics of 'Predictive validity of discriminant analysis for genetic data'. Together they form a unique fingerprint.

Cite this