A new quartet-based statistical method for comparing sets of gene trees is developed using a generalized hoeffding inequality

Research output: Contribution to journalArticlepeer-review

Abstract

Extracting the strength of the tree signal that is encompassed by a collection of gene trees is an exceptionally challenging problem in phylogenomics. Often, this problem not only involves the construction of individual phylogenies based on different genes, which may be a difficult endeavor on its own, but is also exacerbated by many factors that create conflicts between the evolutionary histories of different gene families, such as duplications or losses of genes; hybridization events; incomplete lineage sorting; and horizontal gene transfer, the latter two play central roles in the evolution of eukaryotes and prokaryotes, respectively. In this work, we tackle the aforementioned problem by focusing on quartet trees, which are the most basic unit of information in the context of unrooted phylogenies. In the first part, we show how a theorem of Janson that generalizes the classical Hoeffding inequality can be used to develop a statistical test involving quartets. In the second part, we study real and simulated data using this theoretical advancement, thus demonstrating how the significance of the differences between sets of quartets can be assessed. Our results are particularly intriguing since they nonstandardly require the analysis of dependent random variables.

Original languageEnglish
Pages (from-to)27-37
Number of pages11
JournalJournal of Computational Biology
Volume26
Issue number1
DOIs
StatePublished - Jan 2019

Bibliographical note

Publisher Copyright:
© 2019 Mary Ann Liebert, Inc., publishers.

Keywords

  • Hoeffding inequality
  • horizontal gene transfer
  • prokaryotic evolution
  • quartet plurality

ASJC Scopus subject areas

  • Modeling and Simulation
  • Molecular Biology
  • Genetics
  • Computational Mathematics
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'A new quartet-based statistical method for comparing sets of gene trees is developed using a generalized hoeffding inequality'. Together they form a unique fingerprint.

Cite this