Bounds on identification of genome evolution pacemakers

Research output: Contribution to journalArticlepeer-review

Abstract

Several studies have pointed out that the tight correlation between genes' evolutionary rate is better explained by a model denoted as the Universal PaceMaker (UPM) rather than by a simple rate constancy as manifested by the classical hypothesis of molecular clock (MC). Under UPM, each gene is associated with a single pacemaker (PM) and varies its evolutionary rate according to this PM ticks. Hence, the relative rates of all genes associated with the same PM remain nearly constant, whereas the absolute rates can change arbitrarily according to the PM ticks. A consequent question to that mentioned is finding the gene-PM association only from the gene sequence data. This, however, turns to be a nontrivial task and is affected by the number of variables, their random noise, and the amount of available information. To this end, a clustering heuristic was devised by exploiting the correlation between corresponding edge lengths across thousands of gene trees. Nevertheless, no theoretical study linking the relationship between the affecting parameters was done. We here study this question by providing theoretical bounds, expressed by the system parameters, on probabilities for positive and negative results. We corroborate these results by a simulation study that reveals the critical role of the variances.

Original languageEnglish
Pages (from-to)806-821
Number of pages16
JournalJournal of Computational Biology
Volume26
Issue number8
DOIs
StatePublished - Aug 2019

Bibliographical note

Funding Information:
We thank Eugene Koonin and Yuri Wolf for the inspiring question, and Ilan Newman and Nick Harvey for helpful discussions. Part of this study was done while the author was visiting the NIH, United States. The authors wish to acknowledge the Israel Science Foundation (ISF) for its kind support in doing this research.

Publisher Copyright:
© Copyright 2019, Mary Ann Liebert, Inc., publishers 2019.

Keywords

  • Chernoff bounds
  • DNA sequence evolution
  • chi square distribution
  • probabilistic geometrical clustering

ASJC Scopus subject areas

  • Modeling and Simulation
  • Molecular Biology
  • Genetics
  • Computational Mathematics
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Bounds on identification of genome evolution pacemakers'. Together they form a unique fingerprint.
  • Bounds on identification of genome evolution pacemakers

    Snir, S., 2018, Bioinformatics Research and Applications - 14th International Symposium, ISBRA 2018, Proceedings. Zhang, F., Zhang, S., Cai, Z. & Skums, P. (eds.). Springer Verlag, p. 51-62 12 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 10847 LNBI).

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Cite this