Gene-Adjacency-Based Phylogenetics Under a Stochastic Gain-Loss Model

Yoav Dvir, Shelly Brezner, Sagi Snir

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A key task in molecular systematics is to decipher the evolutionary history of strains of a species. Standard markers are often too crude in this fine systematic resolution to provide a phylogenetic signal. However, among prokaryotes, events in genome dynamics (GD) such as gene gain in horizontal gene transfer (HGT) between organisms and gene loss seem to provide a quite sensitive signal. The synteny index (SI) marker captures differences between a pair of genomes in terms of both gene order and gene content. Recently, it was shown to be consistent under the Jump model, a simple model of GD where the only operation is a gene jump. In this work, we extend the Jump model to a richer model, allowing for gene gain/loss events, the most prevalent GD events in prokaryotic evolution. Despite the increased model complexity, our new representation yields a significant reduction in the number of variables, leading to a simple equation to estimate the model parameter and, consequently, the consistency of the phylogenetic reconstruction. Additionally, with a more straightforward representation, we can easily calculate the asymptotic variance of the parameter estimation, allowing us to obtain a bound for the expected error. We tested the new model and its associated reconstruction approach on actual and simulated data, where the theoretical asymptotic assumptions do not hold. Our simulation results show a very high accuracy under short evolutionary distances. Applying the method to several families in the ATGC database resulted in relative agreement with other reconstruction approaches based on other signals. The code is on GitHub under the link: https://github.com/shellybre/indels_project.

Original languageEnglish
Title of host publicationComparative Genomics - 21st International Conference, RECOMB-CG 2024, Proceedings
EditorsCeline Scornavacca, Maribel Hernández-Rosales
PublisherSpringer Science and Business Media Deutschland GmbH
Pages69-85
Number of pages17
ISBN (Print)9783031580710
DOIs
StatePublished - 2024
Event21st RECOMB International Workshop on Comparative Genomics, RECOMB-CG 2024 - Boston, United States
Duration: 27 Apr 202428 Apr 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14616 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st RECOMB International Workshop on Comparative Genomics, RECOMB-CG 2024
Country/TerritoryUnited States
CityBoston
Period27/04/2428/04/24

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Keywords

  • Birth-Death Theory
  • Markovian Processes
  • Phylogenetics
  • Prokaryotic Genome Dynamics
  • Statistical Consistency

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Gene-Adjacency-Based Phylogenetics Under a Stochastic Gain-Loss Model'. Together they form a unique fingerprint.

Cite this