The net-hmm approach: Phylogenetic network inference by combining maximum likelihood and hidden Markov models

Sagi Snir, Tamir Tuller

Research output: Contribution to journalArticlepeer-review

Abstract

Horizontal gene transfer (HGT) is the event of transferring genetic material from one lineage in the evolutionary tree to a different lineage. HGT plays a major role in bacterial genome diversification and is a significant mechanism by which bacteria develop resistance to antibiotics. Although the prevailing assumption is of complete HGT, cases of partial HGT (which are also named chimeric HGT) where only part of a gene is horizontally transferred, have also been reported, albeit less frequently. In this work we suggest a new probabilistic model, the NET-HMM, for analyzing and modeling phylogenetic networks. This new model captures the biologically realistic assumption that neighboring sites of DNA or amino acid sequences are not independent, which increases the accuracy of the inference. The model describes the phylogenetic network as a Hidden Markov Model (HMM), where each hidden state is related to one of the network's trees. One of the advantages of the NET-HMM is its ability to infer partial HGT as well as complete HGT. We describe the properties of the NET-HMM, devise efficient algorithms for solving a set of problems related to it, and implement them in software. We also provide a novel complementary significance test for evaluating the fitness of a model (NET-HMM) to a given dataset. Using NET-HMM, we are able to answer interesting biological questions, such as inferring the length of partial HGT's and the affected nucleotides in the genomic sequences, as well as inferring the exact location of HGT events along the tree branches. These advantages are demonstrated through the analysis of synthetical inputs and three different biological inputs.

Original languageEnglish
Pages (from-to)625-644
Number of pages20
JournalJournal of Bioinformatics and Computational Biology
Volume7
Issue number4
DOIs
StatePublished - 2009

Bibliographical note

Funding Information:
T.T. was supported by the Edmond J. Safra Bioinformatics Program at Tel Aviv University and the Yeshaya Horowitz Association through the Center for Complexity Science.

Keywords

  • Hidden Markov Models
  • Horizontal gene transfer
  • Maximum Likelihood
  • Phylogenetic network

ASJC Scopus subject areas

  • Molecular Biology
  • Biochemistry
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'The net-hmm approach: Phylogenetic network inference by combining maximum likelihood and hidden Markov models'. Together they form a unique fingerprint.

Cite this