Abstract
The haplotype inference problem (HIP) asks to find a set of haplotypes which resolve a given set of genotypes. This problem is important in practical fields such as the investigation of diseases or other types of genetic mutations. In order to find the haplotypes which are as close as possible to the real set of haplotypes that comprise the genotypes, two models have been suggested which are by now well-studied: The perfect phylogeny model and the pure parsimony model. All known algorithms up till now for haplotype inference may find haplotypes that are not necessarily plausible, i.e., very rare haplotypes or haplotypes that were never observed in the population. In order to overcome this disadvantage, we study in this paper, a new constrained version of HIP under the above-mentioned models. In this new version, a pool of plausible haplotypes H̃ is given together with the set of genotypes G, and the goal is to find a subset H H̃ that resolves G. For constrained perfect phylogeny haplotyping (CPPH), we provide initial insights and polynomial-time algorithms for some restricted cases of the problem. For constrained parsimony haplotyping (CPH), we show that the problem is fixed parameter tractable when parameterized by the size of the solution set of haplotypes.
Original language | English |
---|---|
Article number | 5557846 |
Pages (from-to) | 1692-1699 |
Number of pages | 8 |
Journal | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
Volume | 8 |
Issue number | 6 |
DOIs | |
State | Published - 2011 |
Bibliographical note
Funding Information:The authors would like to thank an anonymous referee for pointing out a bug in the proof of Lemma 2 which appeared in an early version of the paper. Work done while at CRI, Haifa University, and Department of Computer Science, Bar-Ilan University. G. Landau is partially supported by the US National Science Foundation Award 0904246, the Israel Science Foundation grant 347/09, Yahoo, Grant No. 2008217 from the United States-Israel Binational Science Foundation (BSF), and DFG.
Keywords
- Haplotyping
- Parameterized complexity
- perfect phylogeny
- polynomial-time algorithms
- pure parsimony
ASJC Scopus subject areas
- Biotechnology
- Genetics
- Applied Mathematics