Department of Computer Engineering and Information Technology, Isfahan University of Technology, Isfahan 84156-83111, Iran.
Gene. 2012 Oct 10;507(2):177-82. doi: 10.1016/j.gene.2012.06.032. Epub 2012 Jul 7.
Haplotypes include essential SNP information used for a variety of purposes such as investigating potential links between certain diseases and genetic variations. Given a set of genotypes, the haplotype inference problem based on pure parsimony is the problem of finding a minimum set of haplotypes that explains all the given genotypes. The problem is especially important because, while it is fairly inexpensive to obtain genotypes, other approaches to obtaining haplotypes are significantly expensive. There are two types of methods proposed for the problem, namely exact and inexact methods. Existing exact methods guarantee obtaining purely parsimonious solutions but have exponential time-complexities and are not practical for large number or length of genotypes. However, inexact methods are relatively fast but do not always obtain optimum solutions. In this paper, an improved heuristic is proposed, based on which new inexact and exact methods are provided. Experimental results indicate that the proposed methods replace the state-of-the-art inexact and exact methods for the problem.
单倍型包含用于各种目的的重要 SNP 信息,例如研究某些疾病与遗传变异之间的潜在联系。给定一组基因型,基于纯简约的单倍型推断问题是找到一个最小的单倍型集,该集合可以解释所有给定的基因型。该问题尤为重要,因为虽然获取基因型的成本相当低廉,但获取单倍型的其他方法的成本则要高得多。针对该问题,已经提出了两种类型的方法,即精确方法和不精确方法。现有的精确方法保证得到纯简约的解决方案,但时间复杂度为指数级,不适用于大量或长的基因型。然而,不精确方法相对较快,但并不总是获得最佳解决方案。在本文中,提出了一种改进的启发式方法,并在此基础上提供了新的不精确和精确方法。实验结果表明,所提出的方法可以替代该问题的最新不精确和精确方法。