Graça Ana, Lynce Inês, Marques-Silva João, Oliveira Arlindo L
Instituto Superior Técnico (IST), Technical University of Lisbon, INESC-ID Lisboa, Lisbon, Portugal.
J Comput Biol. 2010 Aug;17(8):969-92. doi: 10.1089/cmb.2009.0101.
Given a set of genotypes from a population, the process of recovering the haplotypes that explain the genotypes is called haplotype inference. The haplotype inference problem under the assumption of pure parsimony consists in finding the smallest number of haplotypes that explain a given set of genotypes. This problem is NP-hard. The original formulations for solving the Haplotype Inference by Pure Parsimony (HIPP) problem were based on integer linear programming and branch-and-bound techniques. More recently, solutions based on Boolean satisfiability, pseudo-Boolean optimization, and answer set programming have been shown to be remarkably more efficient. HIPP can now be regarded as a feasible approach for haplotype inference, which can be competitive with other different approaches. This article provides an overview of the methods for solving the HIPP problem, including preprocessing, bounding techniques, and heuristic approaches. The article also presents an empirical evaluation of exact HIPP solvers on a comprehensive set of synthetic and real problem instances. Moreover, the bounding techniques to the exact problem are evaluated. The final section compares and discusses the HIPP approach with a well-established statistical method that represents the reference algorithm for this problem.
给定一组来自某个群体的基因型,找出能够解释这些基因型的单倍型的过程称为单倍型推断。在纯简约假设下的单倍型推断问题在于找到能解释给定基因型集的最少单倍型数量。这个问题是NP难问题。最初用于通过纯简约法解决单倍型推断(HIPP)问题的公式基于整数线性规划和分支定界技术。最近,基于布尔可满足性、伪布尔优化和回答集编程的解决方案已被证明效率显著更高。现在,HIPP可被视为一种可行的单倍型推断方法,它能与其他不同方法相竞争。本文概述了解决HIPP问题的方法,包括预处理、定界技术和启发式方法。本文还对一系列综合的合成和实际问题实例上的精确HIPP求解器进行了实证评估。此外,还对精确问题的定界技术进行了评估。最后一部分将HIPP方法与一种成熟的统计方法进行了比较和讨论,该统计方法是此问题的参考算法。