Suppr超能文献

一种用于存在重组和突变的家系中单体型推断的高效算法。

An efficient algorithm for haplotype inference on pedigrees with recombinations and mutations.

机构信息

DISCo, University Milano-Bicocca, Milan, and Parco Tecnologico Padano, Lodi.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2012 Jan-Feb;9(1):12-25. doi: 10.1109/TCBB.2011.51. Epub 2011 Mar 3.

Abstract

Haplotype Inference (HI) is a computational challenge of crucial importance in a range of genetic studies. Pedigrees allow to infer haplotypes from genotypes more accurately than population data, since Mendelian inheritance restricts the set of possible solutions. In this work, we define a new HI problem on pedigrees, called MINIMUM-CHANGE HAPLOTYPE CONFIGURATION (MCHC) problem, that allows two types of genetic variation events: recombinations and mutations. Our new formulation extends the MINIMUM-RECOMBINANT HAPLOTYPE CONFIGURATION (MRHC) problem, that has been proposed in the literature to overcome the limitations of classic statistical haplotyping methods. Our contribution is twofold. First, we prove that the MCHC problem is APX-hard under several restrictions. Second, we propose an efficient and accurate heuristic algorithm for MCHC based on an L-reduction to a well-known coding problem. Our heuristic can also be used to solve the original MRHC problem and can take advantage of additional knowledge about the input genotypes. Moreover, the L-reduction proves for the first time that MCHC and MRHC are O(nm/(log nm))-approximable on general pedigrees, where n is the pedigree size and m is the genotype length. Finally, we present an extensive experimental evaluation and comparison of our heuristic algorithm with several other state-of-the-art methods for HI on pedigrees.

摘要

单体型推断 (HI) 是一系列遗传研究中至关重要的计算挑战。家系可通过基因型更准确地推断单体型,因为孟德尔遗传限制了可能的解决方案集。在这项工作中,我们在系谱上定义了一个新的 HI 问题,称为最小变化单体型配置 (MCHC) 问题,它允许两种类型的遗传变异事件:重组和突变。我们的新公式扩展了最小重组单体型配置 (MRHC) 问题,该问题已在文献中提出,以克服经典统计单体型分析方法的局限性。我们的贡献有两点。首先,我们证明了在几种限制下,MCHC 问题是 APX 难的。其次,我们提出了一种基于 L 归约到一个著名编码问题的高效准确的 MCHC 启发式算法。我们的启发式算法也可以用于解决原始的 MRHC 问题,并可以利用关于输入基因型的额外知识。此外,L 归约首次证明了在一般系谱上,MCHC 和 MRHC 可以以 O(nm/(log nm)) 的近似值求解,其中 n 是系谱的大小,m 是基因型的长度。最后,我们对我们的启发式算法与其他几种用于系谱 HI 的最先进方法进行了广泛的实验评估和比较。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验