用于在系谱上重建零重组单倍型结构的线性时间算法。

A linear-time algorithm for reconstructing zero-recombinant haplotype configuration on a pedigree.

机构信息

Institute of Biomedical Informatics, National Yang Ming University, Taipei 112, Taiwan.

出版信息

BMC Bioinformatics. 2012;13 Suppl 17(Suppl 17):S19. doi: 10.1186/1471-2105-13-S17-S19. Epub 2012 Dec 13.

DOI:10.1186/1471-2105-13-S17-S19

PMID:23281626

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3521470/

Abstract

BACKGROUND

When studying genetic diseases in which genetic variations are passed on to offspring, the ability to distinguish between paternal and maternal alleles is essential. Determining haplotypes from genotype data is called haplotype inference. Most existing computational algorithms for haplotype inference have been designed to use genotype data collected from individuals in the form of a pedigree. A haplotype is regarded as a hereditary unit and therefore input pedigrees are preferred that are free of mutational events and have a minimum number of genetic recombinational events. These ideas motivated the zero-recombinant haplotype configuration (ZRHC) problem, which strictly follows the Mendelian law of inheritance, namely that one haplotype of each child is inherited from the father and the other haplotype is inherited from the mother, both without any mutation. So far no linear-time algorithm for ZRHC has been proposed for general pedigrees, even though the number of mating loops in a human pedigree is usually very small and can be regarded as constant.

RESULTS

Given a pedigree with n individuals, m marker loci, and k mating loops, we proposed an algorithm that can provide a general solution to the zero-recombinant haplotype configuration problem in O(kmn + k2m) time. In addition, this algorithm can be modified to detect inconsistencies within the genotype data without loss of efficiency. The proposed algorithm was subject to 12000 experiments to verify its performance using different (n, m) combinations. The value of k was uniformly distributed between zero and six throughout all experiments. The experimental results show a great linearity in terms of execution time in relation to input size when both n and m are larger than 100. For those experiments where n or m are less than 100, the proposed algorithm runs very fast, in thousandth to hundredth of a second, on a personal desktop computer.

CONCLUSIONS

We have developed the first deterministic linear-time algorithm for the zero-recombinant haplotype configuration problem. Our experimental results demonstrated the linearity of its execution time in relation to the input size. The proposed algorithm can be modified to detect inconsistency within the genotype data without loss of efficiency and is expected to be able to handle recombinant and missing data with further extension.

摘要

背景

在研究遗传疾病时，遗传变异会遗传给后代，因此区分父本和母本等位基因的能力至关重要。从基因型数据中推断出单倍型称为单倍型推断。大多数现有的用于单倍型推断的计算算法都是为使用以谱系形式收集的个体的基因型数据而设计的。单倍型被视为遗传单位，因此输入的谱系最好没有突变事件，并且具有最小数量的遗传重组事件。这些想法激发了零重组单倍型构型（ZRHC）问题的产生，该问题严格遵循孟德尔遗传定律，即每个孩子的一个单倍型来自父亲，另一个单倍型来自母亲，两者都没有任何突变。到目前为止，即使人类谱系中的交配环数量通常非常少，可以视为常数，但尚未提出针对一般谱系的 ZRHC 的线性时间算法。

结果

对于具有 n 个人、m 个标记基因座和 k 个交配环的谱系，我们提出了一种算法，可以在 O(kmn + k2m)时间内为零重组单倍型构型问题提供通用解决方案。此外，该算法可以修改为在不降低效率的情况下检测基因型数据中的不一致性。所提出的算法经过 12000 次实验验证了其使用不同（n，m）组合的性能。在所有实验中，k 的值在零到六之间均匀分布。实验结果表明，当 n 和 m 都大于 100 时，执行时间与输入大小之间具有很好的线性关系。对于那些 n 或 m 小于 100 的实验，所提出的算法在个人台式计算机上的运行速度非常快，只需千分之一到百分之一秒。

结论

我们已经开发了第一个用于零重组单倍型构型问题的确定性线性时间算法。我们的实验结果表明，其执行时间与输入大小之间具有线性关系。可以修改所提出的算法以检测基因型数据中的不一致性而不会降低效率，并且有望通过进一步扩展能够处理重组和缺失数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2233/3521470/423bf70c188d/1471-2105-13-S17-S19-1.jpg

相似文献

A linear-time algorithm for reconstructing zero-recombinant haplotype configuration on a pedigree.

BMC Bioinformatics. 2012;13 Suppl 17(Suppl 17):S19. doi: 10.1186/1471-2105-13-S17-S19. Epub 2012 Dec 13.

Computing the minimum recombinant haplotype configuration from incomplete genotype data on a pedigree by integer linear programming.

J Comput Biol. 2005 Jul-Aug;12(6):719-39. doi: 10.1089/cmb.2005.12.719.

Linear-time reconstruction of zero-recombinant Mendelian inheritance on pedigrees without mating loops.

Genome Inform. 2007;19:95-106.

Efficient inference of haplotypes from genotypes on a pedigree.

J Bioinform Comput Biol. 2003 Apr;1(1):41-69. doi: 10.1142/s0219720003000204.

An efficient algorithm for haplotype inference on pedigrees with recombinations and mutations.

IEEE/ACM Trans Comput Biol Bioinform. 2012 Jan-Feb;9(1):12-25. doi: 10.1109/TCBB.2011.51. Epub 2011 Mar 3.

A novel algorithm for minimum recombinant haplotyping on pedigrees by zero recombinant block partition.

Interdiscip Sci. 2010 Jun;2(2):185-92. doi: 10.1007/s12539-010-0089-7. Epub 2010 May 1.

Inferring haplotypes from genotypes on a pedigree with mutations, genotyping errors and missing alleles.

J Bioinform Comput Biol. 2011 Apr;9(2):339-65. doi: 10.1142/s0219720011005549.

An almost linear time algorithm for a general haplotype solution on tree pedigrees with no recombination and its extensions.

J Bioinform Comput Biol. 2009 Jun;7(3):521-45. doi: 10.1142/s0219720009004217.

HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination.

Bioinformatics. 2005 Jan 1;21(1):90-103. doi: 10.1093/bioinformatics/bth388. Epub 2004 Jul 1.

Efficient haplotype inference from pedigrees with missing data using linear systems with disjoint-set data structures.

Comput Syst Bioinformatics Conf. 2008;7:297-308.

引用本文的文献

Meta-analysis of Transcriptomic Data Reveals Pathophysiological Modules Involved with Atrial Fibrillation.

Mol Diagn Ther. 2020 Dec;24(6):737-751. doi: 10.1007/s40291-020-00497-0. Epub 2020 Oct 23.

Maximum parsimony xor haplotyping by sparse dictionary selection.

BMC Genomics. 2013 Sep 23;14:645. doi: 10.1186/1471-2164-14-645.

Haplotype phasing after joint estimation of recombination and linkage disequilibrium in breeding populations.

J Anim Sci Biotechnol. 2013 Aug 6;4(1):30. doi: 10.1186/2049-1891-4-30.

InCoB2012 Conference: from biological data to knowledge to technological breakthroughs.

BMC Bioinformatics. 2012;13 Suppl 17(Suppl 17):S1. doi: 10.1186/1471-2105-13-S17-S1. Epub 2012 Dec 13.

本文引用的文献

Haplotype phasing: existing methods and new developments.

Nat Rev Genet. 2011 Sep 16;12(10):703-14. doi: 10.1038/nrg3054.

A near-linear time algorithm for haplotype determination on general pedigrees.

J Comput Biol. 2010 Oct;17(10):1451-65. doi: 10.1089/cmb.2009.0133.

Efficient haplotype inference from pedigrees with missing data using linear systems with disjoint-set data structures.

Comput Syst Bioinformatics Conf. 2008;7:297-308.

Haplotype inference in general pedigrees using the cluster variation method.

Genetics. 2007 Oct;177(2):1101-16. doi: 10.1534/genetics.107.074047. Epub 2007 Jul 29.

Efficient inference of haplotypes from genotypes on a large animal pedigree.

Genetics. 2006 Mar;172(3):1757-65. doi: 10.1534/genetics.105.047134. Epub 2005 Dec 15.

Computing the minimum recombinant haplotype configuration from incomplete genotype data on a pedigree by integer linear programming.

J Comput Biol. 2005 Jul-Aug;12(6):719-39. doi: 10.1089/cmb.2005.12.719.

Simulating realistic zero loop pedigrees using a bipartite Prufer code and graphical modelling.

Math Med Biol. 2004 Dec;21(4):335-45. doi: 10.1093/imammb21.4.335.

HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination.

Bioinformatics. 2005 Jan 1;21(1):90-103. doi: 10.1093/bioinformatics/bth388. Epub 2004 Jul 1.

Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA.

Proc Natl Acad Sci U S A. 2003 Jun 10;100(12):7225-30. doi: 10.1073/pnas.1237858100. Epub 2003 May 30.

On the use of DNA pooling to estimate haplotype frequencies.

Genet Epidemiol. 2003 Jan;24(1):74-82. doi: 10.1002/gepi.10195.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于在系谱上重建零重组单倍型结构的线性时间算法。

A linear-time algorithm for reconstructing zero-recombinant haplotype configuration on a pedigree.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献