Suppr超能文献

共识树方法在重建人类进化历史和检测种群结构中的应用。

A consensus tree approach for reconstructing human evolutionary history and detecting population substructure.

机构信息

Joint Carnegie Mellon University/University of Pittsburgh PhD Program in Computational Biology and Lane Center for Computational Biology, 4400 Fifth Avenue, Pittsburgh, PA 15213, USA.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):918-28. doi: 10.1109/TCBB.2011.23.

Abstract

The random accumulation of variations in the human genome over time implicitly encodes a history of how human populations have arisen, dispersed, and intermixed since we emerged as a species. Reconstructing that history is a challenging computational and statistical problem but has important applications both to basic research and to the discovery of genotype-phenotype correlations. We present a novel approach to inferring human evolutionary history from genetic variation data. We use the idea of consensus trees, a technique generally used to reconcile species trees from divergent gene trees, adapting it to the problem of finding robust relationships within a set of intraspecies phylogenies derived from local regions of the genome. Validation on both simulated and real data shows the method to be effective in recapitulating known true structure of the data closely matching our best current understanding of human evolutionary history. Additional comparison with results of leading methods for the problem of population substructure assignment verifies that our method provides comparable accuracy in identifying meaningful population subgroups in addition to inferring relationships among them. The consensus tree approach thus provides a promising new model for the robust inference of substructure and ancestry from large-scale genetic variation data.

摘要

人类基因组随时间随机积累的变异隐含地编码了自我们作为一个物种出现以来人类群体是如何出现、分散和混合的历史。重建这段历史是一个具有挑战性的计算和统计问题,但对基础研究和基因型-表型相关性的发现都有重要的应用。我们提出了一种从遗传变异数据推断人类进化历史的新方法。我们使用共识树的概念,这是一种通常用于协调来自不同基因树的物种树的技术,将其适用于从基因组局部区域得出的一组种内系统发育树中寻找稳健关系的问题。对模拟和真实数据的验证表明,该方法在准确再现数据的已知真实结构方面非常有效,与我们目前对人类进化历史的最佳理解非常吻合。与用于群体亚结构分配问题的领先方法的结果进行的额外比较验证了,除了推断它们之间的关系之外,我们的方法在识别有意义的群体亚群方面提供了相当的准确性。因此,共识树方法为从大规模遗传变异数据中稳健推断亚结构和祖先提供了一种有前途的新模型。

相似文献

1
A consensus tree approach for reconstructing human evolutionary history and detecting population substructure.
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):918-28. doi: 10.1109/TCBB.2011.23.
2
Genome-scale coestimation of species and gene trees.
Genome Res. 2013 Feb;23(2):323-30. doi: 10.1101/gr.141978.112. Epub 2012 Nov 6.
4
A new fast method for inferring multiple consensus trees using k-medoids.
BMC Evol Biol. 2018 Apr 5;18(1):48. doi: 10.1186/s12862-018-1163-8.
6
Invariant transformers of Robinson and Foulds distance matrices for Convolutional Neural Network.
J Bioinform Comput Biol. 2022 Aug;20(4):2250012. doi: 10.1142/S0219720022500123. Epub 2022 Jul 6.
7
Discordance of species trees with their most likely gene trees.
PLoS Genet. 2006 May;2(5):e68. doi: 10.1371/journal.pgen.0020068. Epub 2006 May 26.
8
Bayesian coestimation of phylogeny and sequence alignment.
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.
9
Shortest triplet clustering: reconstructing large phylogenies using representative sets.
BMC Bioinformatics. 2005 Apr 8;6:92. doi: 10.1186/1471-2105-6-92.
10
Properties of consensus methods for inferring species trees from gene trees.
Syst Biol. 2009 Feb;58(1):35-54. doi: 10.1093/sysbio/syp008. Epub 2009 Jun 4.

引用本文的文献

1
Degree and centrality-based approaches in network-based variable selection: Insights from the Singapore Longitudinal Aging Study.
PLoS One. 2019 Jul 18;14(7):e0219186. doi: 10.1371/journal.pone.0219186. eCollection 2019.
2
Coalescent-based method for learning parameters of admixture events from large-scale genetic variation data.
IEEE/ACM Trans Comput Biol Bioinform. 2013 Sep-Oct;10(5):1137-49. doi: 10.1109/tcbb.2013.98.

本文引用的文献

1
Reconstructing Indian population history.
Nature. 2009 Sep 24;461(7263):489-94. doi: 10.1038/nature08365.
2
mStruct: inference of population structure in light of both genetic admixing and allele mutations.
Genetics. 2009 Jun;182(2):575-93. doi: 10.1534/genetics.108.100222. Epub 2009 Apr 10.
3
Geographical affinities of the HapMap samples.
PLoS One. 2009;4(3):e4684. doi: 10.1371/journal.pone.0004684. Epub 2009 Mar 4.
4
The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research.
Am J Hum Genet. 2008 Sep;83(3):347-58. doi: 10.1016/j.ajhg.2008.08.005. Epub 2008 Aug 28.
5
Genotype, haplotype and copy-number variation in worldwide human populations.
Nature. 2008 Feb 21;451(7181):998-1003. doi: 10.1038/nature06742.
6
Direct maximum parsimony phylogeny reconstruction from genotype data.
BMC Bioinformatics. 2007 Dec 5;8:472. doi: 10.1186/1471-2105-8-472.
7
A second generation human haplotype map of over 3.1 million SNPs.
Nature. 2007 Oct 18;449(7164):851-61. doi: 10.1038/nature06258.
8
Spectrum: joint Bayesian inference of population structure and recombination events.
Bioinformatics. 2007 Jul 1;23(13):i479-89. doi: 10.1093/bioinformatics/btm171.
9
The Genographic Project public participation mitochondrial DNA database.
PLoS Genet. 2007 Jun;3(6):e104. doi: 10.1371/journal.pgen.0030104.
10
Population structure and eigenanalysis.
PLoS Genet. 2006 Dec;2(12):e190. doi: 10.1371/journal.pgen.0020190.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验