Suppr超能文献

基于模型的无关个体祖先快速估计

Fast model-based estimation of ancestry in unrelated individuals.

作者信息

Alexander David H, Novembre John, Lange Kenneth

机构信息

Department of Biomathematics, University of California at Los Angeles, Los Angeles, California 90095, USA.

出版信息

Genome Res. 2009 Sep;19(9):1655-64. doi: 10.1101/gr.094052.109. Epub 2009 Jul 31.

Abstract

Population stratification has long been recognized as a confounding factor in genetic association studies. Estimated ancestries, derived from multi-locus genotype data, can be used to perform a statistical correction for population stratification. One popular technique for estimation of ancestry is the model-based approach embodied by the widely applied program structure. Another approach, implemented in the program EIGENSTRAT, relies on Principal Component Analysis rather than model-based estimation and does not directly deliver admixture fractions. EIGENSTRAT has gained in popularity in part owing to its remarkable speed in comparison to structure. We present a new algorithm and a program, ADMIXTURE, for model-based estimation of ancestry in unrelated individuals. ADMIXTURE adopts the likelihood model embedded in structure. However, ADMIXTURE runs considerably faster, solving problems in minutes that take structure hours. In many of our experiments, we have found that ADMIXTURE is almost as fast as EIGENSTRAT. The runtime improvements of ADMIXTURE rely on a fast block relaxation scheme using sequential quadratic programming for block updates, coupled with a novel quasi-Newton acceleration of convergence. Our algorithm also runs faster and with greater accuracy than the implementation of an Expectation-Maximization (EM) algorithm incorporated in the program FRAPPE. Our simulations show that ADMIXTURE's maximum likelihood estimates of the underlying admixture coefficients and ancestral allele frequencies are as accurate as structure's Bayesian estimates. On real-world data sets, ADMIXTURE's estimates are directly comparable to those from structure and EIGENSTRAT. Taken together, our results show that ADMIXTURE's computational speed opens up the possibility of using a much larger set of markers in model-based ancestry estimation and that its estimates are suitable for use in correcting for population stratification in association studies.

摘要

群体分层长期以来一直被认为是基因关联研究中的一个混杂因素。从多位点基因型数据推导出来的估计祖先成分,可用于对群体分层进行统计校正。一种流行的祖先成分估计技术是广泛应用的程序Structure所体现的基于模型的方法。另一种方法在程序EIGENSTRAT中实现,它依赖于主成分分析而非基于模型的估计,并且不直接给出混合比例。EIGENSTRAT越来越受欢迎,部分原因是与Structure相比它速度极快。我们提出了一种新算法和一个程序ADMIXTURE,用于对无关个体的祖先成分进行基于模型的估计。ADMIXTURE采用了Structure中嵌入的似然模型。然而,ADMIXTURE运行速度要快得多,能在几分钟内解决Structure需要数小时才能解决的问题。在我们的许多实验中,我们发现ADMIXTURE几乎与EIGENSTRAT一样快。ADMIXTURE运行时间的改进依赖于一种快速块松弛方案,该方案使用序列二次规划进行块更新,并结合了一种新颖的拟牛顿收敛加速方法。我们的算法在运行速度上也比程序FRAPPE中纳入的期望最大化(EM)算法的实现更快且更准确。我们的模拟表明,ADMIXTURE对潜在混合系数和祖先等位基因频率的最大似然估计与Structure的贝叶斯估计一样准确。在实际数据集上,ADMIXTURE的估计与来自Structure和EIGENSTRAT的估计直接可比。综合来看,我们的结果表明,ADMIXTURE的计算速度为在基于模型的祖先成分估计中使用大得多的标记集开辟了可能性,并且其估计适用于在关联研究中校正群体分层。

相似文献

1
Fast model-based estimation of ancestry in unrelated individuals.基于模型的无关个体祖先快速估计
Genome Res. 2009 Sep;19(9):1655-64. doi: 10.1101/gr.094052.109. Epub 2009 Jul 31.
2
Fast and efficient estimation of individual ancestry coefficients.个体祖先系数的快速高效估计。
Genetics. 2014 Apr;196(4):973-83. doi: 10.1534/genetics.113.160572. Epub 2014 Feb 4.

引用本文的文献

本文引用的文献

1
Genes mirror geography within Europe.基因反映了欧洲内部的地理特征。
Nature. 2008 Nov 6;456(7218):98-101. doi: 10.1038/nature07331. Epub 2008 Aug 31.
3
On the inference of ancestries in admixed populations.关于混合群体中祖先的推断。
Genome Res. 2008 Apr;18(4):668-75. doi: 10.1101/gr.072751.107. Epub 2008 Mar 18.
6
Estimating local ancestry in admixed populations.估计混合群体中的本地祖先。
Am J Hum Genet. 2008 Feb;82(2):290-303. doi: 10.1016/j.ajhg.2007.09.022.
8
Population structure and eigenanalysis.群体结构与特征分析
PLoS Genet. 2006 Dec;2(12):e190. doi: 10.1371/journal.pgen.0020190.
10
Reconstructing genetic ancestry blocks in admixed individuals.重建混合个体中的遗传祖先片段。
Am J Hum Genet. 2006 Jul;79(1):1-12. doi: 10.1086/504302. Epub 2006 May 17.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验