Suppr超能文献

高效的贝叶斯混合模型分析提高了大型队列研究中的关联效能。

Efficient Bayesian mixed-model analysis increases association power in large cohorts.

作者信息

Loh Po-Ru, Tucker George, Bulik-Sullivan Brendan K, Vilhjálmsson Bjarni J, Finucane Hilary K, Salem Rany M, Chasman Daniel I, Ridker Paul M, Neale Benjamin M, Berger Bonnie, Patterson Nick, Price Alkes L

机构信息

1] Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA. [2] Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA.

1] Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA. [2] Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. [3] Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts, USA.

出版信息

Nat Genet. 2015 Mar;47(3):284-90. doi: 10.1038/ng.3190. Epub 2015 Feb 2.

Abstract

Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts and may not optimize power. All existing methods require time cost O(MN(2)) (where N is the number of samples and M is the number of SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here we present a far more efficient mixed-model association method, BOLT-LMM, which requires only a small number of O(MN) time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to 9 quantitative traits in 23,294 samples from the Women's Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for genome-wide association studies in large cohorts.

摘要

线性混合模型是用于识别基因关联和避免混杂因素的强大统计工具。然而,现有方法在大型队列中计算上难以处理,并且可能无法优化检验效能。所有现有方法都需要时间成本O(MN(2))(其中N是样本数量,M是单核苷酸多态性(SNP)数量),并且隐含地假设一种无穷小的遗传结构,即效应大小呈正态分布,这可能会限制检验效能。在此,我们提出一种效率更高的混合模型关联方法BOLT-LMM,它仅需要少量的O(MN)时间迭代,并通过对标记效应大小采用贝叶斯混合先验来对更现实的、非无穷小的遗传结构进行建模,从而提高检验效能。我们将BOLT-LMM应用于妇女基因组健康研究(WGHS)的23294个样本中的9个数量性状,并观察到检验效能显著提高,这与模拟结果一致。理论和模拟表明,检验效能的提升随着队列规模的增加而增加,这使得BOLT-LMM在大型队列的全基因组关联研究中具有吸引力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f6b/4342297/6480a9111787/nihms650284f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验