Suppr超能文献

一种灵活且可并行化的全基因组多基因风险评分方法。

A flexible and parallelizable approach to genome-wide polygenic risk scores.

机构信息

MRC Biostatistics Unit, School of Clinical Medicine, Cambridge Institute of Public Health, Cambridge Biomedical Campus, Cambridge, UK.

Department of Cardiovascular Sciences, Cardiovascular Research Centre, Glenfield Hospital, University of Leicester, Leicester, UK.

出版信息

Genet Epidemiol. 2019 Oct;43(7):730-741. doi: 10.1002/gepi.22245. Epub 2019 Jul 22.

Abstract

The heritability of most complex traits is driven by variants throughout the genome. Consequently, polygenic risk scores, which combine information on multiple variants genome-wide, have demonstrated improved accuracy in genetic risk prediction. We present a new two-step approach to constructing genome-wide polygenic risk scores from meta-GWAS summary statistics. Local linkage disequilibrium (LD) is adjusted for in Step 1, followed by, uniquely, long-range LD in Step 2. Our algorithm is highly parallelizable since block-wise analyses in Step 1 can be distributed across a high-performance computing cluster, and flexible, since sparsity and heritability are estimated within each block. Inference is obtained through a formal Bayesian variable selection framework, meaning final risk predictions are averaged over competing models. We compared our method to two alternative approaches: LDPred and lassosum using all seven traits in the Welcome Trust Case Control Consortium as well as meta-GWAS summaries for type 1 diabetes (T1D), coronary artery disease, and schizophrenia. Performance was generally similar across methods, although our framework provided more accurate predictions for T1D, for which there are multiple heterogeneous signals in regions of both short- and long-range LD. With sufficient compute resources, our method also allows the fastest runtimes.

摘要

大多数复杂性状的遗传性是由整个基因组中的变异驱动的。因此,多基因风险评分,即将全基因组多个变异的信息结合起来,可以提高遗传风险预测的准确性。我们提出了一种从荟萃 GWAS 汇总统计数据构建全基因组多基因风险评分的新两步法。第一步调整局部连锁不平衡(LD),然后在第二步中独特地调整长程 LD。我们的算法具有高度可并行化的特点,因为第一步中的分块分析可以分布在高性能计算集群上,并且具有灵活性,因为每个分块内都可以估计稀疏性和遗传性。推断是通过正式的贝叶斯变量选择框架获得的,这意味着最终的风险预测是在竞争模型上平均得到的。我们将我们的方法与两种替代方法进行了比较:LDPred 和 lassosum,使用了 Welcome Trust Case Control Consortium 的所有七个性状以及 1 型糖尿病(T1D)、冠状动脉疾病和精神分裂症的荟萃 GWAS 摘要。尽管我们的框架为 T1D 提供了更准确的预测,但总体而言,方法之间的性能相似,因为 T1D 存在多个短程和长程 LD 区域的异质信号。在有足够的计算资源的情况下,我们的方法还可以实现最快的运行时间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5670/6790684/f1cd40f4dc38/GEPI-43-730-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验