Suppr超能文献

利用个体水平的遗传数据和 GWAS 汇总统计数据可以提高多基因预测。

Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction.

机构信息

The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, 8210 Aarhus V, Denmark; National Centre for Register-Based Research, Aarhus University, 8210 Aarhus V, Denmark.

The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, 8210 Aarhus V, Denmark; Department of Biomedicine and Center for Integrative Sequencing, iSEQ, Aarhus University, 8000 Aarhus C, Denmark; Center for Genomics and Personalized Medicine, CGPM, Aarhus University, 8000 Aarhus C, Denmark; Bioinformatics Research Centre, Aarhus University, 8000 Aarhus C, Denmark.

出版信息

Am J Hum Genet. 2021 Jun 3;108(6):1001-1011. doi: 10.1016/j.ajhg.2021.04.014. Epub 2021 May 7.

Abstract

The accuracy of polygenic risk scores (PRSs) to predict complex diseases increases with the training sample size. PRSs are generally derived based on summary statistics from large meta-analyses of multiple genome-wide association studies (GWASs). However, it is now common for researchers to have access to large individual-level data as well, such as the UK Biobank data. To the best of our knowledge, it has not yet been explored how best to combine both types of data (summary statistics and individual-level data) to optimize polygenic prediction. The most widely used approach to combine data is the meta-analysis of GWAS summary statistics (meta-GWAS), but we show that it does not always provide the most accurate PRS. Through simulations and using 12 real case-control and quantitative traits from both iPSYCH and UK Biobank along with external GWAS summary statistics, we compare meta-GWAS with two alternative data-combining approaches, stacked clumping and thresholding (SCT) and meta-PRS. We find that, when large individual-level data are available, the linear combination of PRSs (meta-PRS) is both a simple alternative to meta-GWAS and often more accurate.

摘要

多基因风险评分 (PRSs) 预测复杂疾病的准确性随着训练样本量的增加而提高。PRSs 通常是基于来自多个全基因组关联研究 (GWASs) 的大型荟萃分析的汇总统计数据得出的。然而,现在研究人员通常也可以访问大型个体水平数据,例如英国生物银行 (UK Biobank) 数据。据我们所知,尚未探索如何最好地结合这两种类型的数据(汇总统计数据和个体水平数据)以优化多基因预测。最广泛使用的组合数据的方法是 GWAS 汇总统计数据的荟萃分析(meta-GWAS),但我们表明它并不总是提供最准确的 PRS。通过模拟和使用来自 iPSYCH 和 UK Biobank 的 12 个真实病例对照和定量性状以及外部 GWAS 汇总统计数据,我们比较了 meta-GWAS 与两种替代的数据组合方法,堆积聚类和阈值 (SCT) 和 meta-PRS。我们发现,当有大量个体水平数据可用时,PRS 的线性组合(meta-PRS)不仅是 meta-GWAS 的简单替代方法,而且通常更准确。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5bf/8206385/a7fbb60037c5/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验