Suppr超能文献

连锁不平衡建模提高了多基因风险评分的准确性。

Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores.

作者信息

Vilhjálmsson Bjarni J, Yang Jian, Finucane Hilary K, Gusev Alexander, Lindström Sara, Ripke Stephan, Genovese Giulio, Loh Po-Ru, Bhatia Gaurav, Do Ron, Hayeck Tristan, Won Hong-Hee, Kathiresan Sekar, Pato Michele, Pato Carlos, Tamimi Rulla, Stahl Eli, Zaitlen Noah, Pasaniuc Bogdan, Belbin Gillian, Kenny Eimear E, Schierup Mikkel H, De Jager Philip, Patsopoulos Nikolaos A, McCarroll Steve, Daly Mark, Purcell Shaun, Chasman Daniel, Neale Benjamin, Goddard Michael, Visscher Peter M, Kraft Peter, Patterson Nick, Price Alkes L

机构信息

Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Bioinformatics Research Centre, Aarhus University, 8000 Aarhus, Denmark.

Queensland Brain Institute, University of Queensland, Brisbane, 4072 QLD, Australia; Diamantina Institute, Translational Research Institute, University of Queensland, Brisbane, 4101 QLD, Australia.

出版信息

Am J Hum Genet. 2015 Oct 1;97(4):576-92. doi: 10.1016/j.ajhg.2015.09.001.

Abstract

Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R(2) increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.

摘要

多基因风险评分在预测复杂疾病风险方面已显示出巨大潜力,并且随着训练样本量的增加会变得更加准确。计算风险评分的标准方法涉及基于连锁不平衡(LD)的标记物筛选以及对关联统计量应用p值阈值,但这会丢弃信息并可能降低预测准确性。我们引入了LDpred方法,该方法通过使用效应大小的先验信息和来自外部参考面板的LD信息来推断每个标记物的后验平均效应大小。理论和模拟表明,LDpred优于先进行筛选然后设置阈值的方法,尤其是在大样本量时。因此,在一个大型精神分裂症数据集中,预测的R(2)从20.1%提高到了25.3%,在一个大型多发性硬化症数据集中从9.8%提高到了12.0%。在另外三个大型疾病数据集以及非欧洲精神分裂症样本中也观察到了类似的相对准确性提高。随着样本量的增加,LDpred相对于现有方法的优势将更加明显。

相似文献

5
Polygenic scores via penalized regression on summary statistics.基于汇总统计量的惩罚回归多基因评分。
Genet Epidemiol. 2017 Sep;41(6):469-480. doi: 10.1002/gepi.22050. Epub 2017 May 8.
8
Power and predictive accuracy of polygenic risk scores.多基因风险评分的效力和预测准确性。
PLoS Genet. 2013 Mar;9(3):e1003348. doi: 10.1371/journal.pgen.1003348. Epub 2013 Mar 21.

引用本文的文献

7
Robust angle-based transfer learning in high dimensions.高维空间中基于稳健角度的迁移学习
J R Stat Soc Series B Stat Methodol. 2024 Dec 3;87(3):723-745. doi: 10.1093/jrsssb/qkae111. eCollection 2025 Jul.

本文引用的文献

9
Effective genetic-risk prediction using mixed models.使用混合模型进行有效的遗传风险预测。
Am J Hum Genet. 2014 Oct 2;95(4):383-93. doi: 10.1016/j.ajhg.2014.09.007.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验