Suppr超能文献

一种用于多血统多基因风险预测的集成惩罚回归方法。

An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction.

作者信息

Zhang Jingning, Zhan Jianan, Jin Jin, Ma Cheng, Zhao Ruzhang, O'Connell Jared, Jiang Yunxuan, Koelsch Bertram L, Zhang Haoyu, Chatterjee Nilanjan

机构信息

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

23andMe Inc., Sunnyvale, CA, USA.

出版信息

bioRxiv. 2024 Apr 10:2023.03.15.532652. doi: 10.1101/2023.03.15.532652.

Abstract

Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we propose a novel method for generating multi-ancestry Polygenic Risk scOres based on enSemble of PEnalized Regression models (PROSPER). PROSPER integrates genome-wide association studies (GWAS) summary statistics from diverse populations to develop ancestry-specific PRS with improved predictive power for minority populations. The method uses a combination of (lasso) and (ridge) penalty functions, a parsimonious specification of the penalty parameters across populations, and an ensemble step to combine PRS generated across different penalty parameters. We evaluate the performance of PROSPER and other existing methods on large-scale simulated and real datasets, including those from 23andMe Inc., the Global Lipids Genetics Consortium, and All of Us. Results show that PROSPER can substantially improve multi-ancestry polygenic prediction compared to alternative methods across a wide variety of genetic architectures. In real data analyses, for example, PROSPER increased out-of-sample prediction R for continuous traits by an average of 70% compared to a state-of-the-art Bayesian method (PRS-CSx) in the African ancestry population. Further, PROSPER is computationally highly scalable for the analysis of large SNP contents and many diverse populations.

摘要

人们正在付出巨大努力来开发先进的多基因风险评分(PRS),以改善对复杂性状和疾病的预测。然而,大多数现有的PRS主要是在欧洲血统人群上进行训练的,这限制了它们向非欧洲人群的可转移性。在本文中,我们提出了一种基于惩罚回归模型集成(PROSPER)生成多血统多基因风险评分的新方法。PROSPER整合了来自不同人群的全基因组关联研究(GWAS)汇总统计数据,以开发对少数族裔人群具有更高预测能力的特定血统PRS。该方法使用了(套索)和(岭)惩罚函数的组合、跨人群惩罚参数的简约设定以及一个集成步骤来组合在不同惩罚参数下生成的PRS。我们在大规模模拟和真实数据集上评估了PROSPER和其他现有方法的性能,包括来自23andMe公司、全球脂质遗传学联盟和“我们所有人”项目的数据。结果表明,与各种遗传结构下的替代方法相比,PROSPER可以显著提高多血统多基因预测能力。例如,在实际数据分析中,与非洲血统人群中一种先进的贝叶斯方法(PRS-CSx)相比,PROSPER将连续性状的样本外预测R平均提高了70%。此外,PROSPER在计算上对于分析大量单核苷酸多态性(SNP)内容和许多不同人群具有高度可扩展性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/516e/11005619/ada1714309f1/nihpp-2023.03.15.532652v3-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验