• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用来自训练数据的全基因组关联研究(GWAS)汇总统计量的多基因风险评分方法的调整参数。

Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data.

作者信息

Jiang Wei, Chen Ling, Girgenti Matthew J, Zhao Hongyu

机构信息

Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.

Department of Statistics, Columbia University, New York, NY, USA.

出版信息

Res Sq. 2023 May 31:rs.3.rs-2939390. doi: 10.21203/rs.3.rs-2939390/v1.

DOI:10.21203/rs.3.rs-2939390/v1
PMID:37398263
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10312948/
Abstract

Predicting genetic risks for common diseases may improve their prevention and early treatment. In recent years, various additive-model-based polygenic risk scores (PRS) methods have been proposed to combine the estimated effects of single nucleotide polymorphisms (SNPs) using data collected from genome-wide association studies (GWAS). Some of these methods require access to another external individual-level GWAS dataset to tune the hyperparameters, which can be difficult because of privacy and security-related concerns. Additionally, leaving out partial data for hyperparameter tuning can reduce the predictive accuracy of the constructed PRS model. In this article, we propose a novel method, called PRStuning, to automatically tune hyperparameters for different PRS methods using only GWAS summary statistics from the training data. The core idea is to first predict the performance of the PRS method with different parameter values, and then select the parameters with the best prediction performance. Because directly using the effects observed from the training data tends to overestimate the performance in the testing data (a phenomenon known as overfitting), we adopt an empirical Bayes approach to shrinking the predicted performance in accordance with the estimated genetic architecture of the disease. Results from extensive simulations and real data applications demonstrate that PRStuning can accurately predict the PRS performance across PRS methods and parameters, and it can help select the best-performing parameters.

摘要

预测常见疾病的遗传风险可能会改善其预防和早期治疗。近年来,已经提出了各种基于加性模型的多基因风险评分(PRS)方法,以利用从全基因组关联研究(GWAS)收集的数据来综合单核苷酸多态性(SNP)的估计效应。其中一些方法需要访问另一个外部个体水平的GWAS数据集来调整超参数,由于隐私和安全相关问题,这可能具有挑战性。此外,留出部分数据用于超参数调整可能会降低构建的PRS模型的预测准确性。在本文中,我们提出了一种名为PRStuning的新方法,仅使用训练数据的GWAS汇总统计信息就可以自动为不同的PRS方法调整超参数。核心思想是首先预测具有不同参数值的PRS方法的性能,然后选择具有最佳预测性能的参数。由于直接使用从训练数据中观察到的效应往往会高估测试数据中的性能(一种称为过拟合的现象),因此我们采用经验贝叶斯方法根据疾病的估计遗传结构来收缩预测性能。广泛的模拟和实际数据应用结果表明,PRStuning可以准确预测不同PRS方法和参数下的PRS性能,并且可以帮助选择性能最佳的参数。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a959/10312948/e3b1251fb7a0/nihpp-rs2939390v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a959/10312948/66b3b0dfc354/nihpp-rs2939390v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a959/10312948/04b73bc62ace/nihpp-rs2939390v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a959/10312948/47988c0f3a44/nihpp-rs2939390v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a959/10312948/e3b1251fb7a0/nihpp-rs2939390v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a959/10312948/66b3b0dfc354/nihpp-rs2939390v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a959/10312948/04b73bc62ace/nihpp-rs2939390v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a959/10312948/47988c0f3a44/nihpp-rs2939390v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a959/10312948/e3b1251fb7a0/nihpp-rs2939390v1-f0004.jpg

相似文献

1
Tuning Parameters for Polygenic Risk Score Methods Using GWAS Summary Statistics from Training Data.使用来自训练数据的全基因组关联研究(GWAS)汇总统计量的多基因风险评分方法的调整参数。
Res Sq. 2023 May 31:rs.3.rs-2939390. doi: 10.21203/rs.3.rs-2939390/v1.
2
Tuning parameters for polygenic risk score methods using GWAS summary statistics from training data.使用来自训练数据的 GWAS 汇总统计信息调整多基因风险评分方法的参数。
Nat Commun. 2024 Jan 2;15(1):24. doi: 10.1038/s41467-023-44009-0.
3
A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information.一种基于全基因组关联研究的汇总统计数据构建多基因风险模型并纳入外部信息的惩罚回归框架。
J Am Stat Assoc. 2021;116(533):133-143. doi: 10.1080/01621459.2020.1764849. Epub 2020 Oct 12.
4
Optimizing and benchmarking polygenic risk scores with GWAS summary statistics.利用 GWAS 汇总统计数据优化和基准化多基因风险评分。
Genome Biol. 2024 Oct 8;25(1):260. doi: 10.1186/s13059-024-03400-w.
5
PUMAS: fine-tuning polygenic risk scores with GWAS summary statistics.PUMAS:使用 GWAS 汇总统计数据调整多基因风险评分。
Genome Biol. 2021 Sep 6;22(1):257. doi: 10.1186/s13059-021-02479-9.
6
Development and validation of genome-wide polygenic risk scores for predicting breast cancer incidence in Japanese females: a population-based case-cohort study.基于人群的病例-对照研究:开发和验证用于预测日本女性乳腺癌发病风险的全基因组多基因风险评分。
Breast Cancer Res Treat. 2023 Feb;197(3):661-671. doi: 10.1007/s10549-022-06843-6. Epub 2022 Dec 20.
7
Efficient Implementation of Penalized Regression for Genetic Risk Prediction.高效实现基于惩罚回归的遗传风险预测。
Genetics. 2019 May;212(1):65-74. doi: 10.1534/genetics.119.302019. Epub 2019 Feb 26.
8
Improving polygenic risk prediction from summary statistics by an empirical Bayes approach.基于经验贝叶斯方法改善基于汇总统计量的多基因风险预测。
Sci Rep. 2017 Feb 1;7:41262. doi: 10.1038/srep41262.
9
Applying polygenic risk score methods to pharmacogenomics GWAS: challenges and opportunities.将多基因风险评分方法应用于药物基因组学全基因组关联研究:挑战与机遇
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad470.
10
Pharmacogenomics polygenic risk score for drug response prediction using PRS-PGx methods.基于 PRS-PGx 方法的药物反应预测的药物基因组多基因风险评分。
Nat Commun. 2022 Sep 8;13(1):5278. doi: 10.1038/s41467-022-32407-9.