利用 GWAS 汇总统计数据优化和基准化多基因风险评分。

Optimizing and benchmarking polygenic risk scores with GWAS summary statistics.

机构信息

Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA.

Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA.

出版信息

Genome Biol. 2024 Oct 8;25(1):260. doi: 10.1186/s13059-024-03400-w.

DOI:10.1186/s13059-024-03400-w

PMID:39379999

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11462675/

Abstract

BACKGROUND

Polygenic risk score (PRS) is a major research topic in human genetics. However, a significant gap exists between PRS methodology and applications in practice due to often unavailable individual-level data for various PRS tasks including model fine-tuning, benchmarking, and ensemble learning.

RESULTS

We introduce an innovative statistical framework to optimize and benchmark PRS models using summary statistics of genome-wide association studies. This framework builds upon our previous work and can fine-tune virtually all existing PRS models while accounting for linkage disequilibrium. In addition, we provide an ensemble learning strategy named PUMAS-ensemble to combine multiple PRS models into an ensemble score without requiring external data for model fitting. Through extensive simulations and analysis of many complex traits in the UK Biobank, we demonstrate that this approach closely approximates gold-standard analytical strategies based on external validation, and substantially outperforms state-of-the-art PRS methods.

CONCLUSIONS

Our method is a powerful and general modeling technique that can continue to combine the best-performing PRS methods out there through ensemble learning and could become an integral component for all future PRS applications.

摘要

背景

多基因风险评分（PRS）是人类遗传学的一个主要研究课题。然而，由于各种 PRS 任务（包括模型微调、基准测试和集成学习）通常无法获得个体水平的数据，因此 PRS 方法学与实践应用之间存在显著差距。

结果

我们引入了一种创新的统计框架，使用全基因组关联研究的汇总统计数据来优化和基准测试 PRS 模型。该框架建立在我们之前的工作基础上，可以微调几乎所有现有的 PRS 模型，同时考虑到连锁不平衡。此外，我们提供了一种名为 PUMAS-ensemble 的集成学习策略，用于将多个 PRS 模型组合成一个集成分数，而无需外部数据进行模型拟合。通过对 UK Biobank 中的许多复杂特征进行广泛的模拟和分析，我们证明了这种方法非常接近基于外部验证的黄金标准分析策略，并且大大优于最先进的 PRS 方法。

结论

我们的方法是一种强大而通用的建模技术，可以通过集成学习继续结合表现最佳的 PRS 方法，并且可能成为所有未来 PRS 应用的一个组成部分。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用 GWAS 汇总统计数据优化和基准化多基因风险评分。

Optimizing and benchmarking polygenic risk scores with GWAS summary statistics.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

利用 GWAS 汇总统计数据优化和基准化多基因风险评分。

Optimizing and benchmarking polygenic risk scores with GWAS summary statistics.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献