快速 PG S：一种无需测试数据集即可计算汇总 GWAS 数据的快速多基因评分计算器。

RápidoPGS: a rapid polygenic score calculator for summary GWAS data without a test dataset.

机构信息

Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Cambridge CB2 0AW, UK.

Department of Medicine, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge CB2 2QQ, UK.

出版信息

Bioinformatics. 2021 Dec 7;37(23):4444-4450. doi: 10.1093/bioinformatics/btab456.

DOI:10.1093/bioinformatics/btab456

PMID:34145897

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8652106/

Abstract

MOTIVATION

Polygenic scores (PGS) aim to genetically predict complex traits at an individual level. PGS are typically trained on genome-wide association summary statistics and require an independent test dataset to tune parameters. More recent methods allow parameters to be tuned on the training data, removing the need for independent test data, but approaches are computationally intensive. Based on fine-mapping principles, we present RápidoPGS, a flexible and fast method to compute PGS requiring summary-level Genome-wide association studies (GWAS) datasets only, with little computational requirements and no test data required for parameter tuning.

RESULTS

We show that RápidoPGS performs slightly less well than two out of three other widely used PGS methods (LDpred2, PRScs and SBayesR) for case-control datasets, with median r2 difference: -0.0092, -0.0042 and 0.0064, respectively, but up to 17 000-fold faster with reduced computational requirements. RápidoPGS is implemented in R and can work with user-supplied summary statistics or download them from the GWAS catalog.

AVAILABILITY AND IMPLEMENTATION

Our method is available with a GPL license as an R package from CRAN and GitHub.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

多基因评分（PGS）旨在在个体水平上进行复杂性状的遗传预测。PGS 通常是基于全基因组关联汇总统计数据进行训练的，需要独立的测试数据集来调整参数。最近的方法允许在训练数据上调整参数，从而无需独立的测试数据，但这些方法的计算量较大。基于精细映射原理，我们提出了 RápidoPGS，这是一种灵活且快速的方法，仅需要汇总水平的全基因组关联研究（GWAS）数据集即可计算 PGS，计算要求低，且无需调整参数的测试数据。

结果

我们表明，RápidoPGS 在病例对照数据集上的表现略逊于三种常用的 PGS 方法（LDpred2、PRScs 和 SBayesR）中的两种，中位 r2 差异分别为-0.0092、-0.0042 和 0.0064，但速度快 17000 倍，计算要求也降低了。RápidoPGS 是用 R 语言实现的，可以使用用户提供的汇总统计数据，也可以从 GWAS 目录中下载。

可用性和实现

我们的方法是一个 GPL 许可证下的 R 包，可从 CRAN 和 GitHub 获得。

补充信息

补充数据可在“Bioinformatics”在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a95/8652106/229234d99ca5/btab456f1.jpg

相似文献

RápidoPGS: a rapid polygenic score calculator for summary GWAS data without a test dataset.

Bioinformatics. 2021 Dec 7;37(23):4444-4450. doi: 10.1093/bioinformatics/btab456.

Improving on polygenic scores across complex traits using select and shrink with summary statistics (S4) and LDpred2.

BMC Genomics. 2024 Sep 18;25(1):878. doi: 10.1186/s12864-024-10706-3.

SummaryAUC: a tool for evaluating the performance of polygenic risk prediction models in validation datasets with only summary level statistics.

Bioinformatics. 2019 Oct 15;35(20):4038-4044. doi: 10.1093/bioinformatics/btz176.

Metasubtract: an R-package to analytically produce leave-one-out meta-analysis GWAS summary statistics.

Bioinformatics. 2020 Aug 15;36(16):4521-4522. doi: 10.1093/bioinformatics/btaa570.

Evaluation of polygenic prediction methodology within a reference-standardized framework.

PLoS Genet. 2021 May 4;17(5):e1009021. doi: 10.1371/journal.pgen.1009021. eCollection 2021 May.

Integrate multiple traits to detect novel trait-gene association using GWAS summary data with an adaptive test approach.

Bioinformatics. 2019 Jul 1;35(13):2251-2257. doi: 10.1093/bioinformatics/bty961.

simGWAS: a fast method for simulation of large scale case-control GWAS summary statistics.

Bioinformatics. 2019 Jun 1;35(11):1901-1906. doi: 10.1093/bioinformatics/bty898.

CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies.

Bioinformatics. 2020 Apr 1;36(7):2009-2016. doi: 10.1093/bioinformatics/btz880.

GWASBrewer: An R Package for Simulating Realistic GWAS Summary Statistics.

Genet Epidemiol. 2025 Jan;49(1):e22594. doi: 10.1002/gepi.22594. Epub 2024 Oct 6.

LDpred2: better, faster, stronger.

Bioinformatics. 2021 Apr 1;36(22-23):5424-5431. doi: 10.1093/bioinformatics/btaa1029.

引用本文的文献

Multi-omics Integrative Analysis for Incomplete Data Using Weighted -Value Adjustment Approaches.

J Agric Biol Environ Stat. 2025;30(3):601-617. doi: 10.1007/s13253-024-00603-3. Epub 2024 Feb 28.

shaPRS: Leveraging shared genetic effects across traits or ancestries improves accuracy of polygenic scores.

Am J Hum Genet. 2024 Jun 6;111(6):1006-1017. doi: 10.1016/j.ajhg.2024.04.009. Epub 2024 May 3.

Genetic influences on circulating retinol and its relationship to human health.

Nat Commun. 2024 Feb 19;15(1):1490. doi: 10.1038/s41467-024-45779-x.

本文引用的文献

A simple new approach to variable selection in regression, with application to genetic fine mapping.

J R Stat Soc Series B Stat Methodol. 2020 Dec;82(5):1273-1300. doi: 10.1111/rssb.12388. Epub 2020 Jul 10.

Estimation of Parental Effects Using Polygenic Scores.

Behav Genet. 2021 May;51(3):264-278. doi: 10.1007/s10519-020-10032-w. Epub 2021 Jan 2.

LDpred2: better, faster, stronger.

Bioinformatics. 2021 Apr 1;36(22-23):5424-5431. doi: 10.1093/bioinformatics/btaa1029.

Improving the trans-ancestry portability of polygenic risk scores by prioritizing variants in predicted cell-type-specific regulatory elements.

Nat Genet. 2020 Dec;52(12):1346-1354. doi: 10.1038/s41588-020-00740-8. Epub 2020 Nov 30.

Eliciting priors and relaxing the single causal variant assumption in colocalisation analyses.

PLoS Genet. 2020 Apr 20;16(4):e1008720. doi: 10.1371/journal.pgen.1008720. eCollection 2020 Apr.

Making the Most of Clumping and Thresholding for Polygenic Scores.

Am J Hum Genet. 2019 Dec 5;105(6):1213-1221. doi: 10.1016/j.ajhg.2019.11.001. Epub 2019 Nov 21.

Improved polygenic prediction by Bayesian multiple regression on summary statistics.

Nat Commun. 2019 Nov 8;10(1):5086. doi: 10.1038/s41467-019-12653-0.

A flexible and parallelizable approach to genome-wide polygenic risk scores.

Genet Epidemiol. 2019 Oct;43(7):730-741. doi: 10.1002/gepi.22245. Epub 2019 Jul 22.

Polygenic prediction via Bayesian regression and continuous shrinkage priors.

Nat Commun. 2019 Apr 16;10(1):1776. doi: 10.1038/s41467-019-09718-5.

Clinical use of current polygenic risk scores may exacerbate health disparities.

Nat Genet. 2019 Apr;51(4):584-591. doi: 10.1038/s41588-019-0379-x. Epub 2019 Mar 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

快速 PG S：一种无需测试数据集即可计算汇总 GWAS 数据的快速多基因评分计算器。

RápidoPGS: a rapid polygenic score calculator for summary GWAS data without a test dataset.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献