Suppr超能文献

AoUPRS:一款用于“我们所有人计划”的经济高效且通用的PRS计算器。

AoUPRS: A cost-effective and versatile PRS calculator for the All of Us Program.

作者信息

Khattab Ahmed, Chen Shang-Fu, Wineinger Nathan, Torkamani Ali

机构信息

Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA.

Scripps Research Translational Institute, 3344 North Torrey Pines Court, Suite 300, La Jolla, CA, 92037, USA.

出版信息

BMC Genomics. 2025 May 22;26(1):521. doi: 10.1186/s12864-025-11693-9.

Abstract

BACKGROUND

The All of Us (AoU) Research Program provides a comprehensive genomic dataset to accelerate health research and medical breakthroughs. Despite its potential, researchers face significant challenges, including high costs and inefficiencies associated with data extraction and analysis. AoUPRS addresses these challenges by offering a versatile and cost-effective tool for calculating polygenic risk scores (PRS), enabling both experienced and novice researchers to leverage the AoU dataset for large-scale genomic discoveries.

METHODS

We evaluated three PRS models from the PGS Catalog (coronary artery disease, atrial fibrillation, and type 2 diabetes) using two distinct approaches in the Hail framework: MatrixTable (MT), a dense representation, and Variant Dataset (VDS), a sparse representation optimized for large-scale genomic data. Computational cost, resource usage, and processing time were compared. To assess the similarity of PRS performance between these two approaches, we compared odds ratios (ORs) and area under the curve (AUC). Lin's concordance correlation coefficient (CCC) was also computed to quantify agreement between PRS scores generated by MT and VDS.

RESULTS

The VDS approach reduced computational costs by up to 99.51% (e.g., from $32 to $0.036 for a 51-SNP score) while maintaining PRS estimates that were highly similar to those obtained using the MT approach. Across all three PRS models, AUC comparisons showed minimal differences between MT and VDS, indicating that both approaches yield consistent PRS performance. Agreement between PRS scores calculated by both approaches was further supported by Lin's CCC values ranging from 0.9199 to 0.9944, confirming strong concordance. Empirical cumulative distribution function (ECDF) plots further illustrated the near-identical distribution of PRS values across methods.

CONCLUSIONS

AoUPRS enables efficient and cost-effective PRS computation within AoU, providing substantial cost savings while maintaining highly consistent PRS estimates. These findings support the use of AoUPRS for large-scale genomic risk assessment, making the AoU dataset more accessible and practical for diverse research applications. The tool's open-source availability on GitHub, coupled with detailed documentation and tutorials, ensures accessibility and ease of use for the scientific community.

摘要

背景

“我们所有人”(AoU)研究计划提供了一个全面的基因组数据集,以加速健康研究和医学突破。尽管具有潜力,但研究人员面临重大挑战,包括与数据提取和分析相关的高成本和低效率。AoUPRS通过提供一种通用且经济高效的工具来计算多基因风险评分(PRS),使经验丰富和新手研究人员都能够利用AoU数据集进行大规模基因组发现,从而应对这些挑战。

方法

我们在Hail框架中使用两种不同的方法评估了PGS Catalog中的三种PRS模型(冠状动脉疾病、心房颤动和2型糖尿病):密集表示的矩阵表(MT)和针对大规模基因组数据优化的稀疏表示的变异数据集(VDS)。比较了计算成本、资源使用情况和处理时间。为了评估这两种方法之间PRS性能的相似性,我们比较了优势比(OR)和曲线下面积(AUC)。还计算了林氏一致性相关系数(CCC),以量化MT和VDS生成的PRS分数之间的一致性。

结果

VDS方法将计算成本降低了高达99.51%(例如,对于一个51个单核苷酸多态性的分数,从32美元降至0.036美元),同时保持了与使用MT方法获得的PRS估计值高度相似。在所有三种PRS模型中,AUC比较显示MT和VDS之间差异最小,表明两种方法产生的PRS性能一致。两种方法计算的PRS分数之间的一致性进一步得到林氏CCC值在0.9199至0.9944之间的支持,证实了高度一致性。经验累积分布函数(ECDF)图进一步说明了不同方法之间PRS值的几乎相同分布。

结论

AoUPRS能够在AoU内进行高效且经济高效的PRS计算,在保持高度一致的PRS估计值的同时节省大量成本。这些发现支持使用AoUPRS进行大规模基因组风险评估,使AoU数据集对各种研究应用更易于获取和实用。该工具在GitHub上的开源可用性,以及详细的文档和教程,确保了科学界能够访问并易于使用。

相似文献

本文引用的文献

1
Genomic data in the All of Us Research Program.全美国研究计划中的基因组数据。
Nature. 2024 Mar;627(8003):340-346. doi: 10.1038/s41586-023-06957-x. Epub 2024 Feb 19.
4
Polygenic risk scores for cardiovascular diseases and type 2 diabetes.心血管疾病和 2 型糖尿病的多基因风险评分。
PLoS One. 2022 Dec 2;17(12):e0278764. doi: 10.1371/journal.pone.0278764. eCollection 2022.
7
The "All of Us" Research Program.“All of Us”研究计划。
N Engl J Med. 2019 Aug 15;381(7):668-676. doi: 10.1056/NEJMsr1809937.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验