Suppr超能文献

优化英国生物银行基于云的研究分析平台,以在全基因组测序数据中精细定位冠状动脉疾病基因座。

Optimizing UK biobank cloud-based research analysis platform to fine-map coronary artery disease loci in whole genome sequencing data.

作者信息

Sng Letitia M F, Kaphle Anubhav, O'Brien Mitchell J, Hosking Brendan, Reguant Roc, Verjans Johan, Jain Yatish, Twine Natalie A, Bauer Denis C

机构信息

Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, New South Wales, Australia.

Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Melbourne, Victoria, Australia.

出版信息

Sci Rep. 2025 Mar 25;15(1):10335. doi: 10.1038/s41598-025-95286-2.

Abstract

We conducted the first comprehensive association analysis of a coronary artery disease (CAD) cohort within the recently released UK Biobank (UKB) whole genome sequencing dataset. We employed fine mapping tool PolyFun and pinpoint rs10757274 as the most likely causal SNV within the 9p21.3 CAD risk locus. Notably, we show that machine-learning (ML) approaches, REGENIE and VariantSpark, exhibited greater sensitivity compared to traditional single-SNV logistic regression, uncovering rs28451064 a known risk locus in 21q22.11. Our findings underscore the utility of leveraging advanced computational techniques and cloud-based resources for mega-biobank analyses. Aligning with the paradigm shift of bringing compute to data, we demonstrate a 44% cost reduction and 94% speedup through compute architecture optimisation on UK Biobank's Research Analysis Platform using our RAPpoet approach. We discuss three considerations for researchers implementing novel workflows for datasets hosted on cloud-platforms, to pave the way for harnessing mega-biobank-sized data through scalable, cost-effective cloud computing solutions.

摘要

我们在最近发布的英国生物银行(UKB)全基因组测序数据集中,对冠心病(CAD)队列进行了首次全面的关联分析。我们使用精细定位工具PolyFun,并确定rs10757274是9p21.3 CAD风险基因座中最可能的因果单核苷酸变异(SNV)。值得注意的是,我们发现机器学习(ML)方法REGENIE和VariantSpark与传统的单SNV逻辑回归相比,表现出更高的灵敏度,发现了21q22.11中一个已知的风险基因座rs28451064。我们的研究结果强调了利用先进计算技术和基于云的资源进行大型生物银行分析的实用性。与将计算带到数据的范式转变相一致,我们通过使用我们的RAPpoet方法在英国生物银行的研究分析平台上进行计算架构优化,展示了44%的成本降低和94%的加速。我们讨论了研究人员为云平台上托管的数据集实施新工作流程时的三个注意事项,为通过可扩展、经济高效的云计算解决方案利用大型生物银行规模的数据铺平道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1816/11937306/53c4bbac9006/41598_2025_95286_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验