• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

AskBeacon——通过自然语言进行基因组数据交换与分析。

AskBeacon-performing genomic data exchange and analytics with natural language.

作者信息

Wickramarachchi Anuradha, Tonni Shakila, Majumdar Sonali, Karimi Sarvnaz, Kõks Sulev, Hosking Brendan, Rambla Jordi, Twine Natalie A, Jain Yatish, Bauer Denis C

机构信息

Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Adelaide, SA 5000, Australia.

Data61, Commonwealth Scientific and Industrial Research Organisation, Sydney, NSW 2015, Australia.

出版信息

Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf079.

DOI:10.1093/bioinformatics/btaf079
PMID:39985504
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11889448/
Abstract

MOTIVATION

Enabling clinicians and researchers to directly interact with global genomic data resources by removing technological barriers is vital for medical genomics. AskBeacon enables large language models (LLMs) to be applied to securely shared cohorts via the Global Alliance for Genomics and Health Beacon protocol. By simply "asking" Beacon, actionable insights can be gained, analyzed, and made publication-ready.

RESULTS

In the Parkinson's Progression Markers Initiative (PPMI), we use natural language to ask whether the sex-differences observed in Parkinson's disease are due to X-linked or autosomal markers. AskBeacon returns a publication-ready visualization showing that for PPMI the autosomal marker occurred 1.4 times more often in males with Parkinson's disease than females, compared to no differences for the X-linked marker. We evaluate commercial and open-weight LLM models, as well as different architectures to identify the best strategy for translating research questions to Beacon queries. AskBeacon implements extensive safety guardrails to ensure that genomic data is not exposed to the LLM directly, and that generated code for data extraction, analysis and visualization process is sanitized and hallucination resistant, so data cannot be leaked or falsified.

AVAILABILITY AND IMPLEMENTATION

AskBeacon is available at https://github.com/aehrc/AskBeacon.

摘要

动机

消除技术障碍,使临床医生和研究人员能够直接与全球基因组数据资源进行交互,这对医学基因组学至关重要。AskBeacon使大语言模型(LLMs)能够通过全球基因组学与健康联盟信标协议应用于安全共享的队列。通过简单地“询问”信标,就可以获得可操作的见解、进行分析并使其达到可发表的状态。

结果

在帕金森病进展标志物倡议(PPMI)中,我们使用自然语言询问帕金森病中观察到的性别差异是由X连锁还是常染色体标志物引起的。AskBeacon返回一个可发表的可视化结果,显示对于PPMI,帕金森病男性患者中常染色体标志物出现的频率比女性高1.4倍,而X连锁标志物则没有差异。我们评估了商业和开源的大语言模型以及不同的架构,以确定将研究问题转化为信标查询的最佳策略。AskBeacon实施了广泛的安全防护措施,以确保基因组数据不会直接暴露给大语言模型,并且用于数据提取、分析和可视化过程的生成代码经过清理且抗幻觉,因此数据不会泄露或伪造。

可用性和实现方式

AskBeacon可在https://github.com/aehrc/AskBeacon上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e64/11889448/53e045a146a4/btaf079f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e64/11889448/51a4918f7866/btaf079f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e64/11889448/53e045a146a4/btaf079f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e64/11889448/51a4918f7866/btaf079f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e64/11889448/53e045a146a4/btaf079f1.jpg

相似文献

1
AskBeacon-performing genomic data exchange and analytics with natural language.AskBeacon——通过自然语言进行基因组数据交换与分析。
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf079.
2
Predictive Big Data Analytics: A Study of Parkinson's Disease Using Large, Complex, Heterogeneous, Incongruent, Multi-Source and Incomplete Observations.预测性大数据分析:一项使用大规模、复杂、异构、不一致、多源和不完整观测数据对帕金森病的研究。
PLoS One. 2016 Aug 5;11(8):e0157077. doi: 10.1371/journal.pone.0157077. eCollection 2016.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
Steering veridical large language model analyses by correcting and enriching generated database queries: first steps toward ChatGPT bioinformatics.通过纠正和丰富生成的数据库查询来引导真实的大语言模型分析:迈向ChatGPT生物信息学的第一步。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf045.
5
Pheno-Ranker: a toolkit for comparison of phenotypic data stored in GA4GH standards and beyond.Pheno-Ranker:用于比较存储在GA4GH标准及其他标准中的表型数据的工具包。
BMC Bioinformatics. 2024 Dec 4;25(1):373. doi: 10.1186/s12859-024-05993-2.
6
Explorative visual analytics on interval-based genomic data and their metadata.基于区间的基因组数据及其元数据的探索性可视化分析。
BMC Bioinformatics. 2017 Dec 4;18(1):536. doi: 10.1186/s12859-017-1945-9.
7
Baseline genetic associations in the Parkinson's Progression Markers Initiative (PPMI).帕金森病进展标志物倡议(PPMI)中的基线基因关联。
Mov Disord. 2016 Jan;31(1):79-85. doi: 10.1002/mds.26374. Epub 2015 Aug 13.
8
Dynamic properties in functional connectivity changes and striatal dopamine deficiency in Parkinson's disease.帕金森病中功能连接变化和纹状体多巴胺缺乏的动态特性。
Hum Brain Mapp. 2024 Jul 15;45(10):e26776. doi: 10.1002/hbm.26776.
9
Using Large Language Models to Automate Data Extraction From Surgical Pathology Reports: Retrospective Cohort Study.使用大语言模型自动从外科病理报告中提取数据:回顾性队列研究。
JMIR Form Res. 2025 Apr 7;9:e64544. doi: 10.2196/64544.
10
Large-scale identification of clinical and genetic predictors of motor progression in patients with newly diagnosed Parkinson's disease: a longitudinal cohort study and validation.新诊断帕金森病患者运动进展的临床和遗传预测因素的大规模识别:一项纵向队列研究及验证
Lancet Neurol. 2017 Nov;16(11):908-916. doi: 10.1016/S1474-4422(17)30328-9. Epub 2017 Sep 25.

本文引用的文献

1
Unraveling sex differences in Parkinson's disease through explainable machine learning.通过可解释的机器学习揭示帕金森病中的性别差异。
J Neurol Sci. 2024 Jul 15;462:123091. doi: 10.1016/j.jns.2024.123091. Epub 2024 Jun 8.
2
Twelve quick tips for deploying a Beacon.部署信标的十二个快速提示。
PLoS Comput Biol. 2024 Mar 1;20(3):e1011817. doi: 10.1371/journal.pcbi.1011817. eCollection 2024 Mar.
3
Scalable genomic data exchange and analytics with sBeacon.使用sBeacon进行可扩展的基因组数据交换与分析。
Nat Biotechnol. 2023 Nov;41(11):1510-1512. doi: 10.1038/s41587-023-01972-9.
4
We need a plan D.我们需要一个备用方案。
Nat Methods. 2023 Apr;20(4):473-474. doi: 10.1038/s41592-023-01817-y.
5
Beacon v2 Reference Implementation: a toolkit to enable federated sharing of genomic and phenotypic data.Beacon v2 参考实现:一个用于实现基因组和表型数据联合共享的工具包。
Bioinformatics. 2022 Sep 30;38(19):4656-4657. doi: 10.1093/bioinformatics/btac568.
6
Exploring Uncharted Territory: Genetically Determined Sex Differences in Parkinson's Disease.探索未知领域:帕金森病中由基因决定的性别差异
Ann Neurol. 2021 Jul;90(1):15-18. doi: 10.1002/ana.26091. Epub 2021 May 13.
7
Common X-Chromosome Variants Are Associated with Parkinson Disease Risk.常见的 X 染色体变异与帕金森病风险相关。
Ann Neurol. 2021 Jul;90(1):22-34. doi: 10.1002/ana.26051. Epub 2021 Mar 6.
8
Federated discovery and sharing of genomic data using Beacons.使用信标进行基因组数据的联合发现与共享。
Nat Biotechnol. 2019 Mar;37(3):220-224. doi: 10.1038/s41587-019-0046-x.
9
Cognition among individuals along a spectrum of increased risk for Parkinson's disease.帕金森病高危人群认知研究。
PLoS One. 2018 Aug 20;13(8):e0201964. doi: 10.1371/journal.pone.0201964. eCollection 2018.