• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用大语言模型从生物医学文献中识别癌症疫苗佐剂名称

Cancer Vaccine Adjuvant Name Recognition from Biomedical Literature using Large Language Models.

作者信息

Rehana Hasin, Zheng Jie, Yeh Leo, Bansal Benu, Çam Nur Bengisu, Jemiyo Christianah, McGregor Brett, Özgür Arzucan, He Yongqun, Hur Junguk

机构信息

Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, North Dakota, 58202, USA.

School of Electrical Engineering & Computer Science, University of North Dakota, Grand Forks, North Dakota, 58202, USA.

出版信息

ArXiv. 2025 Feb 12:arXiv:2502.09659v1.

PMID:40196147
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11975310/
Abstract

MOTIVATION

An adjuvant is a chemical incorporated into vaccines that enhances their efficacy by improving the immune response. Identifying adjuvant names from cancer vaccine studies is essential for furthering research and enhancing immunotherapies. However, the manual curation from the constantly expanding biomedical literature poses significant challenges. This study explores the automated recognition of vaccine adjuvant names using state-of-the-art Large Language Models (LLMs), specifically Generative Pretrained Transformers (GPT) and Large Language Model Meta AI (Llama).

METHODS

We utilized two datasets: 97 clinical trial records from AdjuvareDB and 290 PubMed abstracts annotated with the Vaccine Adjuvant Compendium (VAC). Two LLMs, GPT-4o and Llama 3.2 were employed in zero-shot and few-shot learning paradigms with up to four examples per prompt. Prompts explicitly targeted adjuvant names, testing the impact of contextual information such as substances or interventions. Outputs underwent automated and manual validation for accuracy and consistency.

RESULTS

GPT-4o consistently attained 100% Precision across all situations, while also exhibiting notable enhancements in Recall and F1-scores, particularly with the incorporation of interventions. On the VAC dataset, GPT-4o achieved a maximum F1-score of 77.32% with interventions, surpassing Llama-3.2-3B by approximately 2%. On the AdjuvareDB dataset, GPT-4o reached an F1-score of 81.67% for three-shot prompting with interventions, surpassing Llama-3.2-3B's maximum F1-score of 65.62%. These results highlight the critical role of contextual information in enhancing model performance, with GPT-4o demonstrating a superior ability to leverage this enrichment.

CONCLUSION

Our findings demonstrate that LLMs excel at accurately identifying adjuvant names, including rare and novel variations of naming representation. This study emphasizes the capability of LLMs to enhance cancer vaccine development by efficiently extracting insights from clinical trial data. Future work aims to broaden the framework to encompass a wider array of biomedical literature and enhance model generalizability across various vaccines and adjuvants.

AVAILABILITY

Source code is available at https://github.com/hurlab/Vaccine-Adjuvant-LLM.

摘要

动机

佐剂是一种添加到疫苗中的化学物质,通过改善免疫反应来提高疫苗效力。从癌症疫苗研究中识别佐剂名称对于推进研究和增强免疫疗法至关重要。然而,从不断扩充的生物医学文献中进行人工整理面临重大挑战。本研究探索使用先进的大语言模型(LLMs),特别是生成式预训练变换器(GPT)和大语言模型元人工智能(Llama)自动识别疫苗佐剂名称。

方法

我们使用了两个数据集:来自AdjuvareDB的97条临床试验记录和290篇用疫苗佐剂汇编(VAC)注释的PubMed摘要。两个大语言模型,GPT-4o和Llama 3.2,在零样本和少样本学习范式中使用,每个提示最多有四个示例。提示明确针对佐剂名称,测试诸如物质或干预等上下文信息的影响。对输出进行自动和人工验证以确保准确性和一致性。

结果

GPT-4o在所有情况下始终达到100%的精确率,同时在召回率和F1分数方面也有显著提高,特别是在纳入干预措施时。在VAC数据集上,GPT-4o在纳入干预措施时达到了77.32%的最高F1分数,比Llama-3.2-3B高出约2%。在AdjuvareDB数据集上,GPT-4o在有干预措施的三样本提示下达到了81.67%的F1分数,超过了Llama-3.2-3B的最高F1分数65.62%。这些结果突出了上下文信息在提高模型性能方面的关键作用,GPT-4o展示了利用这种丰富信息的卓越能力。

结论

我们的研究结果表明,大语言模型擅长准确识别佐剂名称,包括命名表示的罕见和新颖变体。本研究强调了大语言模型通过有效从临床试验数据中提取见解来促进癌症疫苗开发的能力。未来的工作旨在扩大框架以涵盖更广泛的生物医学文献,并提高模型在各种疫苗和佐剂上的通用性。

可用性

源代码可在https://github.com/hurlab/Vaccine-Adjuvant-LLM获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/428e4a53bf63/nihpp-2502.09659v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/68557f74d225/nihpp-2502.09659v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/66b6dea0cacd/nihpp-2502.09659v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/601d16bc941e/nihpp-2502.09659v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/28a3b202789d/nihpp-2502.09659v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/9b917a86a19a/nihpp-2502.09659v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/428e4a53bf63/nihpp-2502.09659v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/68557f74d225/nihpp-2502.09659v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/66b6dea0cacd/nihpp-2502.09659v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/601d16bc941e/nihpp-2502.09659v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/28a3b202789d/nihpp-2502.09659v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/9b917a86a19a/nihpp-2502.09659v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/919c/11975310/428e4a53bf63/nihpp-2502.09659v1-f0006.jpg

相似文献

1
Cancer Vaccine Adjuvant Name Recognition from Biomedical Literature using Large Language Models.使用大语言模型从生物医学文献中识别癌症疫苗佐剂名称
ArXiv. 2025 Feb 12:arXiv:2502.09659v1.
2
Privacy-ensuring Open-weights Large Language Models Are Competitive with Closed-weights GPT-4o in Extracting Chest Radiography Findings from Free-Text Reports.在从自由文本报告中提取胸部X光检查结果方面,确保隐私的开放权重大型语言模型与封闭权重的GPT-4o具有竞争力。
Radiology. 2025 Jan;314(1):e240895. doi: 10.1148/radiol.240895.
3
An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.零样本临床自然语言处理中大型语言模型提示策略的实证评估:算法开发与验证研究
JMIR Med Inform. 2024 Apr 8;12:e55318. doi: 10.2196/55318.
4
SensitiveCancerGPT: Leveraging Generative Large Language Model on Structured Omics Data to Optimize Drug Sensitivity Prediction.敏感癌症GPT:利用生成式大语言模型处理结构化组学数据以优化药物敏感性预测。
bioRxiv. 2025 Mar 3:2025.02.27.640661. doi: 10.1101/2025.02.27.640661.
5
Improving entity recognition using ensembles of deep learning and fine-tuned large language models: A case study on adverse event extraction from VAERS and social media.使用深度学习集成和微调大语言模型改进实体识别:以从VAERS和社交媒体中提取不良事件为例
J Biomed Inform. 2025 Mar;163:104789. doi: 10.1016/j.jbi.2025.104789. Epub 2025 Feb 7.
6
Comparing Commercial and Open-Source Large Language Models for Labeling Chest Radiograph Reports.比较商用和开源大语言模型在标注胸部 X 光报告中的表现。
Radiology. 2024 Oct;313(1):e241139. doi: 10.1148/radiol.241139.
7
Improving large language models for clinical named entity recognition via prompt engineering.通过提示工程改进临床命名实体识别的大型语言模型。
J Am Med Inform Assoc. 2024 Sep 1;31(9):1812-1820. doi: 10.1093/jamia/ocad259.
8
AI in Home Care-Evaluation of Large Language Models for Future Training of Informal Caregivers: Observational Comparative Case Study.家庭护理中的人工智能——对用于未来非正式护理人员培训的大语言模型的评估:观察性比较案例研究
J Med Internet Res. 2025 Apr 28;27:e70703. doi: 10.2196/70703.
9
Relation extraction using large language models: a case study on acupuncture point locations.基于大语言模型的关系抽取研究:以穴位定位为例。
J Am Med Inform Assoc. 2024 Nov 1;31(11):2622-2631. doi: 10.1093/jamia/ocae233.
10
Zero-shot learning for clinical phenotyping: Comparing LLMs and rule-based methods.用于临床表型分析的零样本学习:比较大语言模型和基于规则的方法。
Comput Biol Med. 2025 Jun;192(Pt A):110181. doi: 10.1016/j.compbiomed.2025.110181. Epub 2025 Apr 23.

本文引用的文献

1
CORAL: Expert-Curated Oncology Reports to Advance Language Model Inference.CORAL:经专家策划的肿瘤学报告,以推进语言模型推理。
NEJM AI. 2024 Apr;1(4). doi: 10.1056/aidbp2300110. Epub 2024 Mar 13.
2
Fine-tuning a local LLaMA-3 large language model for automated privacy-preserving physician letter generation in radiation oncology.微调本地的LLaMA-3大语言模型以在放射肿瘤学中自动生成隐私保护医生信件。
Front Artif Intell. 2025 Jan 14;7:1493716. doi: 10.3389/frai.2024.1493716. eCollection 2024.
3
Cancer vaccines: Target antigens, vaccine platforms and preclinical models.
癌症疫苗:靶抗原、疫苗平台及临床前模型
Mol Aspects Med. 2025 Feb;101:101324. doi: 10.1016/j.mam.2024.101324. Epub 2024 Dec 3.
4
Personalized cancer vaccine design using AI-powered technologies.基于人工智能技术的个体化癌症疫苗设计。
Front Immunol. 2024 Nov 8;15:1357217. doi: 10.3389/fimmu.2024.1357217. eCollection 2024.
5
Evaluating GPT and BERT models for protein-protein interaction identification in biomedical text.评估GPT和BERT模型用于生物医学文本中蛋白质-蛋白质相互作用的识别
Bioinform Adv. 2024 Sep 11;4(1):vbae133. doi: 10.1093/bioadv/vbae133. eCollection 2024.
6
Revolutionizing adjuvant development: harnessing AI for next-generation cancer vaccines.颠覆辅助药物研发模式:利用人工智能开发新一代癌症疫苗。
Front Immunol. 2024 Aug 14;15:1438030. doi: 10.3389/fimmu.2024.1438030. eCollection 2024.
7
[Not Available].[无可用内容]。
Clin Transl Med. 2024 Apr;14(4):e1669. doi: 10.1002/ctm2.1669.
8
Leveraging Large Language Models for Decision Support in Personalized Oncology.利用大型语言模型为个性化肿瘤学提供决策支持。
JAMA Netw Open. 2023 Nov 1;6(11):e2343689. doi: 10.1001/jamanetworkopen.2023.43689.
9
Vaccine Development Throughout History.贯穿历史的疫苗研发
Cureus. 2021 Jul 26;13(7):e16635. doi: 10.7759/cureus.16635. eCollection 2021 Jul.
10
Adjuvants for peptide-based cancer vaccines.基于肽的癌症疫苗佐剂。
J Immunother Cancer. 2016 Sep 20;4:56. doi: 10.1186/s40425-016-0160-y. eCollection 2016.