• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用提示调整大型语言模型增强健康提取社会决定因素中的跨领域通用性。

Enhancing Cross-Domain Generalizability in Social Determinants of Health Extraction with Prompt-Tuning Large Language Models.

作者信息

Peng Cheng, Yu Zehao, Smith Kaleb E, Lo-Ciganic Wei-Hsuan, Bian Jiang, Wu Yonghui

机构信息

Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA.

NVIDIA, Santa Clara, California, USA.

出版信息

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:432-440. eCollection 2025.

PMID:40502248
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12150740/
Abstract

The progress in natural language processing (NLP) using large language models (LLMs) has greatly improved patient information extraction from clinical narratives. However, most methods based on the fine-tuning strategy have limited transfer learning ability for cross-domain applications. This study proposed a novel approach that employs a soft prompt-based learning architecture, which introduces trainable prompts to guide LLMs toward desired outputs. We examined two types of LLM architectures, including encoder-only GatorTron and decoder-only GatorTronGPT, and evaluated their performance for the extraction of social determinants of health (SDoH) using a cross-institution dataset from the 2022 n2c2 challenge and a cross-disease dataset from the University of Florida (UF) Health. The results show that decoder-only LLMs with prompt tuning achieved better performance in cross-domain applications. GatorTronGPT achieved the best F1 scores for both datasets, outperforming traditional fine-tuned GatorTron by 8.9% and 21.8% in a cross-institution setting, and 5.5% and 14.5% in a cross-disease setting.

摘要

使用大语言模型(LLMs)进行自然语言处理(NLP)的进展极大地改善了从临床叙述中提取患者信息的能力。然而,大多数基于微调策略的方法在跨域应用中的迁移学习能力有限。本研究提出了一种新颖的方法,该方法采用基于软提示的学习架构,引入可训练的提示来引导大语言模型生成期望的输出。我们研究了两种类型的大语言模型架构,包括仅编码器的GatorTron和仅解码器的GatorTronGPT,并使用来自2022年n2c2挑战赛的跨机构数据集和佛罗里达大学(UF)健康中心的跨疾病数据集评估了它们在提取健康的社会决定因素(SDoH)方面的性能。结果表明,通过提示调整的仅解码器大语言模型在跨域应用中表现出更好的性能。GatorTronGPT在两个数据集上均取得了最佳的F1分数,在跨机构设置中比传统的微调GatorTron分别高出8.9%和21.8%,在跨疾病设置中高出5.5%和14.5%。

相似文献

1
Enhancing Cross-Domain Generalizability in Social Determinants of Health Extraction with Prompt-Tuning Large Language Models.利用提示调整大型语言模型增强健康提取社会决定因素中的跨领域通用性。
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:432-440. eCollection 2025.
2
Toward Cross-Hospital Deployment of Natural Language Processing Systems: Model Development and Validation of Fine-Tuned Large Language Models for Disease Name Recognition in Japanese.迈向自然语言处理系统的跨医院部署:用于日语疾病名称识别的微调大语言模型的模型开发与验证
JMIR Med Inform. 2025 Jul 8;13:e76773. doi: 10.2196/76773.
3
A dataset and benchmark for hospital course summarization with adapted large language models.一个用于医院病程总结的数据集和基准测试,采用了适配的大语言模型。
J Am Med Inform Assoc. 2025 Mar 1;32(3):470-479. doi: 10.1093/jamia/ocae312.
4
Enhancing Pulmonary Disease Prediction Using Large Language Models With Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study Based on Radiology Report.使用具有特征总结和混合检索增强生成功能的大语言模型增强肺部疾病预测:基于放射学报告的多中心方法学研究
J Med Internet Res. 2025 Jun 11;27:e72638. doi: 10.2196/72638.
5
Evaluating and Improving Syndrome Differentiation Thinking Ability in Large Language Models: Method Development Study.评估和提高大语言模型中的辨证思维能力:方法开发研究
JMIR Med Inform. 2025 Jun 20;13:e75103. doi: 10.2196/75103.
6
Keyword-optimized template insertion for clinical note classification via prompt-based learning.通过基于提示的学习进行关键词优化模板插入以实现临床笔记分类
BMC Med Inform Decis Mak. 2025 Jul 3;25(1):247. doi: 10.1186/s12911-025-03071-y.
7
Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.使用自然语言处理从阿尔茨海默病患者的临床记录中提取睡眠信息。
J Am Med Inform Assoc. 2024 Oct 1;31(10):2217-2227. doi: 10.1093/jamia/ocae177.
8
Model tuning or prompt Tuning? a study of large language models for clinical concept and relation extraction.模型调优还是提示调优?大型语言模型在临床概念和关系抽取中的应用研究。
J Biomed Inform. 2024 May;153:104630. doi: 10.1016/j.jbi.2024.104630. Epub 2024 Mar 26.
9
Evaluating and Enhancing Japanese Large Language Models for Genetic Counseling Support: Comparative Study of Domain Adaptation and the Development of an Expert-Evaluated Dataset.评估和增强用于遗传咨询支持的日本大语言模型:领域适应的比较研究与专家评估数据集的开发
JMIR Med Inform. 2025 Jan 16;13:e65047. doi: 10.2196/65047.
10
Algorithmic Classification of Psychiatric Disorder-Related Spontaneous Communication Using Large Language Model Embeddings: Algorithm Development and Validation.使用大语言模型嵌入对精神障碍相关自发交流进行算法分类:算法开发与验证
JMIR AI. 2025 May 30;4:e67369. doi: 10.2196/67369.

本文引用的文献

1
Model tuning or prompt Tuning? a study of large language models for clinical concept and relation extraction.模型调优还是提示调优?大型语言模型在临床概念和关系抽取中的应用研究。
J Biomed Inform. 2024 May;153:104630. doi: 10.1016/j.jbi.2024.104630. Epub 2024 Mar 26.
2
A study of generative large language model for medical research and healthcare.一项关于用于医学研究和医疗保健的生成式大语言模型的研究。
NPJ Digit Med. 2023 Nov 16;6(1):210. doi: 10.1038/s41746-023-00958-w.
3
Clinical concept and relation extraction using prompt-based machine reading comprehension.基于提示的机器阅读理解的临床概念和关系抽取。
J Am Med Inform Assoc. 2023 Aug 18;30(9):1486-1493. doi: 10.1093/jamia/ocad107.
4
Clinical named entity recognition and relation extraction using natural language processing of medical free text: A systematic review.临床命名实体识别和关系抽取技术在医学自然语言处理中的应用:系统综述。
Int J Med Inform. 2023 Sep;177:105122. doi: 10.1016/j.ijmedinf.2023.105122. Epub 2023 Jun 5.
5
The 2022 n2c2/UW shared task on extracting social determinants of health.2022 年 n2c2/UW 关于提取健康社会决定因素的共享任务。
J Am Med Inform Assoc. 2023 Jul 19;30(8):1367-1378. doi: 10.1093/jamia/ocad012.
6
Evaluation of the portability of computable phenotypes with natural language processing in the eMERGE network.使用自然语言处理评估 eMERGE 网络中可计算表型的可移植性。
Sci Rep. 2023 Feb 3;13(1):1971. doi: 10.1038/s41598-023-27481-y.
7
Portability of natural language processing methods to detect suicidality from clinical text in US and UK electronic health records.自然语言处理方法在美国和英国电子健康记录中从临床文本检测自杀倾向的可移植性。
J Affect Disord Rep. 2022 Dec;10. doi: 10.1016/j.jadr.2022.100430. Epub 2022 Oct 25.
8
A large language model for electronic health records.用于电子健康记录的大型语言模型。
NPJ Digit Med. 2022 Dec 26;5(1):194. doi: 10.1038/s41746-022-00742-2.
9
Clinical concept extraction using transformers.使用转换器进行临床概念提取。
J Am Med Inform Assoc. 2020 Dec 9;27(12):1935-1942. doi: 10.1093/jamia/ocaa189.
10
Computable Phenotype Implementation for a National, Multicenter Pragmatic Clinical Trial: Lessons Learned From ADAPTABLE.可计算表型在全国多中心实用临床试验中的实施:ADAPTABLE 研究的经验教训
Circ Cardiovasc Qual Outcomes. 2020 Jun;13(6):e006292. doi: 10.1161/CIRCOUTCOMES.119.006292. Epub 2020 May 29.