• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用生成式预训练变换器从临床记录中进行癫痫发作结果的零样本提取。

Zero-Shot Extraction of Seizure Outcomes from Clinical Notes Using Generative Pretrained Transformers.

作者信息

Ojemann William K S, Xie Kevin, Liu Kevin, Chang Ellie, Roth Dan, Litt Brian, Ellis Colin A

机构信息

Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104 USA.

Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA 19104 USA.

出版信息

J Healthc Inform Res. 2025 Apr 29;9(3):380-400. doi: 10.1007/s41666-025-00198-5. eCollection 2025 Sep.

DOI:10.1007/s41666-025-00198-5
PMID:40726746
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12290146/
Abstract

UNLABELLED

Emerging evidence has shown that pre-trained encoder transformer models can extract information from unstructured clinic note text but require manual annotation for supervised fine-tuning. Large, Generative Pre-trained Transformer (GPT) models may streamline this process. In this study, we explore GPTs in zero- and few-shot learning scenarios to analyze clinical health records. We prompt-engineered Llama2 13B to optimize performance in extracting seizure freedom from epilepsy clinic notes and compared it against zero-shot and fine-tuned Bio + ClinicalBERT (BERT) models. Our evaluation encompasses different prompting paradigms, including one-word answers, elaboration-based responses, prompts with date formatting instructions, and prompts with dates in context. We found promising median accuracy rates in seizure freedom classification for zero-shot GPTs: one-word-62%, elaboration-50%, prompts with formatted dates-62%, and prompts with dates in context-74%. These outperform the zero-shot BERT model (25%) but fall short of the fully fine-tuned BERT model (84%). Furthermore, in sparse contexts, such as notes from general neurologists, the best performing GPT (76%) surpasses the fine-tuned BERT model (67%) in extracting seizure freedom. This study demonstrates the potential of GPTs in extracting clinically relevant information from unstructured EHR text, offering insights into population trends in seizure management, drug effects, risk factors, and healthcare disparities. Moreover, GPTs exhibit superiority over task-specific models in contexts with the potential to include less precise descriptions of epilepsy and seizures, highlighting their versatility. Additionally, simple prompt engineering techniques enhance model accuracy, presenting a framework for leveraging EHR data with zero clinical annotation.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s41666-025-00198-5.

摘要

未标注

新出现的证据表明,预训练的编码器变压器模型可以从未结构化的临床记录文本中提取信息,但需要人工标注进行监督微调。大型生成式预训练变压器(GPT)模型可能会简化这一过程。在本研究中,我们探索了GPT在零样本和少样本学习场景中分析临床健康记录的情况。我们对Llama2 13B进行了提示工程优化,以提高从癫痫临床记录中提取无癫痫发作信息的性能,并将其与零样本和微调后的Bio+ClinicalBERT(BERT)模型进行比较。我们的评估涵盖了不同的提示范式,包括单字答案、基于阐述的回答、带有日期格式说明的提示以及上下文中带有日期的提示。我们发现零样本GPT在无癫痫发作分类中的中位准确率很有前景:单字回答为62%,阐述为50%,带有格式化日期的提示为62%,上下文中带有日期的提示为74%。这些结果优于零样本BERT模型(25%),但低于完全微调后的BERT模型(84%)。此外,在稀疏的上下文中,如普通神经科医生的记录中,表现最佳的GPT(76%)在提取无癫痫发作信息方面超过了微调后的BERT模型(67%)。这项研究证明了GPT从未结构化电子健康记录文本中提取临床相关信息的潜力,为癫痫管理、药物效果、风险因素和医疗保健差异方面的人群趋势提供了见解。此外,在可能包含对癫痫和发作不太精确描述的上下文中,GPT比特定任务模型表现更优,凸显了它们的通用性。此外,简单的提示工程技术提高了模型的准确性,为在零临床标注的情况下利用电子健康记录数据提供了一个框架。

补充信息

在线版本包含可在10.1007/s41666-025-00198-5获取的补充材料。

相似文献

1
Zero-Shot Extraction of Seizure Outcomes from Clinical Notes Using Generative Pretrained Transformers.使用生成式预训练变换器从临床记录中进行癫痫发作结果的零样本提取。
J Healthc Inform Res. 2025 Apr 29;9(3):380-400. doi: 10.1007/s41666-025-00198-5. eCollection 2025 Sep.
2
A dataset and benchmark for hospital course summarization with adapted large language models.一个用于医院病程总结的数据集和基准测试,采用了适配的大语言模型。
J Am Med Inform Assoc. 2025 Mar 1;32(3):470-479. doi: 10.1093/jamia/ocae312.
3
Keyword-optimized template insertion for clinical note classification via prompt-based learning.通过基于提示的学习进行关键词优化模板插入以实现临床笔记分类
BMC Med Inform Decis Mak. 2025 Jul 3;25(1):247. doi: 10.1186/s12911-025-03071-y.
4
Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study.使用Transformer进行时间序列医疗数据自监督表示学习的轨迹有序目标:模型开发与评估研究
JMIR Med Inform. 2025 Jun 4;13:e68138. doi: 10.2196/68138.
5
CACER: Clinical concept Annotations for Cancer Events and Relations.CACER:癌症事件与关系的临床概念注释。
J Am Med Inform Assoc. 2024 Nov 1;31(11):2583-2594. doi: 10.1093/jamia/ocae231.
6
Enhancing Clinical Relevance of Pretrained Language Models Through Integration of External Knowledge: Case Study on Cardiovascular Diagnosis From Electronic Health Records.通过整合外部知识提高预训练语言模型的临床相关性:来自电子健康记录的心血管诊断案例研究
JMIR AI. 2024 Aug 6;3:e56932. doi: 10.2196/56932.
7
Automated Transformation of Unstructured Cardiovascular Diagnostic Reports into Structured Datasets Using Sequentially Deployed Large Language Models.使用顺序部署的大语言模型将非结构化心血管诊断报告自动转换为结构化数据集
medRxiv. 2024 Oct 8:2024.10.08.24315035. doi: 10.1101/2024.10.08.24315035.
8
The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study.生成式预训练变换器4(GPT-4)分析三种不同语言医学笔记的潜力:一项回顾性模型评估研究。
Lancet Digit Health. 2025 Jan;7(1):e35-e43. doi: 10.1016/S2589-7500(24)00246-2.
9
Evaluating the Reasoning Capabilities of Large Language Models for Medical Coding and Hospital Readmission Risk Stratification: Zero-Shot Prompting Approach.评估大型语言模型在医学编码和医院再入院风险分层方面的推理能力:零样本提示方法。
J Med Internet Res. 2025 Jul 30;27:e74142. doi: 10.2196/74142.
10
Language Models for Multilabel Document Classification of Surgical Concepts in Exploratory Laparotomy Operative Notes: Algorithm Development Study.用于探索性剖腹手术记录中手术概念多标签文档分类的语言模型:算法开发研究
JMIR Med Inform. 2025 Jul 9;13:e71176. doi: 10.2196/71176.

本文引用的文献

1
Automated Extraction of Stroke Severity From Unstructured Electronic Health Records Using Natural Language Processing.使用自然语言处理从非结构化电子健康记录中自动提取中风严重程度
J Am Heart Assoc. 2024 Nov 5;13(21):e036386. doi: 10.1161/JAHA.124.036386. Epub 2024 Oct 25.
2
Large language models facilitate the generation of electronic health record phenotyping algorithms.大语言模型有助于电子健康记录表型算法的生成。
J Am Med Inform Assoc. 2024 Sep 1;31(9):1994-2001. doi: 10.1093/jamia/ocae072.
3
A Comparison of Veterans with Problematic Opioid Use Identified through Natural Language Processing of Clinical Notes versus Using Diagnostic Codes.
通过临床记录自然语言处理识别出的有问题阿片类药物使用的退伍军人与使用诊断代码的退伍军人的比较。
Healthcare (Basel). 2024 Apr 6;12(7):799. doi: 10.3390/healthcare12070799.
4
Moving Biosurveillance Beyond Coded Data Using AI for Symptom Detection From Physician Notes: Retrospective Cohort Study.利用人工智能从医生笔记中检测症状,推动生物监测超越编码数据:回顾性队列研究。
J Med Internet Res. 2024 Apr 4;26:e53367. doi: 10.2196/53367.
5
A large language model-based generative natural language processing framework fine-tuned on clinical notes accurately extracts headache frequency from electronic health records.基于大型语言模型的生成式自然语言处理框架,在临床笔记上进行了微调,能够从电子健康记录中准确提取头痛频率。
Headache. 2024 Apr;64(4):400-409. doi: 10.1111/head.14702. Epub 2024 Mar 25.
6
Disparities in seizure outcomes revealed by large language models.大语言模型揭示的癫痫发作结局差异。
J Am Med Inform Assoc. 2024 May 20;31(6):1348-1355. doi: 10.1093/jamia/ocae047.
7
A Comparison of ChatGPT and Fine-Tuned Open Pre-Trained Transformers (OPT) Against Widely Used Sentiment Analysis Tools: Sentiment Analysis of COVID-19 Survey Data.ChatGPT与微调后的开放预训练变换器(OPT)与广泛使用的情感分析工具的比较:COVID-19调查数据的情感分析
JMIR Ment Health. 2024 Jan 25;11:e50150. doi: 10.2196/50150.
8
DRG-LLaMA : tuning LLaMA model to predict diagnosis-related group for hospitalized patients.DRG-LLaMA:调整LLaMA模型以预测住院患者的诊断相关分组
NPJ Digit Med. 2024 Jan 22;7(1):16. doi: 10.1038/s41746-023-00989-3.
9
Large language models to identify social determinants of health in electronic health records.利用大语言模型识别电子健康记录中的健康社会决定因素。
NPJ Digit Med. 2024 Jan 11;7(1):6. doi: 10.1038/s41746-023-00970-0.
10
Generalization of finetuned transformer language models to new clinical contexts.微调后的变压器语言模型在新临床环境中的泛化。
JAMIA Open. 2023 Aug 16;6(3):ooad070. doi: 10.1093/jamiaopen/ooad070. eCollection 2023 Oct.