文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

Performance and improvement strategies for adapting generative large language models for electronic health record applications: A systematic review.

作者信息

Du Xinsong, Zhou Zhengyang, Wang Yifei, Chuang Ya-Wen, Li Yiming, Yang Richard, Hong Pengyu, Bates David W, Zhou Li

机构信息

Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA 02115, United States; Department of Medicine, Harvard Medical School, Boston, MA 02115, United States.

Department of Computer Science, Brandeis University, Waltham, MA 02453, United States.

出版信息

Int J Med Inform. 2025 Aug 28;205:106091. doi: 10.1016/j.ijmedinf.2025.106091.


DOI:10.1016/j.ijmedinf.2025.106091
PMID:40885071
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12413914/
Abstract

PURPOSE: To synthesize performance and improvement strategies for adapting generative LLMs in EHR analyses and applications. METHODS: We followed the PRISMA guidelines to conduct a systematic review of articles from PubMed and Web of Science published between January 1, 2023 and November 9, 2024. Multiple reviewers including biomedical informaticians and a clinician involved in the article reviewing process. Studies were included if they used generative LLMs to analyze real-world EHR data and reported quantitative performance evaluations for an improvement technique. The review identified key clinical applications, summarized performance and the improvement strategies. RESULTS: Of the 18,735 articles retrieved, 196 met our criteria. 112 (57.1%) studies used generative LLMs for clinical decision support tasks, 40 (20.4%) studies involved documentation tasks, 39 (19.9%) studies involved information extraction tasks, 11 (5.6%) studies involved patient communication tasks, and 10 (5.1%) studies included summarization tasks. Among the 196 studies, most studies (88.8%) did not quantitatively evaluate the LLM performance improvement strategies, with the rest twenty-four studies (12.2%) quantitatively evaluated the effectiveness of in-context learning (9 studies), fine-tuning (12 studies), multimodal integration (8 studies), and ensemble learning (2 studies). Three studies highlighted that few-shot prompting, fine-tuning, and multimodal data integration might not improve performance, and another two studies found that fine-tuning a smaller model could outperform a large model. CONCLUSION: Applying a performance improvement strategy may not necessarily lead to performance improvement, and detailed guidelines regarding how to apply those strategies more effectively and safely are needed, which can be completed from more quantitative analysis in the future.

摘要

相似文献

[1]
Performance and improvement strategies for adapting generative large language models for electronic health record applications: A systematic review.

Int J Med Inform. 2025-8-28

[2]
Generative Large Language Models in Electronic Health Records for Patient Care Since 2023: A Systematic Review.

medRxiv. 2024-8-19

[3]
Implementing Large Language Models in Health Care: Clinician-Focused Review With Interactive Guideline.

J Med Internet Res. 2025-7-11

[4]
Large Language Models and Empathy: Systematic Review.

J Med Internet Res. 2024-12-11

[5]
Sexual Harassment and Prevention Training

2025-1

[6]
Development and evaluation of large-language models (LLMs) for oncology: A scoping review.

PLOS Digit Health. 2025-8-7

[7]
A dataset and benchmark for hospital course summarization with adapted large language models.

J Am Med Inform Assoc. 2025-3-1

[8]
Applications and Concerns of ChatGPT and Other Conversational Large Language Models in Health Care: Systematic Review.

J Med Internet Res. 2024-11-7

[9]
Generative AI/LLMs for Plain Language Medical Information for Patients, Caregivers and General Public: Opportunities, Risks and Ethics.

Patient Prefer Adherence. 2025-7-31

[10]
Prescription of Controlled Substances: Benefits and Risks

2025-1

引用本文的文献

[1]
Precision Grounding: Augmenting Large Language Models with Evidence-Based Databases for Trustworthy Genetic Variant Summarization.

medRxiv. 2025-6-10

本文引用的文献

[1]
Optimizing Large Language Models in Radiology and Mitigating Pitfalls: Prompt Engineering and Fine-tuning.

Radiographics. 2025-4

[2]
A machine learning approach to leveraging electronic health records for enhanced omics analysis.

Nat Mach Intell. 2025

[3]
Large language models improve the identification of emergency department visits for symptomatic kidney stones.

Sci Rep. 2025-1-28

[4]
Economics and Equity of Large Language Models: Health Care Perspective.

J Med Internet Res. 2024-11-14

[5]
PRISM: Patient Records Interpretation for Semantic clinical trial Matching system using large language models.

NPJ Digit Med. 2024-10-28

[6]
Extracting social support and social isolation information from clinical psychiatry notes: comparing a rule-based natural language processing system and a large language model.

J Am Med Inform Assoc. 2025-1-1

[7]
Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review.

JAMA. 2025-1-28

[8]
Enhancing early detection of cognitive decline in the elderly: a comparative study utilizing large language models in clinical notes.

EBioMedicine. 2024-11

[9]
Evaluating the use of large language models to provide clinical recommendations in the Emergency Department.

Nat Commun. 2024-10-8

[10]
Validation of large language models for detecting pathologic complete response in breast cancer using population-based pathology reports.

BMC Med Inform Decis Mak. 2024-10-3

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索