• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用大语言模型从医学报告中准确检索患者信息:系统评价研究

Leveraging Large Language Models for Accurate Retrieval of Patient Information From Medical Reports: Systematic Evaluation Study.

作者信息

Garcia-Carmona Angel Manuel, Prieto Maria-Lorena, Puertas Enrique, Beunza Juan-Jose

机构信息

Research and Doctorate School, Universidad Europea de Madrid, Madrid, Spain.

Department of Computing and Technology, Universidad Europea de Madrid, Madrid, Spain.

出版信息

JMIR AI. 2025 Jul 3;4:e68776. doi: 10.2196/68776.

DOI:10.2196/68776
PMID:40608403
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12271962/
Abstract

BACKGROUND

The digital transformation of health care has introduced both opportunities and challenges, particularly in managing and analyzing the vast amounts of unstructured medical data generated daily. There is a need to explore the feasibility of generative solutions in extracting data from medical reports, categorized by specific criteria.

OBJECTIVE

This study aimed to investigate the application of large language models (LLMs) for the automated extraction of structured information from unstructured medical reports, using the LangChain framework in Python.

METHODS

Through a systematic evaluation of leading LLMs-GPT-4o, Llama 3, Llama 3.1, Gemma 2, Qwen 2, and Qwen 2.5-using zero-shot prompting techniques and embedding results into a vector database, this study assessed the performance of LLMs in extracting patient demographics, diagnostic details, and pharmacological data.

RESULTS

Evaluation metrics, including accuracy, precision, recall, and F-score, revealed high efficacy across most categories, with GPT-4o achieving the highest overall performance (91.4% accuracy).

CONCLUSIONS

The findings highlight notable differences in precision and recall between models, particularly in extracting names and age-related information. There were challenges in processing unstructured medical text, including variability in model performance across data types. Our findings demonstrate the feasibility of integrating LLMs into health care workflows; LLMs offer substantial improvements in data accessibility and support clinical decision-making processes. In addition, the paper describes the role of retrieval-augmented generation techniques in enhancing information retrieval accuracy, addressing issues such as hallucinations and outdated data in LLM outputs. Future work should explore the need for optimization through larger and more diverse training datasets, advanced prompting strategies, and the integration of domain-specific knowledge to improve model generalizability and precision.

摘要

背景

医疗保健的数字化转型带来了机遇和挑战,尤其是在管理和分析每天产生的大量非结构化医疗数据方面。有必要探索生成式解决方案从医疗报告中按特定标准提取数据的可行性。

目的

本研究旨在使用Python中的LangChain框架,研究大语言模型(LLMs)在从非结构化医疗报告中自动提取结构化信息方面的应用。

方法

通过对领先的大语言模型——GPT-4o、Llama 3、Llama 3.1、Gemma 2、Qwen 2和Qwen 2.5进行系统评估,使用零样本提示技术并将结果嵌入向量数据库,本研究评估了大语言模型在提取患者人口统计学信息、诊断细节和药理学数据方面的性能。

结果

包括准确率、精确率、召回率和F值在内的评估指标显示,大多数类别都具有较高的效率,GPT-4o的整体性能最高(准确率为91.4%)。

结论

研究结果突出了模型之间在精确率和召回率方面的显著差异,特别是在提取姓名和年龄相关信息方面。处理非结构化医疗文本存在挑战,包括不同数据类型的模型性能存在差异。我们的研究结果证明了将大语言模型集成到医疗保健工作流程中的可行性;大语言模型在数据可访问性方面有显著改进,并支持临床决策过程。此外,本文描述了检索增强生成技术在提高信息检索准确性方面的作用,解决了大语言模型输出中的幻觉和过时数据等问题。未来的工作应通过更大、更多样化的训练数据集、先进的提示策略以及整合特定领域知识来探索优化的必要性,以提高模型的通用性和精确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a435/12271962/d177ea81040d/ai_v4i1e68776_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a435/12271962/0f693f7af530/ai_v4i1e68776_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a435/12271962/d177ea81040d/ai_v4i1e68776_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a435/12271962/0f693f7af530/ai_v4i1e68776_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a435/12271962/d177ea81040d/ai_v4i1e68776_fig2.jpg

相似文献

1
Leveraging Large Language Models for Accurate Retrieval of Patient Information From Medical Reports: Systematic Evaluation Study.利用大语言模型从医学报告中准确检索患者信息:系统评价研究
JMIR AI. 2025 Jul 3;4:e68776. doi: 10.2196/68776.
2
Data extraction from free-text stroke CT reports using GPT-4o and Llama-3.3-70B: the impact of annotation guidelines.使用GPT-4o和Llama-3.3-70B从自由文本中风CT报告中提取数据:注释指南的影响
Eur Radiol Exp. 2025 Jun 19;9(1):61. doi: 10.1186/s41747-025-00600-2.
3
Enhancing Pulmonary Disease Prediction Using Large Language Models With Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study Based on Radiology Report.使用具有特征总结和混合检索增强生成功能的大语言模型增强肺部疾病预测:基于放射学报告的多中心方法学研究
J Med Internet Res. 2025 Jun 11;27:e72638. doi: 10.2196/72638.
4
Assessing Retrieval-Augmented Large Language Model Performance in Emergency Department ICD-10-CM Coding Compared to Human Coders.与人工编码员相比,评估检索增强型大语言模型在急诊科ICD-10-CM编码中的性能。
medRxiv. 2024 Oct 17:2024.10.15.24315526. doi: 10.1101/2024.10.15.24315526.
5
Extracting epilepsy-related information from unstructured clinic letters using large language models.使用大语言模型从非结构化临床信件中提取癫痫相关信息。
Epilepsia. 2025 Jul 10. doi: 10.1111/epi.18475.
6
Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study.使用检索增强大语言模型预测术后30天死亡率和美国麻醉医师协会身体状况:开发与验证研究
J Med Internet Res. 2025 Jun 3;27:e75052. doi: 10.2196/75052.
7
Utilizing large language models for detecting hospital-acquired conditions: an empirical study on pulmonary embolism.利用大语言模型检测医院获得性疾病:关于肺栓塞的实证研究
J Am Med Inform Assoc. 2025 May 1;32(5):876-884. doi: 10.1093/jamia/ocaf048.
8
Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study.将医学知识图谱融入大语言模型进行诊断预测:设计与应用研究
JMIR AI. 2025 Feb 24;4:e58670. doi: 10.2196/58670.
9
From text to data: Open-source large language models in extracting cancer related medical attributes from German pathology reports.从文本到数据:用于从德语病理报告中提取癌症相关医学属性的开源大语言模型
Int J Med Inform. 2025 Nov;203:106022. doi: 10.1016/j.ijmedinf.2025.106022. Epub 2025 Jul 2.
10
Using Generative Artificial Intelligence in Health Economics and Outcomes Research: A Primer on Techniques and Breakthroughs.在卫生经济学与结果研究中使用生成式人工智能:技术与突破入门
Pharmacoecon Open. 2025 Apr 29. doi: 10.1007/s41669-025-00580-4.

本文引用的文献

1
Medical Misinformation in AI-Assisted Self-Diagnosis: Development of a Method (EvalPrompt) for Analyzing Large Language Models.人工智能辅助自我诊断中的医学错误信息:一种用于分析大语言模型的方法(EvalPrompt)的开发
JMIR Form Res. 2025 Mar 10;9:e66207. doi: 10.2196/66207.
2
Large language models for data extraction from unstructured and semi-structured electronic health records: a multiple model performance evaluation.用于从非结构化和半结构化电子健康记录中提取数据的大语言模型:多模型性能评估
BMJ Health Care Inform. 2025 Jan 19;32(1):e101139. doi: 10.1136/bmjhci-2024-101139.
3
Health Care Language Models and Their Fine-Tuning for Information Extraction: Scoping Review.
医疗保健语言模型及其在信息提取方面的微调:范围综述。
JMIR Med Inform. 2024 Oct 21;12:e60164. doi: 10.2196/60164.
4
Large language models could change the future of behavioral healthcare: a proposal for responsible development and evaluation.大语言模型可能会改变行为医疗保健的未来:关于负责任开发与评估的建议
Npj Ment Health Res. 2024 Apr 2;3(1):12. doi: 10.1038/s44184-024-00056-z.
5
An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.零样本临床自然语言处理中大型语言模型提示策略的实证评估:算法开发与验证研究
JMIR Med Inform. 2024 Apr 8;12:e55318. doi: 10.2196/55318.
6
Embracing Large Language Models for Medical Applications: Opportunities and Challenges.拥抱用于医学应用的大语言模型:机遇与挑战。
Cureus. 2023 May 21;15(5):e39305. doi: 10.7759/cureus.39305. eCollection 2023 May.
7
Large language models and the perils of their hallucinations.大语言模型及其幻觉的风险。
Crit Care. 2023 Mar 21;27(1):120. doi: 10.1186/s13054-023-04393-x.
8
Can ChatGPT pass the life support exams without entering the American heart association course?ChatGPT能否不参加美国心脏协会的课程就通过生命支持考试?
Resuscitation. 2023 Apr;185:109732. doi: 10.1016/j.resuscitation.2023.109732. Epub 2023 Feb 11.
9
A Comprehensive Review on Smart Health Care: Applications, Paradigms, and Challenges with Case Studies.智能医疗保健综述:应用、范例及案例研究的挑战
Contrast Media Mol Imaging. 2022 Sep 29;2022:4822235. doi: 10.1155/2022/4822235. eCollection 2022.
10
Automated Identification and Measurement Extraction of Pancreatic Cystic Lesions from Free-Text Radiology Reports Using Natural Language Processing.使用自然语言处理技术从自由文本放射学报告中自动识别和测量胰腺囊性病变
Radiol Artif Intell. 2021 Dec 22;4(2):e210092. doi: 10.1148/ryai.210092. eCollection 2022 Mar.