• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估用于分诊、转诊和诊断的临床决策支持中的大语言模型工作流程。

Evaluating large language model workflows in clinical decision support for triage and referral and diagnosis.

作者信息

Gaber Farieda, Shaik Maqsood, Allega Fabio, Bilecz Agnes Julia, Busch Felix, Goon Kelsey, Franke Vedran, Akalin Altuna

机构信息

Berlin Institute for Medical Systems Biology (BIMSB), Max Delbrück Center for Molecular Medicine, Berlin, Germany.

Department of Computer Science, Humboldt-Universität zu Berlin, Berlin, Germany.

出版信息

NPJ Digit Med. 2025 May 9;8(1):263. doi: 10.1038/s41746-025-01684-1.

DOI:10.1038/s41746-025-01684-1
PMID:40346344
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12064692/
Abstract

Accurate medical decision-making is critical for both patients and clinicians. Patients often struggle to interpret their symptoms, determine their severity, and select the right specialist. Simultaneously, clinicians face challenges in integrating complex patient data to make timely, accurate diagnoses. Recent advances in large language models (LLMs) offer the potential to bridge this gap by supporting decision-making for both patients and healthcare providers. In this study, we benchmark multiple LLM versions and an LLM-based workflow incorporating retrieval-augmented generation (RAG) on a curated dataset of 2000 medical cases derived from the Medical Information Mart for Intensive Care database. Our findings show that these LLMs are capable of providing personalized insights into likely diagnoses, suggesting appropriate specialists, and assessing urgent care needs. These models may also support clinicians in refining diagnoses and decision-making, offering a promising approach to improving patient outcomes and streamlining healthcare delivery.

摘要

准确的医疗决策对患者和临床医生都至关重要。患者常常难以解读自己的症状、确定症状的严重程度并选择合适的专科医生。与此同时,临床医生在整合复杂的患者数据以做出及时、准确的诊断方面面临挑战。大语言模型(LLMs)的最新进展为弥合这一差距提供了可能,通过为患者和医疗服务提供者的决策提供支持。在本研究中,我们在一个从重症监护医学信息库中提取的包含2000个医疗病例的精选数据集上,对多个大语言模型版本以及一个基于大语言模型并结合检索增强生成(RAG)的工作流程进行了基准测试。我们的研究结果表明,这些大语言模型能够针对可能的诊断提供个性化见解、推荐合适的专科医生并评估紧急护理需求。这些模型还可能支持临床医生完善诊断和决策,为改善患者预后和简化医疗服务提供了一种很有前景的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/f52543112724/41746_2025_1684_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/30cd0d5e3b76/41746_2025_1684_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/8644f1279ca8/41746_2025_1684_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/7fac4476155a/41746_2025_1684_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/68670fa24570/41746_2025_1684_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/965976db0070/41746_2025_1684_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/76e0559b9e0a/41746_2025_1684_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/e8626ea383fd/41746_2025_1684_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/f52543112724/41746_2025_1684_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/30cd0d5e3b76/41746_2025_1684_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/8644f1279ca8/41746_2025_1684_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/7fac4476155a/41746_2025_1684_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/68670fa24570/41746_2025_1684_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/965976db0070/41746_2025_1684_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/76e0559b9e0a/41746_2025_1684_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/e8626ea383fd/41746_2025_1684_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d87f/12064692/f52543112724/41746_2025_1684_Fig8_HTML.jpg

相似文献

1
Evaluating large language model workflows in clinical decision support for triage and referral and diagnosis.评估用于分诊、转诊和诊断的临床决策支持中的大语言模型工作流程。
NPJ Digit Med. 2025 May 9;8(1):263. doi: 10.1038/s41746-025-01684-1.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Leveraging Retrieval-Augmented Large Language Models for Dietary Recommendations With Traditional Chinese Medicine's Medicine Food Homology: Algorithm Development and Validation.利用检索增强大语言模型结合中医药食同源进行饮食推荐:算法开发与验证
JMIR Med Inform. 2025 Aug 21;13:e75279. doi: 10.2196/75279.
4
Detecting Stigmatizing Language in Clinical Notes with Large Language Models for Addiction Care.使用大语言模型在成瘾护理临床记录中检测污名化语言。
medRxiv. 2025 Aug 12:2025.08.08.25333315. doi: 10.1101/2025.08.08.25333315.
5
Plug-and-play use of tree-based methods: consequences for clinical prediction modeling.基于树的方法的即插即用:对临床预测模型的影响。
J Clin Epidemiol. 2025 Aug;184:111834. doi: 10.1016/j.jclinepi.2025.111834. Epub 2025 May 19.
6
Implementing Large Language Models in Health Care: Clinician-Focused Review With Interactive Guideline.在医疗保健中应用大语言模型:以临床医生为重点的回顾与交互式指南
J Med Internet Res. 2025 Jul 11;27:e71916. doi: 10.2196/71916.
7
Sexual Harassment and Prevention Training性骚扰与预防培训
8
Assessing Retrieval-Augmented Large Language Model Performance in Emergency Department ICD-10-CM Coding Compared to Human Coders.与人工编码员相比,评估检索增强型大语言模型在急诊科ICD-10-CM编码中的性能。
medRxiv. 2024 Oct 17:2024.10.15.24315526. doi: 10.1101/2024.10.15.24315526.
9
The experience of adults who choose watchful waiting or active surveillance as an approach to medical treatment: a qualitative systematic review.选择观察等待或主动监测作为治疗方法的成年人的经历:一项定性系统评价。
JBI Database System Rev Implement Rep. 2016 Feb;14(2):174-255. doi: 10.11124/jbisrir-2016-2270.
10
A dataset and benchmark for hospital course summarization with adapted large language models.一个用于医院病程总结的数据集和基准测试,采用了适配的大语言模型。
J Am Med Inform Assoc. 2025 Mar 1;32(3):470-479. doi: 10.1093/jamia/ocae312.

引用本文的文献

1
Enhancing Clinical Decision Support with Adaptive Iterative Self-Query Retrieval for Retrieval-Augmented Large Language Models.通过用于检索增强大语言模型的自适应迭代自查询检索来增强临床决策支持
Bioengineering (Basel). 2025 Aug 21;12(8):895. doi: 10.3390/bioengineering12080895.
2
Predicting major amputation risk in diabetic foot ulcers using comparative machine learning models for enhanced clinical decision-making.使用比较机器学习模型预测糖尿病足溃疡的大截肢风险以加强临床决策
Sci Rep. 2025 Aug 1;15(1):28103. doi: 10.1038/s41598-025-13534-x.
3
Evaluating large language models on hospital health data for automated emergency triage.

本文引用的文献

1
ChatGPT Assisting Diagnosis of Neuro-Ophthalmology Diseases Based on Case Reports.基于病例报告的ChatGPT辅助诊断神经眼科疾病
J Neuroophthalmol. 2024 Oct 10;45(3):301-306. doi: 10.1097/WNO.0000000000002274.
2
ChatGPT Assisting Diagnosis of Neuro-Ophthalmology Diseases Based on Case Reports.基于病例报告的ChatGPT辅助神经眼科疾病诊断
J Neuroophthalmol. 2024 Oct 10. doi: 10.1097/WNO.0000000000002274.
3
1.5 million materials narratives generated by chatbots.聊天机器人生成的150万条素材叙述。
基于医院健康数据评估大型语言模型以实现自动急诊分诊。
Int J Comput Assist Radiol Surg. 2025 Jul 16. doi: 10.1007/s11548-025-03475-1.
4
The Role of Artificial Intelligence Large Language Models in Personalized Rehabilitation Programs for Knee Osteoarthritis: An Observational Study.人工智能大语言模型在膝关节骨关节炎个性化康复计划中的作用:一项观察性研究。
J Med Syst. 2025 Jun 3;49(1):73. doi: 10.1007/s10916-025-02207-x.
Sci Data. 2024 Sep 28;11(1):1060. doi: 10.1038/s41597-024-03886-w.
4
Enhancement of the Performance of Large Language Models in Diabetes Education through Retrieval-Augmented Generation: Comparative Study.通过检索增强生成提高大语言模型在糖尿病教育中的性能:比较研究
J Med Internet Res. 2024 Nov 8;26:e58041. doi: 10.2196/58041.
5
Evaluation and mitigation of the limitations of large language models in clinical decision-making.评估和缓解大型语言模型在临床决策中的局限性。
Nat Med. 2024 Sep;30(9):2613-2622. doi: 10.1038/s41591-024-03097-1. Epub 2024 Jul 4.
6
Optimization of hepatological clinical guidelines interpretation by large language models: a retrieval augmented generation-based framework.基于检索增强生成框架的大语言模型对肝病临床指南解读的优化
NPJ Digit Med. 2024 Apr 23;7(1):102. doi: 10.1038/s41746-024-01091-y.
7
Comparing the Performance of Popular Large Language Models on the National Board of Medical Examiners Sample Questions.比较流行的大语言模型在国家医学考试委员会样题上的表现。
Cureus. 2024 Mar 11;16(3):e55991. doi: 10.7759/cureus.55991. eCollection 2024 Mar.
8
Performance of ChatGPT in Diagnosis of Corneal Eye Diseases.ChatGPT 在角膜眼病诊断中的表现。
Cornea. 2024 May 1;43(5):664-670. doi: 10.1097/ICO.0000000000003492. Epub 2024 Feb 23.
9
Prompt engineering in consistency and reliability with the evidence-based guideline for LLMs.提示工程在与大语言模型基于证据的指南保持一致性和可靠性方面。
NPJ Digit Med. 2024 Feb 20;7(1):41. doi: 10.1038/s41746-024-01029-4.
10
Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine.诊断推理提示揭示了医学中大型语言模型可解释性的潜力。
NPJ Digit Med. 2024 Jan 24;7(1):20. doi: 10.1038/s41746-024-01010-1.