Gaber Farieda, Shaik Maqsood, Allega Fabio, Bilecz Agnes Julia, Busch Felix, Goon Kelsey, Franke Vedran, Akalin Altuna
Berlin Institute for Medical Systems Biology (BIMSB), Max Delbrück Center for Molecular Medicine, Berlin, Germany.
Department of Computer Science, Humboldt-Universität zu Berlin, Berlin, Germany.
NPJ Digit Med. 2025 May 9;8(1):263. doi: 10.1038/s41746-025-01684-1.
Accurate medical decision-making is critical for both patients and clinicians. Patients often struggle to interpret their symptoms, determine their severity, and select the right specialist. Simultaneously, clinicians face challenges in integrating complex patient data to make timely, accurate diagnoses. Recent advances in large language models (LLMs) offer the potential to bridge this gap by supporting decision-making for both patients and healthcare providers. In this study, we benchmark multiple LLM versions and an LLM-based workflow incorporating retrieval-augmented generation (RAG) on a curated dataset of 2000 medical cases derived from the Medical Information Mart for Intensive Care database. Our findings show that these LLMs are capable of providing personalized insights into likely diagnoses, suggesting appropriate specialists, and assessing urgent care needs. These models may also support clinicians in refining diagnoses and decision-making, offering a promising approach to improving patient outcomes and streamlining healthcare delivery.
准确的医疗决策对患者和临床医生都至关重要。患者常常难以解读自己的症状、确定症状的严重程度并选择合适的专科医生。与此同时,临床医生在整合复杂的患者数据以做出及时、准确的诊断方面面临挑战。大语言模型(LLMs)的最新进展为弥合这一差距提供了可能,通过为患者和医疗服务提供者的决策提供支持。在本研究中,我们在一个从重症监护医学信息库中提取的包含2000个医疗病例的精选数据集上,对多个大语言模型版本以及一个基于大语言模型并结合检索增强生成(RAG)的工作流程进行了基准测试。我们的研究结果表明,这些大语言模型能够针对可能的诊断提供个性化见解、推荐合适的专科医生并评估紧急护理需求。这些模型还可能支持临床医生完善诊断和决策,为改善患者预后和简化医疗服务提供了一种很有前景的方法。