Suppr超能文献

迈向使用大语言模型进行准确的鉴别诊断。

Towards accurate differential diagnosis with large language models.

作者信息

McDuff Daniel, Schaekermann Mike, Tu Tao, Palepu Anil, Wang Amy, Garrison Jake, Singhal Karan, Sharma Yash, Azizi Shekoofeh, Kulkarni Kavita, Hou Le, Cheng Yong, Liu Yun, Mahdavi S Sara, Prakash Sushant, Pathak Anupam, Semturs Christopher, Patel Shwetak, Webster Dale R, Dominowska Ewa, Gottweis Juraj, Barral Joelle, Chou Katherine, Corrado Greg S, Matias Yossi, Sunshine Jake, Karthikesalingam Alan, Natarajan Vivek

机构信息

Google Research, Seattle, WA, USA.

Google Research, Toronto, Ontario, Canada.

出版信息

Nature. 2025 Apr 9. doi: 10.1038/s41586-025-08869-4.

Abstract

A comprehensive differential diagnosis is a cornerstone of medical care that is often reached through an iterative process of interpretation that combines clinical history, physical examination, investigations and procedures. Interactive interfaces powered by large language models present new opportunities to assist and automate aspects of this process. Here we introduce the Articulate Medical Intelligence Explorer (AMIE), a large language model that is optimized for diagnostic reasoning, and evaluate its ability to generate a differential diagnosis alone or as an aid to clinicians. Twenty clinicians evaluated 302 challenging, real-world medical cases sourced from published case reports. Each case report was read by two clinicians, who were randomized to one of two assistive conditions: assistance from search engines and standard medical resources; or assistance from AMIE in addition to these tools. All clinicians provided a baseline, unassisted differential diagnosis prior to using the respective assistive tools. AMIE exhibited standalone performance that exceeded that of unassisted clinicians (top-10 accuracy 59.1% versus 33.6%, P = 0.04). Comparing the two assisted study arms, the differential diagnosis quality score was higher for clinicians assisted by AMIE (top-10 accuracy 51.7%) compared with clinicians without its assistance (36.1%; McNemar's test: 45.7, P < 0.01) and clinicians with search (44.4%; McNemar's test: 4.75, P = 0.03). Further, clinicians assisted by AMIE arrived at more comprehensive differential lists than those without assistance from AMIE. Our study suggests that AMIE has potential to improve clinicians' diagnostic reasoning and accuracy in challenging cases, meriting further real-world evaluation for its ability to empower physicians and widen patients' access to specialist-level expertise.

摘要

全面的鉴别诊断是医疗护理的基石,通常通过结合临床病史、体格检查、检查和操作的反复解读过程来实现。由大语言模型驱动的交互式界面为辅助和自动化这一过程的各个方面带来了新机遇。在此,我们介绍了专为诊断推理优化的大语言模型——清晰医学智能探索器(AMIE),并评估其单独生成鉴别诊断或辅助临床医生的能力。20名临床医生对302例具有挑战性的真实世界医疗病例进行了评估,这些病例来自已发表的病例报告。每份病例报告由两名临床医生阅读,他们被随机分配到两种辅助条件之一:搜索引擎和标准医学资源的辅助;或除这些工具外还得到AMIE的辅助。所有临床医生在使用各自的辅助工具之前都提供了一份无辅助的基线鉴别诊断。AMIE展现出的独立表现超过了无辅助的临床医生(前10名准确率为59.1%,而无辅助临床医生为33.6%,P = 0.04)。比较两个辅助研究组,得到AMIE辅助的临床医生的鉴别诊断质量得分更高(前10名准确率为51.7%),高于未得到其辅助的临床医生(36.1%;McNemar检验:45.7,P < 0.01)以及得到搜索辅助的临床医生(44.4%;McNemar检验:4.75,P = 0.03)。此外,得到AMIE辅助的临床医生得出的鉴别诊断清单比未得到AMIE辅助的临床医生更全面。我们的研究表明,AMIE有潜力在具有挑战性的病例中改善临床医生的诊断推理和准确性,其增强医生能力以及扩大患者获得专家级专业知识的能力值得在现实世界中进一步评估。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验