大型语言模型在罕见病识别方面的比较分析。

Comparative analysis of large language models on rare disease identification.

作者信息

Ao Guangyu, Chen Min, Li Jing, Nie Huibing, Zhang Lei, Chen Zejun

机构信息

Department of Nephrology, Chengdu First People's Hospital, No.18 Wanxiang North Road, High-tech District, Chengdu, 610095, Sichuan, China.

Sichuan Provincial Geriatrics Clinical Medical Research Center, Chengdu, China.

出版信息

Orphanet J Rare Dis. 2025 Apr 1;20(1):150. doi: 10.1186/s13023-025-03656-w.

DOI:10.1186/s13023-025-03656-w

PMID:40165285

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11959745/

Abstract

Diagnosing rare diseases is challenging due to their low prevalence, diverse presentations, and limited recognition, often leading to diagnostic delays and errors. This study evaluates the effectiveness of multiple large language models (LLMs) in identifying rare diseases, comparing their performance with that of human physicians using real clinical cases. We analyzed 152 rare disease cases from the Chinese Medical Case Repository using four LLMs: ChatGPT-4o, Claude 3.5 Sonnet, Gemini Advanced, and Llama 3.1 405B. Overall, the LLMs performed better than human physicians, and Claude 3.5 Sonnet achieved the highest accuracy at 78.9%, significantly surpassing the accuracy of human physicians, which was 26.3%. These findings suggest that LLMs can improve rare disease diagnosis and serve as valuable tools in clinical settings, particularly in regions with limited resources. However, further validation and careful consideration of ethical and privacy issues are necessary for their effective integration into medical practice.

摘要

由于罕见病的患病率低、临床表现多样且认知有限，对其进行诊断具有挑战性，这常常导致诊断延迟和错误。本研究评估了多个大语言模型（LLMs）在识别罕见病方面的有效性，并使用真实临床病例将它们的表现与人类医生的表现进行比较。我们使用四个大语言模型：ChatGPT-4o、Claude 3.5 Sonnet、Gemini Advanced和Llama 3.1 405B，分析了来自中国医学病例库的152例罕见病病例。总体而言，大语言模型的表现优于人类医生，Claude 3.5 Sonnet的准确率最高，为78.9%，显著超过人类医生26.3%的准确率。这些发现表明，大语言模型可以改善罕见病诊断，并在临床环境中作为有价值的工具，特别是在资源有限的地区。然而，为了将它们有效整合到医疗实践中，需要进一步验证并仔细考虑伦理和隐私问题。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

大型语言模型在罕见病识别方面的比较分析。

Comparative analysis of large language models on rare disease identification.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

大型语言模型在罕见病识别方面的比较分析。

Comparative analysis of large language models on rare disease identification.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献