生成式预训练变换器（GPT）-4在神经放射学鉴别诊断中的应用

Generative pre-trained transformer (GPT)-4 support for differential diagnosis in neuroradiology.

作者信息

Sorin Vera, Klang Eyal, Sobeh Tamer, Konen Eli, Shrot Shai, Livne Adva, Weissbuch Yulian, Hoffmann Chen, Barash Yiftach

机构信息

Department of Diagnostic Imaging, Chaim Sheba Medical Center, Ramat Gan, Israel.

The Faculty of Medicine, Tel-Aviv University, Tel Aviv-Yafo, Israel.

出版信息

Quant Imaging Med Surg. 2024 Oct 1;14(10):7551-7560. doi: 10.21037/qims-24-200. Epub 2024 Sep 23.

DOI:10.21037/qims-24-200

PMID:39429611

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11485343/

Abstract

BACKGROUND

Differential diagnosis in radiology relies on the accurate identification of imaging patterns. The use of large language models (LLMs) in radiology holds promise, with many potential applications that may enhance the efficiency of radiologists' workflow. The study aimed to evaluate the efficacy of generative pre-trained transformer (GPT)-4, a LLM, in providing differential diagnoses in neuroradiology, comparing its performance with board-certified neuroradiologists.

METHODS

Sixty neuroradiology reports with variable diagnoses were inserted into GPT-4, which was tasked with generating a top-3 differential diagnosis for each case. The results were compared to the true diagnoses and to the differential diagnoses provided by three blinded neuroradiologists. Diagnostic accuracy and agreement between readers were assessed.

RESULTS

Of the 60 patients (mean age 47.8 years, 65% female), GPT-4 correctly included the diagnoses in its differentials in 61.7% (37/60) of cases, while the neuroradiologists' accuracy ranged from 63.3% (38/60) to 73.3% (44/60). Agreement between GPT-4 and the neuroradiologists, and among the neuroradiologists was fair to moderate [Cohen's kappa (kw) 0.34-0.44 and kw 0.39-0.54, respectively].

CONCLUSIONS

GPT-4 shows potential as a support tool for differential diagnosis in neuroradiology, though it was outperformed by human experts. Radiologists should remain mindful to the limitations of LLMs, while harboring their potential to enhance educational and clinical work.

摘要

背景

放射学中的鉴别诊断依赖于对影像模式的准确识别。在放射学中使用大语言模型（LLMs）具有前景，有许多潜在应用可能会提高放射科医生的工作流程效率。本研究旨在评估生成式预训练变换器（GPT）-4（一种大语言模型）在神经放射学中提供鉴别诊断的效果，并将其表现与获得委员会认证的神经放射科医生进行比较。

方法

将60份具有不同诊断结果的神经放射学报告输入GPT-4，要求其为每个病例生成前三位的鉴别诊断。将结果与真实诊断以及三位不知情的神经放射科医生提供的鉴别诊断进行比较。评估诊断准确性和读者之间的一致性。

结果

在60例患者（平均年龄47.8岁，65%为女性）中，GPT-4在61.7%（37/60）的病例中正确地将诊断结果纳入其鉴别诊断中，而神经放射科医生的准确率在63.3%（38/60）至73.3%（44/60）之间。GPT-4与神经放射科医生之间以及神经放射科医生之间的一致性为中等[科恩kappa系数（kw）分别为0.34 - 0.44和kw为0.39 - 0.54]。

结论

GPT-4显示出作为神经放射学鉴别诊断支持工具的潜力，尽管其表现不如人类专家。放射科医生应牢记大语言模型的局限性，同时也要利用其在加强教育和临床工作方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a02d/11485343/a7ed8af3b85c/qims-14-10-7551-f1.jpg

相似文献

Generative pre-trained transformer (GPT)-4 support for differential diagnosis in neuroradiology.生成式预训练变换器（GPT）-4在神经放射学鉴别诊断中的应用

Quant Imaging Med Surg. 2024 Oct 1;14(10):7551-7560. doi: 10.21037/qims-24-200. Epub 2024 Sep 23.

Comparing the Diagnostic Performance of GPT-4-based ChatGPT, GPT-4V-based ChatGPT, and Radiologists in Challenging Neuroradiology Cases.比较基于 GPT-4 的 ChatGPT、基于 GPT-4V 的 ChatGPT 和放射科医生在神经放射学挑战性病例中的诊断性能。

Clin Neuroradiol. 2024 Dec;34(4):779-787. doi: 10.1007/s00062-024-01426-y. Epub 2024 May 28.

Comparative analysis of GPT-4-based ChatGPT's diagnostic performance with radiologists using real-world radiology reports of brain tumors.基于GPT-4的ChatGPT与放射科医生在使用脑肿瘤真实世界放射学报告方面的诊断性能比较分析。

Eur Radiol. 2025 Apr;35(4):1938-1947. doi: 10.1007/s00330-024-11032-8. Epub 2024 Aug 28.

Preliminary assessment of automated radiology report generation with generative pre-trained transformers: comparing results to radiologist-generated reports.基于生成式预训练转换器的自动化放射学报告生成的初步评估：与放射科医生生成的报告进行比较。

Jpn J Radiol. 2024 Feb;42(2):190-200. doi: 10.1007/s11604-023-01487-y. Epub 2023 Sep 15.

Comparing Diagnostic Accuracy of Radiologists versus GPT-4V and Gemini Pro Vision Using Image Inputs from Diagnosis Please Cases.比较放射科医生与 GPT-4V 和 Gemini Pro Vision 使用诊断请案例的图像输入的诊断准确性。

Radiology. 2024 Jul;312(1):e240273. doi: 10.1148/radiol.240273.

ChatGPT's diagnostic performance based on textual vs. visual information compared to radiologists' diagnostic performance in musculoskeletal radiology.与放射科医生在肌肉骨骼放射学中的诊断表现相比，基于文本与视觉信息的ChatGPT的诊断表现。

Eur Radiol. 2025 Jan;35(1):506-516. doi: 10.1007/s00330-024-10902-5. Epub 2024 Jul 12.

Generative Pre-trained Transformer 4 analysis of cardiovascular magnetic resonance reports in suspected myocarditis: A multicenter study.生成式预训练变换器4对疑似心肌炎的心血管磁共振报告的分析：一项多中心研究。

J Cardiovasc Magn Reson. 2024;26(2):101068. doi: 10.1016/j.jocmr.2024.101068. Epub 2024 Jul 28.

Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.利用生成式人工智能辅助学习罕见且复杂的诊断：对流行的大型语言模型的定性研究。

JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391.

Performance of a commercially available Generative Pre-trained Transformer (GPT) in describing radiolucent lesions in panoramic radiographs and establishing differential diagnoses.商用生成式预训练转换器（GPT）在描述全景片上的透光性病变并建立鉴别诊断中的性能。

Clin Oral Investig. 2024 Mar 9;28(3):204. doi: 10.1007/s00784-024-05587-5.

Generative Pre-trained Transformer 4 makes cardiovascular magnetic resonance reports easy to understand.生成式预训练转换器 4 使得心血管磁共振报告易于理解。

J Cardiovasc Magn Reson. 2024 Summer;26(1):101035. doi: 10.1016/j.jocmr.2024.101035. Epub 2024 Mar 7.

引用本文的文献

Accuracy of large language models in generating differential diagnosis from clinical presentation and imaging findings in pediatric cases.大型语言模型根据儿科病例的临床表现和影像学检查结果生成鉴别诊断的准确性。

Pediatr Radiol. 2025 Jul 12. doi: 10.1007/s00247-025-06317-z.

Large Language Models in Medical Diagnostics: Scoping Review With Bibliometric Analysis.医学诊断中的大语言模型：基于文献计量分析的综述

J Med Internet Res. 2025 Jun 9;27:e72062. doi: 10.2196/72062.

本文引用的文献

Assessing GPT-4 multimodal performance in radiological image analysis.评估GPT-4在放射图像分析中的多模态性能。

Eur Radiol. 2025 Apr;35(4):1959-1965. doi: 10.1007/s00330-024-11035-5. Epub 2024 Aug 30.

Potential of GPT-4 for Detecting Errors in Radiology Reports: Implications for Reporting Accuracy.GPT-4 在检测放射科报告错误方面的潜力：对报告准确性的影响。

Radiology. 2024 Apr;311(1):e232714. doi: 10.1148/radiol.232714.

Utilizing large language models in breast cancer management: systematic review.利用大型语言模型进行乳腺癌管理：系统评价。

J Cancer Res Clin Oncol. 2024 Mar 19;150(3):140. doi: 10.1007/s00432-024-05678-6.

Exploring the potential of ChatGPT as an adjunct for generating diagnosis based on chief complaint and cone beam CT radiologic findings.探索ChatGPT作为辅助工具，根据主诉和锥形束CT放射学检查结果生成诊断的潜力。

BMC Med Inform Decis Mak. 2024 Feb 19;24(1):55. doi: 10.1186/s12911-024-02445-y.

Simplifying radiologic reports with natural language processing: a novel approach using ChatGPT in enhancing patient understanding of MRI results.使用自然语言处理简化放射学报告：一种使用 ChatGPT 增强患者对 MRI 结果理解的新方法。

Arch Orthop Trauma Surg. 2024 Feb;144(2):611-618. doi: 10.1007/s00402-023-05113-4. Epub 2023 Nov 11.

Potential of ChatGPT and GPT-4 for Data Mining of Free-Text CT Reports on Lung Cancer.ChatGPT 和 GPT-4 在挖掘肺癌 CT 报告自由文本数据方面的潜力

Radiology. 2023 Sep;308(3):e231362. doi: 10.1148/radiol.231362.

Enhancing Triage Efficiency and Accuracy in Emergency Rooms for Patients with Metastatic Prostate Cancer: A Retrospective Analysis of Artificial Intelligence-Assisted Triage Using ChatGPT 4.0.提高急诊室中转移性前列腺癌患者的分诊效率和准确性：使用ChatGPT 4.0的人工智能辅助分诊的回顾性分析

Cancers (Basel). 2023 Jul 22;15(14):3717. doi: 10.3390/cancers15143717.

Large language models encode clinical knowledge.大语言模型编码临床知识。

Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.

Radiology Gets Chatty: The ChatGPT Saga Unfolds.放射学开始健谈：ChatGPT的传奇故事展开。

Cureus. 2023 Jun 8;15(6):e40135. doi: 10.7759/cureus.40135. eCollection 2023 Jun.

ChatGPT-4 Assistance in Optimizing Emergency Department Radiology Referrals and Imaging Selection.ChatGPT-4 在优化急诊科放射科转诊和影像选择方面的辅助作用。

J Am Coll Radiol. 2023 Oct;20(10):998-1003. doi: 10.1016/j.jacr.2023.06.009. Epub 2023 Jul 7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

生成式预训练变换器（GPT）-4在神经放射学鉴别诊断中的应用

Generative pre-trained transformer (GPT)-4 support for differential diagnosis in neuroradiology.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献