Suppr超能文献

生成式预训练变换器(GPT)-4在神经放射学鉴别诊断中的应用

Generative pre-trained transformer (GPT)-4 support for differential diagnosis in neuroradiology.

作者信息

Sorin Vera, Klang Eyal, Sobeh Tamer, Konen Eli, Shrot Shai, Livne Adva, Weissbuch Yulian, Hoffmann Chen, Barash Yiftach

机构信息

Department of Diagnostic Imaging, Chaim Sheba Medical Center, Ramat Gan, Israel.

The Faculty of Medicine, Tel-Aviv University, Tel Aviv-Yafo, Israel.

出版信息

Quant Imaging Med Surg. 2024 Oct 1;14(10):7551-7560. doi: 10.21037/qims-24-200. Epub 2024 Sep 23.

Abstract

BACKGROUND

Differential diagnosis in radiology relies on the accurate identification of imaging patterns. The use of large language models (LLMs) in radiology holds promise, with many potential applications that may enhance the efficiency of radiologists' workflow. The study aimed to evaluate the efficacy of generative pre-trained transformer (GPT)-4, a LLM, in providing differential diagnoses in neuroradiology, comparing its performance with board-certified neuroradiologists.

METHODS

Sixty neuroradiology reports with variable diagnoses were inserted into GPT-4, which was tasked with generating a top-3 differential diagnosis for each case. The results were compared to the true diagnoses and to the differential diagnoses provided by three blinded neuroradiologists. Diagnostic accuracy and agreement between readers were assessed.

RESULTS

Of the 60 patients (mean age 47.8 years, 65% female), GPT-4 correctly included the diagnoses in its differentials in 61.7% (37/60) of cases, while the neuroradiologists' accuracy ranged from 63.3% (38/60) to 73.3% (44/60). Agreement between GPT-4 and the neuroradiologists, and among the neuroradiologists was fair to moderate [Cohen's kappa (kw) 0.34-0.44 and kw 0.39-0.54, respectively].

CONCLUSIONS

GPT-4 shows potential as a support tool for differential diagnosis in neuroradiology, though it was outperformed by human experts. Radiologists should remain mindful to the limitations of LLMs, while harboring their potential to enhance educational and clinical work.

摘要

背景

放射学中的鉴别诊断依赖于对影像模式的准确识别。在放射学中使用大语言模型(LLMs)具有前景,有许多潜在应用可能会提高放射科医生的工作流程效率。本研究旨在评估生成式预训练变换器(GPT)-4(一种大语言模型)在神经放射学中提供鉴别诊断的效果,并将其表现与获得委员会认证的神经放射科医生进行比较。

方法

将60份具有不同诊断结果的神经放射学报告输入GPT-4,要求其为每个病例生成前三位的鉴别诊断。将结果与真实诊断以及三位不知情的神经放射科医生提供的鉴别诊断进行比较。评估诊断准确性和读者之间的一致性。

结果

在60例患者(平均年龄47.8岁,65%为女性)中,GPT-4在61.7%(37/60)的病例中正确地将诊断结果纳入其鉴别诊断中,而神经放射科医生的准确率在63.3%(38/60)至73.3%(44/60)之间。GPT-4与神经放射科医生之间以及神经放射科医生之间的一致性为中等[科恩kappa系数(kw)分别为0.34 - 0.44和kw为0.39 - 0.54]。

结论

GPT-4显示出作为神经放射学鉴别诊断支持工具的潜力,尽管其表现不如人类专家。放射科医生应牢记大语言模型的局限性,同时也要利用其在加强教育和临床工作方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a02d/11485343/a7ed8af3b85c/qims-14-10-7551-f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验