Suppr超能文献

逐层分析:利用神经放射学中的增量病例信息评估人工智能诊断准确性

Layer by Layer: Assessing AI Diagnostic Accuracy With Incremental Case Information in Neuroradiology.

作者信息

Lotfian Golnaz, Jhaveri Miral, Dua Sumeet G, Suthar Pokhraj P

机构信息

Department of Diagnostic Radiology and Nuclear Medicine, Rush University Medical Center, Chicago, USA.

出版信息

Cureus. 2025 Jun 12;17(6):e85874. doi: 10.7759/cureus.85874. eCollection 2025 Jun.

Abstract

Aim Artificial intelligence (AI) has proven tremendous potential in improving diagnostic accuracy and efficiency in radiology. This study assesses the diagnostic performance of Google Gemini (version 1.5 Flash; Google DeepMind, Mountain View, California, USA), a proprietary large language model, in interpreting challenging diagnostic cases from the "Case of the Month" series. Materials and methods We analyzed 143 neuroradiology cases spanning brain, head and neck, and spine areas. Each case evolved over four weeks, starting with clinical history and followed by incremental imaging findings. Google Gemini was often prompted with the question, "What is the diagnosis?" Its accuracy was assessed at each level and across specialty categories. The data used were publicly available, and no ethical approval was necessary. Results Gemini's diagnosis accuracy improved with new case data, from 3.5% with history alone to 45.7% after complete imaging was supplied. Accuracy by category was highest in spine cases (51.9%), followed by head and neck (45.5%) and brain (44.0%). A chi-square test for trend verified that the performance increase over time was statistically significant (p < 0.0000000001). Conclusion Google Gemini displays moderate diagnosis accuracy that improves with accumulated information. While encouraging, its shortcomings underline the necessity for continual validation and transparency. This study shows the expanding relevance of AI in neuroradiology and the necessity of comprehensive evaluation before clinical integration.

摘要

目的 人工智能(AI)已在提高放射学诊断准确性和效率方面展现出巨大潜力。本研究评估了专有大语言模型谷歌Gemini(1.5 Flash版本;谷歌DeepMind,美国加利福尼亚州山景城)在解读“月度病例”系列中具有挑战性的诊断病例时的诊断性能。材料与方法 我们分析了143例涵盖脑、头颈部和脊柱区域的神经放射学病例。每个病例历时四周,从临床病史开始,随后是逐步增加的影像学检查结果。谷歌Gemini经常被问到“诊断是什么?”这个问题。在每个阶段以及跨专业类别评估其准确性。所使用的数据是公开可用的,无需伦理批准。结果 随着新病例数据的增加,Gemini的诊断准确性有所提高,仅根据病史时为3.5%,在提供完整影像学检查结果后提高到45.7%。按类别划分,脊柱病例的准确性最高(51.9%),其次是头颈部(45.5%)和脑(44.0%)。趋势的卡方检验证实,随着时间推移性能的提高具有统计学意义(p < 0.0000000001)。结论 谷歌Gemini显示出适度的诊断准确性,且随着信息积累而提高。尽管令人鼓舞,但其缺点凸显了持续验证和透明度的必要性。本研究表明AI在神经放射学中的相关性不断扩大,以及在临床整合之前进行全面评估的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f844/12255534/8b1aba9e4dc3/cureus-0017-00000085874-i01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验