• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大语言模型能否成为冠状动脉 CT 血管造影报告的新辅助工具?

Can large language models be new supportive tools in coronary computed tomography angiography reporting?

机构信息

Department of Radiology, Ministry of Health Ankara 29 Mayis State Hospital, Ankara, Türkiye.

Department of Radiology, Ankara Mamak State Hospital, Ankara, Türkiye.

出版信息

Clin Imaging. 2024 Oct;114:110271. doi: 10.1016/j.clinimag.2024.110271. Epub 2024 Aug 31.

DOI:10.1016/j.clinimag.2024.110271
PMID:39236553
Abstract

The advent of large language models (LLMs) marks a transformative leap in natural language processing, offering unprecedented potential in radiology, particularly in enhancing the accuracy and efficiency of coronary artery disease (CAD) diagnosis. While previous studies have explored the capabilities of specific LLMs like ChatGPT in cardiac imaging, a comprehensive evaluation comparing multiple LLMs in the context of CAD-RADS 2.0 has been lacking. This study addresses this gap by assessing the performance of various LLMs, including ChatGPT 4, ChatGPT 4o, Claude 3 Opus, Gemini 1.5 Pro, Mistral Large, Meta Llama 3 70B, and Perplexity Pro, in answering 30 multiple-choice questions derived from the CAD-RADS 2.0 guidelines. Our findings reveal that ChatGPT 4o achieved the highest accuracy at 100 %, with ChatGPT 4 and Claude 3 Opus closely following at 96.6 %. Other models, including Mistral Large, Perplexity Pro, Meta Llama 3 70B, and Gemini 1.5 Pro, also demonstrated commendable performance, though with slightly lower accuracy ranging from 90 % to 93.3 %. This study underscores the proficiency of current LLMs in understanding and applying CAD-RADS 2.0, suggesting their potential to significantly enhance radiological reporting and patient care in coronary artery disease. The variations in model performance highlight the need for further research, particularly in evaluating the visual diagnostic capabilities of LLMs-a critical component of radiology practice. This study provides a foundational comparison of LLMs in CAD-RADS 2.0 and sets the stage for future investigations into their broader applications in radiology, emphasizing the importance of integrating both text-based and visual knowledge for optimal clinical outcomes.

摘要

大型语言模型 (LLM) 的出现标志着自然语言处理领域的重大突破,为放射学带来了前所未有的潜力,特别是在提高冠状动脉疾病 (CAD) 诊断的准确性和效率方面。虽然之前的研究已经探索了 ChatGPT 等特定 LLM 在心脏成像方面的能力,但缺乏对 CAD-RADS 2.0 背景下多种 LLM 进行全面评估的研究。本研究通过评估各种 LLM 的性能来填补这一空白,包括 ChatGPT 4、ChatGPT 4o、Claude 3 Opus、Gemini 1.5 Pro、Mistral Large、Meta Llama 3 70B 和 Perplexity Pro,以回答源自 CAD-RADS 2.0 指南的 30 个多项选择题。我们的研究结果表明,ChatGPT 4o 的准确率最高,达到 100%,紧随其后的是 ChatGPT 4 和 Claude 3 Opus,准确率为 96.6%。其他模型,包括 Mistral Large、Perplexity Pro、Meta Llama 3 70B 和 Gemini 1.5 Pro,也表现出了令人称赞的性能,准确率略低,范围在 90%到 93.3%之间。本研究强调了当前 LLM 在理解和应用 CAD-RADS 2.0 方面的能力,表明它们有可能显著增强冠状动脉疾病的放射学报告和患者护理。模型性能的差异突出表明需要进一步研究,特别是评估 LLM 的视觉诊断能力——这是放射学实践的关键组成部分。本研究为 CAD-RADS 2.0 中的 LLM 提供了基础比较,并为未来更广泛地研究它们在放射学中的应用奠定了基础,强调了整合基于文本和基于视觉的知识以实现最佳临床结果的重要性。

相似文献

1
Can large language models be new supportive tools in coronary computed tomography angiography reporting?大语言模型能否成为冠状动脉 CT 血管造影报告的新辅助工具?
Clin Imaging. 2024 Oct;114:110271. doi: 10.1016/j.clinimag.2024.110271. Epub 2024 Aug 31.
2
ChatGPT vs Gemini: Comparative Accuracy and Efficiency in CAD-RADS Score Assignment from Radiology Reports.ChatGPT与Gemini:放射学报告中CAD-RADS评分分配的比较准确性和效率
J Imaging Inform Med. 2024 Nov 11. doi: 10.1007/s10278-024-01328-y.
3
Large Language Models for CAD-RADS 2.0 Extraction From Semi-Structured Coronary CT Angiography Reports: A Multi-Institutional Study.用于从半结构化冠状动脉CT血管造影报告中提取CAD-RADS 2.0的大语言模型:一项多机构研究
Korean J Radiol. 2025 Sep;26(9):817-831. doi: 10.3348/kjr.2025.0293.
4
Accuracy of large language models in generating differential diagnosis from clinical presentation and imaging findings in pediatric cases.大型语言模型根据儿科病例的临床表现和影像学检查结果生成鉴别诊断的准确性。
Pediatr Radiol. 2025 Jul 12. doi: 10.1007/s00247-025-06317-z.
5
Data extraction from free-text stroke CT reports using GPT-4o and Llama-3.3-70B: the impact of annotation guidelines.使用GPT-4o和Llama-3.3-70B从自由文本中风CT报告中提取数据:注释指南的影响
Eur Radiol Exp. 2025 Jun 19;9(1):61. doi: 10.1186/s41747-025-00600-2.
6
Comparative Analysis of LLMs' Performance On a Practice Radiography Certification Exam.大语言模型在放射实践认证考试中的性能比较分析
Radiol Technol. 2025 May-Jun;96(5):334-342.
7
Evaluating text and visual diagnostic capabilities of large language models on questions related to the Breast Imaging Reporting and Data System Atlas 5 edition.评估大语言模型在与《乳腺影像报告和数据系统》第5版相关问题上的文本和视觉诊断能力。
Diagn Interv Radiol. 2025 Mar 3;31(2):111-129. doi: 10.4274/dir.2024.242876. Epub 2024 Sep 9.
8
Performance of ChatGPT-4o and Four Open-Source Large Language Models in Generating Diagnoses Based on China's Rare Disease Catalog: Comparative Study.ChatGPT-4o与四个开源大语言模型基于中国罕见病目录生成诊断的性能:比较研究
J Med Internet Res. 2025 Jun 18;27:e69929. doi: 10.2196/69929.
9
Stench of Errors or the Shine of Potential: The Challenge of (Ir)Responsible Use of ChatGPT in Speech-Language Pathology.错误的恶臭还是潜力的光辉:言语病理学中(不)负责任地使用ChatGPT的挑战。
Int J Lang Commun Disord. 2025 Jul-Aug;60(4):e70088. doi: 10.1111/1460-6984.70088.
10
Evaluating Bard Gemini Pro and GPT-4 Vision Against Student Performance in Medical Visual Question Answering: Comparative Case Study.在医学视觉问答中评估Bard Gemini Pro和GPT-4 Vision对学生表现的影响:比较案例研究
JMIR Form Res. 2024 Dec 17;8:e57592. doi: 10.2196/57592.

引用本文的文献

1
Large Language Models for CAD-RADS 2.0 Extraction From Semi-Structured Coronary CT Angiography Reports: A Multi-Institutional Study.用于从半结构化冠状动脉CT血管造影报告中提取CAD-RADS 2.0的大语言模型:一项多机构研究
Korean J Radiol. 2025 Sep;26(9):817-831. doi: 10.3348/kjr.2025.0293.
2
Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis.大型语言模型回答临床研究问题的准确性:系统评价与网络荟萃分析
J Med Internet Res. 2025 Apr 30;27:e64486. doi: 10.2196/64486.