• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT 4.0在胸部影像学与诊断中的评估

Evaluation of ChatGPT 4.0 in Thoracic Imaging and Diagnostics.

作者信息

Lotfian Golnaz, Parekh Keyur, Abdul Sami Mohammed, Suthar Pokhraj P

机构信息

Department of Diagnostic Radiology and Nuclear Medicine, Rush University Medical Center, Chicago, USA.

出版信息

Cureus. 2024 Nov 15;16(11):e73741. doi: 10.7759/cureus.73741. eCollection 2024 Nov.

DOI:10.7759/cureus.73741
PMID:39677135
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11646414/
Abstract

Recent advancements in natural language processing (NLP) have profoundly transformed the medical industry, enhancing large cohort data analysis, improving diagnostic capabilities, and streamlining clinical workflows. Among the leading tools in this domain is ChatGPT 4.0 (OpenAI, San Francisco, California, US), a commercial NLP model widely used across various applications. This study evaluates the diagnostic performance of ChatGPT 4.0 specifically in thoracic imaging by assessing its ability to answer diagnostic questions related to this field. We utilized the model to respond to multiple-choice questions derived from thoracic imaging scenarios, followed by rigorous statistical analysis to assess its accuracy and variability across different subgroups. Our analysis revealed significant variability across different subgroups. Overall, the model achieved an impressive accuracy of 84.9% in diagnosing thoracic radiology questions. It excelled in terminology and diagnostic signs, achieving perfect scores, and demonstrated strong performance in the intensive care and normal anatomy categories, with accuracies of 90% and 80%, respectively. In pathology subgroups, ChatGPT achieved an average accuracy of 89.1%, particularly excelling in diagnosing infectious pneumonia and atelectasis, though it scored lower in diffuse alveolar disease (66.7%). For disease-related questions, the mean accuracy was 79.1%, with perfect scores in several specific subcategories. However, accuracy was notably lower for vascular disease (50%) and lung cancer (66.7%). In conclusion, while ChatGPT 4.0 shows strong potential in diagnosing thoracic conditions, the variability identified underscores the necessity for ongoing research and refinement of its transformer architecture. This will enhance its reliability and applicability in broader clinical and patient care settings.

摘要

自然语言处理(NLP)的最新进展深刻改变了医疗行业,提升了大规模队列数据分析能力,改善了诊断能力,并简化了临床工作流程。该领域的领先工具之一是ChatGPT 4.0(美国加利福尼亚州旧金山的OpenAI),这是一种广泛应用于各种场景的商业NLP模型。本研究通过评估ChatGPT 4.0回答与胸部影像相关诊断问题的能力,具体评估其在胸部影像诊断中的性能。我们利用该模型回答源自胸部影像场景的多项选择题,随后进行严格的统计分析,以评估其在不同亚组中的准确性和变异性。我们的分析揭示了不同亚组之间存在显著的变异性。总体而言,该模型在诊断胸部放射学问题方面达到了令人印象深刻的84.9%的准确率。它在术语和诊断体征方面表现出色,获得了满分,并且在重症监护和正常解剖类别中表现强劲,准确率分别为90%和80%。在病理学亚组中,ChatGPT的平均准确率为89.1%,在诊断感染性肺炎和肺不张方面表现尤为突出,不过在弥漫性肺泡疾病方面得分较低(66.7%)。对于与疾病相关的问题,平均准确率为79.1%,在几个特定子类别中获得了满分。然而,血管疾病(50%)和肺癌(66.7%)的准确率明显较低。总之,虽然ChatGPT 4.0在诊断胸部疾病方面显示出强大的潜力,但所发现的变异性凸显了对其变压器架构进行持续研究和优化的必要性。这将提高其在更广泛的临床和患者护理环境中的可靠性和适用性。

相似文献

1
Evaluation of ChatGPT 4.0 in Thoracic Imaging and Diagnostics.ChatGPT 4.0在胸部影像学与诊断中的评估
Cureus. 2024 Nov 15;16(11):e73741. doi: 10.7759/cureus.73741. eCollection 2024 Nov.
2
Advancements in AI Medical Education: Assessing ChatGPT's Performance on USMLE-Style Questions Across Topics and Difficulty Levels.人工智能医学教育的进展:评估ChatGPT在不同主题和难度级别的美国医师执照考试(USMLE)风格问题上的表现。
Cureus. 2024 Dec 24;16(12):e76309. doi: 10.7759/cureus.76309. eCollection 2024 Dec.
3
Performance of DeepSeek, Qwen 2.5 MAX, and ChatGPT Assisting in Diagnosis of Corneal Eye Diseases, Glaucoma, and Neuro-Ophthalmology Diseases Based on Clinical Case Reports.基于临床病例报告,DeepSeek、通义千问2.5 MAX和ChatGPT在角膜眼病、青光眼和神经眼科疾病诊断中的性能表现。
medRxiv. 2025 Mar 17:2025.03.14.25323836. doi: 10.1101/2025.03.14.25323836.
4
Comparative Evaluation of AI Models Such as ChatGPT 3.5, ChatGPT 4.0, and Google Gemini in Neuroradiology Diagnostics.ChatGPT 3.5、ChatGPT 4.0和谷歌Gemini等人工智能模型在神经放射学诊断中的比较评估
Cureus. 2024 Aug 25;16(8):e67766. doi: 10.7759/cureus.67766. eCollection 2024 Aug.
5
Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis.ChatGPT 在全球医学执照考试不同版本中的表现:系统评价和荟萃分析。
J Med Internet Res. 2024 Jul 25;26:e60807. doi: 10.2196/60807.
6
Evaluating ChatGPT-4's Diagnostic Accuracy: Impact of Visual Data Integration.评估ChatGPT-4的诊断准确性:视觉数据整合的影响。
JMIR Med Inform. 2024 Apr 9;12:e55627. doi: 10.2196/55627.
7
Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.利用生成式人工智能辅助学习罕见且复杂的诊断:对流行的大型语言模型的定性研究。
JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391.
8
Comparative Analysis of ChatGPT-4o and Gemini Advanced Performance on Diagnostic Radiology In-Training Exams.ChatGPT-4o与Gemini在放射诊断学培训考试中的性能对比分析
Cureus. 2025 Mar 20;17(3):e80874. doi: 10.7759/cureus.80874. eCollection 2025 Mar.
9
Unveiling GPT-4V's hidden challenges behind high accuracy on USMLE questions: Observational Study.揭示GPT-4V在美国医师执照考试(USMLE)问题上高精度背后的隐藏挑战:观察性研究。
J Med Internet Res. 2025 Feb 7;27:e65146. doi: 10.2196/65146.
10
Large Language Models Take on Cardiothoracic Surgery: A Comparative Analysis of the Performance of Four Models on American Board of Thoracic Surgery Exam Questions in 2023.大语言模型应用于心胸外科手术:2023年四种模型在美国胸外科医师委员会考试题目上的性能对比分析
Cureus. 2024 Jul 22;16(7):e65083. doi: 10.7759/cureus.65083. eCollection 2024 Jul.

引用本文的文献

1
Layer by Layer: Assessing AI Diagnostic Accuracy With Incremental Case Information in Neuroradiology.逐层分析:利用神经放射学中的增量病例信息评估人工智能诊断准确性
Cureus. 2025 Jun 12;17(6):e85874. doi: 10.7759/cureus.85874. eCollection 2025 Jun.

本文引用的文献

1
Comparative Evaluation of AI Models Such as ChatGPT 3.5, ChatGPT 4.0, and Google Gemini in Neuroradiology Diagnostics.ChatGPT 3.5、ChatGPT 4.0和谷歌Gemini等人工智能模型在神经放射学诊断中的比较评估
Cureus. 2024 Aug 25;16(8):e67766. doi: 10.7759/cureus.67766. eCollection 2024 Aug.
2
Automatic generation of conclusions from neuroradiology MRI reports through natural language processing.通过自然语言处理自动生成神经放射学 MRI 报告的结论。
Neuroradiology. 2024 Apr;66(4):477-485. doi: 10.1007/s00234-024-03312-3. Epub 2024 Feb 21.
3
Redefining Radiology: A Review of Artificial Intelligence Integration in Medical Imaging.
重新定义放射学:医学成像中人工智能整合的综述
Diagnostics (Basel). 2023 Aug 25;13(17):2760. doi: 10.3390/diagnostics13172760.
4
Evaluating the adoption of voice recognition technology for real-time dictation in a rural healthcare system: A retrospective analysis of dragon medical one.评估语音识别技术在农村医疗体系中实时听写的采用情况:龙医一号的回顾性分析。
PLoS One. 2023 Mar 23;18(3):e0272545. doi: 10.1371/journal.pone.0272545. eCollection 2023.
5
Applications of natural language processing in radiology: A systematic review.自然语言处理在放射学中的应用:一项系统综述。
Int J Med Inform. 2022 Jul;163:104779. doi: 10.1016/j.ijmedinf.2022.104779. Epub 2022 Apr 26.
6
Artificial Intelligence (AI) Tools for Thyroid Nodules on Ultrasound, From the Special Series on AI Applications.超声甲状腺结节的人工智能(AI)工具,选自 AI 应用特辑。
AJR Am J Roentgenol. 2022 Oct;219(4):1-8. doi: 10.2214/AJR.22.27430. Epub 2022 Apr 6.
7
A systematic review of natural language processing applied to radiology reports.自然语言处理在放射学报告中的应用的系统评价。
BMC Med Inform Decis Mak. 2021 Jun 3;21(1):179. doi: 10.1186/s12911-021-01533-7.
8
Comparison of deep learning models for natural language processing-based classification of non-English head CT reports.基于深度学习的自然语言处理的非英语头部 CT 报告分类的比较。
Neuroradiology. 2020 Oct;62(10):1247-1256. doi: 10.1007/s00234-020-02420-0. Epub 2020 Apr 25.
9
PowerScribe 360 Mobile Radiologist App Review.PowerScribe 360移动放射科医生应用程序评测
J Digit Imaging. 2016 Oct;29(5):526-9. doi: 10.1007/s10278-016-9898-5.
10
The effect of voice recognition software on comparative error rates in radiology reports.语音识别软件对放射学报告中比较错误率的影响。
Br J Radiol. 2008 Oct;81(970):767-70. doi: 10.1259/bjr/20698753. Epub 2008 Jul 15.