GPT-4o能否准确诊断创伤性X光片？与专家评估的比较研究。

Can Gpt-4o Accurately Diagnose Trauma X-Rays? A Comparative Study with Expert Evaluations.

作者信息

Öztürk Ahmet, Günay Serkan, Ateş Serdal, Yiğit Yavuz Yigit Yavuz

机构信息

Department of Emergency Medicine, Hitit University Çorum Erol Olçok Education and Research Hospital, Çorum, Turkey.

Department of Emergency Medicine, Health Sciences University, Ankara Training and Research Hospital, Ankara, Turkey.

出版信息

J Emerg Med. 2025 Jun;73:71-79. doi: 10.1016/j.jemermed.2024.12.010. Epub 2025 Jan 4.

DOI:10.1016/j.jemermed.2024.12.010

PMID:40348690

Abstract

BACKGROUND

The latest artificial intelligence (AI) model, GPT-4o, introduced by OpenAI, can process visual data, presenting a novel opportunity for radiographic evaluation in trauma patients.

OBJECTIVE

This study aimed to assess the efficacy of GPT-4o in interpreting radiographs for traumatic bone pathologies and to compare its performance with that of emergency medicine and orthopedic specialists.

METHODS

The study involved 10 emergency medicine specialists, 10 orthopedic specialists, and the GPT-4o AI model, evaluating 25 cases of traumatic bone pathologies of the upper and lower extremities selected from the Radiopaedia website. Participants were asked to identify fractures or dislocations in the radiographs within 45 minutes. GPT-4o was instructed to perform the same task in 10 different chat sessions.

RESULTS

Emergency medicine specialists and orthopedic specialists demonstrated an average accuracy of 82.8% and 87.2%, respectively, in radiograph interpretation. In contrast, GPT-4o achieved an accuracy of only 11.2%. Statistical analysis revealed significant differences among the three groups (p < 0.001), with GPT-4o performing significantly worse than both groups of specialists.

CONCLUSION

GPT-4o's ability to interpret radiographs of traumatic bone pathologies is currently limited and significantly inferior to that of trained specialists. These findings underscore the ongoing need for human expertise in trauma diagnosis and highlight the challenges of applying AI to complex medical imaging tasks.

摘要

背景

OpenAI推出的最新人工智能（AI）模型GPT-4o能够处理视觉数据，为创伤患者的放射学评估带来了新机遇。

目的

本研究旨在评估GPT-4o解读创伤性骨病变X光片的效能，并将其表现与急诊医学专家和骨科专家的表现进行比较。

方法

该研究纳入了10名急诊医学专家、10名骨科专家以及GPT-4o人工智能模型，对从Radiopaedia网站选取的25例上下肢创伤性骨病变病例的X光片进行评估。要求参与者在45分钟内识别X光片中的骨折或脱位情况。GPT-4o被指示在10个不同的聊天会话中执行相同任务。

结果

急诊医学专家和骨科专家在X光片解读中的平均准确率分别为82.8%和87.2%。相比之下，GPT-4o的准确率仅为11.2%。统计分析显示三组之间存在显著差异（p < 0.001），GPT-4o的表现明显不如两组专家。

结论

GPT-4o解读创伤性骨病变X光片的能力目前有限，且明显逊于训练有素的专家。这些发现强调了创伤诊断中人类专业知识的持续必要性，并凸显了将人工智能应用于复杂医学成像任务的挑战。

相似文献

Can Gpt-4o Accurately Diagnose Trauma X-Rays? A Comparative Study with Expert Evaluations.GPT-4o能否准确诊断创伤性X光片？与专家评估的比较研究。

J Emerg Med. 2025 Jun;73:71-79. doi: 10.1016/j.jemermed.2024.12.010. Epub 2025 Jan 4.

The accuracy of Gemini, GPT-4, and GPT-4o in ECG analysis: A comparison with cardiologists and emergency medicine specialists.Gemini、GPT-4 和 GPT-4o 在心电图分析中的准确性：与心脏病专家和急诊医学专家的比较。

Am J Emerg Med. 2024 Oct;84:68-73. doi: 10.1016/j.ajem.2024.07.043. Epub 2024 Jul 30.

An Evaluation of the Performance of OpenAI-o1 and GPT-4o in the Japanese National Examination for Physical Therapists.OpenAI-o1和GPT-4o在日本物理治疗师国家考试中的表现评估

Cureus. 2025 Jan 6;17(1):e76989. doi: 10.7759/cureus.76989. eCollection 2025 Jan.

Advancing medical AI: GPT-4 and GPT-4o surpass GPT-3.5 in Taiwanese medical licensing exams.推进医学人工智能：GPT-4和GPT-4o在台湾医学执照考试中超越GPT-3.5。

PLoS One. 2025 Jun 4;20(6):e0324841. doi: 10.1371/journal.pone.0324841. eCollection 2025.

Performance Evaluation of GPT-4o and o1-Preview Using the Certification Examination for the Japanese 'Operations Chief of Radiography With X-rays'.使用日本“X射线摄影操作主管”认证考试对GPT-4o和o1-Preview进行性能评估

Cureus. 2024 Nov 22;16(11):e74262. doi: 10.7759/cureus.74262. eCollection 2024 Nov.

Patient Triage and Guidance in Emergency Departments Using Large Language Models: Multimetric Study.使用大语言模型在急诊科进行患者分诊和指导：多指标研究

J Med Internet Res. 2025 May 15;27:e71613. doi: 10.2196/71613.

AI-Assisted Hypothesis Generation to Address Challenges in Cardiotoxicity Research: Simulation Study Using ChatGPT With GPT-4o.人工智能辅助生成假设以应对心脏毒性研究中的挑战：使用ChatGPT与GPT-4o的模拟研究

J Med Internet Res. 2025 May 15;27:e66161. doi: 10.2196/66161.

ChatGPT-4 Omni Performance in USMLE Disciplines and Clinical Skills: Comparative Analysis.ChatGPT-4 在 USMLE 学科和临床技能中的全能表现：比较分析。

JMIR Med Educ. 2024 Nov 6;10:e63430. doi: 10.2196/63430.

GPT-4o’s competency in answering the simulated written European Board of Interventional Radiology exam compared to a medical student and experts in Germany and its ability to generate exam items on interventional radiology: a descriptive study.GPT-4o 在回答模拟的欧洲介入放射学委员会考试方面的能力与德国医学生和专家相比，以及其在介入放射学方面生成考试项目的能力：一项描述性研究。

J Educ Eval Health Prof. 2024;21:21. doi: 10.3352/jeehp.2024.21.21. Epub 2024 Aug 20.

Assessing the accuracy and clinical utility of GPT-4O in abnormal blood cell morphology recognition.评估GPT-4O在异常血细胞形态识别中的准确性和临床效用。

Digit Health. 2024 Nov 5;10:20552076241298503. doi: 10.1177/20552076241298503. eCollection 2024 Jan-Dec.

引用本文的文献

Effectiveness of the GPT-4o Model in Interpreting Electrocardiogram Images for Cardiac Diagnostics: Diagnostic Accuracy Study.GPT-4o模型在解读心电图图像用于心脏诊断中的有效性：诊断准确性研究

JMIR AI. 2025 Aug 22;4:e74426. doi: 10.2196/74426.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

GPT-4o能否准确诊断创伤性X光片？与专家评估的比较研究。

Can Gpt-4o Accurately Diagnose Trauma X-Rays? A Comparative Study with Expert Evaluations.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSION

背景

目的

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献