心血管急症的临床评估：ChatGPT、急诊科医生和心脏病专家的反应比较

A Clinical Evaluation of Cardiovascular Emergencies: A Comparison of Responses from ChatGPT, Emergency Physicians, and Cardiologists.

作者信息

Geneş Muhammet, Deveci Bülent

机构信息

Cardiology Residency, Department of Cardiology, Sincan Training and Research Hospital, Ankara 06930, Turkey.

出版信息

Diagnostics (Basel). 2024 Dec 4;14(23):2731. doi: 10.3390/diagnostics14232731.

DOI:10.3390/diagnostics14232731

PMID:39682639

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11639855/

Abstract

Artificial intelligence (AI) tools, like ChatGPT, are gaining attention for their potential in supporting clinical decisions. This study evaluates the performance of ChatGPT-4o in acute cardiological cases compared to cardiologists and emergency physicians. Twenty acute cardiological scenarios were used to compare the responses of ChatGPT-4o, cardiologists, and emergency physicians in terms of accuracy, completeness, and response time. Statistical analyses included the Kruskal-Wallis H test and post hoc comparisons using the Mann-Whitney U test with Bonferroni correction. ChatGPT-4o and cardiologists both achieved 100% correct response rates, while emergency physicians showed lower accuracy. ChatGPT-4o provided the fastest responses and obtained the highest accuracy and completeness scores. Statistically significant differences were found between ChatGPT-4o and emergency physicians ( < 0.001), and between cardiologists and emergency physicians ( < 0.001). A Cohen's kappa value of 0.92 indicated a high level of inter-rater agreement. ChatGPT-4o outperformed human clinicians in accuracy, completeness, and response time, highlighting its potential as a clinical decision support tool. However, human oversight remains essential to ensure safe AI integration in healthcare settings.

摘要

像ChatGPT这样的人工智能（AI）工具因其在支持临床决策方面的潜力而受到关注。本研究将ChatGPT-4o在急性心脏病病例中的表现与心脏病专家和急诊科医生进行了比较。使用了20个急性心脏病场景，从准确性、完整性和响应时间方面比较ChatGPT-4o、心脏病专家和急诊科医生的回答。统计分析包括Kruskal-Wallis H检验以及使用带有Bonferroni校正的Mann-Whitney U检验进行的事后比较。ChatGPT-4o和心脏病专家的回答正确率均达到100%，而急诊科医生的准确性较低。ChatGPT-4o的回答速度最快，准确性和完整性得分最高。在ChatGPT-4o与急诊科医生之间（<0.001）以及心脏病专家与急诊科医生之间（<0.001）发现了具有统计学意义的差异。Cohen's kappa值为0.92表明评分者间一致性程度较高。ChatGPT-4o在准确性、完整性和响应时间方面优于人类临床医生，凸显了其作为临床决策支持工具的潜力。然而，人为监督对于确保在医疗环境中安全整合人工智能仍然至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71d1/11639855/0051fbfa4d79/diagnostics-14-02731-g001.jpg

相似文献

A Clinical Evaluation of Cardiovascular Emergencies: A Comparison of Responses from ChatGPT, Emergency Physicians, and Cardiologists.心血管急症的临床评估：ChatGPT、急诊科医生和心脏病专家的反应比较

Diagnostics (Basel). 2024 Dec 4;14(23):2731. doi: 10.3390/diagnostics14232731.

A Comparative Analysis of Artificial Intelligence Platforms: ChatGPT-4o and Google Gemini in Answering Questions About Birth Control Methods.人工智能平台的比较分析：ChatGPT-4o与谷歌Gemini在回答避孕方法相关问题方面的表现

Cureus. 2025 Jan 1;17(1):e76745. doi: 10.7759/cureus.76745. eCollection 2025 Jan.

Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o.人工智能模型在风湿病委员会级问题中的比较性能：评估 Google Gemini 和 ChatGPT-4o。

Clin Rheumatol. 2024 Nov;43(11):3507-3513. doi: 10.1007/s10067-024-07154-5. Epub 2024 Sep 28.

The accuracy of Gemini, GPT-4, and GPT-4o in ECG analysis: A comparison with cardiologists and emergency medicine specialists.Gemini、GPT-4 和 GPT-4o 在心电图分析中的准确性：与心脏病专家和急诊医学专家的比较。

Am J Emerg Med. 2024 Oct;84:68-73. doi: 10.1016/j.ajem.2024.07.043. Epub 2024 Jul 30.

Evaluation of Advanced Artificial Intelligence Algorithms' Diagnostic Efficacy in Acute Ischemic Stroke: A Comparative Analysis of ChatGPT-4o and Claude 3.5 Sonnet Models.先进人工智能算法在急性缺血性卒中诊断效能的评估：ChatGPT-4o与Claude 3.5 Sonnet模型的比较分析

J Clin Med. 2025 Jan 17;14(2):571. doi: 10.3390/jcm14020571.

Comparing diagnostic skills in endodontic cases: dental students versus ChatGPT-4o.比较牙髓病病例中的诊断技能：牙科学生与ChatGPT-4o。

BMC Oral Health. 2025 Mar 29;25(1):457. doi: 10.1186/s12903-025-05857-y.

Evaluating ChatGPT and Google Gemini Performance and Implications in Turkish Dental Education.评估ChatGPT和谷歌Gemini在土耳其牙科教育中的性能及影响

Cureus. 2025 Jan 11;17(1):e77292. doi: 10.7759/cureus.77292. eCollection 2025 Jan.

Evaluating the Performance of ChatGPT-4o Oncology Expert in Comparison to Standard Medical Oncology Knowledge: A Focus on Treatment-Related Clinical Questions.评估ChatGPT-4o肿瘤学专家与标准医学肿瘤学知识相比的表现：聚焦于与治疗相关的临床问题。

Cureus. 2025 Jan 27;17(1):e78076. doi: 10.7759/cureus.78076. eCollection 2025 Jan.

AI-powered standardised patients: evaluating ChatGPT-4o's impact on clinical case management in intern physicians.人工智能驱动的标准化病人：评估ChatGPT-4o对实习医生临床病例管理的影响。

BMC Med Educ. 2025 Feb 20;25(1):278. doi: 10.1186/s12909-025-06877-6.

Assessing the clinical support capabilities of ChatGPT 4o and ChatGPT 4o mini in managing lumbar disc herniation.评估ChatGPT 4o和ChatGPT 4o mini在管理腰椎间盘突出症方面的临床支持能力。

Eur J Med Res. 2025 Jan 22;30(1):45. doi: 10.1186/s40001-025-02296-x.

本文引用的文献

Assessment of ChatGPT's Compliance with ESC-Acute Coronary Syndrome Management Guidelines at 30-Day Intervals.以30天为间隔评估ChatGPT对欧洲心脏病学会急性冠状动脉综合征管理指南的遵循情况。

Life (Basel). 2024 Sep 27;14(10):1235. doi: 10.3390/life14101235.

Prospective Human Validation of Artificial Intelligence Interventions in Cardiology: A Scoping Review.人工智能干预在心脏病学中的前瞻性人体验证：一项范围综述。

JACC Adv. 2024 Aug 28;3(9):101202. doi: 10.1016/j.jacadv.2024.101202. eCollection 2024 Sep.

Am J Emerg Med. 2024 Oct;84:68-73. doi: 10.1016/j.ajem.2024.07.043. Epub 2024 Jul 30.

Triage Performance Across Large Language Models, ChatGPT, and Untrained Doctors in Emergency Medicine: Comparative Study.分诊表现比较：大型语言模型、ChatGPT 和未经训练的急诊医生：一项对比研究。

J Med Internet Res. 2024 Jun 14;26:e53297. doi: 10.2196/53297.

The application of large language models in medicine: A scoping review.大语言模型在医学中的应用：一项范围综述。

iScience. 2024 Apr 23;27(5):109713. doi: 10.1016/j.isci.2024.109713. eCollection 2024 May 17.

Comparison of emergency medicine specialist, cardiologist, and chat-GPT in electrocardiography assessment.比较急诊医学专家、心脏病专家和 Chat-GPT 在心电图评估中的表现。

Am J Emerg Med. 2024 Jun;80:51-60. doi: 10.1016/j.ajem.2024.03.017. Epub 2024 Mar 15.

Harnessing the Power of Generative AI for Clinical Summaries: Perspectives From Emergency Physicians.利用生成式人工智能为临床总结提供助力：来自急诊医师的观点。

Ann Emerg Med. 2024 Aug;84(2):128-138. doi: 10.1016/j.annemergmed.2024.01.039. Epub 2024 Mar 12.

Performance of ChatGPT as an AI-assisted decision support tool in medicine: a proof-of-concept study for interpreting symptoms and management of common cardiac conditions (AMSTELHEART-2).ChatGPT 在医学中作为 AI 辅助决策支持工具的性能：解释常见心脏疾病症状和管理的概念验证研究 (AMSTELHEART-2)。

Acta Cardiol. 2024 May;79(3):358-366. doi: 10.1080/00015385.2024.2303528. Epub 2024 Feb 13.

Assessing the precision of artificial intelligence in ED triage decisions: Insights from a study with ChatGPT.评估人工智能在急诊分诊决策中的精准度：来自一项与 ChatGPT 合作研究的洞察。

Am J Emerg Med. 2024 Apr;78:170-175. doi: 10.1016/j.ajem.2024.01.037. Epub 2024 Jan 24.

Heart Failure Emergency Readmission Prediction Using Stacking Machine Learning Model.使用堆叠机器学习模型预测心力衰竭紧急再入院情况

Diagnostics (Basel). 2023 Jun 2;13(11):1948. doi: 10.3390/diagnostics13111948.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

心血管急症的临床评估：ChatGPT、急诊科医生和心脏病专家的反应比较

A Clinical Evaluation of Cardiovascular Emergencies: A Comparison of Responses from ChatGPT, Emergency Physicians, and Cardiologists.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献