文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

ChatGPT 联合 GPT-4 在诊断准确率上优于急诊科医生:回顾性分析。

ChatGPT With GPT-4 Outperforms Emergency Department Physicians in Diagnostic Accuracy: Retrospective Analysis.

机构信息

Department of Medicine IV, LMU University Hospital, Munich, Germany.

Department of Medicine I, LMU University Hospital, Munich, Germany.

出版信息

J Med Internet Res. 2024 Jul 8;26:e56110. doi: 10.2196/56110.


DOI:10.2196/56110
PMID:38976865
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11263899/
Abstract

BACKGROUND: OpenAI's ChatGPT is a pioneering artificial intelligence (AI) in the field of natural language processing, and it holds significant potential in medicine for providing treatment advice. Additionally, recent studies have demonstrated promising results using ChatGPT for emergency medicine triage. However, its diagnostic accuracy in the emergency department (ED) has not yet been evaluated. OBJECTIVE: This study compares the diagnostic accuracy of ChatGPT with GPT-3.5 and GPT-4 and primary treating resident physicians in an ED setting. METHODS: Among 100 adults admitted to our ED in January 2023 with internal medicine issues, the diagnostic accuracy was assessed by comparing the diagnoses made by ED resident physicians and those made by ChatGPT with GPT-3.5 or GPT-4 against the final hospital discharge diagnosis, using a point system for grading accuracy. RESULTS: The study enrolled 100 patients with a median age of 72 (IQR 58.5-82.0) years who were admitted to our internal medicine ED primarily for cardiovascular, endocrine, gastrointestinal, or infectious diseases. GPT-4 outperformed both GPT-3.5 (P<.001) and ED resident physicians (P=.01) in diagnostic accuracy for internal medicine emergencies. Furthermore, across various disease subgroups, GPT-4 consistently outperformed GPT-3.5 and resident physicians. It demonstrated significant superiority in cardiovascular (GPT-4 vs ED physicians: P=.03) and endocrine or gastrointestinal diseases (GPT-4 vs GPT-3.5: P=.01). However, in other categories, the differences were not statistically significant. CONCLUSIONS: In this study, which compared the diagnostic accuracy of GPT-3.5, GPT-4, and ED resident physicians against a discharge diagnosis gold standard, GPT-4 outperformed both the resident physicians and its predecessor, GPT-3.5. Despite the retrospective design of the study and its limited sample size, the results underscore the potential of AI as a supportive diagnostic tool in ED settings.

摘要

背景:OpenAI 的 ChatGPT 是自然语言处理领域的开创性人工智能(AI),它在提供治疗建议方面具有重要的医学应用潜力。此外,最近的研究表明,ChatGPT 在急诊分诊中具有很有前景的结果。然而,它在急诊室(ED)的诊断准确性尚未得到评估。

目的:本研究比较了 ChatGPT 与 GPT-3.5 和 GPT-4 以及 ED 主治住院医师在 ED 环境中的诊断准确性。

方法:在 2023 年 1 月入住我们 ED 的 100 名患有内科问题的成年人中,通过比较 ED 主治住院医师与 ChatGPT 与 GPT-3.5 或 GPT-4 做出的诊断与最终出院诊断,使用分级准确性的评分系统来评估诊断准确性。

结果:这项研究共纳入了 100 名中位年龄为 72(IQR 58.5-82.0)岁的患者,他们主要因心血管、内分泌、胃肠道或传染病而入住我们的内科 ED。GPT-4 在诊断内科急症方面的准确性优于 GPT-3.5(P<.001)和 ED 主治住院医师(P=.01)。此外,在各种疾病亚组中,GPT-4 始终优于 GPT-3.5 和主治住院医师。它在心血管疾病(GPT-4 与 ED 医师:P=.03)和内分泌或胃肠道疾病(GPT-4 与 GPT-3.5:P=.01)方面表现出显著优势。然而,在其他类别中,差异没有统计学意义。

结论:在这项研究中,我们将 GPT-3.5、GPT-4 和 ED 主治住院医师的诊断准确性与出院诊断金标准进行了比较,GPT-4 的表现优于主治住院医师和其前身 GPT-3.5。尽管研究采用了回顾性设计且样本量有限,但研究结果强调了 AI 作为 ED 环境中辅助诊断工具的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47a2/11263899/ce81ad18b74e/jmir_v26i1e56110_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47a2/11263899/ce81ad18b74e/jmir_v26i1e56110_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47a2/11263899/ce81ad18b74e/jmir_v26i1e56110_fig1.jpg

相似文献

[1]
ChatGPT With GPT-4 Outperforms Emergency Department Physicians in Diagnostic Accuracy: Retrospective Analysis.

J Med Internet Res. 2024-7-8

[2]
Assessing the precision of artificial intelligence in ED triage decisions: Insights from a study with ChatGPT.

Am J Emerg Med. 2024-4

[3]
Triage Performance Across Large Language Models, ChatGPT, and Untrained Doctors in Emergency Medicine: Comparative Study.

J Med Internet Res. 2024-6-14

[4]
Emergency department triaging using ChatGPT based on emergency severity index principles: a cross-sectional study.

Sci Rep. 2024-9-27

[5]
Patient-Representing Population's Perceptions of GPT-Generated Versus Standard Emergency Department Discharge Instructions: Randomized Blind Survey Assessment.

J Med Internet Res. 2024-8-2

[6]
Comparison of Diagnostic and Triage Accuracy of Ada Health and WebMD Symptom Checkers, ChatGPT, and Physicians for Patients in an Emergency Department: Clinical Data Analysis Study.

JMIR Mhealth Uhealth. 2023-10-3

[7]
Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4.

J Med Internet Res. 2024-6-27

[8]
The diagnostic and triage accuracy of the GPT-3 artificial intelligence model: an observational study.

Lancet Digit Health. 2024-8

[9]
Comparative analysis of ChatGPT, Gemini and emergency medicine specialist in ESI triage assessment.

Am J Emerg Med. 2024-7

[10]
Accuracy of a Commercial Large Language Model (ChatGPT) to Perform Disaster Triage of Simulated Patients Using the Simple Triage and Rapid Treatment (START) Protocol: Gage Repeatability and Reproducibility Study.

J Med Internet Res. 2024-9-30

引用本文的文献

[1]
Artificial Intelligence Chatbots in Pediatric Emergencies: A Reliable Lifeline or a Risk?

Cureus. 2025-8-1

[2]
A bibliometric analysis of large language model-based AI chatbots in surgery.

Ann Med Surg (Lond). 2025-5-12

[3]
The performance of ChatGPT on medical image-based assessments and implications for medical education.

BMC Med Educ. 2025-8-23

[4]
Co-production of Diagnostic Excellence - Patients, Clinicians, and Artificial Intelligence Comment on "Achieving Diagnostic Excellence: Roadmaps to Develop and Use Patient-Reported Measures With an Equity Lens".

Int J Health Policy Manag. 2025

[5]
Can AI match emergency physicians in managing common emergency cases? A comparative performance evaluation.

BMC Emerg Med. 2025-7-31

[6]
Use of a Medical Communication Framework to Assess the Quality of Generative Artificial Intelligence Replies to Primary Care Patient Portal Messages: Content Analysis.

JMIR Form Res. 2025-7-31

[7]
Artificial intelligence in coronary angiography: benchmarking the diagnostic accuracy of ChatGPT-4o against interventional cardiologists.

Open Heart. 2025-7-20

[8]
Large Language Models in Medical Diagnostics: Scoping Review With Bibliometric Analysis.

J Med Internet Res. 2025-6-9

[9]
ChatGPT-o1 Preview Outperforms ChatGPT-4 as a Diagnostic Support Tool for Ankle Pain Triage in Emergency Settings.

Arch Acad Emerg Med. 2025-4-5

[10]
A Practical Guide to the Utilization of ChatGPT in the Emergency Department: A Systematic Review of Current Applications, Future Directions, and Limitations.

Cureus. 2025-4-6

本文引用的文献

[1]
ChatGPT-Generated Differential Diagnosis Lists for Complex Case-Derived Clinical Vignettes: Diagnostic Accuracy Evaluation.

JMIR Med Inform. 2023-10-9

[2]
Comparison of Diagnostic and Triage Accuracy of Ada Health and WebMD Symptom Checkers, ChatGPT, and Physicians for Patients in an Emergency Department: Clinical Data Analysis Study.

JMIR Mhealth Uhealth. 2023-10-3

[3]
Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study.

J Med Internet Res. 2023-8-22

[4]
Machine learning for ECG diagnosis and risk stratification of occlusion myocardial infarction.

Nat Med. 2023-7

[5]
ChatGPT: A Valuable Tool for Emergency Medical Assistance.

Ann Emerg Med. 2023-9

[6]
The ChatGPT Era: Artificial Intelligence in Emergency Medicine.

Ann Emerg Med. 2023-6

[7]
Diagnostic Accuracy of Differential-Diagnosis Lists Generated by Generative Pretrained Transformer 3 Chatbot for Clinical Vignettes with Common Chief Complaints: A Pilot Study.

Int J Environ Res Public Health. 2023-2-15

[8]
Appropriateness of Cardiovascular Disease Prevention Recommendations Obtained From a Popular Online Chat-Based Artificial Intelligence Model.

JAMA. 2023-3-14

[9]
AI bot ChatGPT writes smart essays - should professors worry?

Nature. 2022-12-9

[10]
Review of the Basics of Cognitive Error in Emergency Medicine: Still No Easy Answers.

West J Emerg Med. 2020-11-2

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索