Suppr超能文献

急诊科GPT辅助鉴别诊断的准确性评估

Accuracy Evaluation of GPT-Assisted Differential Diagnosis in Emergency Department.

作者信息

Shah-Mohammadi Fatemeh, Finkelstein Joseph

机构信息

Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT 84112, USA.

出版信息

Diagnostics (Basel). 2024 Aug 15;14(16):1779. doi: 10.3390/diagnostics14161779.

Abstract

In emergency department (ED) settings, rapid and precise diagnostic evaluations are critical to ensure better patient outcomes and efficient healthcare delivery. This study assesses the accuracy of differential diagnosis lists generated by the third-generation ChatGPT (ChatGPT-3.5) and the fourth-generation ChatGPT (ChatGPT-4) based on electronic health record notes recorded within the first 24 h of ED admission. These models process unstructured text to formulate a ranked list of potential diagnoses. The accuracy of these models was benchmarked against actual discharge diagnoses to evaluate their utility as diagnostic aids. Results indicated that both GPT-3.5 and GPT-4 reasonably accurately predicted diagnoses at the body system level, with GPT-4 slightly outperforming its predecessor. However, their performance at the more granular category level was inconsistent, often showing decreased precision. Notably, GPT-4 demonstrated improved accuracy in several critical categories that underscores its advanced capabilities in managing complex clinical scenarios.

摘要

在急诊科环境中,快速而精确的诊断评估对于确保更好的患者治疗效果和高效的医疗服务至关重要。本研究基于急诊科入院后24小时内记录的电子健康记录笔记,评估第三代ChatGPT(ChatGPT-3.5)和第四代ChatGPT(ChatGPT-4)生成的鉴别诊断列表的准确性。这些模型处理非结构化文本以制定潜在诊断的排名列表。将这些模型的准确性与实际出院诊断进行对比,以评估它们作为诊断辅助工具的效用。结果表明,GPT-3.5和GPT-4在身体系统层面都能较为准确地预测诊断,GPT-4略优于其前身。然而,它们在更细化的类别层面的表现并不一致,往往精度有所下降。值得注意的是,GPT-4在几个关键类别中表现出更高的准确性,这突出了其在处理复杂临床情况方面的先进能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/feb1/11354035/7ed4e072f2b2/diagnostics-14-01779-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验