Suppr超能文献

ChatGPT对患者分诊的诊断准确性:一项系统评价与荟萃分析

Diagnostic Accuracy of ChatGPT for Patients' Triage; a Systematic Review and Meta-Analysis.

作者信息

Kaboudi Navid, Firouzbakht Saeedeh, Shahir Eftekhar Mohammad, Fayazbakhsh Fatemeh, Joharivarnoosfaderani Niloufar, Ghaderi Salar, Dehdashti Mohammadreza, Mohtasham Kia Yasmin, Afshari Maryam, Vasaghi-Gharamaleki Maryam, Haghani Leila, Moradzadeh Zahra, Khalaj Fattaneh, Mohammadi Zahra, Hasanabadi Zahra, Shahidi Ramin

机构信息

Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran.

Department of Pediatrics, School of Medicine, Bushehr University of Medical Sciences, Bushehr, Iran.

出版信息

Arch Acad Emerg Med. 2024 Jul 30;12(1):e60. doi: 10.22037/aaem.v12i1.2384. eCollection 2024.

Abstract

INTRODUCTION

Artificial intelligence (AI), particularly ChatGPT developed by OpenAI, has shown the potential to improve diagnostic accuracy and efficiency in emergency department (ED) triage. This study aims to evaluate the diagnostic performance and safety of ChatGPT in prioritizing patients based on urgency in ED settings.

METHODS

A systematic review and meta-analysis were conducted following PRISMA guidelines. Comprehensive literature searches were performed in Scopus, Web of Science, PubMed, and Embase. Studies evaluating ChatGPT's diagnostic performance in ED triage were included. Quality assessment was conducted using the QUADAS-2 tool. Pooled accuracy estimates were calculated using a random-effects model, and heterogeneity was assessed with the I² statistic.

RESULTS

Fourteen studies with a total of 1,412 patients or scenarios were included. ChatGPT 4.0 demonstrated a pooled accuracy of 0.86 (95% CI: 0.64-0.98) with substantial heterogeneity (I² = 93%). ChatGPT 3.5 showed a pooled accuracy of 0.63 (95% CI: 0.43-0.81) with significant heterogeneity (I² = 84%). Funnel plots indicated potential publication bias, particularly for ChatGPT 3.5. Quality assessments revealed varying levels of risk of bias and applicability concerns.

CONCLUSION

ChatGPT, especially version 4.0, shows promise in improving ED triage accuracy. However, significant variability and potential biases highlight the need for further evaluation and enhancement.

摘要

引言

人工智能(AI),尤其是OpenAI开发的ChatGPT,已显示出提高急诊科(ED)分诊诊断准确性和效率的潜力。本研究旨在评估ChatGPT在急诊科环境中根据紧急程度对患者进行优先级排序的诊断性能和安全性。

方法

按照PRISMA指南进行系统评价和荟萃分析。在Scopus、科学网、PubMed和Embase中进行了全面的文献检索。纳入评估ChatGPT在急诊科分诊中诊断性能的研究。使用QUADAS-2工具进行质量评估。使用随机效应模型计算合并准确性估计值,并使用I²统计量评估异质性。

结果

纳入了14项研究,共1412例患者或病例场景。ChatGPT 4.0的合并准确性为0.86(95%CI:0.64-0.98),异质性较大(I²=93%)。ChatGPT 3.5的合并准确性为0.63(95%CI:0.43-0.81),异质性显著(I²=84%)。漏斗图表明存在潜在的发表偏倚,尤其是对于ChatGPT 3.5。质量评估揭示了不同程度的偏倚风险和适用性问题。

结论

ChatGPT,尤其是4.0版本,在提高急诊科分诊准确性方面显示出前景。然而,显著的变异性和潜在偏倚凸显了进一步评估和改进的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7060/11407534/5b02504c5fe5/aaem-12-e60-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验