Geneş Muhammet, Çelik Murat
Cardiology Residency, Department of Cardiology, Sincan Training and Research Hospital, Sincan, Ankara 06930, Turkey.
Department of Cardiology, Gulhane Training and Research Hospital, Health Science University, Ankara 06000, Turkey.
Life (Basel). 2024 Sep 27;14(10):1235. doi: 10.3390/life14101235.
Despite ongoing advancements in healthcare, acute coronary syndromes (ACS) remain a leading cause of morbidity and mortality. The 2023 European Society of Cardiology (ESC) guidelines have introduced significant improvements in ACS management. Concurrently, artificial intelligence (AI), particularly models like ChatGPT, is showing promise in supporting clinical decision-making and education. This study evaluates the performance of ChatGPT-v4 in adhering to ESC guidelines for ACS management over a 30-day interval. Based on ESC guidelines, a dataset of 100 questions was used to assess ChatGPT's accuracy and consistency. The questions were divided into binary (true/false) and multiple-choice formats. The AI's responses were initially evaluated and then re-evaluated after 30 days, using accuracy and consistency as primary metrics. ChatGPT's accuracy in answering ACS-related binary and multiple-choice questions was evaluated at baseline and after 30 days. For binary questions, accuracy was 84% initially and 86% after 30 days, with no significant change ( = 0.564). Cohen's Kappa was 0.94, indicating excellent agreement. Multiple-choice question accuracy was 80% initially, improving to 84% after 30 days, also without significant change ( = 0.527). Cohen's Kappa was 0.93, reflecting similarly high consistency. These results suggest stable AI performance with minor fluctuations. Despite variations in performance on binary and multiple-choice questions, ChatGPT shows significant promise as a clinical support tool in ACS management. However, it is crucial to consider limitations such as fluctuations and hallucinations, which could lead to severe issues in clinical applications.
尽管医疗保健领域不断取得进步,但急性冠状动脉综合征(ACS)仍然是发病和死亡的主要原因。2023年欧洲心脏病学会(ESC)指南在ACS管理方面有了显著改进。与此同时,人工智能(AI),特别是像ChatGPT这样的模型,在支持临床决策和教育方面显示出了潜力。本研究评估了ChatGPT-v4在30天时间间隔内遵循ESC ACS管理指南的表现。基于ESC指南,使用一个包含100个问题的数据集来评估ChatGPT的准确性和一致性。这些问题分为二元(真/假)和多项选择格式。AI的回答首先进行评估,然后在30天后重新评估,以准确性和一致性作为主要指标。在基线和30天后评估了ChatGPT回答ACS相关二元和多项选择题的准确性。对于二元问题,最初的准确率为84%,30天后为86%,无显著变化(P = 0.564)。科恩kappa系数为0.94,表明一致性极佳。多项选择题的准确率最初为80%,30天后提高到84%,同样无显著变化(P = 0.527)。科恩kappa系数为0.93,反映出同样高的一致性。这些结果表明AI性能稳定,波动较小。尽管在二元和多项选择题上的表现存在差异,但ChatGPT作为ACS管理中的临床支持工具显示出巨大潜力。然而,必须考虑到波动和幻觉等局限性,这些可能会在临床应用中导致严重问题。