Avidan Yuval, Naoum Ibrahim, Khoury Razi, Zahra Sameha, Dov Nissan Ben, Schliamser Jorge E, Danon Asaf, Aker Amir
Department of Cardiology, Lady Davis Carmel Medical Center, Haifa, Israel.
Department of Cardiology, Lady Davis Carmel Medical Center, Haifa, Israel.
Heart Lung. 2025 Sep-Oct;73:90-94. doi: 10.1016/j.hrtlng.2025.04.032. Epub 2025 May 8.
Current guidelines require physician confirmation for smartwatch-diagnosed atrial fibrillation (AF), increasing telemedicine workloads. The newest ChatGPT-4o (GPT-4o) incorporates advanced image input capabilities.
To assess GPT-4o's performance in identifying AF from smartwatch recordings.
Consecutive 120 patients with AF and 60 controls with sinus rhythm (SR), confirmed by conventional 12-lead ECG, recorded single-lead ECGs using an Apple Watch (AW) Series 6®. Two blinded cardiologists independently classified the smartwatch recordings as AF, SR, or undetermined. GPT-4o was subsequently prompted to analyze all smartwatch ECGs.
Six AF cases were excluded due to undetermined AW-ECG recordings, leaving 114 AF patients (mean age: 73.4 ± 10.4 years) and 60 controls. The AW algorithm achieved 97.3 % and 100 % accuracy for AF and SR, respectively, while GPT-4o correctly analyzed 47.3 % of AF and 71.6 % of SR tracings. None of the AF characteristics-chronicity, heart rate, QRS width, fibrillatory wave amplitude, or R-wave amplitude and polarity-were predictive of GPT-4o's diagnostic accuracy.
The current capabilities of GPT-4o are insufficient to make a reliable diagnosis of AF from smartwatch ECGs. Despite the theoretical appeal of leveraging this innovative technology for such purpose, the findings highlight that human expertise remains indispensable. Consumers must remain aware of the current limitations of this technology.
当前指南要求医生对智能手表诊断的房颤(AF)进行确认,这增加了远程医疗的工作量。最新的ChatGPT-4o(GPT-4o)具备先进的图像输入功能。
评估GPT-4o从智能手表记录中识别房颤的性能。
连续纳入120例经传统12导联心电图确诊的房颤患者和60例窦性心律(SR)对照者,使用苹果手表(AW)Series 6®记录单导联心电图。两名盲法心脏病专家独立将智能手表记录分类为房颤、SR或无法确定。随后促使GPT-4o分析所有智能手表心电图。
由于AW心电图记录无法确定,排除6例房颤病例,剩余114例房颤患者(平均年龄:73.4±10.4岁)和60例对照者。AW算法对房颤和SR的准确率分别达到97.3%和100%,而GPT-4o正确分析了47.3%的房颤和71.6%的SR心电图。房颤的特征——慢性、心率、QRS波宽度、颤动波振幅或R波振幅及极性——均不能预测GPT-4o的诊断准确性。
GPT-4o目前的能力不足以从智能手表心电图中可靠诊断房颤。尽管利用这项创新技术用于此目的在理论上具有吸引力,但研究结果表明人类专业知识仍然不可或缺。消费者必须意识到这项技术目前的局限性。