Alam Sreyoshi F, Thongprayoon Charat, Miao Jing, Pham Justin H, Sheikh Mohammad S, Garcia Valencia Oscar A, Schwartz Gary L, Craici Iasmina M, Gonzalez Suarez Maria L, Cheungpasitporn Wisit
Division of Nephrology and Hypertension, Mayo Clinic Minnesota, Rochester, MN, USA.
*Maria L. Gonzalez Suarez and Wisit Cheungpasitporn are senior co-authors.
Digit Health. 2025 Mar 14;11:20552076251326014. doi: 10.1177/20552076251326014. eCollection 2025 Jan-Dec.
The use of artificial intelligence (AI) for interpreting ambulatory blood pressure monitoring (ABPM) data is gaining traction in clinical practice. Evaluating the accuracy of AI models, like ChatGPT 4.0, in clinical settings can inform their integration into healthcare processes. However, limited research has been conducted to validate the performance of such models against expert interpretations in real-world clinical scenarios. A total of 53 ABPM records from Mayo Clinic, Minnesota, were analyzed. ChatGPT 4.0's interpretations were compared with consensus results from two experienced nephrologists, based on the American College of Cardiology/American Heart Association (ACC/AHA) guidelines. The study assessed ChatGPT's accuracy and reliability over two rounds of testing, with a three-month interval between rounds. ChatGPT achieved an accuracy of 87% for identifying hypertension, 89% for nocturnal hypertension, 81% for nocturnal dipping, and 94% for abnormal heart rate. ChatGPT correctly identified all conditions in 60% of ABPM records. The percentage agreement between the first and second round of ChatGPT's analysis was 81% in identifying hypertension, 85% in nocturnal hypertension, 89% in nocturnal dipping, and 94% in abnormal heart rate. There was no significant difference in accuracy between the first and second round (all p > 0.05). The Kappa statistic was 0.63 for identifying hypertension, 0.66 for nocturnal hypertension, 0.76 for nocturnal dipping, and 0.70 for abnormal heart rate. ChatGPT 4.0 demonstrates potential as a reliable tool for interpreting 24-h ABPM data, achieving substantial agreement with expert nephrologists. These findings underscore the potential for AI integration into hypertension management workflows, while highlighting the need for further validation in larger, diverse cohorts.
在临床实践中,使用人工智能(AI)来解读动态血压监测(ABPM)数据正越来越受到关注。评估像ChatGPT 4.0这样的AI模型在临床环境中的准确性,可为其融入医疗保健流程提供参考依据。然而,针对此类模型在现实临床场景中与专家解读相比的性能验证研究有限。对来自明尼苏达州梅奥诊所的53份ABPM记录进行了分析。根据美国心脏病学会/美国心脏协会(ACC/AHA)指南,将ChatGPT 4.0的解读与两位经验丰富的肾病专家的共识结果进行了比较。该研究在两轮测试中评估了ChatGPT的准确性和可靠性,两轮测试间隔为三个月。ChatGPT在识别高血压方面的准确率为87%,夜间高血压为89%,夜间血压下降为81%,心率异常为94%。ChatGPT在60%的ABPM记录中正确识别了所有情况。ChatGPT第一轮和第二轮分析在识别高血压方面的百分比一致性为81%,夜间高血压为85%,夜间血压下降为89%,心率异常为94%。第一轮和第二轮之间的准确性没有显著差异(所有p>0.05)。识别高血压的Kappa统计量为0.63,夜间高血压为0.66,夜间血压下降为0.76,心率异常为0.70。ChatGPT 4.0显示出作为解读24小时ABPM数据的可靠工具的潜力,与肾病专家达成了实质性一致。这些发现强调了AI融入高血压管理工作流程的潜力,同时也突出了在更大、更多样化队列中进行进一步验证的必要性。