Harmon David M, Carter Rickey E, Cohen-Shelly Michal, Svatikova Anna, Adedinsewo Demilade A, Noseworthy Peter A, Kapa Suraj, Lopez-Jimenez Francisco, Friedman Paul A, Attia Zachi I
Department of Internal Medicine, Mayo Clinic School of Graduate Medical Education, Rochester, MN.
Department of Quantitative Health Sciences, Mayo Clinic College of Medicine, Jacksonville, FL.
Eur Heart J Digit Health. 2022 Jun;3(2):238-244. doi: 10.1093/ehjdh/ztac028. Epub 2022 May 17.
Some artificial intelligence models applied in medical practice require ongoing retraining, introduce unintended racial bias, or have variable performance among different subgroups of patients. We assessed the real-world performance of the artificial intelligence-enhanced electrocardiogram to detect left ventricular systolic dysfunction with respect to multiple patient and electrocardiogram variables to determine the algorithm's long-term efficacy and potential bias in the absence of retraining.
Electrocardiograms acquired in 2019 at Mayo Clinic in Minnesota, Arizona, and Florida with an echocardiogram performed within 14 days were analyzed ( = 44 986 unique patients). The area under the curve (AUC) was calculated to evaluate performance of the algorithm among age groups, racial and ethnic groups, patient encounter location, electrocardiogram features, and over time. The artificial intelligence-enhanced electrocardiogram to detect left ventricular systolic dysfunction had an AUC of 0.903 for the total cohort. Time series analysis of the model validated its temporal stability. Areas under the curve were similar for all racial and ethnic groups (0.90-0.92) with minimal performance difference between sexes. Patients with a 'normal sinus rhythm' electrocardiogram ( = 37 047) exhibited an AUC of 0.91. All other electrocardiogram features had areas under the curve between 0.79 and 0.91, with the lowest performance occurring in the left bundle branch block group (0.79).
The artificial intelligence-enhanced electrocardiogram to detect left ventricular systolic dysfunction is stable over time in the absence of retraining and robust with respect to multiple variables including time, patient race, and electrocardiogram features.
一些应用于医学实践的人工智能模型需要持续重新训练,会引入意外的种族偏见,或者在不同患者亚组中的表现存在差异。我们评估了人工智能增强心电图检测左心室收缩功能障碍的实际性能,涉及多个患者和心电图变量,以确定该算法在不进行重新训练的情况下的长期疗效和潜在偏见。
分析了2019年在明尼苏达州、亚利桑那州和佛罗里达州的梅奥诊所采集的心电图,并在14天内进行了超声心动图检查(n = 44986名不同患者)。计算曲线下面积(AUC)以评估该算法在年龄组、种族和族裔组、患者就诊地点、心电图特征以及随时间变化的性能。用于检测左心室收缩功能障碍的人工智能增强心电图在整个队列中的AUC为0.903。对该模型的时间序列分析验证了其时间稳定性。所有种族和族裔组的曲线下面积相似(0.90 - 0.92),性别之间的性能差异最小。心电图为“正常窦性心律”的患者(n = 37047)的AUC为0.91。所有其他心电图特征的曲线下面积在0.79至0.91之间,左束支传导阻滞组的性能最低(0.79)。
用于检测左心室收缩功能障碍的人工智能增强心电图在不进行重新训练的情况下随时间稳定,并且在包括时间、患者种族和心电图特征等多个变量方面表现稳健。