Department of Cardiovascular Medicine (P.A.N., Z.I.A., L.C.B., S.N.H., X.Y., S.K., P.A.F., F.L.-J.), Mayo Clinic, Rochester, MN.
Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery (P.A.N., X.Y.), Mayo Clinic, Rochester, MN.
Circ Arrhythm Electrophysiol. 2020 Mar;13(3):e007988. doi: 10.1161/CIRCEP.119.007988. Epub 2020 Feb 16.
Deep learning algorithms derived in homogeneous populations may be poorly generalizable and have the potential to reflect, perpetuate, and even exacerbate racial/ethnic disparities in health and health care. In this study, we aimed to (1) assess whether the performance of a deep learning algorithm designed to detect low left ventricular ejection fraction using the 12-lead ECG varies by race/ethnicity and to (2) determine whether its performance is determined by the derivation population or by racial variation in the ECG.
We performed a retrospective cohort analysis that included 97 829 patients with paired ECGs and echocardiograms. We tested the model performance by race/ethnicity for convolutional neural network designed to identify patients with a left ventricular ejection fraction ≤35% from the 12-lead ECG.
The convolutional neural network that was previously derived in a homogeneous population (derivation cohort, n=44 959; 96.2% non-Hispanic white) demonstrated consistent performance to detect low left ventricular ejection fraction across a range of racial/ethnic subgroups in a separate testing cohort (n=52 870): non-Hispanic white (n=44 524; area under the curve [AUC], 0.931), Asian (n=557; AUC, 0.961), black/African American (n=651; AUC, 0.937), Hispanic/Latino (n=331; AUC, 0.937), and American Indian/Native Alaskan (n=223; AUC, 0.938). In secondary analyses, a separate neural network was able to discern racial subgroup category (black/African American [AUC, 0.84], and white, non-Hispanic [AUC, 0.76] in a 5-class classifier), and a network trained only in non-Hispanic whites from the original derivation cohort performed similarly well across a range of racial/ethnic subgroups in the testing cohort with an AUC of at least 0.930 in all racial/ethnic subgroups.
Our study demonstrates that while ECG characteristics vary by race, this did not impact the ability of a convolutional neural network to predict low left ventricular ejection fraction from the ECG. We recommend reporting of performance among diverse ethnic, racial, age, and sex groups for all new artificial intelligence tools to ensure responsible use of artificial intelligence in medicine.
在同质人群中得出的深度学习算法可能无法很好地推广,并且有可能反映、延续甚至加剧健康和医疗保健方面的种族/民族差异。在这项研究中,我们旨在:(1)评估旨在使用 12 导联心电图检测低左心室射血分数的深度学习算法的性能是否因种族/民族而异;(2)确定其性能是由推导人群还是由心电图中的种族差异决定。
我们进行了一项回顾性队列分析,其中包括 97829 名接受心电图和超声心动图检查的患者。我们使用卷积神经网络对左心室射血分数≤35%的患者进行 12 导联心电图检测,测试了该模型在不同种族/民族亚组中的性能。
之前在同质人群中得出的卷积神经网络(推导队列,n=44959;96.2%非西班牙裔白人)在独立测试队列(n=52870)中表现出一致的性能,可检测出一系列种族/民族亚组中的低左心室射血分数:非西班牙裔白人(n=44524;曲线下面积[AUC],0.931)、亚洲人(n=557;AUC,0.961)、黑人/非裔美国人(n=651;AUC,0.937)、西班牙裔/拉丁裔(n=331;AUC,0.937)和美洲印第安人/阿拉斯加原住民(n=223;AUC,0.938)。在二次分析中,另一个神经网络能够辨别种族亚组类别(黑人/非裔美国人[AUC,0.84]和白人,非西班牙裔[AUC,0.76]在五分类器中),并且仅在原始推导队列中的非西班牙裔白人群体中训练的网络在测试队列中也能在所有种族/民族亚组中表现良好,AUC 均至少为 0.930。
我们的研究表明,尽管心电图特征因种族而异,但这并未影响卷积神经网络从心电图预测低左心室射血分数的能力。我们建议为所有新的人工智能工具报告不同种族、民族、年龄和性别群体的表现,以确保人工智能在医学中的负责任使用。