Ambale-Venkatesh Bharath, Yang Xiaoying, Wu Colin O, Liu Kiang, Hundley W Gregory, McClelland Robyn, Gomes Antoinette S, Folsom Aaron R, Shea Steven, Guallar Eliseo, Bluemke David A, Lima João A C
From the Department of Radiology (B.A.-V.), Bloomberg School of Public Health (E.G.), and Department of Medicine, Cardiology and Radiology (J.A.C.L.), Johns Hopkins University, Baltimore, MD; George Washington University, DC (X.Y.); Office of Biostatistics, NHLBI, NIH, Bethesda, MD (C.O.W.); Department of Preventive Medicine, Northwestern University Medical School, Chicago, IL (K.L.); Department of Cardiology, Wake Forest University Health Sciences, Winston-Salem, NC (W.G.H.); Department of Biostatistics, University of Washington, Seattle (R.M.); Department of Radiology, UCLA School of Medicine, Los Angeles, CA (A.S.G.); Division of Epidemiology and Community Health, University of Minnesota, Minneapolis (A.R.F.); Departments of Medicine and Epidemiology, Columbia University, New York, NY (S.S.); and Radiology and Imaging Sciences, NIH Clinical Center, Bethesda, MD (D.A.B.).
Circ Res. 2017 Oct 13;121(9):1092-1101. doi: 10.1161/CIRCRESAHA.117.311312. Epub 2017 Aug 9.
RATIONALE: Machine learning may be useful to characterize cardiovascular risk, predict outcomes, and identify biomarkers in population studies. OBJECTIVE: To test the ability of random survival forests, a machine learning technique, to predict 6 cardiovascular outcomes in comparison to standard cardiovascular risk scores. METHODS AND RESULTS: We included participants from the MESA (Multi-Ethnic Study of Atherosclerosis). Baseline measurements were used to predict cardiovascular outcomes over 12 years of follow-up. MESA was designed to study progression of subclinical disease to cardiovascular events where participants were initially free of cardiovascular disease. All 6814 participants from MESA, aged 45 to 84 years, from 4 ethnicities, and 6 centers across the United States were included. Seven-hundred thirty-five variables from imaging and noninvasive tests, questionnaires, and biomarker panels were obtained. We used the random survival forests technique to identify the top-20 predictors of each outcome. Imaging, electrocardiography, and serum biomarkers featured heavily on the top-20 lists as opposed to traditional cardiovascular risk factors. Age was the most important predictor for all-cause mortality. Fasting glucose levels and carotid ultrasonography measures were important predictors of stroke. Coronary Artery Calcium score was the most important predictor of coronary heart disease and all atherosclerotic cardiovascular disease combined outcomes. Left ventricular structure and function and cardiac troponin-T were among the top predictors for incident heart failure. Creatinine, age, and ankle-brachial index were among the top predictors of atrial fibrillation. TNF-α (tissue necrosis factor-α) and IL (interleukin)-2 soluble receptors and NT-proBNP (N-Terminal Pro-B-Type Natriuretic Peptide) levels were important across all outcomes. The random survival forests technique performed better than established risk scores with increased prediction accuracy (decreased Brier score by 10%-25%). CONCLUSIONS: Machine learning in conjunction with deep phenotyping improves prediction accuracy in cardiovascular event prediction in an initially asymptomatic population. These methods may lead to greater insights on subclinical disease markers without apriori assumptions of causality. CLINICAL TRIAL REGISTRATION: URL: http://www.clinicaltrials.gov. Unique identifier: NCT00005487.
原理:在人群研究中,机器学习可能有助于描述心血管风险、预测结局并识别生物标志物。 目的:测试一种机器学习技术——随机生存森林,与标准心血管风险评分相比,预测6种心血管结局的能力。 方法与结果:我们纳入了动脉粥样硬化多族裔研究(MESA)中的参与者。使用基线测量数据来预测12年随访期内的心血管结局。MESA旨在研究亚临床疾病向心血管事件的进展情况,参与者最初无心血管疾病。纳入了来自美国6个中心、4个种族、年龄在45至84岁的所有6814名MESA参与者。获取了来自影像学和非侵入性检查、问卷及生物标志物面板的735个变量。我们使用随机生存森林技术来确定每种结局的前20个预测因素。与传统心血管危险因素不同,影像学、心电图和血清生物标志物在这前20个列表中占主导地位。年龄是全因死亡率的最重要预测因素。空腹血糖水平和颈动脉超声测量是中风的重要预测因素。冠状动脉钙化评分是冠心病和所有动脉粥样硬化性心血管疾病综合结局的最重要预测因素。左心室结构和功能以及心肌肌钙蛋白T是新发心力衰竭的顶级预测因素之一。肌酐、年龄和踝臂指数是心房颤动的顶级预测因素之一。肿瘤坏死因子-α(TNF-α)、白细胞介素(IL)-2可溶性受体和N端前脑钠肽(NT-proBNP)水平在所有结局中都很重要。随机生存森林技术的表现优于既定的风险评分,预测准确性提高(Brier评分降低10%-25%)。 结论:机器学习结合深度表型分析可提高对初始无症状人群心血管事件预测的准确性。这些方法可能会在无需先验因果假设的情况下,对亚临床疾病标志物有更深入的了解。 临床试验注册:网址:http://www.clinicaltrials.gov。唯一标识符:NCT00005487。
Quant Imaging Med Surg. 2025-9-1
Front Med (Lausanne). 2025-6-12
Neurobiol Sleep Circadian Rhythms. 2025-6-3
Circulation. 2015-11-17
Clin J Am Soc Nephrol. 2015-6-5
Circ Cardiovasc Qual Outcomes. 2014-11
Nat Rev Cardiol. 2014-3-25
Atherosclerosis. 2014-2
Circ Cardiovasc Qual Outcomes. 2012-1