Amorim J L, Bensenor I M, Alencar A P, Pereira A C, Goulart A C, Lotufo P A, Santos I S
Centro de Pesquisa Clínica e Epidemiológica, Hospital Universitário, Universidade de São Paulo, São Paulo, SP, Brasil.
Departamento de Clínica Médica, Faculdade de Medicina, Universidade de São Paulo, São Paulo, SP, Brasil.
Braz J Med Biol Res. 2025 Aug 22;58:e14986. doi: 10.1590/1414-431X2025e14986. eCollection 2025.
It is unclear who benefits the most from atherosclerotic cardiovascular disease (ASCVD) screening imaging. This study aimed to identify features associated with positive coronary artery calcium scores (CACS) in individuals with diabetes using machine learning (ML) techniques. ELSA-Brasil is a cohort study with 15,105 participants aged 35 to 74 years in six Brazilian cities. We analyzed 25 sociodemographic, medical history, symptom-related, and laboratory variables from 585 participants from the São Paulo investigation center with CACS data and no overt cardiovascular disease at baseline. We used six ML algorithms to build models to identify individuals with positive CACS. Feature importance was determined by SHapley Additive exPlanations (SHAP) values. The best performer ML algorithm was the XGBoost Classifier (accuracy: 94.8%). Age (SHAP: 0.220), systolic blood pressure (SHAP: 0.102), and body mass index (SHAP: 0.075) were the most important variables to identify ASCVD in individuals with diabetes in XGBoost models. Considering all ML models in our analysis, age, systolic blood pressure, and sex were frequently influential variables. We obtained high accuracy with our best model, using information generally present in current clinical practice. ML models may help clinicians select patients with characteristics most probably associated with a positive CAC. Age, systolic blood pressure, body mass index, and sex may be useful markers to identify those at higher risk for subclinical ASCVD.
目前尚不清楚谁能从动脉粥样硬化性心血管疾病(ASCVD)筛查成像中获益最大。本研究旨在使用机器学习(ML)技术识别糖尿病患者中与冠状动脉钙化评分(CACS)阳性相关的特征。ELSA - Brasil是一项队列研究,在巴西六个城市有15105名年龄在35至74岁的参与者。我们分析了来自圣保罗调查中心的585名参与者的25个社会人口统计学、病史、症状相关和实验室变量,这些参与者有CACS数据且基线时无明显心血管疾病。我们使用六种ML算法构建模型以识别CACS阳性的个体。特征重要性由SHapley加法解释(SHAP)值确定。表现最佳的ML算法是XGBoost分类器(准确率:94.8%)。在XGBoost模型中,年龄(SHAP:0.220)、收缩压(SHAP:0.102)和体重指数(SHAP:0.075)是识别糖尿病患者ASCVD的最重要变量。考虑到我们分析中的所有ML模型,年龄、收缩压和性别是经常有影响的变量。我们使用当前临床实践中普遍存在的信息,通过最佳模型获得了高精度。ML模型可能有助于临床医生选择最有可能与CACS阳性相关特征的患者。年龄、收缩压、体重指数和性别可能是识别亚临床ASCVD高风险人群的有用标志物。