Department of Radiology, New York-Presbyterian Hospital and Weill Cornell Medicine, New York, New York, United States of America.
Dalio Institute of Cardiovascular Imaging, Weill Cornell Medicine, New York, New York, United States of America.
PLoS One. 2020 Jun 25;15(6):e0233791. doi: 10.1371/journal.pone.0233791. eCollection 2020.
Machine learning (ML) is able to extract patterns and develop algorithms to construct data-driven models. We use ML models to gain insight into the relative importance of variables to predict obstructive coronary artery disease (CAD) using the Coronary Computed Tomographic Angiography for Selective Cardiac Catheterization (CONSERVE) study, as well as to compare prediction of obstructive CAD to the CAD consortium clinical score (CAD2). We further perform ML analysis to gain insight into the role of imaging and clinical variables for revascularization.
For prediction of obstructive CAD, the entire ICA arm of the study, comprising 719 patients was used. For revascularization, 1,028 patients were randomized to invasive coronary angiography (ICA) or coronary computed tomographic angiography (CCTA). Data was randomly split into 80% training 20% test sets for building and validation. Models used extreme gradient boosting (XGBoost).
Mean age was 60.6 ± 11.5 years and 64.3% were female. For the prediction of obstructive CAD, the AUC was significantly higher for ML at 0.779 (95% CI: 0.672-0.886) than for CAD2 (0.696 [95% CI: 0.594-0.798]) (P = 0.01). BMI, age, and angina severity were the most important variables. For revascularization, the model obtained an overall area under the receiver-operation curve (AUC) of 0.958 (95% CI = 0.933-0.983). Performance did not differ whether the imaging parameters used were from ICA (AUC 0.947, 95% CI = 0.903-0.990) or CCTA (AUC 0.941, 95% CI = 0.895-0.988) (P = 0.90). The ML model obtained sensitivity and specificity of 89.2% and 92.9%, respectively. Number of vessels with ≥70% stenosis, maximum segment stenosis severity (SSS) and body mass index (BMI) were the most important variables. Exclusion of imaging variables resulted in performance deterioration, with an AUC of 0.705 (95% CI 0.614-0.795) (P <0.0001).
For obstructive CAD, the ML model outperformed CAD2. BMI is an important variable, although currently not included in most scores. In this ML model, imaging variables were most associated with revascularization. Imaging modality did not influence model performance. Removal of imaging variables reduced model performance.
机器学习 (ML) 能够提取模式并开发算法,以构建数据驱动的模型。我们使用 ML 模型深入了解变量对预测阻塞性冠状动脉疾病 (CAD) 的相对重要性,使用的是 Coronary Computed Tomographic Angiography for Selective Cardiac Catheterization (CONSERVE) 研究,以及与 CAD 联盟临床评分 (CAD2) 比较 CAD 的预测。我们进一步进行 ML 分析,深入了解成像和临床变量对血运重建的作用。
对于阻塞性 CAD 的预测,使用了研究的整个 ICA 手臂,包含 719 名患者。对于血运重建,1028 名患者被随机分配到侵入性冠状动脉造影 (ICA) 或冠状动脉计算机断层扫描血管造影 (CCTA)。数据随机分为 80%的训练集和 20%的测试集,用于构建和验证。模型使用极端梯度提升 (XGBoost)。
平均年龄为 60.6±11.5 岁,女性占 64.3%。对于阻塞性 CAD 的预测,ML 的 AUC 显著高于 CAD2,分别为 0.779(95%CI:0.672-0.886)和 0.696(95%CI:0.594-0.798)(P=0.01)。BMI、年龄和心绞痛严重程度是最重要的变量。对于血运重建,该模型获得了总体接收者操作曲线 (ROC)下面积 (AUC)为 0.958(95%CI=0.933-0.983)。无论使用的成像参数来自 ICA(AUC 0.947,95%CI=0.903-0.990)还是 CCTA(AUC 0.941,95%CI=0.895-0.988),性能均无差异(P=0.90)。ML 模型获得了 89.2%的敏感性和 92.9%的特异性。最重要的变量是≥70%狭窄的血管数量、最大节段狭窄严重程度 (SSS)和体重指数 (BMI)。排除成像变量会导致性能恶化,AUC 为 0.705(95%CI 0.614-0.795)(P<0.0001)。
对于阻塞性 CAD,ML 模型优于 CAD2。BMI 是一个重要的变量,尽管目前大多数评分都不包括 BMI。在这个 ML 模型中,成像变量与血运重建最相关。成像方式并不影响模型性能。去除成像变量会降低模型性能。