Khalaji Amirmohammad, Behnoush Amir Hossein, Jameie Mana, Sharifi Ali, Sheikhy Ali, Fallahzadeh Aida, Sadeghian Saeed, Pashang Mina, Bagheri Jamshid, Ahmadi Tafti Seyed Hossein, Hosseini Kaveh
Tehran Heart Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.
School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.
Front Cardiovasc Med. 2022 Aug 24;9:977747. doi: 10.3389/fcvm.2022.977747. eCollection 2022.
As the era of big data analytics unfolds, machine learning (ML) might be a promising tool for predicting clinical outcomes. This study aimed to evaluate the predictive ability of ML models for estimating mortality after coronary artery bypass grafting (CABG).
Various baseline and follow-up features were obtained from the CABG data registry, established in 2005 at Tehran Heart Center. After selecting key variables using the random forest method, prediction models were developed using: Logistic Regression (LR), Support Vector Machine (SVM), Naïve Bayes (NB), K-Nearest Neighbors (KNN), Extreme Gradient Boosting (XGBoost), and Random Forest (RF) algorithms. Area Under the Curve (AUC) and other indices were used to assess the performance.
A total of 16,850 patients with isolated CABG (mean age: 67.34 ± 9.67 years) were included. Among them, 16,620 had one-year follow-up, from which 468 died. Eleven features were chosen to train the models. Total ventilation hours and left ventricular ejection fraction were by far the most predictive factors of mortality. All the models had AUC > 0.7 (acceptable performance) for 1-year mortality. Nonetheless, LR (AUC = 0.811) and XGBoost (AUC = 0.792) outperformed NB (AUC = 0.783), RF (AUC = 0.783), SVM (AUC = 0.738), and KNN (AUC = 0.715). The trend was similar for two-to-five-year mortality, with LR demonstrating the highest predictive ability.
Various ML models showed acceptable performance for estimating CABG mortality, with LR illustrating the highest prediction performance. These models can help clinicians make decisions according to the risk of mortality in patients undergoing CABG.
随着大数据分析时代的到来,机器学习(ML)可能是预测临床结果的一种有前景的工具。本研究旨在评估ML模型对冠状动脉旁路移植术(CABG)后死亡率的预测能力。
从2005年在德黑兰心脏中心建立的CABG数据登记处获取各种基线和随访特征。使用随机森林方法选择关键变量后,使用逻辑回归(LR)、支持向量机(SVM)、朴素贝叶斯(NB)、K近邻(KNN)、极端梯度提升(XGBoost)和随机森林(RF)算法开发预测模型。使用曲线下面积(AUC)和其他指标评估性能。
共纳入16850例孤立性CABG患者(平均年龄:67.34±9.67岁)。其中,16620例患者进行了一年的随访,468例死亡。选择了11个特征来训练模型。总通气小时数和左心室射血分数是迄今为止死亡率最具预测性的因素。所有模型对1年死亡率的AUC均>0.7(性能可接受)。尽管如此,LR(AUC = 0.811)和XGBoost(AUC = 0.792)的表现优于NB(AUC = 0.783)、RF(AUC = 0.783)、SVM(AUC = 0.738)和KNN(AUC = 0.715)。2至5年死亡率的趋势相似,LR显示出最高的预测能力。
各种ML模型在估计CABG死亡率方面表现出可接受的性能,LR的预测性能最高。这些模型可以帮助临床医生根据接受CABG患者的死亡风险做出决策。