Department of Neurology, The Second Hospital of Dalian Medical University, Dalian, Liaoning, China.
Department of Neurology, Shenyang First People's Hospital, Shenyang, Liaoning, China.
PLoS One. 2024 Feb 8;19(2):e0296402. doi: 10.1371/journal.pone.0296402. eCollection 2024.
To construct several prediction models for the risk of stroke in coronary artery disease (CAD) patients receiving coronary revascularization based on machine learning methods.
In total, 5757 CAD patients receiving coronary revascularization admitted to ICU in Medical Information Mart for Intensive Care IV (MIMIC-IV) were included in this cohort study. All the data were randomly split into the training set (n = 4029) and testing set (n = 1728) at 7:3. Pearson correlation analysis and least absolute shrinkage and selection operator (LASSO) regression model were applied for feature screening. Variables with Pearson correlation coefficient<9 were included, and the regression coefficients were set to 0. Features more closely related to the outcome were selected from the 10-fold cross-validation, and features with non-0 Coefficent were retained and included in the final model. The predictive values of the models were evaluated by sensitivity, specificity, area under the curve (AUC), accuracy, and 95% confidence interval (CI).
The Catboost model presented the best predictive performance with the AUC of 0.831 (95%CI: 0.811-0.851) in the training set, and 0.760 (95%CI: 0.722-0.798) in the testing set. The AUC of the logistic regression model was 0.789 (95%CI: 0.764-0.814) in the training set and 0.731 (95%CI: 0.686-0.776) in the testing set. The results of Delong test revealed that the predictive value of the Catboost model was significantly higher than the logistic regression model (P<0.05). Charlson Comorbidity Index (CCI) was the most important variable associated with the risk of stroke in CAD patients receiving coronary revascularization.
The Catboost model was the optimal model for predicting the risk of stroke in CAD patients receiving coronary revascularization, which might provide a tool to quickly identify CAD patients who were at high risk of postoperative stroke.
基于机器学习方法,构建冠状动脉疾病(CAD)患者接受冠状动脉血运重建术后发生中风风险的几种预测模型。
本队列研究纳入了 5757 例在医疗信息重症监护室 IV(MIMIC-IV)接受冠状动脉血运重建术的 CAD 患者。所有数据在 7:3 时随机分为训练集(n=4029)和测试集(n=1728)。采用 Pearson 相关分析和最小绝对值收缩选择算子(LASSO)回归模型进行特征筛选。纳入 Pearson 相关系数<9 的变量,将回归系数设置为 0。从 10 折交叉验证中选择与结局更相关的特征,保留非 0 系数的特征并纳入最终模型。通过敏感性、特异性、曲线下面积(AUC)、准确性和 95%置信区间(CI)评估模型的预测值。
Catboost 模型在训练集的 AUC 为 0.831(95%CI:0.811-0.851),在测试集的 AUC 为 0.760(95%CI:0.722-0.798),预测性能最佳。逻辑回归模型在训练集的 AUC 为 0.789(95%CI:0.764-0.814),在测试集的 AUC 为 0.731(95%CI:0.686-0.776)。Delong 检验结果表明,Catboost 模型的预测价值明显高于逻辑回归模型(P<0.05)。Charlson 合并症指数(CCI)是与接受冠状动脉血运重建术的 CAD 患者中风风险最相关的重要变量。
Catboost 模型是预测 CAD 患者接受冠状动脉血运重建术后中风风险的最优模型,可能为快速识别术后发生中风风险较高的 CAD 患者提供一种工具。