Institute of Neuroscience, Kunming Medical University, Kunming, Yunnan, China.
Department of Neurology, Nanbu People's Hospital, Nanbu, Sichuan, China.
Biosci Rep. 2022 Sep 30;42(9). doi: 10.1042/BSR20220995.
Embolic stroke (ES) is characterized by high morbidity and mortality. Its mortality predictors remain unclear. The present study aimed to use machine learning (ML) to identify the key predictors of mortality for ES patients in the intensive care unit (ICU). Data were extracted from two large ICU databases: Medical Information Mart for Intensive Care (MIMIC)-IV for training and internal validation, and eICU Collaborative Research Database (eICU-CRD) for external validation. We developed predictive models of ES mortality based on 15 ML algorithms. We relied on the synthetic minority oversampling technique (SMOTE) to address class imbalance. Our main performance metric was area under the receiver operating characteristic (AUROC). We adopted recursive feature elimination (RFE) for feature selection. We assessed model performance using three disease-severity scoring systems as benchmarks. Of the 1566 and 207 ES patients enrolled in the two databases, there were 173 (15.70%), 73 (15.57%), and 36 (17.39%) hospital mortality in the training, internal validation, and external validation cohort, respectively. The random forest (RF) model had the largest AUROC (0.806) in the internal validation phase and was chosen as the best model. The AUROC of the RF compact (RF-COM) model containing the top six features identified by RFE was 0.795. In the external validation phase, the AUROC of the RF model was 0.838, and the RF-COM model was 0.830, outperforming other models. Our findings suggest that the RF model was the best model and the top six predictors of ES hospital mortality were Glasgow Coma Scale, white blood cell, blood urea nitrogen, bicarbonate, age, and mechanical ventilation.
脑栓塞(ES)的特点是发病率和死亡率高。其死亡率预测因素仍不清楚。本研究旨在使用机器学习(ML)识别重症监护病房(ICU)中 ES 患者死亡的关键预测因素。数据从两个大型 ICU 数据库中提取:用于训练和内部验证的医疗信息集市重症监护(MIMIC-IV)和 eICU 协作研究数据库(eICU-CRD)用于外部验证。我们基于 15 种 ML 算法开发了 ES 死亡率预测模型。我们依赖于合成少数过采样技术(SMOTE)来解决类别不平衡问题。我们的主要性能指标是接收者操作特征曲线下的面积(AUROC)。我们采用递归特征消除(RFE)进行特征选择。我们使用三种疾病严重程度评分系统作为基准来评估模型性能。在两个数据库中,共有 1566 名和 207 名 ES 患者入组,训练组、内部验证组和外部验证组的住院死亡率分别为 173 例(15.70%)、73 例(15.57%)和 36 例(17.39%)。随机森林(RF)模型在内部验证阶段的 AUROC 最大(0.806),被选为最佳模型。包含 RFE 确定的前六个特征的 RF 精简模型(RF-COM)的 AUROC 为 0.795。在外部验证阶段,RF 模型的 AUROC 为 0.838,RF-COM 模型为 0.830,均优于其他模型。我们的研究结果表明,RF 模型是最佳模型,ES 住院死亡率的前六个预测因素是格拉斯哥昏迷量表、白细胞、血尿素氮、碳酸氢盐、年龄和机械通气。