Wen Rui, Wang Miaoran, Bian Wei, Zhu Haoyue, Xiao Ying, He Qian, Wang Yu, Liu Xiaoqing, Shi Yangdi, Hong Zhe, Xu Bing
Shenyang Tenth People's Hospital, Shenyang, China.
Affiliated Central Hospital of Shenyang Medical College, Shenyang Medical College, Shenyang, China.
Front Neurol. 2023 Oct 20;14:1247492. doi: 10.3389/fneur.2023.1247492. eCollection 2023.
This study aimed to compare the performance of different machine learning models in predicting symptomatic intracranial hemorrhage (sICH) after thrombolysis treatment for ischemic stroke.
This multicenter study utilized the Shenyang Stroke Emergency Map database, comprising 8,924 acute ischemic stroke patients from 29 comprehensive hospitals who underwent thrombolysis between January 2019 and December 2021. An independent testing cohort was further established, including 1,921 patients from the First People's Hospital of Shenyang. The structured dataset encompassed 15 variables, including clinical and therapeutic metrics. The primary outcome was the sICH occurrence post-thrombolysis. Models were developed using an 80/20 split for training and internal validation. Performance was assessed using machine learning classifiers, including logistic regression with lasso regularization, support vector machine (SVM), random forest, gradient-boosted decision tree (GBDT), and multilayer perceptron (MLP). The model boasting the highest area under the curve (AUC) was specifically employed to highlight feature importance.
Baseline characteristics were compared between the training cohort ( = 6,369) and the external validation cohort ( = 1,921), with the sICH incidence being slightly higher in the training cohort (1.6%) compared to the validation cohort (1.1%). Among the evaluated models, the logistic regression with lasso regularization achieved the highest AUC of 0.87 (95% confidence interval [CI]: 0.79-0.95; < 0.001), followed by the MLP model with an AUC of 0.766 (95% CI: 0.637-0.894; = 0.04). The reference model and SVM showed AUCs of 0.575 and 0.582, respectively, while the random forest and GBDT models performed less optimally with AUCs of 0.536 and 0.436, respectively. Decision curve analysis revealed net benefits primarily for the SVM and MLP models. Feature importance from the logistic regression model emphasized anticoagulation therapy as the most significant negative predictor (coefficient: -2.0833) and recombinant tissue plasminogen activator as the principal positive predictor (coefficient: 0.5082).
After a comprehensive evaluation, the MLP model is recommended due to its superior ability to predict the risk of symptomatic hemorrhage post-thrombolysis in ischemic stroke patients. Based on decision curve analysis, the MLP-based model was chosen and demonstrated enhanced discriminative ability compared to the reference. This model serves as a valuable tool for clinicians, aiding in treatment planning and ensuring more precise forecasting of patient outcomes.
本研究旨在比较不同机器学习模型在预测缺血性中风溶栓治疗后症状性颅内出血(sICH)方面的性能。
这项多中心研究利用了沈阳卒中急救地图数据库,该数据库包含2019年1月至2021年12月期间在29家综合医院接受溶栓治疗的8924例急性缺血性中风患者。进一步建立了一个独立测试队列,包括来自沈阳市第一人民医院的1921例患者。结构化数据集包含15个变量,包括临床和治疗指标。主要结局是溶栓后sICH的发生情况。使用80/20分割进行训练和内部验证来开发模型。使用机器学习分类器评估性能,包括带套索正则化的逻辑回归、支持向量机(SVM)、随机森林、梯度提升决策树(GBDT)和多层感知器(MLP)。曲线下面积(AUC)最高的模型被专门用于突出特征重要性。
比较了训练队列(n = 6369)和外部验证队列(n = 1921)的基线特征,训练队列中的sICH发生率(1.6%)略高于验证队列(1.1%)。在评估的模型中,带套索正则化的逻辑回归实现了最高的AUC,为0.87(95%置信区间[CI]:0.79 - 0.95;P < 0.001),其次是MLP模型,AUC为0.766(95% CI:0.637 - 0.894;P = 0.04)。参考模型和SVM分别显示AUC为0.575和0.582,而随机森林和GBDT模型的表现较差,AUC分别为0.536和0.436。决策曲线分析显示主要对SVM和MLP模型有净效益。逻辑回归模型的特征重要性强调抗凝治疗是最显著的负预测因子(系数:-2.0833),重组组织型纤溶酶原激活剂是主要的正预测因子(系数:0.5082)。
经过全面评估,推荐MLP模型,因为它在预测缺血性中风患者溶栓后症状性出血风险方面具有卓越能力。基于决策曲线分析,选择了基于MLP的模型,与参考模型相比,其鉴别能力有所增强。该模型是临床医生的宝贵工具,有助于治疗规划并确保更精确地预测患者预后。