Lin Yanan, Li Yan, Luo Yayin, Han Jie
Department of Neurology, The First Affiliated Hospital of Dalian Medical University, Dalian, China.
Interdisciplinary Research Center for Biology and Chemistry, Liaoning Normal University, Dalian, China.
Front Neurol. 2025 Jan 15;15:1446250. doi: 10.3389/fneur.2024.1446250. eCollection 2024.
To develop and validate an explainable machine learning (ML) model predicting the risk of hemorrhagic transformation (HT) after intravenous thrombolysis.
We retrospectively enrolled patients who received intravenous tissue plasminogen activator (IV-tPA) thrombolysis within 4.5 h after symptom onset to form the original modeling cohort. HT was defined as any hemorrhage on head CT scan completed within 48 h after IV-tPA administration. We utilized the Random Forest (RF), Multilayer Perceptron (MLP), Adaptive Boosting (AdaBoost), and Gaussian Naive Bayes (GauNB) algorithms to develop ML-HT models. The models' predictive performance was evaluated using confusion matrix (including accuracy, precision, recall, and F1 score), and discriminative analysis (area under the receiver-operating-characteristic curve, ROC-AUC) in the original cohort, followed by validation in an independent external cohort. The models' explainability was assessed using SHapley Additive exPlanations (SHAP) global feature plot, SHAP Summary Plot, and Partial Dependence Plot.
A total of 1,007 patients were included in the original modeling cohort, with an HT incidence of 8.94%. The RF-based ML-HT model showed metrics of 0.874 (accuracy), 0.972 (precision), 0.890 (recall), 0.929 (F1 score); with ROC-AUC of 0.7847 in the original cohort and 0.7119 in the external validation cohort. The MLP model showed 0.878, 0.967, 0.989, 0.978, 0.7710, and 0.6768, respectively. The AdaBoost model showed 0.907, 0.967, 0.989, 0.978, 0.7798, and 0.6606, respectively. The GauNB model showed 0.848, 0.983, 0.598, 0.716, 0.6953, and 0.6289, respectively. The explainable analysis of the RF-based ML model indicated that the National Institute of Health Stroke Scale (NIHSS) score, age, platelet count, and atrial fibrillation were the primary determinants for HT following IV-tPA thrombolysis.
The RF-based explainable ML model demonstrated promising predictive ability for estimating the risk of HT after IV-tPA thrombolysis and may have the potential to assist the clinical decision-making in emergency settings.
开发并验证一种可解释的机器学习(ML)模型,用于预测静脉溶栓后出血转化(HT)的风险。
我们回顾性纳入了症状发作后4.5小时内接受静脉注射组织纤溶酶原激活剂(IV-tPA)溶栓的患者,以形成原始建模队列。HT定义为在IV-tPA给药后48小时内完成的头部CT扫描上出现的任何出血。我们使用随机森林(RF)、多层感知器(MLP)、自适应增强(AdaBoost)和高斯朴素贝叶斯(GauNB)算法来开发ML-HT模型。在原始队列中使用混淆矩阵(包括准确率、精确率、召回率和F1分数)和判别分析(受试者操作特征曲线下面积,ROC-AUC)评估模型的预测性能,随后在独立的外部队列中进行验证。使用夏普利值附加解释(SHAP)全局特征图、SHAP汇总图和部分依赖图评估模型的可解释性。
原始建模队列共纳入1007例患者,HT发生率为8.94%。基于RF的ML-HT模型的指标为0.874(准确率)、0.972(精确率)、0.890(召回率)、0.929(F1分数);在原始队列中的ROC-AUC为0.7847,在外部验证队列中为0.7119。MLP模型分别显示为0.878、0.967、0.989、0.978、0.7710和0.6768。AdaBoost模型分别显示为0.907、0.967、0.989、0.978、0.7798和0.6606。GauNB模型分别显示为0.848、0.983、0.598、0.716、0.6953和0.6289。对基于RF的ML模型的可解释性分析表明,美国国立卫生研究院卒中量表(NIHSS)评分、年龄、血小板计数和心房颤动是IV-tPA溶栓后HT的主要决定因素。
基于RF的可解释ML模型在估计IV-tPA溶栓后HT风险方面显示出有前景的预测能力,可能有助于在紧急情况下的临床决策。