Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA, USA.
Digital Innovation and Consumer Experience (DICE) Group, Thomas Jefferson University, Philadelphia, PA, USA.
J Stroke Cerebrovasc Dis. 2021 Jul;30(7):105796. doi: 10.1016/j.jstrokecerebrovasdis.2021.105796. Epub 2021 Apr 19.
Novel machine learning (ML) methods are being investigated across medicine for their predictive capabilities while boasting increased adaptability and generalizability. In our study, we compare logistic regression with machine learning for feature importance analysis and prediction in first-pass reperfusion.
We retrospectively identified cases of ischemic stroke treated with mechanical thrombectomy (MT) at our institution from 2012-2018. Significant variables used in predictive modeling were demographic characteristics, medical history, admission NIHSS, and stroke characteristics. Outcome was binarized TICI on first pass (0-2a vs 2b-3). Shapley feature importance plots were used to identify variables that strongly affected outcomes.
Accuracy for the Random Forest and SVM models were 67.1% compared to 65.8% for the logistic regression model. Brier score was lower for the Random Forest model (0.329 vs 0.342) indicating better predictive capability. Other supervised learning models performed worse than the logistic regression model, with accuracy of 56.2% for Naïve Bayes and 61.6% for XGBoost. Shapley plots for the Random Forest model showed use of aspiration, hyperlipidemia, hypertension, use of stent retriever, and time between symptom onset and catheterization as the top five predictors of first pass reperfusion.
Use of machine learning models, such as Random Forest, for the study of MT outcomes, is more accurate than logistic regression for our dataset, and identifies new factors that contribute to achieving first pass reperfusion. The benefits of machine learning, such as improved predictive capabilities, integration of new data, and generalizability, establish ML as the preferred model for studying outcomes in stroke.
新型机器学习(ML)方法正在医学领域得到广泛研究,因其具有预测能力,同时还具有更强的适应性和泛化能力。在我们的研究中,我们比较了逻辑回归和机器学习在首次再灌注中的特征重要性分析和预测。
我们回顾性地确定了 2012 年至 2018 年在我院接受机械血栓切除术(MT)治疗的缺血性脑卒中病例。在预测模型中使用了有显著意义的变量,包括人口统计学特征、病史、入院 NIHSS 和卒中特征。结局为首次通过 TICI 分级(0-2a 与 2b-3)。Shapley 特征重要性图用于确定对结局有强烈影响的变量。
随机森林和 SVM 模型的准确性分别为 67.1%和 65.8%,而逻辑回归模型的准确性为 65.8%。随机森林模型的 Brier 得分较低(0.329 比 0.342),表明其具有更好的预测能力。其他监督学习模型的性能均不如逻辑回归模型,朴素贝叶斯模型的准确性为 56.2%,XGBoost 模型的准确性为 61.6%。随机森林模型的 Shapley 图显示,抽吸、高脂血症、高血压、使用支架取栓和症状发作与导管插入之间的时间是首次通过再灌注的前五个预测因素。
对于 MT 结局的研究,使用机器学习模型(如随机森林)比逻辑回归更准确,并且可以确定有助于实现首次通过再灌注的新因素。机器学习的优势,如提高预测能力、整合新数据和通用性,使其成为研究中风结局的首选模型。