Zheng Yan, Lin Yuan-Xiang, He Qiu, Zhuo Ling-Yun, Huang Wei, Gao Zhu-Yu, Chen Ren-Long, Zhao Ming-Pei, Xie Ze-Feng, Ma Ke, Fang Wen-Hua, Wang Deng-Liang, Chen Jian-Cai, Kang De-Zhi, Lin Fu-Xin
Department of Neurosurgery, Neurosurgery Research Institute, The First Affiliated Hospital, Fujian Medical University, Fuzhou, China.
Department of Neurosurgery, Binhai Branch of National Regional Medical Center, The First Affiliated Hospital, Fujian Medical University, Fuzhou, China.
Front Neurol. 2022 Aug 25;13:955271. doi: 10.3389/fneur.2022.955271. eCollection 2022.
Stroke-associated pneumonia (SAP) contributes to high mortality rates in spontaneous intracerebral hemorrhage (sICH) populations. Accurate prediction and early intervention of SAP are associated with prognosis. None of the previously developed predictive scoring systems are widely accepted. We aimed to derive and validate novel supervised machine learning (ML) models to predict SAP events in supratentorial sICH populations.
The data of eligible supratentorial sICH individuals were extracted from the database and split into training, internal validation, and external validation datasets. The primary outcome was SAP during hospitalization. Univariate and multivariate analyses were used for variable filtering, and logistic regression (LR), Gaussian naïve Bayes (GNB), random forest (RF), K-nearest neighbor (KNN), support vector machine (SVM), extreme gradient boosting (XGB), and ensemble soft voting model (ESVM) were adopted for ML model derivations. The accuracy, sensitivity, specificity, and area under the curve (AUC) were adopted to evaluate the predictive value of each model with internal/cross-/external validations.
A total of 468 individuals with sICH were included in this work. Six independent variables [nasogastric feeding, airway support, unconscious onset, surgery for external ventricular drainage (EVD), larger sICH volume, and intensive care unit (ICU) stay] for SAP were identified and selected for ML prediction model derivations and validations. The internal and cross-validations revealed the superior and robust performance of the GNB model with the highest AUC value (0.861, 95% CI: 0.793-0.930), while the LR model had the highest AUC value (0.867, 95% CI: 0.812-0.923) in external validation. The ESVM method combining the other six methods had moderate but robust abilities in both cross-validation and external validation and achieved an AUC of 0.843 (95% CI: 0.784-0.902) in external validation.
The ML models could effectively predict SAP in sICH populations, and our novel ensemble model demonstrated reliable robust performance outcomes despite the populational and algorithmic differences. This attempt indicated that ML application may benefit in the early identification of SAP.
卒中相关性肺炎(SAP)导致自发性脑出血(sICH)人群的死亡率很高。准确预测和早期干预SAP与预后相关。以前开发的预测评分系统均未被广泛接受。我们旨在推导和验证新的监督式机器学习(ML)模型,以预测幕上sICH人群中的SAP事件。
从数据库中提取符合条件的幕上sICH个体的数据,并将其分为训练集、内部验证集和外部验证集。主要结局是住院期间发生的SAP。采用单变量和多变量分析进行变量筛选,并采用逻辑回归(LR)、高斯朴素贝叶斯(GNB)、随机森林(RF)、K近邻(KNN)、支持向量机(SVM)、极端梯度提升(XGB)和集成软投票模型(ESVM)进行ML模型推导。采用准确度、灵敏度、特异度和曲线下面积(AUC)通过内部/交叉/外部验证来评估每个模型的预测价值。
本研究共纳入468例sICH个体。确定了六个与SAP相关的独立变量[鼻饲、气道支持、无意识发作、脑室外引流(EVD)手术、较大的sICH体积和重症监护病房(ICU)住院时间],并选择这些变量用于ML预测模型的推导和验证。内部和交叉验证显示GNB模型具有优越且稳健的性能,AUC值最高(0.861,95%CI:0.793-0.930),而LR模型在外部验证中AUC值最高(0.867,95%CI:0.812-0.923)。结合其他六种方法的ESVM方法在交叉验证和外部验证中均具有中等但稳健的能力,在外部验证中AUC为0.843(95%CI:0.784-0.902)。
ML模型可以有效地预测sICH人群中的SAP,并且我们新的集成模型尽管存在人群和算法差异,但仍显示出可靠的稳健性能结果。这一尝试表明ML应用可能有助于早期识别SAP。