Xu Jixiang, Li Yuan, Zhu Fumin, Han Xiaoxiao, Chen Liang, Qi Yinliang, Zhou Xiaomei
Department of Hyperbaric Oxygen, The Second People's Hospital of Hefei, Hefei Hospital Affiliated to Anhui Medical University, Hefei, Anhui Province, China.
Department of Neurology, Dazhou Central Hospital, Dazhou, Sichuan, China.
Front Neurol. 2025 Jun 18;16:1571755. doi: 10.3389/fneur.2025.1571755. eCollection 2025.
Pulmonary infection (PI) remains a prevalent and severe complication in patients recovering from spontaneous deep subcortical intracerebral hemorrhage (deep SICH). Accurate prediction of PI risk is crucial for early intervention and optimized clinical management. The aim of this study was to develop a machine learning (ML) model for predicting PI risk in patients during the recovery phase of deep SICH and to investigate the contributions of individual risk factors through explainable artificial intelligence techniques.
We conducted a retrospective study involving 649 patients diagnosed with PI during the recovery phase of deep SICH between 2021 and 2023. The cohort was divided into a training set (70%, = 454) and a testing set (30%, = 195). Eight key clinical features were identified using the Boruta algorithm: mechanical ventilation, nasogastric feeding, tracheotomy, antibacterial drug use, hyperbaric oxygen therapy, procalcitonin levels, sedative drug use, and consciousness scores. Seven ML algorithms were employed to build predictive models, with performance evaluated based on the area under the receiver operating characteristic (AUC) curve, sensitivity, specificity, and accuracy. The best-performing model was selected, and SHAP (Shapley Additive Explanations) analysis was performed to interpret feature importance.
Among 649 patients with deep SICH, no significant baseline differences were found between the training ( = 454) and testing ( = 195) sets. The Boruta algorithm identified eight key predictors of pulmonary infection (PI). The random forest (RF) model achieved the highest AUCs: 0.994 (95% CI: 0.989-0.998) in training and 0.931 (95% CI: 0.899-0.963) in testing. DeLong tests showed RF significantly outperformed several models (DT, SVM, LightGBM), while performance differences with XGBoost ( = 0.95), KNN ( = 0.80), and LR ( = 0.22) were not significant. SHAP analysis revealed mechanical ventilation, nasogastric feeding, and tracheotomy as key risk factors, with hyperbaric oxygen therapy and higher consciousness scores showing protective effects.
This study provides a high-performing and interpretable ML-based risk stratification tool for pulmonary infection in patients during the recovery phase of deep SICH. The integration of SHAP enhances clinical applicability by demystifying complex model outputs, thereby supporting individualized preventive strategies. These findings underscore the promise of explainable AI in advancing neurocritical care and call for prospective multicenter validation and real-time dynamic model adaptation in future research.
肺部感染(PI)仍是自发性深部皮质下脑出血(深部SICH)患者康复过程中普遍且严重的并发症。准确预测PI风险对于早期干预和优化临床管理至关重要。本研究的目的是开发一种机器学习(ML)模型,用于预测深部SICH康复期患者的PI风险,并通过可解释的人工智能技术研究个体风险因素的作用。
我们进行了一项回顾性研究,纳入了2021年至2023年间在深部SICH康复期被诊断为PI的649例患者。该队列被分为训练集(70%,n = 454)和测试集(30%,n = 195)。使用Boruta算法确定了八个关键临床特征:机械通气、鼻饲、气管切开、抗菌药物使用、高压氧治疗、降钙素原水平、镇静药物使用和意识评分。采用七种ML算法构建预测模型,并根据受试者操作特征(AUC)曲线下面积、敏感性、特异性和准确性评估模型性能。选择性能最佳的模型,并进行SHAP(Shapley加性解释)分析以解释特征重要性。
在649例深部SICH患者中,训练集(n = 454)和测试集(n = 195)之间未发现显著的基线差异。Boruta算法确定了肺部感染(PI)的八个关键预测因素。随机森林(RF)模型在训练中的AUC最高:0.994(95%CI:0.989 - 0.998),在测试中的AUC为0.931(95%CI:0.899 - 0.963)。DeLong检验表明,RF明显优于几个模型(决策树、支持向量机、LightGBM),而与XGBoost(AUC = 0.95)、KNN(AUC = 0.80)和LR(AUC = 0.22)的性能差异不显著。SHAP分析显示机械通气、鼻饲和气管切开是关键风险因素,高压氧治疗和较高的意识评分显示出保护作用。
本研究为深部SICH康复期患者的肺部感染提供了一种高性能且可解释的基于ML的风险分层工具。SHAP的整合通过揭开复杂模型输出的神秘面纱增强了临床适用性,从而支持个体化预防策略。这些发现强调了可解释人工智能在推进神经重症监护方面的前景,并呼吁在未来研究中进行前瞻性多中心验证和实时动态模型调整。