Suppr超能文献

基于机器学习预测高血压性脑出血6个月功能恢复情况:来自XGBoost和SHAP分析的见解

Machine learning-based prediction of 6-month functional recovery in hypertensive cerebral hemorrhage: insights from XGBoost and SHAP analysis.

作者信息

He Menghui, Lu Zhongsheng, Lv Yiwei, Cheng Zihai, Zhang Qiang, Jin Xiaoqing, Han Pei

机构信息

Department of Graduate School, Qinghai University, Xining, China.

Department of Neurosurgery, Qinghai Provincial People's Hospital, Xining, China.

出版信息

Front Neurol. 2025 Jun 4;16:1608341. doi: 10.3389/fneur.2025.1608341. eCollection 2025.

Abstract

BACKGROUND

The poor prognosis of hypertensive cerebral hemorrhage (HICH) remains high. The period of 3-6 months after onset is the most rapid phase of neurological recovery in hemorrhagic stroke patients. Accurate early prediction of 6-month functional outcomes is critical for optimizing therapeutic strategies. This study compared the predictive efficacy of multiple machine learning models to identify the optimal model for forecasting long-term prognosis in HICH patients.

METHODS

We conducted a retrospective analysis of clinical data from 807 HICH patients admitted to Qinghai Provincial People's Hospital's Neurosurgery Department between June 2020 and June 2024. After data preprocessing, data from June 2020 to December 2023 ( = 716) were randomly split into training ( = 497) and test sets ( = 219) at a 7:3 ratio. Data from January to June 2024 ( = 91) served as an external validation set. Recursive Feature Elimination (RFE) was performed to identify optimal features, and repeated five-fold cross-validation minimized the risk of overfitting. Model performance was evaluated using Area Under the Curve (AUC) and Decision Curve Analysis (DCA) across XGBoost, Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN). The optimal model was interpreted via SHapley Additive exPlanations (SHAP).

RESULTS

The 6-month poor prognosis rate among 807 HICH patients was 27.51%. The XGBoost model exhibited optimal performance in the training set (AUC = 0.921, 95% CI: 0.896-0.944) and demonstrated stability in the external validation set (AUC = 0.813, 95% CI: 0.728-0.899). DCA analysis showed that the XGBoost model provided higher net benefit than other models across threshold probabilities of 0%-20% and 56%-100%. SHAP analysis identified hematoma volume as the most critical predictor, with secondary contributions from Glasgow coma score, white blood cell count, age, serum albumin, and systolic blood pressure, among others.

CONCLUSION

XGBoost models demonstrate powerful accuracy in long-term prognosis prediction of HICH patients. The SHAP framework quantifies the specific contributions of key pathophysiological indicators to individual patient model predictions, enabling individualized risk stratification and strategic allocation of medical resources.

摘要

背景

高血压性脑出血(HICH)的预后较差,仍然居高不下。发病后3至6个月是出血性中风患者神经功能恢复最快的阶段。准确早期预测6个月功能结局对于优化治疗策略至关重要。本研究比较了多种机器学习模型的预测效能,以确定预测HICH患者长期预后的最佳模型。

方法

我们对2020年6月至2024年6月期间青海省人民医院神经外科收治的807例HICH患者的临床资料进行回顾性分析。数据预处理后,将2020年6月至2023年12月的数据(n = 716)按7:3的比例随机分为训练集(n = 497)和测试集(n = 219)。2024年1月至6月的数据(n = 91)作为外部验证集。进行递归特征消除(RFE)以识别最佳特征,并采用重复五折交叉验证将过拟合风险降至最低。使用曲线下面积(AUC)和决策曲线分析(DCA)对XGBoost、随机森林(RF)、逻辑回归(LR)、支持向量机(SVM)和K近邻(KNN)等模型的性能进行评估。通过SHapley加性解释(SHAP)对最佳模型进行解释。

结果

807例HICH患者6个月预后不良率为27.51%。XGBoost模型在训练集(AUC = 0.921,95%CI:0.896 - 0.944)中表现最佳,在外部验证集(AUC = 0.813,95%CI:0.728 - 0.899)中表现稳定。DCA分析表明,在阈值概率为0% - 20%和56% - 100%时,XGBoost模型比其他模型提供更高的净效益。SHAP分析确定血肿体积是最关键的预测因素,格拉斯哥昏迷评分、白细胞计数、年龄、血清白蛋白和收缩压等因素也有次要贡献。

结论

XGBoost模型在HICH患者长期预后预测中具有强大的准确性。SHAP框架量化了关键病理生理指标对个体患者模型预测的具体贡献,有助于进行个体化风险分层和医疗资源的战略分配。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15e3/12173871/e193609039d1/fneur-16-1608341-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验