Shang Shuai, Wei Meng, Lv Huasheng, Liang Xiaoyan, Lu Yanmei, Tang Baopeng
Department of Cardiac Pacing and Electrophysiology, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, People's Republic of China.
Xinjiang Key Laboratory of Cardiac Electrophysiology and Remodeling, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, People's Republic of China.
Int J Gen Med. 2025 Jun 20;18:3277-3288. doi: 10.2147/IJGM.S514972. eCollection 2025.
This study aimed to develop and validate a model based on machine learning algorithms to predict the risk of in-hospital death among advanced elderly patients with Heart Failure (HF).
A total of 4580 advanced elderly patients who were admitted to the hospital and diagnosed with HF from May 2012 to September 2023 were included in this study, among whom 552 cases (12.5%) died. The least absolute shrinkage and selection operator (LASSO) regression and Boruta feature selection were used to screen the baseline variables to identify the variables significantly associated with death. Subsequently, seven different machine learning models were constructed and their prediction performances were evaluated. The Shapley Additive Explanations (SHAP) values were used to analyze the impact of key variables on the model prediction results.
A total of seven variables significantly associated with death were selected by LASSO regression and Boruta feature selection, including white blood cell count (WBC), neutrophil percentage (Neut %), C-reactive protein (CRP), D-dimer, glycated serum protein (GSP), N-terminal pro-B-type natriuretic peptide (NT-ProBNP), and body mass index (BMI). Among all the models, the extreme gradient boosting (XGB) model performed the best, with an area under the curve (AUC) value of 0.933, a sensitivity of 0.79, a specificity of 0.89, a recall of 0.79, and an F1 score of 0.59 on the validation set. The SHAP analysis showed that CRP, BMI, NT-ProBNP, D-dimer, and GSP were the main influencing factors for death.
This study successfully constructed a prediction model for the in-hospital death risk of advanced elderly patients with HF, and the XGB model exhibited excellent prediction performance. This model can be used for the early clinical identification of high-risk patients and thus provide support for individualized treatment strategies.
本研究旨在开发并验证一种基于机器学习算法的模型,以预测高龄心力衰竭(HF)患者的院内死亡风险。
本研究纳入了2012年5月至2023年9月期间入院并被诊断为HF的4580例高龄患者,其中552例(12.5%)死亡。采用最小绝对收缩和选择算子(LASSO)回归和Boruta特征选择来筛选基线变量,以识别与死亡显著相关的变量。随后,构建了七种不同的机器学习模型,并评估了它们的预测性能。使用夏普利值(SHAP)分析关键变量对模型预测结果的影响。
通过LASSO回归和Boruta特征选择共筛选出七个与死亡显著相关的变量,包括白细胞计数(WBC)、中性粒细胞百分比(Neut%)、C反应蛋白(CRP)、D-二聚体、糖化血清蛋白(GSP)、N末端B型脑钠肽原(NT-ProBNP)和体重指数(BMI)。在所有模型中,极端梯度提升(XGB)模型表现最佳,在验证集上的曲线下面积(AUC)值为0.933,灵敏度为0.79,特异性为0.89,召回率为0.79,F1得分为0.59。SHAP分析表明,CRP、BMI、NT-ProBNP、D-二聚体和GSP是死亡的主要影响因素。
本研究成功构建了高龄HF患者院内死亡风险的预测模型,且XGB模型表现出优异的预测性能。该模型可用于临床早期识别高危患者,从而为个体化治疗策略提供支持。