Wang Yanfeng, Miao Xisha, Xiao Gang, Huang Chun, Sun Junwei, Wang Ying, Li Panlong, You Xu
The School of Electrical and Information Engineering, Zhengzhou University of Light Industry, Zhengzhou, China.
Department of Clinical Laboratory, The Third Affiliated Hospital, Southern Medical University, Guangzhou, China.
Front Genet. 2022 Apr 26;13:889378. doi: 10.3389/fgene.2022.889378. eCollection 2022.
Heart failure (HF) is the main cause of mortality in hemodialysis (HD) patients. However, it is still a challenge for the prediction of HF in HD patients. Therefore, we aimed to establish and validate a prediction model to predict HF events in HD patients. A total of 355 maintenance HD patients from two hospitals were included in this retrospective study. A total of 21 variables, including traditional demographic characteristics, medical history, and blood biochemical indicators, were used. Two classification models were established based on the extreme gradient boosting (XGBoost) algorithm and traditional linear logistic regression. The performance of the two models was evaluated based on calibration curves and area under the receiver operating characteristic curves (AUCs). Feature importance and SHapley Additive exPlanation (SHAP) were used to recognize risk factors from the variables. The Kaplan-Meier curve of each risk factor was constructed and compared with the log-rank test. Compared with the traditional linear logistic regression, the XGBoost model had better performance in accuracy (78.5 vs. 74.8%), sensitivity (79.6 vs. 75.6%), specificity (78.1 vs. 74.4%), and AUC (0.814 vs. 0.722). The feature importance and SHAP value of XGBoost indicated that age, hypertension, platelet count (PLT), C-reactive protein (CRP), and white blood cell count (WBC) were risk factors of HF. These results were further confirmed by Kaplan-Meier curves. The HF prediction model based on XGBoost had a satisfactory performance in predicting HF events, which could prove to be a useful tool for the early prediction of HF in HD.
心力衰竭(HF)是血液透析(HD)患者死亡的主要原因。然而,预测HD患者的HF仍然是一项挑战。因此,我们旨在建立并验证一个预测模型,以预测HD患者的HF事件。本回顾性研究纳入了来自两家医院的355例维持性HD患者。共使用了21个变量,包括传统的人口统计学特征、病史和血液生化指标。基于极端梯度提升(XGBoost)算法和传统线性逻辑回归建立了两个分类模型。基于校准曲线和受试者操作特征曲线下面积(AUC)评估了这两个模型的性能。使用特征重要性和SHapley加性解释(SHAP)从变量中识别危险因素。构建了每个危险因素的Kaplan-Meier曲线,并与对数秩检验进行比较。与传统线性逻辑回归相比,XGBoost模型在准确性(78.5%对74.8%)、敏感性(79.6%对75.6%)、特异性(78.1%对74.4%)和AUC(0.814对0.722)方面表现更好。XGBoost的特征重要性和SHAP值表明,年龄、高血压、血小板计数(PLT)、C反应蛋白(CRP)和白细胞计数(WBC)是HF的危险因素。Kaplan-Meier曲线进一步证实了这些结果。基于XGBoost的HF预测模型在预测HF事件方面具有令人满意的性能,这可能被证明是HD中HF早期预测的有用工具。