Department of Nephrology, Daping Hospital, Army Medical University, Chongqing, 400042, China.
Teaching Office, Medical Research Department, Army Special Medical Center, Chongqing, China.
BMC Med Inform Decis Mak. 2024 Jan 2;24(1):8. doi: 10.1186/s12911-023-02412-z.
An appropriate prediction model for adverse prognosis before peritoneal dialysis (PD) is lacking. Thus, we retrospectively analysed patients who underwent PD to construct a predictive model for adverse prognoses using machine learning (ML).
A retrospective analysis was conducted on 873 patients who underwent PD from August 2007 to December 2020. A total of 824 patients who met the inclusion criteria were included in the analysis. Five commonly used ML algorithms were used for the initial model training. By using the area under the curve (AUC) and accuracy (ACC), we ranked the indicators with the highest impact and displayed them using the values of Shapley additive explanation (SHAP) version 0.41.0. The top 20 indicators were selected to build a compact model that is conducive to clinical application. All model-building steps were implemented in Python 3.8.3.
At the end of follow-up, 353 patients withdrew from PD (converted to haemodialysis or died), and 471 patients continued receiving PD. In the complete model, the categorical boosting classifier (CatBoost) model exhibited the strongest performance (AUC = 0.80, 95% confidence interval [CI] = 0.76-0.83; ACC: 0.78, 95% CI = 0.72-0.83) and was selected for subsequent analysis. We reconstructed a compression model by extracting 20 key features ranked by the SHAP values, and the CatBoost model still showed the strongest performance (AUC = 0.79, ACC = 0.74).
The CatBoost model, which was built using the intelligent analysis technology of ML, demonstrated the best predictive performance. Therefore, our developed prediction model has potential value in patient screening before PD and hierarchical management after PD.
目前缺乏腹膜透析(PD)前不良预后的合适预测模型。因此,我们回顾性分析了接受 PD 的患者,使用机器学习(ML)构建了不良预后预测模型。
对 2007 年 8 月至 2020 年 12 月期间接受 PD 的 873 例患者进行回顾性分析。符合纳入标准的 824 例患者纳入分析。使用 5 种常用的 ML 算法进行初始模型训练。使用曲线下面积(AUC)和准确性(ACC)对影响最大的指标进行排序,并使用 Shapley 加性解释(SHAP)版本 0.41.0 显示其值。选择前 20 个指标构建有利于临床应用的紧凑模型。所有模型构建步骤均在 Python 3.8.3 中实现。
随访结束时,353 例患者退出 PD(转为血液透析或死亡),471 例患者继续接受 PD。在完整模型中,分类提升分类器(CatBoost)模型表现出最强的性能(AUC=0.80,95%置信区间[CI] = 0.76-0.83;ACC:0.78,95%CI=0.72-0.83),并被选入后续分析。我们通过提取按 SHAP 值排序的 20 个关键特征来重建压缩模型,CatBoost 模型仍表现出最强的性能(AUC=0.79,ACC=0.74)。
使用 ML 智能分析技术构建的 CatBoost 模型表现出最佳预测性能。因此,我们开发的预测模型在 PD 前患者筛选和 PD 后分层管理方面具有潜在价值。