Department of Biomedical Engineering & Biotechnology, Khalifa University, Abu Dhabi, United Arab Emirates.
Sci Rep. 2024 Feb 14;14(1):3687. doi: 10.1038/s41598-024-54375-4.
Chronic kidney disease (CKD) is a major worldwide health problem, affecting a large proportion of the world's population and leading to higher morbidity and death rates. The early stages of CKD sometimes present without visible symptoms, causing patients to be unaware. Early detection and treatments are critical in reducing complications and improving the overall quality of life for people afflicted. In this work, we investigate the use of an explainable artificial intelligence (XAI)-based strategy, leveraging clinical characteristics, to predict CKD. This study collected clinical data from 491 patients, comprising 56 with CKD and 435 without CKD, encompassing clinical, laboratory, and demographic variables. To develop the predictive model, five machine learning (ML) methods, namely logistic regression (LR), random forest (RF), decision tree (DT), Naïve Bayes (NB), and extreme gradient boosting (XGBoost), were employed. The optimal model was selected based on accuracy and area under the curve (AUC). Additionally, the SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) algorithms were utilized to demonstrate the influence of the features on the optimal model. Among the five models developed, the XGBoost model achieved the best performance with an AUC of 0.9689 and an accuracy of 93.29%. The analysis of feature importance revealed that creatinine, glycosylated hemoglobin type A1C (HgbA1C), and age were the three most influential features in the XGBoost model. The SHAP force analysis further illustrated the model's visualization of individualized CKD predictions. For further insights into individual predictions, we also utilized the LIME algorithm. This study presents an interpretable ML-based approach for the early prediction of CKD. The SHAP and LIME methods enhance the interpretability of ML models and help clinicians better understand the rationale behind the predicted outcomes more effectively.
慢性肾脏病(CKD)是一个全球性的主要健康问题,影响了世界上很大一部分人口,导致更高的发病率和死亡率。CKD 的早期阶段有时没有明显的症状,使患者无法察觉。早期发现和治疗对于减少并发症和提高患者的整体生活质量至关重要。在这项工作中,我们研究了使用基于可解释人工智能(XAI)的策略,利用临床特征来预测 CKD。这项研究从 491 名患者中收集了临床数据,包括 56 名 CKD 患者和 435 名非 CKD 患者,涵盖了临床、实验室和人口统计学变量。为了开发预测模型,我们使用了五种机器学习(ML)方法,即逻辑回归(LR)、随机森林(RF)、决策树(DT)、朴素贝叶斯(NB)和极端梯度提升(XGBoost)。根据准确性和曲线下面积(AUC)选择了最优模型。此外,还使用了 SHAP(SHapley Additive exPlanations)和 LIME(Local Interpretable Model-agnostic Explanations)算法来展示特征对最优模型的影响。在所开发的五个模型中,XGBoost 模型的性能最佳,AUC 为 0.9689,准确性为 93.29%。特征重要性分析表明,肌酐、糖化血红蛋白 A1C(HgbA1C)和年龄是 XGBoost 模型中三个最具影响力的特征。SHAP 力分析进一步说明了模型对个体化 CKD 预测的可视化。为了进一步深入了解个体预测,我们还使用了 LIME 算法。本研究提出了一种基于可解释机器学习的 CKD 早期预测方法。SHAP 和 LIME 方法增强了机器学习模型的可解释性,有助于临床医生更有效地理解预测结果背后的原理。