Kumar Abhishek, Mondal Sanchita, Khatua Debnarayan, Guha Debashree, Mukherjee Budhaditya, Lahiri Arista, Prasad Dilip K, Sekh Arif Ahmed
School of Medical Science and Technology, IIT Kharagpur, Kharagpur, West Bengal, 721302, India.
Department of Mathematics and Statistics, Vignan's Foundation for Science, Technology and Research, Andhra Pradesh, 522213, India.
Sci Rep. 2025 Jul 1;15(1):21067. doi: 10.1038/s41598-025-07406-7.
Visceral Leishmaniasis (VL), also known as Kala-Azar, poses a significant global public health challenge and is a neglected disease, with relapses and treatment failures leading to increased morbidity and mortality. This study introduces an explainable machine learning approach to predict VL relapse and identify critical risk factors, thereby aiding patient monitoring and treatment strategies. Leveraging data from a follow-up study of 571 patients, the survival machine learning models are applied, including Random Survival Forest (RSF), Survival Support Vector Machine (SSVM), and eXtreme Gradient Boosting (XGBoost), for relapse prediction. The results demonstrated that RSF, with a C-index of 0.85, outperformed the conventional Cox Proportional Hazard (CPH) model (C-index 0.8), offering improved prediction capabilities by capturing non-linear relationships and variable interactions. To address the lack of transparency (in terms of feature importance) in Machine Learning (ML) models, the SHapley Additive exPlanation (SHAP) method is employed, which enhances model interpretability (feature importance) through visual insights. SHAP dependence plots allowed the healthcare professionals to evaluate which factors encourage the occurrence of the relapse. A statistically significant relationship between HIV co-infection (HR=3.92, 95% CI=2.03-7.58) and VL relapse was identified through -2 log-likelihood ratio and chi-square tests. These results indicate the promise of explainable artificial intelligence (XAI) for making clinical decisions and remedying recurrences in VL.
内脏利什曼病(VL),也称为黑热病,是一项重大的全球公共卫生挑战,也是一种被忽视的疾病,复发和治疗失败会导致发病率和死亡率上升。本研究引入了一种可解释的机器学习方法来预测VL复发并识别关键风险因素,从而辅助患者监测和治疗策略。利用对571名患者的随访研究数据,应用生存机器学习模型,包括随机生存森林(RSF)、生存支持向量机(SSVM)和极端梯度提升(XGBoost),进行复发预测。结果表明,C指数为0.85的RSF优于传统的Cox比例风险(CPH)模型(C指数为0.8),通过捕捉非线性关系和变量相互作用,提供了更好的预测能力。为了解决机器学习(ML)模型中缺乏透明度(就特征重要性而言)的问题,采用了Shapley加法解释(SHAP)方法,该方法通过可视化洞察增强了模型的可解释性(特征重要性)。SHAP依赖图使医疗保健专业人员能够评估哪些因素会促使复发的发生。通过-2对数似然比和卡方检验,确定了HIV合并感染(HR=3.92,95%CI=2.03-7.58)与VL复发之间存在统计学显著关系。这些结果表明,可解释人工智能(XAI)在做出临床决策和补救VL复发方面具有前景。