Ning Caihong, Ouyang Hui, Xiao Jie, Wu Di, Sun Zefang, Liu Baiqi, Shen Dingcheng, Hong Xiaoyue, Lin Chiayan, Li Jiarong, Chen Lu, Zhu Shuai, Li Xinying, Xia Fada, Huang Gengwen
Department of General Surgery, Xiangya Hospital, Central South University, Changsha, Hunan Province 410008, China.
National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan Province 410008, China.
EClinicalMedicine. 2025 Jan 22;80:103074. doi: 10.1016/j.eclinm.2025.103074. eCollection 2025 Feb.
Infected pancreatic necrosis (IPN) represents a severe complication of acute pancreatitis, commonly linked with mortality rates ranging from 15% to 35%. However, the present mortality prediction tools for IPN are limited and lack sufficient sensitivity and specificity. This study aims to develop and validate an explainable machine learning (ML) model for death prediction among patients with IPN.
We performed a prospective cohort study of 344 patients with IPN consecutively enrolled from a large Chinese tertiary hospital from January 2011 to January 2023. Ten ML models were developed to predict 90-day mortality in these patients. A benchmarking test, involving nested resampling, automatic hyperparameter tuning and random search techniques, was conducted to select the ML model. Sequential forward selection method was employed to select the optimal feature subset from 31 candidate subsets to simplify the model and maximize predictive performance. The final model was internally validated with the 1000 bootstrap method and externally validated using an independent cohort of 132 patients with IPN retrospectively collected from another Chinese tertiary hospital from January 2018 to January 2023. The SHapley Additive exPlanations (SHAP) method was employed to interpret the model in terms of features importance and features effect. The final model constructed with optimal feature subset was deployed as an interactive web-based Shiny app.
Random survival forest (RSF) model showed the best predictive performance than other 9 ML models (internal validation, C-index = 0.863 [95% CI: 0.854-0.875]; external validation, C-index = 0.857 [95% CI: 0.850-0.865]). Multiple organ failure, Acute Physiology and Chronic Health Examination II (APACHE II) score ≥20, duration of organ failure ≥21 days, bloodstream infection, time from onset to first intervention <30 days, Bedside Index of Severity in Acute Pancreatitis score ≥3, critical acute pancreatitis, age ≥ 50 years, and hemorrhage were 9 most important features associated with mortality. Furthermore, SHAP algorithm revealed insightful nonlinear interactive associations between important predictors and mortality, identifying 9 features pairs with high interaction SHAP value and clinical significance. Two interactive web-based Shiny apps were developed to enhance clinical practicability: https://rsfmodels.shinyapps.io/IPN_app/ for cases where the APACHE II score was available and https://rsfmodels.shinyapps.io/IPNeasy/ for cases where it was not.
An explainable ML model for death prediction among IPN patients was feasible and effective, suggesting its superior potential in guiding clinical management and improving patient outcomes. Two publicly accessible web tools generated for the optimized model facilitated its utility in clinical settings.
The Natural Science Foundation of Hunan Province (2023JJ30885), Postdoctoral Fellowship Program of CPSF (GZB20230872), The Youth Science Foundation of Xiangya Hospital (2023Q13), The Project Program of National Clinical Research Center for Geriatric Disorders of Xiangya Hospital (2021LNJJ19).
感染性胰腺坏死(IPN)是急性胰腺炎的一种严重并发症,死亡率通常在15%至35%之间。然而,目前用于IPN的死亡率预测工具有限,缺乏足够的敏感性和特异性。本研究旨在开发并验证一种可解释的机器学习(ML)模型,用于预测IPN患者的死亡情况。
我们对2011年1月至2023年1月期间从一家大型中国三级医院连续纳入的344例IPN患者进行了一项前瞻性队列研究。开发了10个ML模型来预测这些患者的90天死亡率。进行了一项基准测试,包括嵌套重采样、自动超参数调整和随机搜索技术,以选择ML模型。采用顺序向前选择方法从31个候选子集中选择最佳特征子集,以简化模型并最大化预测性能。最终模型通过1000次自助法进行内部验证,并使用2018年1月至2023年1月从另一家中国三级医院回顾性收集的132例IPN患者的独立队列进行外部验证。采用SHapley加性解释(SHAP)方法从特征重要性和特征效应方面解释模型。将由最佳特征子集构建的最终模型部署为基于网络的交互式Shiny应用程序。
随机生存森林(RSF)模型显示出比其他9个ML模型更好的预测性能(内部验证,C指数 = 0.863 [95% CI:0.854 - 0.875];外部验证,C指数 = 0.857 [95% CI:0.850 - 0.865])。多器官功能衰竭、急性生理与慢性健康状况评分II(APACHE II)≥20、器官功能衰竭持续时间≥21天、血流感染、发病至首次干预时间<30天、急性胰腺炎床边严重程度指数评分≥3、重症急性胰腺炎、年龄≥50岁和出血是与死亡率相关的9个最重要特征。此外,SHAP算法揭示了重要预测因素与死亡率之间有深刻见解的非线性交互关联,识别出9对具有高交互SHAP值和临床意义的特征。开发了两个基于网络的交互式Shiny应用程序以提高临床实用性:https://rsfmodels.shinyapps.io/IPN_app/ 用于可获得APACHE II评分情况,https://rsfmodels.shinyapps.io/IPNeasy/ 用于不可获得该评分的情况。
一种用于预测IPN患者死亡的可解释ML模型是可行且有效的,表明其在指导临床管理和改善患者预后方面具有卓越潜力。为优化模型生成的两个可公开访问的网络工具促进了其在临床环境中的应用。
湖南省自然科学基金(2023JJ30885)、中国博士后科学基金特别资助(GZB20230872)、中南大学湘雅医院青年科学基金(2023Q13)、中南大学湘雅医院国家老年疾病临床研究中心项目(2021LNJJ19)。