Suppr超能文献

用于预测持续性脓毒症相关急性肾损伤的可解释机器学习模型:开发与验证研究

Explainable Machine Learning Model for Predicting Persistent Sepsis-Associated Acute Kidney Injury: Development and Validation Study.

作者信息

Jiang Wei, Zhang Yaosheng, Weng Jiayi, Song Lin, Liu Siqi, Li Xianghui, Xu Shiqi, Shi Keran, Li Luanluan, Zhang Chuanqing, Wang Jing, Yuan Quan, Zhang Yongwei, Shao Jun, Yu Jiangquan, Zheng Ruiqiang

机构信息

Department of Critical Care Medicine, Northern Jiangsu People's Hospital Affiliated to Yangzhou University, Yangzhou, China.

School of Clinical and Basic Medicine, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, China.

出版信息

J Med Internet Res. 2025 Apr 28;27:e62932. doi: 10.2196/62932.

Abstract

BACKGROUND

Persistent sepsis-associated acute kidney injury (SA-AKI) shows poor clinical outcomes and remains a therapeutic challenge for clinicians. Early identification and prediction of persistent SA-AKI are crucial.

OBJECTIVE

The aim of this study was to develop and validate an interpretable machine learning (ML) model that predicts persistent SA-AKI and to compare its diagnostic performance with that of C-C motif chemokine ligand 14 (CCL14) in a prospective cohort.

METHODS

The study used 4 retrospective cohorts and 1 prospective cohort for model derivation and validation. The derivation cohort used the MIMIC-IV database, which was randomly split into 2 parts (80% for model construction and 20% for internal validation). External validation was conducted using subsets of the MIMIC-III dataset and e-ICU dataset, and retrospective cohorts from the intensive care unit (ICU) of Northern Jiangsu People's Hospital. Prospective data from the same ICU were used for validation and comparison with urinary CCL14 biomarker measurements. Acute kidney injury (AKI) was defined based on serum creatinine and urine output, using the Kidney Disease: Improving Global Outcomes (KDIGO) criteria. Routine clinical data within the first 24 hours of ICU admission were collected, and 8 ML algorithms were used to construct the prediction model. Multiple evaluation metrics, including area under the receiver operating characteristic curve (AUC), were used to compare predictive performance. Feature importance was ranked using Shapley Additive Explanations (SHAP), and the final model was explained accordingly. In addition, the model was developed into a web-based application using the Streamlit framework to facilitate its clinical application.

RESULTS

A total of 46,097 patients with sepsis from multiple cohorts were enrolled for analysis. Among 17,928 patients with sepsis in the derivation cohort, 8081 patients (45.1%) showed progression to persistent SA-AKI. Among the 8 ML models, the gradient boosting machine (GBM) model demonstrated superior discriminative ability. Following feature importance ranking, a final interpretable GBM model comprising 12 features (AKI stage, ΔCreatinine, urine output, furosemide dose, BMI, Sequential Organ Failure Assessment score, kidney replacement therapy, mechanical ventilation, lactate, blood urea nitrogen, prothrombin time, and age) was established. The final model accurately predicted the occurrence of persistent SA-AKI in both internal (AUC=0.870) and external validation cohorts (MIMIC-III subset: AUC=0.891; e-ICU dataset: AUC=0.932; Northern Jiangsu People's Hospital retrospective cohort: AUC=0.983). In the prospective cohort, the GBM model outperformed urinary CCL14 in predicting persistent SA-AKI (GBM AUC=0.852 vs CCL14 AUC=0.821). The model has been transformed into an online clinical tool to facilitate its application in clinical settings.

CONCLUSIONS

The interpretable GBM model was shown to successfully and accurately predict the occurrence of persistent SA-AKI, demonstrating good predictive ability in both internal and external validation cohorts. Furthermore, the model was demonstrated to outperform the biomarker CCL14 in prospective cohort validation.

摘要

背景

持续性脓毒症相关急性肾损伤(SA-AKI)临床预后较差,仍是临床医生面临的治疗挑战。早期识别和预测持续性SA-AKI至关重要。

目的

本研究旨在开发并验证一种可解释的机器学习(ML)模型,用于预测持续性SA-AKI,并在前瞻性队列中比较其与C-C基序趋化因子配体14(CCL14)的诊断性能。

方法

本研究使用4个回顾性队列和1个前瞻性队列进行模型推导和验证。推导队列使用MIMIC-IV数据库,该数据库随机分为两部分(80%用于模型构建,20%用于内部验证)。使用MIMIC-III数据集和e-ICU数据集的子集以及苏北人民医院重症监护病房(ICU)的回顾性队列进行外部验证。来自同一ICU的前瞻性数据用于验证并与尿CCL14生物标志物测量结果进行比较。根据血清肌酐和尿量,采用改善全球肾脏病预后(KDIGO)标准定义急性肾损伤(AKI)。收集ICU入院后24小时内的常规临床数据,并使用8种ML算法构建预测模型。使用包括受试者工作特征曲线下面积(AUC)在内的多个评估指标比较预测性能。使用Shapley加性解释(SHAP)对特征重要性进行排名,并据此解释最终模型。此外,使用Streamlit框架将该模型开发为基于网络的应用程序,以促进其临床应用。

结果

共纳入来自多个队列的46097例脓毒症患者进行分析。在推导队列的17928例脓毒症患者中,8081例(45.1%)进展为持续性SA-AKI。在8种ML模型中,梯度提升机(GBM)模型表现出卓越的判别能力。在进行特征重要性排名后,建立了一个最终的可解释GBM模型,该模型包含12个特征(AKI分期、肌酐变化量、尿量、呋塞米剂量、体重指数、序贯器官衰竭评估评分、肾脏替代治疗、机械通气、乳酸、血尿素氮、凝血酶原时间和年龄)。最终模型在内部(AUC=0.870)和外部验证队列(MIMIC-III子集:AUC=0.891;e-ICU数据集:AUC=0.932;苏北人民医院回顾性队列:AUC=0.983)中均能准确预测持续性SA-AKI的发生。在前瞻性队列中,GBM模型在预测持续性SA-AKI方面优于尿CCL14(GBM AUC=0.852 vs CCL14 AUC=0.821)。该模型已转化为在线临床工具,以促进其在临床环境中的应用。

结论

可解释的GBM模型被证明能够成功且准确地预测持续性SA-AKI的发生,在内部和外部验证队列中均显示出良好的预测能力。此外,在前瞻性队列验证中,该模型表现优于生物标志物CCL14。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4516/12070005/870ee2d994f1/jmir_v27i1e62932_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验