用于预测脓毒症患者脓毒症相关肝损伤的监督式机器学习模型：基于多中心队列研究的开发与验证研究

Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.

作者信息

Lei Jingchao, Zhai Jia, Zhang Yao, Qi Jing, Sun Chuanzheng

机构信息

Third Xiangya Hospital of Central South University, Changsha, China.

出版信息

J Med Internet Res. 2025 May 26;27:e66733. doi: 10.2196/66733.

DOI:10.2196/66733

PMID:40418571

Abstract

BACKGROUND

Sepsis-associated liver injury (SALI) is a severe complication of sepsis that contributes to increased mortality and morbidity. Early identification of SALI can improve patient outcomes; however, sepsis heterogeneity makes timely diagnosis challenging. Traditional diagnostic tools are often limited, and machine learning techniques offer promising solutions for predicting adverse outcomes in patients with sepsis.

OBJECTIVE

This study aims to develop an explainable machine learning model, incorporating stacking techniques, to predict the occurrence of liver injury in patients with sepsis and provide decision support for early intervention and personalized treatment strategies.

METHODS

This retrospective multicenter cohort study adhered to the TRIPOD+AI (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis, Extended for Artificial Intelligence) guidelines. Data from 8834 patients with sepsis in the Medical Information Mart for Intensive Care IV (MIMIC-IV) database were used for training and internal validation, while data from 4236 patients in the eICU-Collaborative Research Database (eICU-CRD) database were used for external validation. SALI was defined as an international normalized ratio >1.5 and total bilirubin >2 mg/dL within 1 week of intensive care unit admission. Nine machine learning models-decision tree, random forest (RF), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), support vector machine, elastic net, logistic regression, multilayer perceptron, and k-nearest neighbors-were trained. A stacking ensemble model, using LightGBM, XGBoost, and RF as base learners and Lasso regression as the meta-model, was optimized via 10-fold cross-validation. Hyperparameters were tuned using grid search and Bayesian optimization. Model performance was evaluated using accuracy, balanced accuracy, Brier score, detection prevalence, F1-score, Jaccard index, κ coefficient, Matthews correlation coefficient, negative predictive value, positive predictive value, precision, recall, area under the receiver operating characteristic curve (ROC-AUC), precision-recall AUC, and decision curve analysis. Shapley additive explanations (SHAP) values were used to quantify feature importance.

RESULTS

In the training set, LightGBM, XGBoost, and RF demonstrated the best performance among all models, with ROC-AUCs of 0.9977, 0.9311, and 0.9847, respectively. These models exhibited minimal variance in cross-validation, with tightly clustered ROC-AUC and precision-recall area under the curve distributions. In the internal validation set, LightGBM (ROC-AUC 0.8401) and XGBoost (ROC-AUC 0.8403) outperformed all other models, while RF achieved an ROC-AUC of 0.8193. In the external validation set, LightGBM (ROC-AUC 0.7077), XGBoost (ROC-AUC 0.7169), and RF (ROC-AUC 0.7081) maintained strong performance, although with slight decreases in ROC-AUC compared with the training set. The stacking model achieved ROC-AUCs of 0.995, 0.838, and 0.721 in the training, internal validation, and external validation sets, respectively. Key predictors-total bilirubin, lactate, prothrombin time, and mechanical ventilation status-were consistently identified across models, with SHAP analysis highlighting their significant contributions to the model's predictions.

CONCLUSIONS

The stacking ensemble model developed in this study yields accurate and robust predictions of SALI in patients with sepsis, demonstrating potential clinical utility for early intervention and personalized treatment strategies.

摘要

背景

脓毒症相关肝损伤（SALI）是脓毒症的一种严重并发症，会导致死亡率和发病率增加。早期识别SALI可改善患者预后；然而，脓毒症的异质性使得及时诊断具有挑战性。传统诊断工具往往存在局限性，而机器学习技术为预测脓毒症患者的不良结局提供了有前景的解决方案。

目的

本研究旨在开发一种可解释的机器学习模型，结合堆叠技术，以预测脓毒症患者肝损伤的发生，并为早期干预和个性化治疗策略提供决策支持。

方法

这项回顾性多中心队列研究遵循TRIPOD+AI（个体预后或诊断的多变量预测模型的透明报告，扩展至人工智能）指南。重症监护医学信息数据库IV（MIMIC-IV）中8834例脓毒症患者的数据用于训练和内部验证，而电子重症监护病房协作研究数据库（eICU-CRD）中4236例患者的数据用于外部验证。SALI定义为在重症监护病房入院1周内国际标准化比值>1.5且总胆红素>2mg/dL。训练了9种机器学习模型——决策树、随机森林（RF）、极端梯度提升（XGBoost）、轻量级梯度提升机（LightGBM）、支持向量机、弹性网络、逻辑回归、多层感知器和k近邻。通过10折交叉验证优化了一个堆叠集成模型，该模型使用LightGBM、XGBoost和RF作为基学习器，套索回归作为元模型。使用网格搜索和贝叶斯优化调整超参数。使用准确率、平衡准确率、布里尔评分、检测患病率、F1分数、杰卡德指数、κ系数、马修斯相关系数、阴性预测值、阳性预测值(、精确率、召回率、受试者操作特征曲线下面积（ROC-AUC）、精确率-召回率AUC和决策曲线分析来评估模型性能。使用夏普利值加法解释（SHAP）来量化特征重要性。

结果

在训练集中，LightGBM、XGBoost和RF在所有模型中表现最佳，ROC-AUC分别为0.9977、0.9311和0.9847。这些模型在交叉验证中表现出最小的方差，ROC-AUC和曲线下精确率-召回率分布紧密聚集。在内部验证集中，LightGBM（ROC-AUC 0.8401）和XGBoost（ROC-AUC 0.8403）优于所有其他模型，而RF的ROC-AUC为0.8193。在外部验证集中，LightGBM（ROC-AUC 0.7077）(XGBoost（ROC-AUC 0.7169）和RF（ROC-AUC 0.7081）保持了较强的性能，尽管与训练集相比ROC-AUC略有下降。堆叠模型在训练集、内部验证集和外部验证集中的ROC-AUC分别为0.995、0.838和0.721。关键预测因素——总胆红素、乳酸、凝血酶原时间和机械通气状态——在各模型中均一致确定，SHAP分析突出了它们对模型预测的重要贡献。