随机森林驱动的重症炎症性肠病护理中的死亡率预测：一种整合共病模式和实时生理指标的双数据库模型

Random forest-driven mortality prediction in critical IBD care: a dual-database model integrating comorbidity patterns and real-time physiometrics.

作者信息

Zhang Zhenze, Zhao Caiqing, Zhou Yijun, Yao Ling, Liu Peng, Fang Ziling, Fang Nian

机构信息

Clinical Graduate School, Jiangxi Medical College, Nanchang University, Nanchang, China.

The 1st Affiliated Hospital, Nanchang University, Nanchang, China.

出版信息

Front Med (Lausanne). 2025 Aug 8;12:1624899. doi: 10.3389/fmed.2025.1624899. eCollection 2025.

DOI:10.3389/fmed.2025.1624899

PMID:40861216

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12370684/

Abstract

BACKGROUND

Inflammatory bowel disease (IBD) poses significant mortality risks for critically ill patients requiring intensive care unit (ICU) admission, driven by complications such as malnutrition, thromboembolism, and multi-organ dysfunction. Current prognostic tools for mortality prediction in this population remain limited. Machine learning (ML) offers advantages in handling complex clinical data but has not been systematically applied to this high-risk cohort. This multicenter study aimed to develop and validate ML-based models for mortality risk stratification in critically ill IBD patients using large-scale ICU databases.

METHODS

Data from 551 IBD patients in the MIMIC-IV database (2008-2019) were analyzed, with external validation using the eICU dataset. Nine ML algorithms (XGBoost, logistic regression, LightGBM, random forest, decision tree, elastic net, MLP, KNN, RSVM) were trained to predict 1-year mortality. Predictors included demographics, comorbidities, laboratory parameters, vital signs, and disease severity scores. Missing data (<30%) were imputed using random forest. The cohort was split into training (75%) and internal testing (25%) sets, with hyperparameter optimization via 5-fold cross-validation. Model performance was evaluated using AUC, sensitivity, specificity, and calibration curves. The SHAP framework was integrated with predictive analytics to systematically evaluate key determinants of mortality risk through quantitative feature importance analysis. A nomogram was constructed based on key predictors identified through logistic regression.

RESULTS

The random forest model achieved superior discrimination in internal validation (AUC > 0.8). Nine predictors were identified: malignancy history, Charlson Comorbidity Index (CCI), Red Cell Distribution Width (Rdw), Glasgow Coma Scale (GCS), Sequential Organ Failure Assessment (Sofa), age, heart rate, weight and gender. The nomogram demonstrated robust external validation performance in the eICU cohort (AUC > 0.8).

CONCLUSION

We developed and validated a machine learning-based nomogram to predict mortality in critically ill IBD patients, integrating interpretable predictors from multicenter ICU data.

摘要

背景

炎症性肠病（IBD）对需要入住重症监护病房（ICU）的重症患者构成了重大死亡风险，其并发症如营养不良、血栓栓塞和多器官功能障碍是导致死亡的原因。目前用于该人群死亡率预测的预后工具仍然有限。机器学习（ML）在处理复杂临床数据方面具有优势，但尚未系统地应用于这一高危队列。这项多中心研究旨在利用大规模ICU数据库开发并验证基于ML的模型，用于对重症IBD患者进行死亡风险分层。

方法

分析了MIMIC-IV数据库（2008 - 2019年）中551例IBD患者的数据，并使用eICU数据集进行外部验证。训练了九种ML算法（XGBoost、逻辑回归、LightGBM、随机森林、决策树、弹性网络、多层感知器、K近邻、支持向量机回归）来预测1年死亡率。预测指标包括人口统计学特征、合并症、实验室参数、生命体征和疾病严重程度评分。使用随机森林对缺失数据（<30%）进行插补。将队列分为训练集（75%）和内部测试集（25%），通过5折交叉验证进行超参数优化。使用AUC、敏感性、特异性和校准曲线评估模型性能。将SHAP框架与预测分析相结合，通过定量特征重要性分析系统地评估死亡风险的关键决定因素。基于通过逻辑回归确定的关键预测指标构建了列线图。

结果

随机森林模型在内部验证中表现出卓越的区分能力（AUC > 0.8）。确定了九个预测指标：恶性肿瘤病史、查尔森合并症指数（CCI）、红细胞分布宽度（Rdw）、格拉斯哥昏迷量表（GCS）、序贯器官衰竭评估（Sofa）、年龄、心率、体重和性别。该列线图在eICU队列中表现出强大的外部验证性能（AUC > 0.8）。