Zhuangli Li, Xingcheng Zhang, Xiaoli Zhang, Zhonghua Lu, Yun Sun
The First Department of Critical Care Medicine, The Second Affiliated Hospital of Anhui Medical University, Hefei, China.
Department of Critical Care Medicine, The 901 Hospital of the Joint Logistic Support Force of the Chinese People's Liberation Army, Clinic College, Anhui Medical University, Hefei, China.
Front Med (Lausanne). 2025 Jul 21;12:1592051. doi: 10.3389/fmed.2025.1592051. eCollection 2025.
Acute pancreatitis (AP) in the intensive care unit (ICU) is linked to elevated in-hospital mortality rates. Timely identification of high-risk patients remains challenging. This study aimed to develop an interpretable machine learning model for predicting in-hospital mortality in ICU patients with AP and to identify key contributing factors.
A retrospective analysis was performed on 306 ICU patients diagnosed with AP. After data preprocessing and feature selection via the Least Absolute Shrinkage and Selection Operator (LASSO), seven machine learning models were developed: decision tree, random forest, XGBoost, support vector machine (SVM), multilayer perceptron, k-nearest neighbors (KNN), and logistic regression. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC), Brier score, calibration plots, and decision curve analysis (DCA). The SHapley Additive exPlanations (SHAP) framework was utilized to interpret model predictions and assess feature importance rankings.
Multivariate logistic regression analysis identified the following independent risk factors for in-hospital mortality in ICU patients with AP: acute physiology and chronic health evaluation (APACHE II) score, activated partial thromboplastin time (APTT), albumin (Alb), blood urea nitrogen (BUN), creatinine (Cr), use of vasoactive agents, and ICU length of stay. The AUC values for the seven machine learning models in the training set were DT (0.947), RF (0.900), XGBoost (0.887), SVM (0.901), MLP (0.837), KNN (0.983), and LR (0.876). In the validation set, the corresponding AUC values were DT (0.698), RF (0.850), XGBoost (0.878), SVM (0.892), MLP (0.822), KNN (0.755), and LR (0.858). Although DT and KNN demonstrated high sensitivity and specificity in the training set, their performance was suboptimal in the validation set. SHAP analysis ranked APACHE II score as the most influential predictor of mortality.
An interpretable SVM model incorporating routinely available clinical variables effectively predicts in-hospital mortality in ICU patients with AP. SHAP-enhanced interpretation highlights key predictors and enhances model transparency, supporting clinical decision-making.
重症监护病房(ICU)中的急性胰腺炎(AP)与住院死亡率升高相关。及时识别高危患者仍然具有挑战性。本研究旨在开发一种可解释的机器学习模型,用于预测ICU中AP患者的住院死亡率,并确定关键影响因素。
对306例诊断为AP的ICU患者进行回顾性分析。通过最小绝对收缩和选择算子(LASSO)进行数据预处理和特征选择后,开发了七种机器学习模型:决策树、随机森林、XGBoost、支持向量机(SVM)、多层感知器、k近邻(KNN)和逻辑回归。使用受试者操作特征曲线下面积(AUC)、布里尔评分、校准图和决策曲线分析(DCA)评估模型性能。利用SHapley加性解释(SHAP)框架解释模型预测并评估特征重要性排名。
多因素逻辑回归分析确定了ICU中AP患者住院死亡的以下独立危险因素:急性生理与慢性健康状况评估(APACHE II)评分、活化部分凝血活酶时间(APTT)、白蛋白(Alb)、血尿素氮(BUN)、肌酐(Cr)、血管活性药物的使用以及ICU住院时间。训练集中七种机器学习模型的AUC值分别为:决策树(DT,0.947)、随机森林(RF,0.900)、XGBoost(0.887)、支持向量机(SVM,0.901)、多层感知器(MLP,0.837)、k近邻(KNN,0.983)和逻辑回归(LR,0.876)。在验证集中,相应的AUC值分别为:决策树(DT,0.698)、随机森林(RF,0.850)、XGBoost(0.878)、支持向量机(SVM,0.892)、多层感知器(MLP,0.822)、k近邻(KNN,0.755)和逻辑回归(LR,0.858)。尽管决策树和k近邻在训练集中表现出高敏感性和特异性,但其在验证集中的性能欠佳。SHAP分析将APACHE II评分列为死亡率最具影响力的预测因素。
一个纳入常规可用临床变量的可解释支持向量机模型能有效预测ICU中AP患者的住院死亡率。SHAP增强解释突出了关键预测因素并提高了模型透明度,有助于临床决策。