Du Zhaohui, Ying Qiaoling, Yang Yisen, Ma Huicong, Zhao Hongchang, Yang Jie, Wang Zhenjie, Zheng Chuanming, Wang Shurui, Tang Qiang
Department of Emergency Surgery, The First Affiliated Hospital of Bengbu Medical University, Bengbu, Anhui, China.
Department of Radiation Oncology, The First Affiliated Hospital of Bengbu Medical University, Bengbu, Anhui, China.
Front Med (Lausanne). 2025 Aug 12;12:1638097. doi: 10.3389/fmed.2025.1638097. eCollection 2025.
Acute Pancreatitis-Associated Lung Injury (APALI) is one of the most severe and life-threatening systemic complications in acute pancreatitis patients, with high rates of morbidity and mortality. This study aims to develop a prediction model for the diagnosis of APALI based on machine learning algorithms.
This study included data from the First Affiliated Hospital of Bengbu Medical College (July 2012 to June 2022), which were randomly categorized into the training and testing set. And data from the Second Affiliated Hospital of Zhejiang University (January 2018 to April 2023) served as the external validation set. LASSO regression was applied to eliminate irrelevant or highly collinear independent variables. Six machine learning models were constructed, with evaluation metrics including Area Under Curve (AUC), accuracy, sensitivity, specificity, F1 score, and recall. The impact of model features was analyzed using SHapley Additive exPlanations (SHAP).
A total of 1,975 patients with acute pancreatitis were randomly assigned to a training set (1,480 patients) and a testing set (495 patients). In the training set, 480 cases (32.43%) were diagnosed with APALI. The eXtreme Gradient Boosting (XGBoost) and Random Forest (RF) models demonstrated the best predictive performance, achieving the highest AUC (0.92 and 0.914, respectively), along with higher accuracy, F1 score, and recall in the testing set. Six particularly influential factors were identified and ranked as follows: CRP, BMI, neutrophil, calcium, lactate, and neutrophil-to-albumin ratio (NAR). The global interpretability of the XGBoost and RF models, along with these six features, is shown in the SHAP summary plot. These two models were selected as the optimal models for the development of an online calculator for clinical applications and risk stratification.
We developed and internally validated a machine learning model to predict APALI, showing strong performance in our study population. To support further research and clinical use, we created an open-access web-based risk calculator. Prospective multicenter validation is needed to confirm generalizability. If successful, the tool may support early risk identification and guide interventions to prevent APALI.
急性胰腺炎相关性肺损伤(APALI)是急性胰腺炎患者中最严重且危及生命的全身并发症之一,发病率和死亡率很高。本研究旨在基于机器学习算法开发一种用于诊断APALI的预测模型。
本研究纳入了蚌埠医学院第一附属医院的数据(2012年7月至2022年6月),这些数据被随机分为训练集和测试集。浙江大学医学院附属第二医院的数据(2018年1月至2023年4月)用作外部验证集。应用LASSO回归消除无关或高度共线的自变量。构建了六个机器学习模型,评估指标包括曲线下面积(AUC)、准确率、敏感性、特异性、F1分数和召回率。使用SHapley值相加解释法(SHAP)分析模型特征的影响。
总共1975例急性胰腺炎患者被随机分配到训练集(1480例患者)和测试集(495例患者)。在训练集中,480例(32.43%)被诊断为APALI。极端梯度提升(XGBoost)和随机森林(RF)模型表现出最佳预测性能,在测试集中达到了最高的AUC(分别为0.92和0.914),同时具有更高的准确率、F1分数和召回率。确定了六个特别有影响的因素,并按以下顺序排列:CRP、BMI、中性粒细胞、钙、乳酸和中性粒细胞与白蛋白比值(NAR)。XGBoost和RF模型以及这六个特征的全局可解释性在SHAP汇总图中显示。这两个模型被选为开发临床应用和风险分层在线计算器的最佳模型。
我们开发并内部验证了一种用于预测APALI的机器学习模型,在我们的研究人群中表现出强大的性能。为支持进一步研究和临床应用,我们创建了一个基于网络的开放获取风险计算器。需要进行前瞻性多中心验证以确认其普遍性。如果成功,该工具可能有助于早期风险识别并指导预防APALI的干预措施。