Zhang Wanyue, Chang Yongjian, Ding Yuan, Zhu Yinnan, Zhao Yawen, Shi Ruihua
Department of Medical School, Southeast University, Nanjing 210009, China.
Department of Gastroenterology, Southeast University Affiliated Zhongda Hospital, No. 87 Dingjiaqiao, Nanjing 210009, China.
J Clin Med. 2023 Feb 21;12(5):1718. doi: 10.3390/jcm12051718.
To develop binary and quaternary classification prediction models in patients with severe acute pancreatitis (SAP) using machine learning methods, so that doctors can evaluate the risk of patients with acute respiratory distress syndrome (ARDS) and severe ARDS at an early stage.
A retrospective study was conducted on SAP patients hospitalized in our hospital from August 2017 to August 2022. Logical Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Decision Tree (DT), and eXtreme Gradient Boosting (XGB) were used to build the binary classification prediction model of ARDS. Shapley Additive explanations (SHAP) values were used to interpret the machine learning model, and the model was optimized according to the interpretability results of SHAP values. Combined with the optimized characteristic variables, four-class classification models, including RF, SVM, DT, XGB, and Artificial Neural Network (ANN), were constructed to predict mild, moderate, and severe ARDS, and the prediction effects of each model were compared.
The XGB model showed the best effect (AUC = 0.84) in the prediction of binary classification (ARDS or non-ARDS). According to SHAP values, the prediction model of ARDS severity was constructed with four characteristic variables (PaO/FiO, APACHE II, SOFA, AMY). Among them, the overall prediction accuracy of ANN is 86%, which is the best.
Machine learning has a good effect in predicting the occurrence and severity of ARDS in SAP patients. It can also provide a valuable tool for doctors to make clinical decisions.
运用机器学习方法构建重症急性胰腺炎(SAP)患者的二元和四元分类预测模型,以便医生在早期评估急性呼吸窘迫综合征(ARDS)和重度ARDS患者的风险。
对2017年8月至2022年8月在我院住院的SAP患者进行回顾性研究。采用逻辑回归(LR)、随机森林(RF)、支持向量机(SVM)、决策树(DT)和极端梯度提升(XGB)构建ARDS的二元分类预测模型。使用Shapley加性解释(SHAP)值解释机器学习模型,并根据SHAP值的可解释性结果对模型进行优化。结合优化后的特征变量,构建包括RF、SVM、DT、XGB和人工神经网络(ANN)在内的四分类模型,用于预测轻度、中度和重度ARDS,并比较各模型的预测效果。
在二元分类(ARDS或非ARDS)预测中,XGB模型效果最佳(AUC = 0.84)。根据SHAP值,用四个特征变量(PaO/FiO、APACHE II、SOFA、AMY)构建了ARDS严重程度预测模型。其中,ANN的总体预测准确率为86%,为最佳。
机器学习在预测SAP患者ARDS的发生和严重程度方面效果良好。它还可为医生做出临床决策提供有价值的工具。