Department of Emergency Medicine, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin, 300052, P.R. China.
Department of Anesthesiology, The First Affiliated Hospital of Hebei North University, Zhangjiakou, Hebei, 075000, P.R. China.
BMC Surg. 2023 Sep 1;23(1):267. doi: 10.1186/s12893-023-02151-y.
This study aimed to construct predictive models for the risk of sepsis in patients with Acute pancreatitis (AP) using machine learning methods and compared optimal one with the logistic regression (LR) model and scoring systems.
In this retrospective cohort study, data were collected from the Medical Information Mart for Intensive Care III (MIMIC III) database between 2001 and 2012 and the MIMIC IV database between 2008 and 2019. Patients were randomly divided into training and test sets (8:2). The least absolute shrinkage and selection operator (LASSO) regression plus 5-fold cross-validation were used to screen and confirm the predictive factors. Based on the selected predictive factors, 6 machine learning models were constructed, including support vector machine (SVM), K-nearest neighbour (KNN), multi-layer perceptron (MLP), LR, gradient boosting decision tree (GBDT) and adaptive enhancement algorithm (AdaBoost). The models and scoring systems were evaluated and compared using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and the area under the curve (AUC).
A total of 1, 672 patients were eligible for participation. In the training set, 261 AP patients (19.51%) were diagnosed with sepsis. The predictive factors for the risk of sepsis in AP patients included age, insurance, vasopressors, mechanical ventilation, Glasgow Coma Scale (GCS), heart rate, respiratory rate, temperature, SpO2, platelet, red blood cell distribution width (RDW), International Normalized Ratio (INR), and blood urea nitrogen (BUN). The AUC of the GBDT model for sepsis prediction in the AP patients in the testing set was 0.985. The GBDT model showed better performance in sepsis prediction than the LR, systemic inflammatory response syndrome (SIRS) score, bedside index for severity in acute pancreatitis (BISAP) score, sequential organ failure assessment (SOFA) score, quick-SOFA (qSOFA), and simplified acute physiology score II (SAPS II).
The present findings suggest that compared to the classical LR model and SOFA, qSOFA, SAPS II, SIRS, and BISAP scores, the machine learning model-GBDT model had a better performance in predicting sepsis in the AP patients, which is a useful tool for early identification of high-risk patients and timely clinical interventions.
本研究旨在使用机器学习方法为急性胰腺炎(AP)患者的脓毒症风险构建预测模型,并将最优模型与逻辑回归(LR)模型和评分系统进行比较。
在这项回顾性队列研究中,数据来自 2001 年至 2012 年的医疗信息集市强化护理 III(MIMIC III)数据库和 2008 年至 2019 年的 MIMIC IV 数据库。患者被随机分为训练集和测试集(8:2)。采用最小绝对值收缩和选择算子(LASSO)回归加 5 倍交叉验证筛选和确认预测因素。基于选择的预测因素,构建了 6 种机器学习模型,包括支持向量机(SVM)、K-最近邻(KNN)、多层感知机(MLP)、LR、梯度提升决策树(GBDT)和自适应增强算法(AdaBoost)。使用灵敏度、特异性、阳性预测值(PPV)、阴性预测值(NPV)、准确性和曲线下面积(AUC)评估和比较模型和评分系统。
共有 1672 名患者符合纳入标准。在训练集中,261 例 AP 患者(19.51%)被诊断为脓毒症。AP 患者发生脓毒症的风险预测因素包括年龄、保险、血管加压药、机械通气、格拉斯哥昏迷评分(GCS)、心率、呼吸频率、体温、SpO2、血小板、红细胞分布宽度(RDW)、国际标准化比值(INR)和血尿素氮(BUN)。在测试集中,AP 患者 GBDT 模型预测脓毒症的 AUC 为 0.985。与 LR、全身炎症反应综合征(SIRS)评分、床边急性胰腺炎严重程度指数(BISAP)评分、序贯器官衰竭评估(SOFA)评分、快速 SOFA(qSOFA)评分和简化急性生理学评分 II(SAPS II)相比,GBDT 模型在预测 AP 患者脓毒症方面表现更好。
与经典 LR 模型和 SOFA、qSOFA、SAPS II、SIRS 和 BISAP 评分相比,机器学习模型-GBDT 模型在预测 AP 患者脓毒症方面具有更好的性能,这是早期识别高危患者和及时临床干预的有用工具。