机器学习预测胰腺炎患者急性呼吸衰竭：一项回顾性研究。

Machine learning predicts acute respiratory failure in pancreatitis patients: A retrospective study.

机构信息

Department of Hepatobiliary Surgery, Northern Jiangsu People's Hospital Affiliated to Yangzhou University, Yangzhou, Jiangsu 225001, China.

Department of General Surgery, Liangzhou District Hospital of Integrated Traditional Chinese and Western Medicine, Wuwei, Gansu 733000, China.

出版信息

Int J Med Inform. 2024 Dec;192:105629. doi: 10.1016/j.ijmedinf.2024.105629. Epub 2024 Sep 14.

DOI:10.1016/j.ijmedinf.2024.105629

PMID:39321493

Abstract

PURPOSE

The purpose of the research is to design an algorithm to predict the occurrence of acute respiratory failure (ARF) in patients with acute pancreatitis (AP).

METHODS

We collected data on patients with AP in the Medical Information Mart for Intensive Care IV database. The enrolled observations were randomly divided into a 70 % training cohort and a 30 % validation cohort, and the observations in the training cohort were divided into ARF and non-ARF groups. Feature engineering was conducted using random forest (RF) and least absolute shrinkage and selection operator (LASSO) methods in the training cohort. The model building included logistic regression (LR), decision tree (DT), k-nearest neighbours (KNN), naive bayes (NB) and extreme gradient boosting (XGBoost). Parameters for model evaluation include receiver operating characteristic (ROC) curve, precision-recall curve (PRC), calibration curves, positive predictive value (PPV), negative predictive value (NPV), true positive rate (TPR), true negative rate (TNR), accuracy (ACC) and F1 score.

RESULTS

Among 4527 patients, 445 patients (9.8 %) experienced ARF. Ca, ALB, GLR, WBC, AG and BUN have been included in the prediction model as features for predicting ARF. The AUC of XGBoost were 0.86 (95 %CI 0.84-0.88) and 0.87 (95 %CI 0.84-0.90) in the training and validation cohorts. In the training cohort, XGBoost demonstrates a true positive rate (TPR) of 0.662, a true negative rate (TNR) of 0.884, a positive predictive value (PPV) of 0.380, a negative predictive value (NPV) of 0.960, an accuracy (ACC) of 0.862, and an F1 score of 0.483. In the validation cohort, XGBoost shows a TPR of 0.620, a TNR of 0.895, a PPV of 0.399, an NPV of 0.955, an ACC of 0.867, and an F1 score of 0.486.

CONCLUSION

The XGBOOST model demonstrates good discriminatory ability, which enables clinicians to ascertain the probability of developing ARF in AP patients.

摘要

目的

本研究旨在设计一种算法，以预测急性胰腺炎（AP）患者发生急性呼吸衰竭（ARF）的可能性。

方法

我们从医疗信息集市重症监护 IV 数据库中收集了 AP 患者的数据。入组观察值被随机分为 70%的训练队列和 30%的验证队列，并且训练队列中的观察值被分为 ARF 和非 ARF 组。在训练队列中使用随机森林（RF）和最小绝对值收缩和选择算子（LASSO）方法进行特征工程。模型构建包括逻辑回归（LR）、决策树（DT）、k 最近邻（KNN）、朴素贝叶斯（NB）和极端梯度提升（XGBoost）。模型评估的参数包括受试者工作特征（ROC）曲线、精度-召回（PRC）曲线、校准曲线、阳性预测值（PPV）、阴性预测值（NPV）、真阳性率（TPR）、真阴性率（TNR）、准确性（ACC）和 F1 分数。

结果

在 4527 名患者中，445 名（9.8%）患者发生 ARF。Ca、ALB、GLR、WBC、AG 和 BUN 已被纳入预测模型，作为预测 ARF 的特征。XGBoost 在训练和验证队列中的 AUC 分别为 0.86（95%CI 0.84-0.88）和 0.87（95%CI 0.84-0.90）。在训练队列中，XGBoost 的真阳性率（TPR）为 0.662，真阴性率（TNR）为 0.884，阳性预测值（PPV）为 0.380，阴性预测值（NPV）为 0.960，准确性（ACC）为 0.862，F1 分数为 0.483。在验证队列中，XGBoost 显示 TPR 为 0.620，TNR 为 0.895，PPV 为 0.399，NPV 为 0.955，ACC 为 0.867，F1 分数为 0.486。