Chao Hsiao-Yun, Wu Chin-Chieh, Singh Avichandra, Shedd Andrew, Wolfshohl Jon, Chou Eric H, Huang Yhu-Chering, Chen Kuan-Fu
Department of Emergency Medicine, Linkou Chang Gung Memorial Hospital, No. 5, Fu-Shin Street, Gueishan Village, Taoyuan 333423, Taiwan.
Clinical Informatics and Medical Statistics Research Center, Chang Gung University, Taoyuan 33302, Taiwan.
Biomedicines. 2022 Mar 29;10(4):802. doi: 10.3390/biomedicines10040802.
Early recognition of sepsis and the prediction of mortality in patients with infection are important. This multi-center, ED-based study aimed to develop and validate a 28-day mortality prediction model for patients with infection using various machine learning (ML) algorithms.
Patients with acute infection requiring intravenous antibiotic treatment during the first 24 h of admission were prospectively recruited. Patient demographics, comorbidities, clinical signs and symptoms, laboratory test data, selected sepsis-related novel biomarkers, and 28-day mortality were collected and divided into training (70%) and testing (30%) datasets. Logistic regression and seven ML algorithms were used to develop the prediction models. The area under the receiver operating characteristic curve (AUROC) was used to compare different models.
A total of 555 patients were recruited with a full panel of biomarker tests. Among them, 18% fulfilled Sepsis-3 criteria, with a 28-day mortality rate of 8%. The wrapper algorithm selected 30 features, including disease severity scores, biochemical parameters, and conventional and few sepsis-related biomarkers. Random forest outperformed other ML models (AUROC: 0.96; 95% confidence interval: 0.93-0.98) and SOFA and early warning scores (AUROC: 0.64-0.84) in the prediction of 28-day mortality in patients with infection. Additionally, random forest remained the best-performing model, with an AUROC of 0.95 (95% CI: 0.91-0.98, = 0.725) after removing five sepsis-related novel biomarkers.
Our results demonstrated that ML models provide a more accurate prediction of 28-day mortality with an enhanced ability in dealing with multi-dimensional data than the logistic regression model.
早期识别脓毒症以及预测感染患者的死亡率至关重要。这项基于急诊科的多中心研究旨在使用各种机器学习(ML)算法开发并验证一种针对感染患者的28天死亡率预测模型。
前瞻性招募入院后首24小时内需要静脉使用抗生素治疗的急性感染患者。收集患者的人口统计学资料、合并症、临床症状和体征、实验室检查数据、选定的脓毒症相关新型生物标志物以及28天死亡率,并将其分为训练集(70%)和测试集(30%)。使用逻辑回归和七种ML算法来开发预测模型。采用受试者操作特征曲线下面积(AUROC)比较不同模型。
共招募了555例患者并进行了全面的生物标志物检测。其中,18%符合脓毒症-3标准,28天死亡率为8%。包装算法选择了30个特征,包括疾病严重程度评分、生化参数以及传统和少数脓毒症相关生物标志物。在预测感染患者的28天死亡率方面,随机森林模型优于其他ML模型(AUROC:0.96;95%置信区间:0.93 - 0.98)以及序贯器官衰竭评估(SOFA)和早期预警评分(AUROC:0.64 - 0.84)。此外,在去除五个脓毒症相关新型生物标志物后,随机森林仍然是表现最佳的模型,AUROC为0.95(95%CI:0.91 - 0.98,Z = 0.725)。
我们的结果表明,与逻辑回归模型相比,ML模型在处理多维度数据方面能力更强,能更准确地预测28天死亡率。