Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
Toos Institute of Higher Education, Mashhad, Iran.
Sci Rep. 2024 Feb 10;14(1):3406. doi: 10.1038/s41598-024-54038-4.
This study addresses the challenges associated with emergency department (ED) overcrowding and emphasizes the need for efficient risk stratification tools to identify high-risk patients for early intervention. While several scoring systems, often based on logistic regression (LR) models, have been proposed to indicate patient illness severity, this study aims to compare the predictive performance of ensemble learning (EL) models with LR for in-hospital mortality in the ED. A cross-sectional single-center study was conducted at the ED of Imam Reza Hospital in northeast Iran from March 2016 to March 2017. The study included adult patients with one to three levels of emergency severity index. EL models using Bagging, AdaBoost, random forests (RF), Stacking and extreme gradient boosting (XGB) algorithms, along with an LR model, were constructed. The training and validation visits from the ED were randomly divided into 80% and 20%, respectively. After training the proposed models using tenfold cross-validation, their predictive performance was evaluated. Model performance was compared using the Brier score (BS), The area under the receiver operating characteristics curve (AUROC), The area and precision-recall curve (AUCPR), Hosmer-Lemeshow (H-L) goodness-of-fit test, precision, sensitivity, accuracy, F1-score, and Matthews correlation coefficient (MCC). The study included 2025 unique patients admitted to the hospital's ED, with a total percentage of hospital deaths at approximately 19%. In the training group and the validation group, 274 of 1476 (18.6%) and 152 of 728 (20.8%) patients died during hospitalization, respectively. According to the evaluation of the presented framework, EL models, particularly Bagging, predicted in-hospital mortality with the highest AUROC (0.839, CI (0.802-0.875)) and AUCPR = 0.64 comparable in terms of discrimination power with LR (AUROC (0.826, CI (0.787-0.864)) and AUCPR = 0.61). XGB achieved the highest precision (0.83), sensitivity (0.831), accuracy (0.842), F1-score (0.833), and the highest MCC (0.48). Additionally, the most accurate models in the unbalanced dataset belonged to RF with the lowest BS (0.128). Although all studied models overestimate mortality risk and have insufficient calibration (P > 0.05), stacking demonstrated relatively good agreement between predicted and actual mortality. EL models are not superior to LR in predicting in-hospital mortality in the ED. Both EL and LR models can be considered as screening tools to identify patients at risk of mortality.
这项研究旨在探讨与急诊部门(ED)过度拥挤相关的挑战,并强调需要有效的风险分层工具来识别需要早期干预的高危患者。虽然已经提出了几种评分系统,通常基于逻辑回归(LR)模型,用于指示患者疾病的严重程度,但本研究旨在比较集成学习(EL)模型与 LR 在 ED 住院死亡率预测中的表现。这是一项在伊朗东北部伊玛目礼萨医院 ED 进行的横断面单中心研究,研究时间为 2016 年 3 月至 2017 年 3 月。该研究纳入了急诊严重程度指数为 1-3 级的成年患者。使用 Bagging、AdaBoost、随机森林(RF)、堆叠和极端梯度提升(XGB)算法构建了 EL 模型和 LR 模型。使用 ED 的训练和验证访问随机分为 80%和 20%。在使用十折交叉验证对提出的模型进行训练后,评估了它们的预测性能。使用 Brier 评分(BS)、接收器操作特征曲线下的面积(AUROC)、面积和精度-召回曲线下的面积(AUCPR)、Hosmer-Lemeshow(H-L)拟合优度检验、精度、灵敏度、准确性、F1 评分和 Matthews 相关系数(MCC)比较模型性能。该研究纳入了 2025 名因各种疾病被送入该医院 ED 的独特患者,住院期间总死亡率约为 19%。在训练组和验证组中,1476 名患者中有 274 名(18.6%)和 728 名患者中有 152 名(20.8%)在住院期间死亡。根据所提出的框架评估,EL 模型,特别是 Bagging,预测住院死亡率的 AUROC(0.839,CI(0.802-0.875))和 AUCPR=0.64 最高,在判别能力方面与 LR(AUROC(0.826,CI(0.787-0.864))和 AUCPR=0.61)相当。XGB 实现了最高的精度(0.83)、灵敏度(0.831)、准确性(0.842)、F1 评分(0.833)和最高的 MCC(0.48)。此外,在不平衡数据集,RF 模型的准确性最高,BS 最低(0.128)。虽然所有研究的模型都高估了死亡率风险,且校准不足(P>0.05),但堆叠模型显示了预测死亡率与实际死亡率之间的相对良好的一致性。EL 模型在预测 ED 住院死亡率方面并不优于 LR。EL 和 LR 模型都可以作为识别有死亡风险的患者的筛选工具。