De Hond Anne, Raven Wouter, Schinkelshoek Laurens, Gaakeer Menno, Ter Avest Ewoud, Sir Ozcan, Lameijer Heleen, Hessels Roger Apa, Reijnen Resi, De Jonge Evert, Steyerberg Ewout, Nickel Christian H, De Groot Bas
Department of Information Technology and Digital Innovation, Leiden University Medical Centre, Albinusdreef 2, 2300 RC, Leiden, the Netherlands; Clinical AI Implementation and Research Lab, Leiden University Medical Centre, Albinusdreef 2, 2300 RC, Leiden, the Netherlands; Department of Biomedical Data Sciences, Leiden University Medical Centre, Albinusdreef 2, 2300 RC, Leiden, the Netherlands.
Department of Emergency Medicine, Leiden University Medical Centre, Albinusdreef 2, 2300 RC, Leiden, the Netherlands.
Int J Med Inform. 2021 Aug;152:104496. doi: 10.1016/j.ijmedinf.2021.104496. Epub 2021 May 15.
Early identification of emergency department (ED) patients who need hospitalization is essential for quality of care and patient safety. We aimed to compare machine learning (ML) models predicting the hospitalization of ED patients and conventional regression techniques at three points in time after ED registration.
We analyzed consecutive ED patients of three hospitals using the Netherlands Emergency Department Evaluation Database (NEED). We developed prediction models for hospitalization using an increasing number of data available at triage, ∼30 min (including vital signs) and ∼2 h (including laboratory tests) after ED registration, using ML (random forest, gradient boosted decision trees, deep neural networks) and multivariable logistic regression analysis (including spline transformations for continuous predictors). Demographics, urgency, presenting complaints, disease severity and proxies for comorbidity, and complexity were used as covariates. We compared the performance using the area under the ROC curve in independent validation sets from each hospital.
We included 172,104 ED patients of whom 66,782 (39 %) were hospitalized. The AUC of the multivariable logistic regression model was 0.82 (0.78-0.86) at triage, 0.84 (0.81-0.86) at ∼30 min and 0.83 (0.75-0.92) after ∼2 h. The best performing ML model over time was the gradient boosted decision trees model with an AUC of 0.84 (0.77-0.88) at triage, 0.86 (0.82-0.89) at ∼30 min and 0.86 (0.74-0.93) after ∼2 h.
Our study showed that machine learning models had an excellent but similar predictive performance as the logistic regression model for predicting hospital admission. In comparison to the 30-min model, the 2-h model did not show a performance improvement. After further validation, these prediction models could support management decisions by real-time feedback to medical personal.
早期识别需要住院治疗的急诊科(ED)患者对于医疗质量和患者安全至关重要。我们旨在比较预测ED患者住院情况的机器学习(ML)模型与ED登记后三个时间点的传统回归技术。
我们使用荷兰急诊科评估数据库(NEED)分析了三家医院的连续ED患者。我们使用ML(随机森林、梯度提升决策树、深度神经网络)和多变量逻辑回归分析(包括连续预测变量的样条变换),利用在分诊时、ED登记后约30分钟(包括生命体征)和约2小时(包括实验室检查)可获得的越来越多的数据,开发了住院预测模型。人口统计学、紧急程度、主诉、疾病严重程度、合并症代理指标和复杂性被用作协变量。我们使用每家医院独立验证集中ROC曲线下面积来比较性能。
我们纳入了172,104例ED患者,其中66,782例(39%)住院。多变量逻辑回归模型在分诊时的AUC为0.82(0.78 - 0.86),在约30分钟时为0.84(0.81 - 0.86),在约2小时后为0.83(0.75 - 0.92)。随着时间推移,表现最佳的ML模型是梯度提升决策树模型,在分诊时的AUC为0.84(0.77 - 0.88),在约30分钟时为0.86(0.82 - 0.89),在约2小时后为0.86(0.74 - 0.93)。
我们的研究表明,机器学习模型在预测住院方面具有出色但与逻辑回归模型相似的预测性能。与30分钟模型相比,2小时模型并未显示出性能提升。经过进一步验证后,这些预测模型可通过向医疗人员提供实时反馈来支持管理决策。