Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, Wisconsin, USA.
BerbeeWalsh Department of Emergency Medicine, University of Wisconsin-Madison, Madison, Wisconsin, USA.
J Am Med Inform Assoc. 2023 Jan 18;30(2):292-300. doi: 10.1093/jamia/ocac214.
To develop a machine learning framework to forecast emergency department (ED) crowding and to evaluate model performance under spatial and temporal data drift.
We obtained 4 datasets, identified by the location: 1-large academic hospital and 2-rural hospital, and time period: pre-coronavirus disease (COVID) (January 1, 2019-February 1, 2020) and COVID-era (May 15, 2020-February 1, 2021). Our primary target was a binary outcome that is equal to 1 if the number of patients with acute respiratory illness that were ED boarding for more than 4 h was above a prescribed historical percentile. We trained a random forest and used the area under the curve (AUC) to evaluate out-of-sample performance for 2 experiments: (1) we evaluated the impact of sudden temporal drift by training models using pre-COVID data and testing them during the COVID-era, (2) we evaluated the impact of spatial drift by testing models trained at location 1 on data from location 2, and vice versa.
The baseline AUC values for ED boarding ranged from 0.54 (pre-COVID at location 2) to 0.81 (COVID-era at location 1). Models trained with pre-COVID data performed similarly to COVID-era models (0.82 vs 0.78 at location 1). Models that were transferred from location 2 to location 1 performed worse than models trained at location 1 (0.51 vs 0.78).
Our results demonstrate that ED boarding is a predictable metric for ED crowding, models were not significantly impacted by temporal data drift, and any attempts at implementation must consider spatial data drift.
开发一种机器学习框架,以预测急诊科(ED)拥挤情况,并评估在时空数据漂移下的模型性能。
我们获得了 4 个数据集,通过位置识别:1 家大型学术医院和 2 家农村医院,以及时间段:新冠疫情前(2019 年 1 月 1 日至 2020 年 2 月 1 日)和新冠疫情期间(2020 年 5 月 15 日至 2021 年 2 月 1 日)。我们的主要目标是一个二进制结果,如果急性呼吸道疾病患者在 ED 滞留超过 4 小时的人数超过规定的历史百分位数,则结果等于 1。我们训练了一个随机森林,并使用曲线下面积(AUC)评估了 2 个实验的样本外性能:(1)我们通过使用新冠疫情前的数据训练模型并在新冠疫情期间进行测试来评估突然的时间漂移的影响,(2)我们通过在位置 1 上训练的模型测试位置 2 上的数据,反之亦然,来评估空间漂移的影响。
ED 滞留的基线 AUC 值范围从 0.54(新冠疫情前位置 2)到 0.81(新冠疫情期间位置 1)。使用新冠疫情前的数据训练的模型与新冠疫情期间的模型表现相似(位置 1 为 0.82 对 0.78)。从位置 2 转移到位置 1 的模型表现不如在位置 1 上训练的模型(位置 1 为 0.51 对 0.78)。
我们的结果表明,ED 滞留是 ED 拥挤的一个可预测指标,模型没有受到时间数据漂移的显著影响,任何实施尝试都必须考虑空间数据漂移。