Centre for Pattern Recognition and Data Analytics, Deakin University, Geelong Waurn Ponds, Australia.
JMIR Med Inform. 2016 Jul 21;4(3):e25. doi: 10.2196/medinform.5650.
Our study investigates different models to forecast the total number of next-day discharges from an open ward having no real-time clinical data.
We compared 5 popular regression algorithms to model total next-day discharges: (1) autoregressive integrated moving average (ARIMA), (2) the autoregressive moving average with exogenous variables (ARMAX), (3) k-nearest neighbor regression, (4) random forest regression, and (5) support vector regression. Although the autoregressive integrated moving average model relied on past 3-month discharges, nearest neighbor forecasting used median of similar discharges in the past in estimating next-day discharge. In addition, the ARMAX model used the day of the week and number of patients currently in ward as exogenous variables. For the random forest and support vector regression models, we designed a predictor set of 20 patient features and 88 ward-level features.
Our data consisted of 12,141 patient visits over 1826 days. Forecasting quality was measured using mean forecast error, mean absolute error, symmetric mean absolute percentage error, and root mean square error. When compared with a moving average prediction model, all 5 models demonstrated superior performance with the random forests achieving 22.7% improvement in mean absolute error, for all days in the year 2014.
In the absence of clinical information, our study recommends using patient-level and ward-level data in predicting next-day discharges. Random forest and support vector regression models are able to use all available features from such data, resulting in superior performance over traditional autoregressive methods. An intelligent estimate of available beds in wards plays a crucial role in relieving access block in emergency departments.
本研究旨在探讨在缺乏实时临床数据的情况下,预测开放式病房次日出院总人数的不同模型。
我们比较了 5 种流行的回归算法,以对总次日出院人数进行建模:(1)自回归积分移动平均(ARIMA),(2)带外生变量的自回归移动平均(ARMAX),(3)k-最近邻回归,(4)随机森林回归,和(5)支持向量回归。虽然自回归积分移动平均模型依赖于过去 3 个月的出院人数,但最近邻预测使用过去类似出院人数的中位数来估计次日出院人数。此外,ARMAX 模型使用星期几和当前病房中的患者人数作为外生变量。对于随机森林和支持向量回归模型,我们设计了一个包含 20 个患者特征和 88 个病房特征的预测器集。
我们的数据包括 1826 天内的 12141 次患者就诊。使用平均预测误差、平均绝对误差、对称平均绝对百分比误差和均方根误差来衡量预测质量。与移动平均预测模型相比,所有 5 种模型的表现均优于后者,随机森林在 2014 年所有日期的平均绝对误差方面提高了 22.7%。
在缺乏临床信息的情况下,本研究建议在预测次日出院人数时使用患者水平和病房水平的数据。随机森林和支持向量回归模型能够使用来自这些数据的所有可用特征,从而在性能上优于传统的自回归方法。对病房可用床位的智能估计在缓解急诊科就诊拥堵方面起着至关重要的作用。