Chowell G, Luo R, Sun K, Roosa K, Tariq A, Viboud C
Department of Population Heath Sciences, School of Public Health, Georgia State University, Atlanta, GA, USA; Division of International Epidemiology and Population Studies, Fogarty International Center, National Institutes of Health, Bethesda, MD, USA.
Department of Population Heath Sciences, School of Public Health, Georgia State University, Atlanta, GA, USA.
Epidemics. 2019 Dec 21;30:100379. doi: 10.1016/j.epidem.2019.100379.
Forecasting the trajectory of social dynamic processes, such as the spread of infectious diseases, poses significant challenges that call for methods that account for data and model uncertainty. Here we introduce an ensemble model for sequential forecasting that weights a set of plausible models and use a frequentist computational bootstrap approach to evaluate its uncertainty. We demonstrate the feasibility of our approach using simple dynamic differential-equation models and the trajectory of outbreak scenarios of the Ebola Forecasting Challenge. Specifically, we generate sequential short-term forecasts of epidemic outbreaks by combining phenomenological models that incorporate flexible epidemic growth scaling, namely the Generalized-Growth Model (GGM) and the Generalized Logistic Model (GLM). We rely on the root-mean-square error (RMSE) to quantify the quality of the models' fits during the calibration periods for weighting their contribution to the ensemble model while forecasting performance was evaluated using the RMSE of the forecasts. For a given forecasting horizon (1-4 weeks), we report the performance for each model as the percentage of the number of times each model outperforms the other models. The overall mean RMSE performance of the GLM and the GGM-GLM ensemble models outcompeted that of participant models of the Ebola Forecasting Challenge. We also found that the ensemble model provided more accurate forecasts with higher frequency than the GGM and GLM models, but its performance varied across forecasting horizons. For instance, across all of the Ebola Challenge Scenarios, the ensemble model outperformed the other models at horizons of 2 and 3 weeks while the GLM outperformed other models at horizons of 1 and 4 weeks.
预测社会动态过程的轨迹,如传染病的传播,带来了重大挑战,这需要考虑数据和模型不确定性的方法。在这里,我们引入了一种用于序列预测的集成模型,该模型对一组合理的模型进行加权,并使用频率主义计算自助法来评估其不确定性。我们使用简单的动态微分方程模型和埃博拉预测挑战赛的爆发情景轨迹来证明我们方法的可行性。具体来说,我们通过结合纳入灵活疫情增长尺度的现象学模型,即广义增长模型(GGM)和广义逻辑模型(GLM),生成疫情爆发的序列短期预测。我们依靠均方根误差(RMSE)来量化校准期间模型拟合的质量,以便在加权它们对集成模型的贡献时使用,同时使用预测的RMSE来评估预测性能。对于给定的预测期(1 - 4周),我们将每个模型的性能报告为每个模型优于其他模型次数的百分比。GLM和GGM - GLM集成模型的总体平均RMSE性能优于埃博拉预测挑战赛的参与者模型。我们还发现,集成模型比GGM和GLM模型提供了更准确且频率更高的预测,但其性能在不同预测期有所不同。例如,在所有埃博拉挑战情景中,集成模型在2周和3周的预测期内优于其他模型,而GLM在1周和4周的预测期内优于其他模型。