Brooks Logan C, Farrow David C, Hyun Sangwon, Tibshirani Ryan J, Rosenfeld Roni
School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America.
Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America.
PLoS Comput Biol. 2015 Aug 28;11(8):e1004382. doi: 10.1371/journal.pcbi.1004382. eCollection 2015 Aug.
Seasonal influenza epidemics cause consistent, considerable, widespread loss annually in terms of economic burden, morbidity, and mortality. With access to accurate and reliable forecasts of a current or upcoming influenza epidemic's behavior, policy makers can design and implement more effective countermeasures. This past year, the Centers for Disease Control and Prevention hosted the "Predict the Influenza Season Challenge", with the task of predicting key epidemiological measures for the 2013-2014 U.S. influenza season with the help of digital surveillance data. We developed a framework for in-season forecasts of epidemics using a semiparametric Empirical Bayes framework, and applied it to predict the weekly percentage of outpatient doctors visits for influenza-like illness, and the season onset, duration, peak time, and peak height, with and without using Google Flu Trends data. Previous work on epidemic modeling has focused on developing mechanistic models of disease behavior and applying time series tools to explain historical data. However, tailoring these models to certain types of surveillance data can be challenging, and overly complex models with many parameters can compromise forecasting ability. Our approach instead produces possibilities for the epidemic curve of the season of interest using modified versions of data from previous seasons, allowing for reasonable variations in the timing, pace, and intensity of the seasonal epidemics, as well as noise in observations. Since the framework does not make strict domain-specific assumptions, it can easily be applied to some other diseases with seasonal epidemics. This method produces a complete posterior distribution over epidemic curves, rather than, for example, solely point predictions of forecasting targets. We report prospective influenza-like-illness forecasts made for the 2013-2014 U.S. influenza season, and compare the framework's cross-validated prediction error on historical data to that of a variety of simpler baseline predictors.
季节性流感疫情每年都会在经济负担、发病率和死亡率方面造成持续、可观且广泛的损失。如果能够获得关于当前或即将到来的流感疫情动态的准确可靠预测,政策制定者就可以设计并实施更有效的应对措施。去年,美国疾病控制与预防中心举办了“预测流感季节挑战赛”,任务是借助数字监测数据预测2013 - 2014年美国流感季节的关键流行病学指标。我们开发了一个使用半参数经验贝叶斯框架进行流行病情中预测的框架,并将其应用于预测流感样疾病门诊就诊的每周百分比,以及季节开始时间、持续时间、高峰时间和高峰高度,分别使用和不使用谷歌流感趋势数据。以往关于疫情建模的工作主要集中在开发疾病行为的机制模型,并应用时间序列工具来解释历史数据。然而,使这些模型适用于某些类型的监测数据可能具有挑战性,并且具有许多参数的过于复杂的模型可能会损害预测能力。相反,我们的方法使用前几个季节数据的修改版本生成感兴趣季节的疫情曲线的可能性,允许季节性疫情在时间、速度和强度以及观测中的噪声方面存在合理变化。由于该框架不做严格的特定领域假设,它可以很容易地应用于其他一些有季节性疫情的疾病。这种方法生成的是疫情曲线上完整的后验分布,而不是例如仅对预测目标进行点预测。我们报告了对2013 - 2014年美国流感季节的前瞻性流感样疾病预测,并将该框架在历史数据上的交叉验证预测误差与各种更简单的基线预测器的误差进行了比较。