Ray Evan L, Sakrejda Krzysztof, Lauer Stephen A, Johansson Michael A, Reich Nicholas G
Department of Biostatistics and Epidemiology, School of Public Health and Health Sciences, University of Massachusetts, Amherst, MA 01003, USA.
Department of Mathematics and Statistics, Mount Holyoke College, South Hadley, MA 01075, USA.
Stat Med. 2017 Dec 30;36(30):4908-4929. doi: 10.1002/sim.7488. Epub 2017 Sep 14.
Creating statistical models that generate accurate predictions of infectious disease incidence is a challenging problem whose solution could benefit public health decision makers. We develop a new approach to this problem using kernel conditional density estimation (KCDE) and copulas. We obtain predictive distributions for incidence in individual weeks using KCDE and tie those distributions together into joint distributions using copulas. This strategy enables us to create predictions for the timing of and incidence in the peak week of the season. Our implementation of KCDE incorporates 2 novel kernel components: a periodic component that captures seasonality in disease incidence and a component that allows for a full parameterization of the bandwidth matrix with discrete variables. We demonstrate via simulation that a fully parameterized bandwidth matrix can be beneficial for estimating conditional densities. We apply the method to predicting dengue fever and influenza and compare to a seasonal autoregressive integrated moving average model and HHH4, a previously published extension to the generalized linear model framework developed for infectious disease incidence. The KCDE outperforms the baseline methods for predictions of dengue incidence in individual weeks. The KCDE also offers more consistent performance than the baseline models for predictions of incidence in the peak week and is comparable to the baseline models on the other prediction targets. Using the periodic kernel function led to better predictions of incidence. Our approach and extensions of it could yield improved predictions for public health decision makers, particularly in diseases with heterogeneous seasonal dynamics such as dengue fever.
创建能够准确预测传染病发病率的统计模型是一个具有挑战性的问题,其解决方案将使公共卫生决策者受益。我们使用核条件密度估计(KCDE)和copulas方法开发了一种解决此问题的新途径。我们使用KCDE获得各个星期发病率的预测分布,并使用copulas将这些分布关联成联合分布。这种策略使我们能够对季节高峰周的发病时间和发病率进行预测。我们对KCDE的实现纳入了两个新颖的核组件:一个用于捕捉疾病发病率季节性的周期组件,以及一个允许使用离散变量对带宽矩阵进行完全参数化的组件。我们通过模拟证明,完全参数化的带宽矩阵有助于估计条件密度。我们将该方法应用于预测登革热和流感,并与季节性自回归积分移动平均模型以及HHH4(一种先前发表的针对传染病发病率开发的广义线性模型框架的扩展)进行比较。在对各个星期登革热发病率的预测中,KCDE优于基线方法。在对高峰周发病率的预测方面,KCDE也比基线模型表现出更一致的性能,并且在其他预测目标上与基线模型相当。使用周期核函数能对发病率做出更好的预测。我们的方法及其扩展可为公共卫生决策者带来更优的预测,特别是对于登革热等具有异质季节动态的疾病。