Huang Rongjie, McMahan Christopher, Herrin Brian, McLain Alexander, Cai Bo, Self Stella
Department of Epidemiology and Biostatistics, University of South Carolina, South Carolina, USA.
School of Mathematical and Statistical Sciences, Clemson University, South Carolina, USA.
Infect Dis Model. 2024 Oct 4;10(1):189-200. doi: 10.1016/j.idm.2024.09.008. eCollection 2025 Mar.
Disease forecasting and surveillance often involve fitting models to a tremendous volume of historical testing data collected over space and time. Bayesian spatio-temporal regression models fit with Markov chain Monte Carlo (MCMC) methods are commonly used for such data. When the spatio-temporal support of the model is large, implementing an MCMC algorithm becomes a significant computational burden. This research proposes a computationally efficient gradient boosting algorithm for fitting a Bayesian spatio-temporal mixed effects binomial regression model. We demonstrate our method on a disease forecasting model and compare it to a computationally optimized MCMC approach. Both methods are used to produce monthly forecasts for Lyme disease, anaplasmosis, ehrlichiosis, and heartworm disease in domestic dogs for the contiguous United States. The data have a spatial support of 3108 counties and a temporal support of 108-138 months with 71-135 million test results. The proposed estimation approach is several orders of magnitude faster than the optimized MCMC algorithm, with a similar mean absolute prediction error.
疾病预测与监测通常涉及将模型拟合到大量在空间和时间上收集的历史检测数据。使用马尔可夫链蒙特卡罗(MCMC)方法拟合的贝叶斯时空回归模型常用于此类数据。当模型的时空支持范围较大时,实施MCMC算法会成为一项巨大的计算负担。本研究提出一种计算效率高的梯度提升算法,用于拟合贝叶斯时空混合效应二项式回归模型。我们在一个疾病预测模型上展示了我们的方法,并将其与计算优化的MCMC方法进行比较。这两种方法都用于对美国本土家犬的莱姆病、无形体病、埃立克体病和心丝虫病进行月度预测。数据的空间支持范围为3108个县,时间支持范围为108 - 138个月,有7100万 - 1.35亿条检测结果。所提出的估计方法比优化的MCMC算法快几个数量级,且平均绝对预测误差相似。