Bayer Damon, Goldstein Isaac H, Fintzi Jonathan, Lumbard Keith, Ricotta Emily, Warner Sarah, Busch Lindsay M, Strich Jeffrey R, Chertow Daniel S, Parker Daniel M, Boden-Albala Bernadette, Dratch Alissa, Chhuon Richard, Quick Nichole, Zahn Matthew, Minin Volodymyr M
Department of Statistics, University of California, Irvine, California, U.S.A.
Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, Rockville, Maryland, U.S.A.
ArXiv. 2023 Mar 11:arXiv:2009.02654v3.
Mechanistic models fit to streaming surveillance data are critical to understanding the transmission dynamics of an outbreak as it unfolds in real-time. However, transmission model parameter estimation can be imprecise, and sometimes even impossible, because surveillance data are noisy and not informative about all aspects of the mechanistic model. To partially overcome this obstacle, Bayesian models have been proposed to integrate multiple surveillance data streams. We devised a modeling framework for integrating SARS-CoV-2 diagnostics test and mortality time series data, as well as seroprevalence data from cross-sectional studies, and tested the importance of individual data streams for both inference and forecasting. Importantly, our model for incidence data accounts for changes in the total number of tests performed. We model the transmission rate, infection-to-fatality ratio, and a parameter controlling a functional relationship between the true case incidence and the fraction of positive tests as time-varying quantities and estimate changes of these parameters nonparametrically. We compare our base model against modified versions which do not use diagnostics test counts or seroprevalence data to demonstrate the utility of including these often unused data streams. We apply our Bayesian data integration method to COVID-19 surveillance data collected in Orange County, California between March 2020 and February 2021 and find that 32-72% of the Orange County residents experienced SARS-CoV-2 infection by mid-January, 2021. Despite this high number of infections, our results suggest that the abrupt end of the winter surge in January 2021 was due to both behavioral changes and a high level of accumulated natural immunity.
适用于动态监测数据的机理模型对于理解疫情实时传播动态至关重要。然而,传播模型参数估计可能不准确,有时甚至无法进行,因为监测数据存在噪声,且无法提供机理模型所有方面的信息。为了部分克服这一障碍,人们提出了贝叶斯模型来整合多个监测数据流。我们设计了一个建模框架,用于整合新冠病毒诊断检测和死亡率时间序列数据,以及横断面研究中的血清流行率数据,并测试了各个数据流对于推断和预测的重要性。重要的是,我们的发病率数据模型考虑了检测总次数的变化。我们将传播率、感染致死率以及控制真实病例发病率与阳性检测比例之间函数关系的一个参数建模为时变数量,并对这些参数的变化进行非参数估计。我们将基础模型与不使用诊断检测计数或血清流行率数据的修改版本进行比较,以证明纳入这些通常未使用的数据流的效用。我们将贝叶斯数据整合方法应用于2020年3月至2021年2月在加利福尼亚州奥兰治县收集的新冠疫情监测数据,发现截至2021年1月中旬,32%-72%的奥兰治县居民感染了新冠病毒。尽管感染人数众多,但我们的结果表明,2起1年1月冬季疫情高峰的突然结束是行为变化和高水平累积自然免疫共同作用的结果。