Groendyke Chris, Welch David, Hunter David R
Department of Statistics, Pennsylvania State University, University Park, Pennsylvania 16802, USA.
Biometrics. 2012 Sep;68(3):755-65. doi: 10.1111/j.1541-0420.2012.01748.x. Epub 2012 Feb 24.
In this article, we demonstrate a statistical method for fitting the parameters of a sophisticated network and epidemic model to disease data. The pattern of contacts between hosts is described by a class of dyadic independence exponential-family random graph models (ERGMs), whereas the transmission process that runs over the network is modeled as a stochastic susceptible-exposed-infectious-removed (SEIR) epidemic. We fit these models to very detailed data from the 1861 measles outbreak in Hagelloch, Germany. The network models include parameters for all recorded host covariates including age, sex, household, and classroom membership and household location whereas the SEIR epidemic model has exponentially distributed transmission times with gamma-distributed latent and infective periods. This approach allows us to make meaningful statements about the structure of the population-separate from the transmission process-as well as to provide estimates of various biological quantities of interest, such as the effective reproductive number, R. Using reversible jump Markov chain Monte Carlo, we produce samples from the joint posterior distribution of all the parameters of this model-the network, transmission tree, network parameters, and SEIR parameters-and perform Bayesian model selection to find the best-fitting network model. We compare our results with those of previous analyses and show that the ERGM network model better fits the data than a Bernoulli network model previously used. We also provide a software package, written in R, that performs this type of analysis.
在本文中,我们展示了一种统计方法,用于将复杂的网络和流行病模型的参数拟合到疾病数据。宿主之间的接触模式由一类二元独立指数族随机图模型(ERGM)描述,而在网络上运行的传播过程则被建模为一个随机的易感-暴露-感染-康复(SEIR)流行病模型。我们将这些模型拟合到来自德国哈格洛赫1861年麻疹疫情的非常详细的数据。网络模型包括所有记录的宿主协变量的参数,包括年龄、性别、家庭、班级成员身份和家庭位置,而SEIR流行病模型具有指数分布的传播时间,潜伏期和感染期服从伽马分布。这种方法使我们能够对人群结构(与传播过程分开)做出有意义的陈述,并提供对各种感兴趣的生物学量的估计,例如有效繁殖数R。使用可逆跳跃马尔可夫链蒙特卡罗方法,我们从该模型所有参数(网络、传播树、网络参数和SEIR参数)的联合后验分布中生成样本,并进行贝叶斯模型选择以找到最佳拟合的网络模型。我们将我们的结果与先前分析的结果进行比较,表明ERGM网络模型比先前使用的伯努利网络模型更能拟合数据。我们还提供了一个用R编写的软件包,用于执行此类分析。