Boskova Veronika, Bonhoeffer Sebastian, Stadler Tanja
Department of Biosystems Science & Engineering (D-BSSE), Eidgenössische Technische Hochschule (ETH) Zürich, Basel, Switzerland.
Institute of Integrative Biology, Eidgenössische Technische Hochschule (ETH) Zürich, Zürich, Switzerland.
PLoS Comput Biol. 2014 Nov 6;10(11):e1003913. doi: 10.1371/journal.pcbi.1003913. eCollection 2014 Nov.
Quantifying epidemiological dynamics is crucial for understanding and forecasting the spread of an epidemic. The coalescent and the birth-death model are used interchangeably to infer epidemiological parameters from the genealogical relationships of the pathogen population under study, which in turn are inferred from the pathogen genetic sequencing data. To compare the performance of these widely applied models, we performed a simulation study. We simulated phylogenetic trees under the constant rate birth-death model and the coalescent model with a deterministic exponentially growing infected population. For each tree, we re-estimated the epidemiological parameters using both a birth-death and a coalescent based method, implemented as an MCMC procedure in BEAST v2.0. In our analyses that estimate the growth rate of an epidemic based on simulated birth-death trees, the point estimates such as the maximum a posteriori/maximum likelihood estimates are not very different. However, the estimates of uncertainty are very different. The birth-death model had a higher coverage than the coalescent model, i.e. contained the true value in the highest posterior density (HPD) interval more often (2-13% vs. 31-75% error). The coverage of the coalescent decreases with decreasing basic reproductive ratio and increasing sampling probability of infecteds. We hypothesize that the biases in the coalescent are due to the assumption of deterministic rather than stochastic population size changes. Both methods performed reasonably well when analyzing trees simulated under the coalescent. The methods can also identify other key epidemiological parameters as long as one of the parameters is fixed to its true value. In summary, when using genetic data to estimate epidemic dynamics, our results suggest that the birth-death method will be less sensitive to population fluctuations of early outbreaks than the coalescent method that assumes a deterministic exponentially growing infected population.
量化流行病学动态对于理解和预测流行病传播至关重要。合并模型和出生-死亡模型可互换使用,以便从所研究病原体群体的谱系关系中推断流行病学参数,而这些参数又是从病原体基因测序数据中推断出来的。为了比较这些广泛应用模型的性能,我们进行了一项模拟研究。我们在恒定速率出生-死亡模型和合并模型下模拟系统发育树,其中感染人群呈确定性指数增长。对于每棵树,我们使用基于出生-死亡和基于合并的方法重新估计流行病学参数,这些方法在BEAST v2.0中作为MCMC程序实现。在我们基于模拟出生-死亡树估计流行病增长率的分析中,诸如最大后验/最大似然估计等点估计并没有太大差异。然而,不确定性估计却大不相同。出生-死亡模型的覆盖率高于合并模型,即更多地在最高后验密度(HPD)区间内包含真实值(误差分别为2 - 13%和31 - 75%)。合并模型的覆盖率随着基本繁殖率的降低和感染者采样概率的增加而降低。我们推测合并模型中的偏差是由于假设种群大小变化是确定性的而非随机性的。在分析合并模型下模拟的树时,两种方法表现都相当不错。只要其中一个参数固定为其真实值,这些方法也可以识别其他关键流行病学参数。总之,当使用基因数据估计流行病动态时,我们的结果表明,与假设感染人群呈确定性指数增长的合并模型相比,出生-死亡方法对早期疫情的种群波动不太敏感。