Department of Biology, University of Oregon, Eugene, OR, USA.
Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA.
Mol Biol Evol. 2021 Aug 23;38(9):4010-4024. doi: 10.1093/molbev/msab149.
Viral phylogenies provide crucial information on the spread of infectious diseases, and many studies fit mathematical models to phylogenetic data to estimate epidemiological parameters such as the effective reproduction ratio (Re) over time. Such phylodynamic inferences often complement or even substitute for conventional surveillance data, particularly when sampling is poor or delayed. It remains generally unknown, however, how robust phylodynamic epidemiological inferences are, especially when there is uncertainty regarding pathogen prevalence and sampling intensity. Here, we use recently developed mathematical techniques to fully characterize the information that can possibly be extracted from serially collected viral phylogenetic data, in the context of the commonly used birth-death-sampling model. We show that for any candidate epidemiological scenario, there exists a myriad of alternative, markedly different, and yet plausible "congruent" scenarios that cannot be distinguished using phylogenetic data alone, no matter how large the data set. In the absence of strong constraints or rate priors across the entire study period, neither maximum-likelihood fitting nor Bayesian inference can reliably reconstruct the true epidemiological dynamics from phylogenetic data alone; rather, estimators can only converge to the "congruence class" of the true dynamics. We propose concrete and feasible strategies for making more robust epidemiological inferences from viral phylogenetic data.
病毒系统发育提供了有关传染病传播的重要信息,许多研究都将数学模型拟合到系统发育数据中,以估计随时间变化的流行病学参数,如有效繁殖率 (Re)。这种系统发育动力学推断通常补充甚至替代传统的监测数据,特别是在采样较差或延迟时。然而,目前尚不清楚系统发育流行病学推断的稳健性如何,特别是当病原体流行率和采样强度存在不确定性时。在这里,我们使用最近开发的数学技术,在常用的出生-死亡-采样模型的背景下,从连续收集的病毒系统发育数据中充分描述可能提取的信息。我们表明,对于任何候选的流行病学情况,都存在无数种可能的替代、明显不同但合理的“一致”情况,仅使用系统发育数据无法区分这些情况,无论数据集有多大。在整个研究期间没有跨整个研究期的强约束或速率先验的情况下,最大似然拟合或贝叶斯推断都不能仅从系统发育数据可靠地重建真实的流行病学动态;相反,估计值只能收敛到真实动态的“一致性类”。我们提出了从病毒系统发育数据中进行更稳健的流行病学推断的具体而可行的策略。