Romero-Severson Ethan O, Bulla Ingo, Leitner Thomas
Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87545.
Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87545
Proc Natl Acad Sci U S A. 2016 Mar 8;113(10):2690-5. doi: 10.1073/pnas.1522930113. Epub 2016 Feb 22.
Although the use of phylogenetic trees in epidemiological investigations has become commonplace, their epidemiological interpretation has not been systematically evaluated. Here, we use an HIV-1 within-host coalescent model to probabilistically evaluate transmission histories of two epidemiologically linked hosts. Previous critique of phylogenetic reconstruction has claimed that direction of transmission is difficult to infer, and that the existence of unsampled intermediary links or common sources can never be excluded. The phylogenetic relationship between the HIV populations of epidemiologically linked hosts can be classified into six types of trees, based on cladistic relationships and whether the reconstruction is consistent with the true transmission history or not. We show that the direction of transmission and whether unsampled intermediary links or common sources existed make very different predictions about expected phylogenetic relationships: (i) Direction of transmission can often be established when paraphyly exists, (ii) intermediary links can be excluded when multiple lineages were transmitted, and (iii) when the sampled individuals' HIV populations both are monophyletic a common source was likely the origin. Inconsistent results, suggesting the wrong transmission direction, were generally rare. In addition, the expected tree topology also depends on the number of transmitted lineages, the sample size, the time of the sample relative to transmission, and how fast the diversity increases after infection. Typically, 20 or more sequences per subject give robust results. We confirm our theoretical evaluations with analyses of real transmission histories and discuss how our findings should aid in interpreting phylogenetic results.
尽管在流行病学调查中使用系统发育树已变得很常见,但其流行病学解释尚未得到系统评估。在此,我们使用一种宿主内HIV-1溯祖模型来概率性地评估两个存在流行病学关联的宿主的传播历史。先前对系统发育重建的批评称,传播方向难以推断,并且未采样的中间环节或共同来源的存在永远无法排除。基于分支关系以及重建是否与真实传播历史一致,存在流行病学关联的宿主的HIV群体之间的系统发育关系可分为六种类型的树。我们表明,传播方向以及未采样的中间环节或共同来源是否存在对预期的系统发育关系会做出非常不同的预测:(i)当并系存在时,传播方向通常可以确定;(ii)当多个谱系被传播时,可以排除中间环节;(iii)当采样个体的HIV群体均为单系时,共同来源很可能是起源。结果不一致,即表明错误的传播方向,这种情况通常很少见。此外,预期的树拓扑结构还取决于传播谱系的数量、样本大小、样本相对于传播的时间,以及感染后多样性增加的速度。通常,每个受试者20个或更多序列能得出可靠的结果。我们通过对真实传播历史的分析证实了我们的理论评估,并讨论了我们的发现应如何有助于解释系统发育结果。