Gill Mandev S, Tung Ho Lam Si, Baele Guy, Lemey Philippe, Suchard Marc A
Department of Statistics, Columbia University, New York, NY 10027, USA.
Department of Biostatistics, Jonathan and Karin Fielding School of Public Health, University of California, Los Angeles, CA 90095, USA.
Syst Biol. 2017 May 1;66(3):299-319. doi: 10.1093/sysbio/syw093.
Understanding the processes that give rise to quantitative measurements associated with molecular sequence data remains an important issue in statistical phylogenetics. Examples of such measurements include geographic coordinates in the context of phylogeography and phenotypic traits in the context of comparative studies. A popular approach is to model the evolution of continuously varying traits as a Brownian diffusion process acting on a phylogenetic tree. However, standard Brownian diffusion is quite restrictive and may not accurately characterize certain trait evolutionary processes. Here, we relax one of the major restrictions of standard Brownian diffusion by incorporating a nontrivial estimable mean into the process. We introduce a relaxed directional random walk (RDRW) model for the evolution of multivariate continuously varying traits along a phylogenetic tree. Notably, the RDRW model accommodates branch-specific variation of directional trends while preserving model identifiability. Furthermore, our development of a computationally efficient dynamic programming approach to compute the data likelihood enables scaling of our method to large data sets frequently encountered in phylogenetic comparative studies and viral evolution. We implement the RDRW model in a Bayesian inference framework to simultaneously reconstruct the evolutionary histories of molecular sequence data and associated multivariate continuous trait data, and provide tools to visualize evolutionary reconstructions. We demonstrate the performance of our model on synthetic data, and we illustrate its utility in two viral examples. First, we examine the spatiotemporal spread of HIV-1 in central Africa and show that the RDRW model uncovers a clearer, more detailed picture of the dynamics of viral dispersal than standard Brownian diffusion. Second, we study antigenic evolution in the context of HIV-1 resistance to three broadly neutralizing antibodies. Our analysis reveals evidence of a continuous drift at the HIV-1 population level towards enhanced resistance to neutralization by the VRC01 monoclonal antibody over the course of the epidemic. [Brownian Motion; Diffusion Processes; Phylodynamics; Phylogenetics; Phylogeography; Trait Evolution.].
理解产生与分子序列数据相关的定量测量的过程,仍然是统计系统发育学中的一个重要问题。此类测量的例子包括系统发育地理学背景下的地理坐标以及比较研究背景下的表型特征。一种常用的方法是将连续变化性状的进化建模为作用于系统发育树的布朗扩散过程。然而,标准布朗扩散具有相当大的局限性,可能无法准确表征某些性状的进化过程。在这里,我们通过将一个非平凡的可估计均值纳入该过程,放宽了标准布朗扩散的一个主要限制。我们引入了一种用于多变量连续变化性状沿系统发育树进化的松弛定向随机游走(RDRW)模型。值得注意的是,RDRW模型在保持模型可识别性的同时,考虑了定向趋势的分支特异性变化。此外,我们开发了一种计算效率高的动态规划方法来计算数据似然,从而使我们的方法能够扩展到系统发育比较研究和病毒进化中经常遇到的大数据集。我们在贝叶斯推理框架中实现了RDRW模型,以同时重建分子序列数据和相关多变量连续性状数据的进化历史,并提供可视化进化重建的工具。我们在合成数据上展示了我们模型的性能,并在两个病毒实例中说明了它的实用性。首先,我们研究了HIV-1在中非的时空传播,结果表明,与标准布朗扩散相比,RDRW模型揭示了更清晰、更详细的病毒传播动态图景。其次,我们在HIV-1对三种广泛中和抗体的抗性背景下研究了抗原进化。我们的分析揭示了在疫情过程中,HIV-1群体水平上朝着增强对VRC01单克隆抗体中和抗性的方向持续漂移的证据。[布朗运动;扩散过程;系统发育动力学;系统发育学;系统发育地理学;性状进化。]