Magee Daniel, Suchard Marc A, Scotch Matthew
Department of Biomedical Informatics, Arizona State University, Tempe, Arizona, United States of America.
Biodesign Center for Environmental Security, Arizona State University, Tempe, Arizona, United States of America.
PLoS Comput Biol. 2017 Feb 7;13(2):e1005389. doi: 10.1371/journal.pcbi.1005389. eCollection 2017 Feb.
Ancestral state reconstructions in Bayesian phylogeography of virus pandemics have been improved by utilizing a Bayesian stochastic search variable selection (BSSVS) framework. Recently, this framework has been extended to model the transition rate matrix between discrete states as a generalized linear model (GLM) of genetic, geographic, demographic, and environmental predictors of interest to the virus and incorporating BSSVS to estimate the posterior inclusion probabilities of each predictor. Although the latter appears to enhance the biological validity of ancestral state reconstruction, there has yet to be a comparison of phylogenies created by the two methods. In this paper, we compare these two methods, while also using a primitive method without BSSVS, and highlight the differences in phylogenies created by each. We test six coalescent priors and six random sequence samples of H3N2 influenza during the 2014-15 flu season in the U.S. We show that the GLMs yield significantly greater root state posterior probabilities than the two alternative methods under five of the six priors, and significantly greater Kullback-Leibler divergence values than the two alternative methods under all priors. Furthermore, the GLMs strongly implicate temperature and precipitation as driving forces of this flu season and nearly unanimously identified a single root state, which exhibits the most tropical climate during a typical flu season in the U.S. The GLM, however, appears to be highly susceptible to sampling bias compared with the other methods, which casts doubt on whether its reconstructions should be favored over those created by alternate methods. We report that a BSSVS approach with a Poisson prior demonstrates less bias toward sample size under certain conditions than the GLMs or primitive models, and believe that the connection between reconstruction method and sampling bias warrants further investigation.
通过使用贝叶斯随机搜索变量选择(BSSVS)框架,病毒大流行的贝叶斯系统地理学中的祖先状态重建得到了改进。最近,该框架已扩展为将离散状态之间的转移率矩阵建模为病毒感兴趣的遗传、地理、人口和环境预测因子的广义线性模型(GLM),并纳入BSSVS以估计每个预测因子的后验包含概率。尽管后者似乎增强了祖先状态重建的生物学有效性,但尚未对这两种方法创建的系统发育进行比较。在本文中,我们比较了这两种方法,同时还使用了一种没有BSSVS的原始方法,并突出了每种方法创建的系统发育的差异。我们测试了2014 - 15年美国流感季节H3N2流感的六种合并先验和六个随机序列样本。我们表明,在六种先验中的五种情况下,GLM产生的根状态后验概率明显高于另外两种方法,并且在所有先验情况下,其Kullback-Leibler散度值明显高于另外两种方法。此外,GLM强烈暗示温度和降水是这个流感季节的驱动力,并且几乎一致地确定了一个单一的根状态,该状态在美国典型流感季节表现出最热带的气候。然而,与其他方法相比,GLM似乎对抽样偏差高度敏感,这让人怀疑其重建结果是否应比其他方法创建的结果更受青睐。我们报告说,在某些条件下,具有泊松先验的BSSVS方法比GLM或原始模型对样本大小的偏差更小,并且认为重建方法与抽样偏差之间的联系值得进一步研究。