Department of Statistics, University of Washington, Seattle, Washington, U.S.A.
National School of Public Health, Oswaldo Cruz Foundation, Brazil.
PLoS Comput Biol. 2020 Oct 12;16(10):e1007774. doi: 10.1371/journal.pcbi.1007774. eCollection 2020 Oct.
Coalescent theory combined with statistical modeling allows us to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. When sequences are sampled serially through time and the distribution of the sampling times depends on the effective population size, explicit statistical modeling of sampling times improves population size estimation. Previous work assumed that the genealogy relating sampled sequences is known and modeled sampling times as an inhomogeneous Poisson process with log-intensity equal to a linear function of the log-transformed effective population size. We improve this approach in two ways. First, we extend the method to allow for joint Bayesian estimation of the genealogy, effective population size trajectory, and other model parameters. Next, we improve the sampling time model by incorporating additional sources of information in the form of time-varying covariates. We validate our new modeling framework using a simulation study and apply our new methodology to analyses of population dynamics of seasonal influenza and to the recent Ebola virus outbreak in West Africa.
合并理论与统计建模使我们能够从目标群体中抽样的个体的分子序列估计有效种群大小的波动。当序列随时间连续抽样且抽样时间的分布取决于有效种群大小时,对抽样时间进行明确的统计建模可以提高种群大小的估计。以前的工作假设相关抽样序列的系统发育已知,并将抽样时间建模为对数强度等于对数变换有效种群大小的线性函数的非齐次泊松过程。我们以两种方式改进了这种方法。首先,我们将该方法扩展到允许联合贝叶斯估计系统发育、有效种群大小轨迹和其他模型参数。接下来,我们通过将时变协变量的形式的其他信息纳入采样时间模型来改进采样时间模型。我们使用模拟研究验证了我们的新建模框架,并将我们的新方法应用于季节性流感的种群动态分析和最近西非的埃博拉病毒爆发。