Drummond Alexei J, Nicholls Geoff K, Rodrigo Allen G, Solomon Wiremu
School of Biological Sciences, University of Auckland 1001, Auckland, New Zealand.
Genetics. 2002 Jul;161(3):1307-20. doi: 10.1093/genetics/161.3.1307.
Molecular sequences obtained at different sampling times from populations of rapidly evolving pathogens and from ancient subfossil and fossil sources are increasingly available with modern sequencing technology. Here, we present a Bayesian statistical inference approach to the joint estimation of mutation rate and population size that incorporates the uncertainty in the genealogy of such temporally spaced sequences by using Markov chain Monte Carlo (MCMC) integration. The Kingman coalescent model is used to describe the time structure of the ancestral tree. We recover information about the unknown true ancestral coalescent tree, population size, and the overall mutation rate from temporally spaced data, that is, from nucleotide sequences gathered at different times, from different individuals, in an evolving haploid population. We briefly discuss the methodological implications and show what can be inferred, in various practically relevant states of prior knowledge. We develop extensions for exponentially growing population size and joint estimation of substitution model parameters. We illustrate some of the important features of this approach on a genealogy of HIV-1 envelope (env) partial sequences.
借助现代测序技术,越来越容易获得在不同采样时间从快速进化的病原体群体以及古代亚化石和化石来源获得的分子序列。在这里,我们提出一种贝叶斯统计推断方法,用于联合估计突变率和种群大小,该方法通过使用马尔可夫链蒙特卡罗(MCMC)积分纳入了此类时间间隔序列谱系中的不确定性。金曼合并模型用于描述祖先树的时间结构。我们从时间间隔数据中恢复有关未知真实祖先合并树、种群大小和总体突变率的信息,即从在不断进化的单倍体群体中不同时间、不同个体收集的核苷酸序列中恢复这些信息。我们简要讨论了方法学意义,并展示了在各种实际相关的先验知识状态下可以推断出什么。我们开发了指数增长种群大小的扩展以及替代模型参数的联合估计。我们在HIV-1包膜(env)部分序列的谱系上说明了这种方法的一些重要特征。