Department of Physics, Bar-Ilan University, Ramat Gan, Israel.
Mol Biol Evol. 2011 May;28(5):1617-31. doi: 10.1093/molbev/msq331. Epub 2010 Dec 16.
We show that the number of lineages ancestral to a sample, as a function of time back into the past, which we call the number of lineages as a function of time (NLFT), is a nearly deterministic property of large-sample gene genealogies. We obtain analytic expressions for the NLFT for both constant-sized and exponentially growing populations. The low level of stochastic variation associated with the NLFT of a large sample suggests using the NLFT to make estimates of population parameters. Based on this, we develop a new computational method of inferring the size and growth rate of a population from a large sample of DNA sequences at a single locus. We apply our method first to a sample of 1,212 mitochondrial DNA (mtDNA) sequences from China, confirming a pattern of recent population growth previously identified using other techniques, but with much smaller confidence intervals for past population sizes due to the low variation of the NLFT. We further analyze a set of 63 mtDNA sequences from blue whales (BWs), concluding that the population grew in the past. This calls for reevaluation of previous studies that were based on the assumption that the BW population was fixed.
我们证明了样本中祖先谱系的数量(作为时间的函数),我们称之为随时间变化的谱系数量(NLFT),是大样本基因谱系的一个几乎确定的性质。我们为定大小和指数增长的种群分别得到了 NLFT 的解析表达式。大样本 NLFT 所具有的低水平随机变化表明,可以使用 NLFT 来估计种群参数。基于此,我们提出了一种从单个基因座的大量 DNA 序列中推断种群大小和增长率的新计算方法。我们首先将该方法应用于从中国采集的 1212 个线粒体 DNA(mtDNA)序列样本,证实了先前使用其他技术确定的近期种群增长模式,但由于 NLFT 的低变化,过去种群规模的置信区间要小得多。我们进一步分析了一组来自蓝鲸(BW)的 63 个 mtDNA 序列,得出了种群过去有增长的结论。这呼吁重新评估之前基于 BW 种群固定假设的研究。