Blath Jochen, González Casanova Adrián, Eldon Bjarki, Kurt Noemi, Wilke-Berenguer Maite
TU Berlin, Institut für Mathematik, 10623 Berlin, Germany.
TU Berlin, Institut für Mathematik, 10623 Berlin, Germany
Genetics. 2015 Jul;200(3):921-34. doi: 10.1534/genetics.115.176818. Epub 2015 May 7.
We analyze patterns of genetic variability of populations in the presence of a large seedbank with the help of a new coalescent structure called the seedbank coalescent. This ancestral process appears naturally as a scaling limit of the genealogy of large populations that sustain seedbanks, if the seedbank size and individual dormancy times are of the same order as those of the active population. Mutations appear as Poisson processes on the active lineages and potentially at reduced rate also on the dormant lineages. The presence of "dormant" lineages leads to qualitatively altered times to the most recent common ancestor and nonclassical patterns of genetic diversity. To illustrate this we provide a Wright-Fisher model with a seedbank component and mutation, motivated from recent models of microbial dormancy, whose genealogy can be described by the seedbank coalescent. Based on our coalescent model, we derive recursions for the expectation and variance of the time to most recent common ancestor, number of segregating sites, pairwise differences, and singletons. Estimates (obtained by simulations) of the distributions of commonly employed distance statistics, in the presence and absence of a seedbank, are compared. The effect of a seedbank on the expected site-frequency spectrum is also investigated using simulations. Our results indicate that the presence of a large seedbank considerably alters the distribution of some distance statistics, as well as the site-frequency spectrum. Thus, one should be able to detect from genetic data the presence of a large seedbank in natural populations.
我们借助一种名为种子库合并的新合并结构,分析存在大型种子库时种群的遗传变异模式。如果种子库大小和个体休眠时间与活跃种群的大小和时间处于同一量级,那么这个祖先过程自然会作为维持种子库的大种群谱系的缩放极限出现。突变在活跃谱系上表现为泊松过程,在休眠谱系上可能也以较低速率出现。“休眠”谱系的存在导致到最近共同祖先的时间在性质上发生改变,以及遗传多样性出现非经典模式。为了说明这一点,我们提供了一个带有种子库成分和突变的赖特 - 费希尔模型,该模型源自最近的微生物休眠模型,其谱系可用种子库合并来描述。基于我们的合并模型,我们推导了到最近共同祖先的时间、分离位点数量、成对差异和单例的期望和方差的递归公式。比较了在有和没有种子库的情况下,常用距离统计量分布的估计值(通过模拟获得)。还使用模拟研究了种子库对期望位点频率谱的影响。我们的结果表明,大型种子库的存在会显著改变一些距离统计量的分布以及位点频率谱。因此,应该能够从遗传数据中检测出自然种群中大型种子库的存在。