Städler Thomas, Haubold Bernhard, Merino Carlos, Stephan Wolfgang, Pfaffelhuber Peter
Faculty of Mathematics and Physics, University of Freiburg, D-79104 Freiburg, Germany.
Genetics. 2009 May;182(1):205-16. doi: 10.1534/genetics.108.094904. Epub 2009 Feb 23.
Using coalescent simulations, we study the impact of three different sampling schemes on patterns of neutral diversity in structured populations. Specifically, we are interested in two summary statistics based on the site frequency spectrum as a function of migration rate, demographic history of the entire substructured population (including timing and magnitude of specieswide expansions), and the sampling scheme. Using simulations implementing both finite-island and two-dimensional stepping-stone spatial structure, we demonstrate strong effects of the sampling scheme on Tajima's D (D(T)) and Fu and Li's D (D(FL)) statistics, particularly under specieswide (range) expansions. Pooled samples yield average D(T) and D(FL) values that are generally intermediate between those of local and scattered samples. Local samples (and to a lesser extent, pooled samples) are influenced by local, rapid coalescence events in the underlying coalescent process. These processes result in lower proportions of external branch lengths and hence lower proportions of singletons, explaining our finding that the sampling scheme affects D(FL) more than it does D(T). Under specieswide expansion scenarios, these effects of spatial sampling may persist up to very high levels of gene flow (Nm > 25), implying that local samples cannot be regarded as being drawn from a panmictic population. Importantly, many data sets on humans, Drosophila, and plants contain signatures of specieswide expansions and effects of sampling scheme that are predicted by our simulation results. This suggests that validating the assumption of panmixia is crucial if robust demographic inferences are to be made from local or pooled samples. However, future studies should consider adopting a framework that explicitly accounts for the genealogical effects of population subdivision and empirical sampling schemes.
通过溯祖模拟,我们研究了三种不同抽样方案对结构化种群中性多样性模式的影响。具体而言,我们感兴趣的是基于位点频率谱的两个汇总统计量,它们是迁移率、整个亚结构化种群的人口统计历史(包括物种范围扩张的时间和规模)以及抽样方案的函数。使用同时实现有限岛屿和二维 stepping - stone 空间结构的模拟,我们证明了抽样方案对 Tajima's D(D(T))和 Fu and Li's D(D(FL))统计量有强烈影响,特别是在物种范围(范围)扩张的情况下。合并样本产生的平均 D(T) 和 D(FL) 值通常介于本地样本和分散样本之间。本地样本(以及在较小程度上,合并样本)受到基础溯祖过程中局部快速合并事件的影响。这些过程导致外部分支长度的比例较低,因此单倍体的比例也较低,这解释了我们的发现,即抽样方案对 D(FL) 的影响大于对 D(T) 的影响。在物种范围扩张的情况下,空间抽样的这些影响可能会持续到非常高的基因流水平(Nm > 25),这意味着本地样本不能被视为从随机交配种群中抽取。重要的是,许多关于人类、果蝇和植物的数据集都包含物种范围扩张的特征以及抽样方案的影响,这些都是我们模拟结果所预测的。这表明,如果要从本地或合并样本中进行可靠的人口统计推断,验证随机交配的假设至关重要。然而,未来的研究应该考虑采用一个明确考虑种群细分的谱系效应和经验抽样方案的框架。