King Léandra, Wakeley John, Carmi Shai
Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA.
Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Israel.
Theor Popul Biol. 2018 Jul;122:22-29. doi: 10.1016/j.tpb.2017.03.002. Epub 2017 Mar 21.
The population-scaled mutation rate, θ, is informative on the effective population size and is thus widely used in population genetics. We show that for two sequences and n unlinked loci, the variance of Tajima's estimator (θˆ), which is the average number of pairwise differences, does not vanish even as n→∞. The non-zero variance of θˆ results from a (weak) correlation between coalescence times even at unlinked loci, which, in turn, is due to the underlying fixed pedigree shared by gene genealogies at all loci. We derive the correlation coefficient under a diploid, discrete-time, Wright-Fisher model, and we also derive a simple, closed-form lower bound. We also obtain empirical estimates of the correlation of coalescence times under demographic models inspired by large-scale human genealogies. While the effect we describe is small (Varθˆ∕θ≈ON), it is important to recognize this feature of statistical population genetics, which runs counter to commonly held notions about unlinked loci.
群体规模缩放突变率θ,对于有效群体大小具有参考价值,因此在群体遗传学中被广泛应用。我们证明,对于两个序列和n个不连锁位点,即使n趋于无穷大, Tajima估计量(θˆ)的方差(即成对差异的平均数)也不会消失。θˆ的非零方差源于即使在不连锁位点,合并时间之间也存在(微弱)相关性,而这又归因于所有位点的基因谱系所共享的潜在固定谱系。我们推导了二倍体、离散时间Wright-Fisher模型下的相关系数,还推导了一个简单的闭式下界。我们还在受大规模人类谱系启发的人口模型下获得了合并时间相关性的经验估计。虽然我们所描述的效应很小(Varθˆ∕θ≈O(1/n)),但认识到统计群体遗传学的这一特征很重要,它与关于不连锁位点的普遍观念相悖。