Department of Life Sciences, Imperial College London, Ascot, Berkshire, United Kingdom.
The Francis Crick Institute, London, United Kingdom.
Mol Biol Evol. 2019 Sep 1;36(9):2040-2052. doi: 10.1093/molbev/msz081.
Estimating recent effective population size is of great importance in characterizing and predicting the evolution of natural populations. Methods based on nucleotide diversity may underestimate current day effective population sizes due to historical bottlenecks, whereas methods that reconstruct demographic history typically only detect long-term variations. However, soft selective sweeps, which leave a fingerprint of mutational history by recurrent mutations on independent haplotype backgrounds, holds promise of an estimate more representative of recent population history. Here, we present a simple and robust method of estimation based only on knowledge of the number of independent recurrent origins and the current frequency of the beneficial allele in a population sample, independent of the strength of selection and age of the mutation. Using a forward-time theoretical framework, we show the mean number of origins is a function of θ=2Nμ and current allele frequency, through a simple equation, and the distribution is approximately Poisson. This estimate is robust to whether mutants preexisted before selection arose and is equally accurate for diploid populations with incomplete dominance. For fast (e.g., seasonal) demographic changes compared with time scale for fixation of the mutant allele, and for moderate peak-to-trough ratios, we show our constant population size estimate can be used to bound the maximum and minimum population size. Applied to the Vgsc gene of Anopheles gambiae, we estimate an effective population size of roughly 6×107, and including seasonal demographic oscillations, a minimum effective population size >3×107, and a maximum <6×109, suggesting a mean ∼109.
估算近期有效种群大小对于描述和预测自然种群的进化具有重要意义。基于核苷酸多样性的方法可能会低估当前有效种群大小,因为存在历史瓶颈,而重建人口历史的方法通常只能检测到长期变化。然而,软选择清除通过独立单倍型背景上的反复突变留下了突变历史的指纹,有望更能代表近期的种群历史。在这里,我们提出了一种简单而稳健的估计方法,仅基于对独立重复起源数量和种群样本中有益等位基因当前频率的了解,而与选择强度和突变年龄无关。使用正向时间理论框架,我们通过一个简单的方程表明,起源的平均数量是 θ=2Nμ和当前等位基因频率的函数,分布近似泊松分布。这种估计不受突变体在选择出现之前是否存在的影响,对于不完全显性的二倍体种群同样准确。对于与突变体固定时间尺度相比快速(例如季节性)的人口变化,以及适度的峰值到低谷比,我们表明我们的恒定种群大小估计可以用来限制最大和最小种群大小。将其应用于冈比亚按蚊的 Vgsc 基因,我们估计有效种群大小约为 6×107,包括季节性人口波动,最小有效种群大小>3×107,最大<6×109,表明平均值约为 109。