School of Life Sciences, Arizona State University, Tempe, AZ, USA.
Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom.
Mol Biol Evol. 2021 Jun 25;38(7):2986-3003. doi: 10.1093/molbev/msab050.
Current procedures for inferring population history generally assume complete neutrality-that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2), specifically studying how the underlying shape of the distribution of fitness effects and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the distribution of fitness effect as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.
目前推断种群历史的方法通常假设完全中性——也就是说,它们忽略了直接选择和选择对连锁位点的影响。我们通过评估两种常用方法(MSMC 和 fastsimcoal2),具体研究了适应值效应分布的基本形状和直接选择的位点比例如何与人口参数估计相互作用,来研究直接选择和背景选择的存在如何通过偏倚人口推断。结果表明,即使在屏蔽功能基因组区域后,背景选择也可能导致在种群大小和下降的模型下对种群增长的错误推断。这种影响随着纯化选择的强度和直接选择的位点密度的增加而放大,这表现在连锁中性位点的核苷酸多样性和位点频率谱的扭曲上。我们还展示了如何通过分析预测由种群大小变化引起的背景选择效应的模拟变化。我们提出了一种潜在的方法来纠正由选择引起的人口增长的错误推断。通过将适应值效应的分布视为一种干扰参数并对所有潜在实现进行平均,我们证明即使是直接选择的位点也可以用于以合理的准确性推断人口历史。