Cousins Trevor, Tabin Daniel, Patterson Nick, Reich David, Durvasula Arun
Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA.
Broad Institute of MIT and Harvard, Cambridge, MA USA.
bioRxiv. 2024 Jan 20:2024.01.18.576291. doi: 10.1101/2024.01.18.576291.
All published methods for learning about demographic history make the simplifying assumption that the genome evolves neutrally, and do not seek to account for the effects of natural selection on patterns of variation. This is a major concern, as ample work has demonstrated the pervasive effects of natural selection and in particular background selection (BGS) on patterns of genetic variation in diverse species. Simulations and theoretical work have shown that methods to infer changes in effective population size over time (()) become increasingly inaccurate as the strength of linked selection increases. Here, we introduce an extension to the Pairwise Sequentially Markovian Coalescent (PSMC) algorithm, PSMC+, which explicitly co-models demographic history and natural selection. We benchmark our method using forward-in-time simulations with BGS and find that our approach improves the accuracy of effective population size inference. Leveraging a high resolution map of BGS in humans, we infer considerable changes in the magnitude of inferred effective population size relative to previous reports. Finally, we separately infer () on the X chromosome and on the autosomes in diverse great apes without making a correction for selection, and find that the inferred ratio fluctuates substantially through time in a way that differs across species, showing that uncorrected selection may be an important driver of signals of genetic difference on the X chromosome and autosomes.
所有已发表的用于了解种群历史的方法都做出了一个简化假设,即基因组以中性方式进化,且未试图考虑自然选择对变异模式的影响。这是一个主要问题,因为大量研究已证明自然选择尤其是背景选择(BGS)对不同物种遗传变异模式具有广泛影响。模拟和理论研究表明,随着连锁选择强度的增加,推断有效种群大小随时间变化的方法(())会变得越来越不准确。在此,我们介绍了成对顺序马尔可夫合并(PSMC)算法的扩展版本PSMC+,它明确地对种群历史和自然选择进行联合建模。我们使用包含BGS的时间向前模拟对我们的方法进行基准测试,发现我们的方法提高了有效种群大小推断的准确性。利用人类BGS的高分辨率图谱,我们推断出相对于先前报告,推断的有效种群大小在幅度上有相当大的变化。最后,我们在不校正选择的情况下,分别推断不同大猿X染色体和常染色体上的(()),并发现推断的比率随时间大幅波动,且不同物种的波动方式不同,这表明未校正的选择可能是X染色体和常染色体上遗传差异信号的一个重要驱动因素。