Soni Vivak, Jensen Jeffrey D
School of Life Sciences, Center for Evolution & Medicine, Arizona State University, Tempe, AZ, US.
bioRxiv. 2024 Nov 21:2024.09.19.613979. doi: 10.1101/2024.09.19.613979.
The demographic history of a population, and the distribution of fitness effects (DFE) of newly arising mutations in functional genomic regions, are fundamental factors dictating both genetic variation and evolutionary trajectories. Although both demographic and DFE inference has been performed extensively in humans, these approaches have generally either been limited to simple demographic models involving a single population, or, where a complex population history has been inferred, without accounting for the potentially confounding effects of selection at linked sites. Taking advantage of the coding-sparse nature of the genome, we propose a 2-step approach in which coalescent simulations are first used to infer a complex multi-population demographic model, utilizing large non-functional regions that are likely free from the effects of background selection. We then use forward-in-time simulations to perform DFE inference in functional regions, conditional on the complex demography inferred and utilizing expected background selection effects in the estimation procedure. Throughout, recombination and mutation rate maps were used to account for the underlying empirical rate heterogeneity across the human genome. Importantly, within this framework it is possible to utilize and fit multiple aspects of the data, and this inference scheme represents a generalized approach for such large-scale inference in species with coding-sparse genomes.
一个种群的人口统计学历史,以及功能基因组区域中新出现突变的适应性效应分布(DFE),是决定遗传变异和进化轨迹的基本因素。尽管在人类中已经广泛开展了人口统计学和DFE推断,但这些方法通常要么局限于涉及单一种群的简单人口统计学模型,要么在推断出复杂的种群历史时,没有考虑连锁位点选择的潜在混杂效应。利用基因组的编码稀疏特性,我们提出了一种两步法,其中首先使用合并模拟来推断复杂的多群体人口统计学模型,利用可能不受背景选择影响的大型非功能区域。然后,我们使用时间向前模拟在功能区域中进行DFE推断,以推断出的复杂人口统计学为条件,并在估计过程中利用预期的背景选择效应。在整个过程中,重组和突变率图谱被用来解释人类基因组中潜在的经验率异质性。重要的是,在这个框架内,可以利用和拟合数据的多个方面,并且这种推断方案代表了一种针对具有编码稀疏基因组的物种进行此类大规模推断的通用方法。