Barlukova Ayuna, Rouzine Igor M
Sorbonne Université, Institute de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative, Paris, France.
PLoS Comput Biol. 2021 Mar 8;17(3):e1008822. doi: 10.1371/journal.pcbi.1008822. eCollection 2021 Mar.
An intriguing fact long defying explanation is the observation of a universal exponential distribution of beneficial mutations in fitness effect for different microorganisms. To explain this effect, we use a population model including mutation, directional selection, linkage, and genetic drift. The multiple-mutation regime of adaptation at large population sizes (traveling wave regime) is considered. We demonstrate analytically and by simulation that, regardless of the inherent distribution of mutation fitness effect across genomic sites, an exponential distribution of fitness effects emerges in the long term. This result follows from the exponential statistics of the frequency of the less-fit alleles, f, that we predict to evolve, in the long term, for both polymorphic and monomorphic sites. We map the logarithmic slope of the distribution onto the previously derived fixation probability and demonstrate that it increases linearly in time. Our results demonstrate a striking difference between the distribution of fitness effects observed experimentally for naturally occurring mutations, and the "inherent" distribution obtained in a directed-mutagenesis experiment, which can have any shape depending on the organism. Based on these results, we develop a new method to measure the fitness effect of mutations for each variable residue using DNA sequences sampled from adapting populations. This new method is not sensitive to linkage effects and does not require the one-site model assumptions.
长期以来一直难以解释的一个有趣事实是,不同微生物的有益突变在适应性效应上呈现出普遍的指数分布。为了解释这种效应,我们使用了一个包含突变、定向选择、连锁和遗传漂变的种群模型。我们考虑了大种群规模下适应的多突变机制(行波机制)。我们通过分析和模拟证明,无论基因组位点上突变适应性效应的固有分布如何,从长期来看都会出现适应性效应的指数分布。这个结果源于我们预测在长期内,对于多态性和单态性位点都会进化出的较不适应等位基因频率(f)的指数统计。我们将分布的对数斜率映射到先前推导的固定概率上,并证明它随时间线性增加。我们的结果表明,实验观察到的自然发生突变的适应性效应分布与定向诱变实验中获得的“固有”分布之间存在显著差异,后者的形状可能因生物体而异。基于这些结果,我们开发了一种新方法,利用从适应种群中采样的DNA序列来测量每个可变残基的突变适应性效应。这种新方法对连锁效应不敏感,并且不需要一位点模型假设。