Flynn Jullien M, Caldas Ian, Cristescu Melania E, Clark Andrew G
Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14850
Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850.
Genetics. 2017 Oct;207(2):697-710. doi: 10.1534/genetics.117.300146. Epub 2017 Aug 15.
A long-standing evolutionary puzzle is that all eukaryotic genomes contain large amounts of tandemly-repeated DNA whose sequence motifs and abundance vary greatly among even closely related species. To elucidate the evolutionary forces governing tandem repeat dynamics, quantification of the rates and patterns of mutations in repeat copy number and tests of its selective neutrality are necessary. Here, we used whole-genome sequences of 28 mutation accumulation (MA) lines of , in addition to six isolates from a non-MA population originating from the same progenitor, to both estimate mutation rates of abundances of repeat sequences and evaluate the selective regime acting upon them. We found that mutation rates of individual repeats were both high and highly variable, ranging from additions/deletions of 0.29-105 copies per generation (reflecting changes of 0.12-0.80% per generation). Our results also provide evidence that new repeat sequences are often formed from existing ones. The non-MA population isolates showed a signal of either purifying or stabilizing selection, with 33% lower variation in repeat copy number on average than the MA lines, although the level of selective constraint was not evenly distributed across all repeats. The changes between many pairs of repeats were correlated, and the pattern of correlations was significantly different between the MA lines and the non-MA population. Our study demonstrates that tandem repeats can experience extremely rapid evolution in copy number, which can lead to high levels of divergence in genome-wide repeat composition between closely related species.
一个长期存在的进化难题是,所有真核生物基因组都包含大量串联重复DNA,其序列基序和丰度在即使亲缘关系很近的物种之间也有很大差异。为了阐明控制串联重复动态变化的进化力量,有必要对重复拷贝数的突变率和模式进行量化,并测试其选择性中性。在这里,我们除了使用来自同一祖细胞的非突变积累(MA)群体的六个分离株外,还使用了28个MA品系的全基因组序列,来估计重复序列丰度的突变率,并评估作用于它们的选择机制。我们发现,单个重复序列的突变率既高又高度可变,范围从每代增加/缺失0.29 - 105个拷贝(反映每代变化0.12 - 0.80%)。我们的结果还提供了证据表明新的重复序列通常由现有序列形成。非MA群体分离株显示出纯化或稳定选择的信号,其重复拷贝数的平均变异比MA品系低33%,尽管选择约束水平在所有重复序列中分布并不均匀。许多重复序列对之间的变化是相关的,并且MA品系和非MA群体之间的相关模式有显著差异。我们的研究表明,串联重复序列在拷贝数上可以经历极其快速的进化,这可能导致亲缘关系很近的物种在全基因组重复组成上有高水平的差异。