Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA.
Department of Biology, University of Rochester, Rochester, New York 14627, USA.
Genome Res. 2021 Mar;31(3):380-396. doi: 10.1101/gr.263442.120. Epub 2021 Feb 9.
The rapid evolution of repetitive DNA sequences, including satellite DNA, tandem duplications, and transposable elements, underlies phenotypic evolution and contributes to hybrid incompatibilities between species. However, repetitive genomic regions are fragmented and misassembled in most contemporary genome assemblies. We generated highly contiguous de novo reference genomes for the species complex (, , and ), which speciated ∼250,000 yr ago. Our assemblies are comparable in contiguity and accuracy to the current genome, allowing us to directly compare repetitive sequences between these four species. We find that at least 15% of the complex species genomes fail to align uniquely to owing to structural divergence-twice the number of single-nucleotide substitutions. We also find rapid turnover of satellite DNA and extensive structural divergence in heterochromatic regions, whereas the euchromatic gene content is mostly conserved. Despite the overall preservation of gene synteny, euchromatin in each species has been shaped by clade- and species-specific inversions, transposable elements, expansions and contractions of satellite and tRNA tandem arrays, and gene duplications. We also find rapid divergence among Y-linked genes, including copy number variation and recent gene duplications from autosomes. Our assemblies provide a valuable resource for studying genome evolution and its consequences for phenotypic evolution in these genetic model species.
重复 DNA 序列(包括卫星 DNA、串联重复和转座元件)的快速进化是表型进化的基础,并导致了物种之间的杂种不育。然而,在大多数当代基因组组装中,重复的基因组区域是碎片化和错误组装的。我们为物种复合体(、、和)生成了高度连续的从头参考基因组,它们在大约 25 万年前就已经分化了。我们的组装在连续性和准确性上与当前的基因组相当,使我们能够直接比较这四个物种之间的重复序列。我们发现,至少有 15%的物种复合体基因组由于结构差异而无法唯一地比对到基因组,这一数量是单核苷酸替换的两倍。我们还发现卫星 DNA 的快速更替和异染色质区域的广泛结构差异,而常染色质基因含量大多是保守的。尽管基因的整体排列保持不变,但每个物种的常染色质都受到了谱系和物种特异性倒位、转座元件、卫星和 tRNA 串联重复的扩展和收缩以及基因重复的影响。我们还发现 Y 连锁基因之间存在快速分化,包括拷贝数变异和来自常染色体的近期基因重复。我们的组装为研究这些遗传模式物种的基因组进化及其对表型进化的影响提供了宝贵的资源。