López-Cortegano Eugenio, Chebib Jobran, Jonas Anika, Vock Anastasia, Künzel Sven, Keightley Peter D, Tautz Diethard
Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom;
Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom.
Genome Res. 2025 Jan 22;35(1):43-54. doi: 10.1101/gr.279982.124.
All forms of genetic variation originate from new mutations, making it crucial to understand their rates and mechanisms. Here, we use long-read sequencing from Pacific Biosciences (PacBio) to investigate de novo mutations that accumulated in 12 inbred mouse lines derived from three commonly used inbred strains (C3H, C57BL/6, and FVB) maintained for 8 to 15 generations in a mutation accumulation (MA) experiment. We built chromosome-level genome assemblies based on the MA line founders' genomes and then employed a combination of read and assembly-based methods to call the complete spectrum of new mutations. On average, there are about 45 mutations per haploid genome per generation, about half of which (54%) are insertions and deletions shorter than 50 bp (indels). The remainder are single-nucleotide mutations (SNMs; 44%) and large structural mutations (SMs; 2%). We found that the degree of DNA repetitiveness is positively correlated with SNM and indel rates and that a substantial fraction of SMs can be explained by homology-dependent mechanisms associated with repeat sequences. Most (90%) indels can be attributed to microsatellite contractions and expansions, and there is a marked bias toward 4 bp indels. Among the different types of SMs, tandem repeat mutations have the highest mutation rate, followed by insertions of transposable elements (TEs). We uncover a rich landscape of active TEs, notable differences in their spectrum among MA lines and strains, and a high rate of gene retroposition. Our study offers novel insights into mammalian genome evolution and highlights the importance of repetitive elements in shaping genomic diversity.
所有形式的基因变异都源于新的突变,因此了解其发生率和机制至关重要。在这里,我们使用太平洋生物科学公司(PacBio)的长读长测序技术,来研究在一个突变积累(MA)实验中,从三个常用近交系(C3H、C57BL/6和FVB)衍生而来并维持了8至15代的12个近交小鼠品系中积累的新生突变。我们基于MA系奠基者的基因组构建了染色体水平的基因组组装,然后采用基于 reads 和组装的方法相结合,来鉴定新突变的完整谱系。平均而言,每代单倍体基因组中约有45个突变,其中约一半(54%)是长度小于50 bp的插入和缺失(indels)。其余的是单核苷酸突变(SNMs;44%)和大的结构突变(SMs;2%)。我们发现DNA重复程度与SNM和indel发生率呈正相关,并且相当一部分SMs可以由与重复序列相关的同源依赖性机制来解释。大多数(90%)indels可归因于微卫星的收缩和扩张,并且对4 bp indels存在明显的偏向性。在不同类型的SMs中,串联重复突变的突变率最高,其次是转座元件(TEs)的插入。我们揭示了活跃TEs的丰富图景,MA系和品系之间它们的谱系存在显著差异,以及较高的基因逆转录率。我们的研究为哺乳动物基因组进化提供了新的见解,并强调了重复元件在塑造基因组多样性中的重要性。