Ferraj Ardian, Audano Peter A, Balachandran Parithi, Czechanski Anne, Flores Jacob I, Radecki Alexander A, Mosur Varun, Gordon David S, Walawalkar Isha A, Eichler Evan E, Reinholdt Laura G, Beck Christine R
Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06032, USA.
The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.
Cell Genom. 2023 Apr 5;3(5):100291. doi: 10.1016/j.xgen.2023.100291. eCollection 2023 May 10.
Diverse inbred mouse strains are important biomedical research models, yet genome characterization of many strains is fundamentally lacking in comparison with humans. In particular, catalogs of structural variants (SVs) (variants ≥ 50 bp) are incomplete, limiting the discovery of causative alleles for phenotypic variation. Here, we resolve genome-wide SVs in 20 genetically distinct inbred mice with long-read sequencing. We report 413,758 site-specific SVs affecting 13% (356 Mbp) of the mouse reference assembly, including 510 previously unannotated coding variants. We substantially improve the transposable element (TE) callset, and we find that TEs comprise 39% of SVs and account for 75% of altered bases. We further utilize this callset to investigate how TE heterogeneity affects mouse embryonic stem cells and find multiple TE classes that influence chromatin accessibility. Our work provides a comprehensive analysis of SVs found in diverse mouse genomes and illustrates the role of TEs in epigenetic differences.
多种近交系小鼠品系是重要的生物医学研究模型,但与人类相比,许多品系的基因组特征在根本上仍存在不足。特别是,结构变异(SVs,变异长度≥50bp)的目录并不完整,限制了导致表型变异的等位基因的发现。在此,我们通过长读长测序解析了20种遗传上不同的近交系小鼠的全基因组SVs。我们报告了413,758个位点特异性SVs,它们影响了小鼠参考基因组组装的13%(3.56亿碱基对),其中包括510个先前未注释的编码变异。我们大幅改进了转座元件(TE)的调用集,发现TE占SVs的39%,并占碱基改变的75%。我们进一步利用这个调用集来研究TE的异质性如何影响小鼠胚胎干细胞,并发现了多个影响染色质可及性的TE类别。我们的工作对在多种小鼠基因组中发现的SVs进行了全面分析,并阐明了TE在表观遗传差异中的作用。