Division of Molecular Biology, Ruđer Bošković Institute, Bijenička 54, 10 000, Zagreb, Croatia.
Sci Rep. 2020 Sep 15;10(1):15107. doi: 10.1038/s41598-020-71886-y.
Satellite DNAs (satDNAs) are long arrays of tandem repeats typically located in heterochromatin and span the centromeres of eukaryotic chromosomes. Despite the wealth of knowledge about satDNAs, little is known about a fraction of short, satDNA-like arrays dispersed throughout the genome. Our survey of the Pacific oyster Crassostrea gigas sequenced genome revealed genome assembly replete with satDNA-like tandem repeats. We focused on the most abundant arrays, grouped according to sequence similarity into 13 clusters, and explored their flanking sequences. Structural analysis showed that arrays of all 13 clusters represent central repeats of 11 non-autonomous elements named Cg_HINE, which are classified into the Helentron superfamily of DNA transposons. Each of the described elements is formed by a unique combination of flanking sequences and satDNA-like central repeats, coming from one, exceptionally two clusters in a consecutive order. While some of the detected Cg_HINE elements are related according to sequence similarities in flanking and repetitive modules, others evidently arose in independent events. In addition, some of the Cg_HINE's central repeats are related to the classical C. gigas satDNA, interconnecting mobile elements and satDNAs. Genome-wide distribution of Cg_HINE implies non-autonomous Helentrons as a dynamic system prone to efficiently propagate tandem repeats in the C. gigas genome.
卫星 DNA(satDNA)是长串串联重复序列,通常位于异染色质中,跨越真核染色体的着丝粒。尽管人们对 satDNA 有丰富的了解,但对分散在基因组中的一小部分短的 satDNA 样阵列却知之甚少。我们对太平洋牡蛎(Crassostrea gigas)测序基因组的调查显示,基因组组装中充满了 satDNA 样串联重复序列。我们专注于最丰富的阵列,根据序列相似性分为 13 个簇,并探索了它们的侧翼序列。结构分析表明,所有 13 个簇的阵列代表 11 个非自主元件 Cg_HINE 的中心重复序列,这些元件被归类为 DNA 转座子 Helentron 超家族。描述的每个元素都是由独特的侧翼序列和 satDNA 样中心重复序列组合而成的,这些重复序列来自一个,特别是两个连续的簇。虽然根据侧翼和重复模块的序列相似性,一些检测到的 Cg_HINE 元件是相关的,但其他元件显然是在独立事件中产生的。此外,一些 Cg_HINE 的中心重复序列与经典的 C. gigas satDNA 相关,将移动元件和 satDNA 相互连接。Cg_HINE 在全基因组范围内的分布意味着非自主 Helentrons 作为一个动态系统,容易在 C. gigas 基因组中有效地传播串联重复序列。