Department of Genetics, Ecology and Evolution, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil.
Genes (Basel). 2023 Jan 23;14(2):300. doi: 10.3390/genes14020300.
Satellite DNA (satDNA) is a class of tandemly repeated non-protein coding DNA sequences which can be found in abundance in eukaryotic genomes. They can be functional, impact the genomic architecture in many ways, and their rapid evolution has consequences for species diversification. We took advantage of the recent availability of sequenced genomes from 23 species from the group to study their satDNA landscape. For this purpose, we used publicly available whole-genome sequencing Illumina reads and the TAREAN (tandem repeat analyzer) pipeline. We provide the characterization of 101 non-homologous satDNA families in this group, 93 of which are described here for the first time. Their repeat units vary in size from 4 bp to 1897 bp, but most satDNAs show repeat units < 100 bp long and, among them, repeats ≤ 10 bp are the most frequent ones. The genomic contribution of the satDNAs ranges from ~1.4% to 21.6%. There is no significant correlation between satDNA content and genome sizes in the 23 species. We also found that at least one satDNA originated from an expansion of the central tandem repeats (CTRs) present inside a Helitron transposon. Finally, some satDNAs may be useful as taxonomic markers for the identification of species or subgroups within the group.
卫星 DNA(satDNA)是一类串联重复的非蛋白编码 DNA 序列,在真核生物基因组中大量存在。它们可能具有功能,以多种方式影响基因组结构,并且它们的快速进化对物种多样化产生影响。我们利用最近获得的来自 23 个 组物种的测序基因组,研究了它们的 satDNA 景观。为此,我们使用了公开的全基因组测序 Illumina 读取数据和 TAREAN(串联重复分析器)管道。我们提供了该组中 101 个非同源 satDNA 家族的特征描述,其中 93 个家族是首次在这里描述的。它们的重复单元大小从 4 bp 到 1897 bp 不等,但大多数 satDNA 显示重复单元<100 bp 长,其中<10 bp 的重复是最常见的。satDNA 的基因组贡献范围从~1.4%到 21.6%。在这 23 个物种中,satDNA 含量与基因组大小之间没有显著相关性。我们还发现,至少有一种 satDNA 起源于内部存在的 Helitron 转座子的中心串联重复(CTRs)的扩张。最后,一些 satDNA 可能可用作识别该组内物种或亚组的分类标记。