Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand.
Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand.
Cells. 2020 Dec 18;9(12):2714. doi: 10.3390/cells9122714.
A substantial portion of the primate genome is composed of non-coding regions, so-called "dark matter", which includes an abundance of tandemly repeated sequences called satellite DNA. Collectively known as the satellitome, this genomic component offers exciting evolutionary insights into aspects of primate genome biology that raise new questions and challenge existing paradigms. A complete human reference genome was recently reported with telomere-to-telomere human X chromosome assembly that resolved hundreds of dark regions, encompassing a 3.1 Mb centromeric satellite array that had not been identified previously. With the recent exponential increase in the availability of primate genomes, and the development of modern genomic and bioinformatics tools, extensive growth in our knowledge concerning the structure, function, and evolution of satellite elements is expected. The current state of knowledge on this topic is summarized, highlighting various types of primate-specific satellite repeats to compare their proportions across diverse lineages. Inter- and intraspecific variation of satellite repeats in the primate genome are reviewed. The functional significance of these sequences is discussed by describing how the transcriptional activity of satellite repeats can affect gene expression during different cellular processes. Sex-linked satellites are outlined, together with their respective genomic organization. Mechanisms are proposed whereby satellite repeats might have emerged as novel sequences during different evolutionary phases. Finally, the main challenges that hinder the detection of satellite DNA are outlined and an overview of the latest methodologies to address technological limitations is presented.
灵长类基因组的很大一部分由非编码区域组成,这些非编码区域被称为“暗物质”,其中包含大量串联重复序列,称为卫星 DNA。这些基因组成分统称为卫星组,它们为灵长类基因组生物学的各个方面提供了令人兴奋的进化见解,这些方面提出了新的问题并挑战了现有范式。最近报道了一个完整的人类参考基因组,其中包括端粒到端粒的人类 X 染色体组装,解决了数百个暗区,包括以前未被识别的 3.1Mb 着丝粒卫星阵列。随着灵长类基因组的可用性最近呈指数级增长,以及现代基因组学和生物信息学工具的发展,预计我们对卫星元件的结构、功能和进化的了解将广泛增加。总结了这一主题的现有知识状态,突出了各种类型的灵长类特异性卫星重复,以比较它们在不同谱系中的比例。综述了灵长类基因组中卫星重复的种间和种内变异。通过描述卫星重复的转录活性如何在不同的细胞过程中影响基因表达,讨论了这些序列的功能意义。概述了性连锁卫星及其各自的基因组组织。提出了卫星重复在不同进化阶段可能作为新序列出现的机制。最后,概述了阻碍卫星 DNA 检测的主要挑战,并介绍了最新的方法学来解决技术限制。