Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA.
Department of Medicine, Knight Cardiovascular Institute, Oregon Health and Science University, Portland, OR, USA.
Mol Biol Evol. 2021 Aug 23;38(9):3972-3992. doi: 10.1093/molbev/msab148.
Centromeres are functionally conserved chromosomal loci essential for proper chromosome segregation during cell division, yet they show high sequence diversity across species. Despite their variation, a near universal feature of centromeres is the presence of repetitive sequences, such as DNA satellites and transposable elements (TEs). Because of their rapidly evolving karyotypes, gibbons represent a compelling model to investigate divergence of functional centromere sequences across short evolutionary timescales. In this study, we use ChIP-seq, RNA-seq, and fluorescence in situ hybridization to comprehensively investigate the centromeric repeat content of the four extant gibbon genera (Hoolock, Hylobates, Nomascus, and Siamang). In all gibbon genera, we find that CENP-A nucleosomes and the DNA-proteins that interface with the inner kinetochore preferentially bind retroelements of broad classes rather than satellite DNA. A previously identified gibbon-specific composite retrotransposon, LAVA, known to be expanded within the centromere regions of one gibbon genus (Hoolock), displays centromere- and species-specific sequence differences, potentially as a result of its co-option to a centromeric function. When dissecting centromere satellite composition, we discovered the presence of the retroelement-derived macrosatellite SST1 in multiple centromeres of Hoolock, whereas alpha-satellites represent the predominate satellite in the other genera, further suggesting an independent evolutionary trajectory for Hoolock centromeres. Finally, using de novo assembly of centromere sequences, we determined that transcripts originating from gibbon centromeres recapitulate the species-specific TE composition. Combined, our data reveal dynamic shifts in the repeat content that define gibbon centromeres and coincide with the extensive karyotypic diversity within this lineage.
着丝粒是细胞分裂过程中正确分离染色体所必需的功能保守的染色体位点,但它们在物种间表现出高度的序列多样性。尽管存在差异,但着丝粒的一个近乎普遍的特征是存在重复序列,如 DNA 卫星和转座元件 (TEs)。由于其快速进化的染色体组型,长臂猿代表了一个引人注目的模型,可以研究功能着丝粒序列在短进化时间尺度上的分化。在这项研究中,我们使用 ChIP-seq、RNA-seq 和荧光原位杂交技术,全面研究了现生四种长臂猿属(合趾猿、树猿、白掌长臂猿和白眉长臂猿)的着丝粒重复序列含量。在所有长臂猿属中,我们发现 CENP-A 核小体和与内着丝粒接口的 DNA 蛋白优先结合广泛类别的反转元件,而不是卫星 DNA。先前鉴定的长臂猿特异性复合反转录转座子 LAVA,已知在一个长臂猿属(合趾猿)的着丝粒区域中扩张,显示出着丝粒和物种特异性的序列差异,可能是由于其被选为着丝粒功能。在剖析着丝粒卫星组成时,我们发现了反转元件衍生的宏卫星 SST1 存在于多个合趾猿的着丝粒中,而α卫星是其他属的主要卫星,这进一步表明合趾猿着丝粒具有独立的进化轨迹。最后,通过对着丝粒序列的从头组装,我们确定了源自长臂猿着丝粒的转录本再现了物种特异性 TE 组成。综合来看,我们的数据揭示了定义长臂猿着丝粒的重复含量的动态变化,与该谱系内广泛的染色体组多样性相一致。