National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China.
National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China.
Plant Commun. 2024 Feb 12;5(2):100722. doi: 10.1016/j.xplc.2023.100722. Epub 2023 Sep 22.
Centromere positioning and organization are crucial for genome evolution; however, research on centromere biology is largely influenced by the quality of available genome assemblies. Here, we combined Oxford Nanopore and Pacific Biosciences technologies to de novo assemble two high-quality reference genomes for Gossypium hirsutum (TM-1) and Gossypium barbadense (3-79). Compared with previously published reference genomes, our assemblies show substantial improvements, with the contig N50 improved by 4.6-fold and 5.6-fold, respectively, and thus represent the most complete cotton genomes to date. These high-quality reference genomes enable us to characterize 14 and 5 complete centromeric regions for G. hirsutum and G. barbadense, respectively. Our data revealed that the centromeres of allotetraploid cotton are occupied by members of the centromeric repeat for maize (CRM) and Tekay long terminal repeat families, and the CRM family reshapes the centromere structure of the A subgenome after polyploidization. These two intertwined families have driven the convergent evolution of centromeres between the two subgenomes, ensuring centromere function and genome stability. In addition, the repositioning and high sequence divergence of centromeres between G. hirsutum and G. barbadense have contributed to speciation and centromere diversity. This study sheds light on centromere evolution in a significant crop and provides an alternative approach for exploring the evolution of polyploid plants.
着丝粒定位和组织对于基因组进化至关重要;然而,着丝粒生物学的研究在很大程度上受到可用基因组组装质量的影响。在这里,我们结合了牛津纳米孔和太平洋生物科学技术,从头组装了两个高质量的陆地棉(TM-1)和亚洲棉(3-79)参考基因组。与之前发表的参考基因组相比,我们的组装有了显著的改进,分别将 contig N50 提高了 4.6 倍和 5.6 倍,因此代表了迄今为止最完整的棉花基因组。这些高质量的参考基因组使我们能够分别对陆地棉和亚洲棉的 14 个和 5 个完整着丝粒区域进行特征描述。我们的数据表明,异源四倍体棉花的着丝粒由玉米着丝粒重复(CRM)和 Tekay 长末端重复家族的成员组成,CRM 家族在多倍化后重塑了 A 亚基因组的着丝粒结构。这两个交织的家族推动了两个亚基因组之间着丝粒的趋同进化,确保了着丝粒功能和基因组稳定性。此外,陆地棉和亚洲棉之间着丝粒的重定位和高序列分化导致了物种形成和着丝粒多样性。本研究揭示了重要作物的着丝粒进化,并为探索多倍体植物的进化提供了一种替代方法。