Yuan Daojun, Tang Zhonghui, Wang Maojun, Gao Wenhui, Tu Lili, Jin Xin, Chen Lingling, He Yonghui, Zhang Lin, Zhu Longfu, Li Yang, Liang Qiqi, Lin Zhongxu, Yang Xiyan, Liu Nian, Jin Shuangxia, Lei Yang, Ding Yuanhao, Li Guoliang, Ruan Xiaoan, Ruan Yijun, Zhang Xianlong
National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Shizishan Street, Wuhan, Hubei 430070, China.
The Jackson Laboratory for Genome Medicine, 10 Discovery Drive, Farmington, CT 06032, USA.
Sci Rep. 2015 Dec 4;5:17662. doi: 10.1038/srep17662.
Gossypium hirsutum contributes the most production of cotton fibre, but G. barbadense is valued for its better comprehensive resistance and superior fibre properties. However, the allotetraploid genome of G. barbadense has not been comprehensively analysed. Here we present a high-quality assembly of the 2.57 gigabase genome of G. barbadense, including 80,876 protein-coding genes. The double-sized genome of the A (or At) (1.50 Gb) against D (or Dt) (853 Mb) primarily resulted from the expansion of Gypsy elements, including Peabody and Retrosat2 subclades in the Del clade, and the Athila subclade in the Athila/Tat clade. Substantial gene expansion and contraction were observed and rich homoeologous gene pairs with biased expression patterns were identified, suggesting abundant gene sub-functionalization occurred by allopolyploidization. More specifically, the CesA gene family has adapted differentially temporal expression patterns, suggesting an integrated regulatory mechanism of CesA genes from At and Dt subgenomes for the primary and secondary cellulose biosynthesis of cotton fibre in a "relay race"-like fashion. We anticipate that the G. barbadense genome sequence will advance our understanding the mechanism of genome polyploidization and underpin genome-wide comparison research in this genus.
陆地棉贡献了最多的棉花纤维产量,但海岛棉因其更好的综合抗性和优良的纤维特性而受到重视。然而,海岛棉的异源四倍体基因组尚未得到全面分析。在此,我们展示了海岛棉2.57千兆碱基基因组的高质量组装,其中包括80876个蛋白质编码基因。A(或At)(1.50Gb)与D(或Dt)(853Mb)的双倍体基因组主要源于Gypsy元件的扩增,包括Del分支中的Peabody和Retrosat2亚分支,以及Athila/Tat分支中的Athila亚分支。观察到大量的基因扩增和收缩,并鉴定出丰富的具有偏向性表达模式的同源基因对,这表明异源多倍体化发生了大量的基因亚功能化。更具体地说,纤维素合成酶(CesA)基因家族呈现出不同的时间表达模式,这表明来自At和Dt亚基因组的CesA基因以“接力赛”的方式对棉花纤维的初生和次生纤维素生物合成进行综合调控。我们预计,海岛棉基因组序列将增进我们对基因组多倍体化机制的理解,并为该属的全基因组比较研究提供支持。