Suppr超能文献

栽培棉种陆地棉基因组序列。

Genome sequence of the cultivated cotton Gossypium arboreum.

机构信息

1] State Key Laboratory of Cotton Biology, Institute of Cotton Research of the Chinese Academy of Agricultural Sciences, Anyang, China. [2].

1] BGI-Shenzhen, Shenzhen, China. [2].

出版信息

Nat Genet. 2014 Jun;46(6):567-72. doi: 10.1038/ng.2987. Epub 2014 May 18.

Abstract

The complex allotetraploid nature of the cotton genome (AADD; 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled the Gossypium arboreum (AA; 2n = 26) genome, a putative contributor of the A subgenome. A total of 193.6 Gb of clean sequence covering the genome by 112.6-fold was obtained by paired-end sequencing. We further anchored and oriented 90.4% of the assembly on 13 pseudochromosomes and found that 68.5% of the genome is occupied by repetitive DNA sequences. We predicted 41,330 protein-coding genes in G. arboreum. Two whole-genome duplications were shared by G. arboreum and Gossypium raimondii before speciation. Insertions of long terminal repeats in the past 5 million years are responsible for the twofold difference in the sizes of these genomes. Comparative transcriptome studies showed the key role of the nucleotide binding site (NBS)-encoding gene family in resistance to Verticillium dahliae and the involvement of ethylene in the development of cotton fiber cells.

摘要

棉花基因组(AADD;2n = 52)的复杂异源四倍体性质使得遗传、基因组和功能分析极具挑战性。在这里,我们对棉属(AA;2n = 26)基因组进行了测序和组装,棉属是 A 亚基因组的一个可能供体。通过双端测序共获得了 193.6Gb 的清洁序列,覆盖基因组的 112.6 倍。我们进一步将组装序列锚定到 13 条假染色体上,并发现 68.5%的基因组被重复 DNA 序列占据。我们在棉属中预测了 41,330 个编码蛋白的基因。在种间分化之前,棉属和雷蒙德氏棉共有两次全基因组加倍事件。过去 500 万年的长末端重复序列的插入导致了这两个基因组大小的两倍差异。比较转录组研究表明,核苷酸结合位点(NBS)编码基因家族在抗黄萎病中的关键作用,以及乙烯在棉花纤维细胞发育中的参与。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验