Department of Soil and Crop Sciences, 2474 TAMU, Texas A&M University, College Station, TX 77843-2474, USA.
BMC Genomics. 2013 Mar 28;14:208. doi: 10.1186/1471-2164-14-208.
Cotton, one of the world's leading crops, is important to the world's textile and energy industries, and is a model species for studies of plant polyploidization, cellulose biosynthesis and cell wall biogenesis. Here, we report the construction of a plant-transformation-competent binary bacterial artificial chromosome (BIBAC) library and comparative genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.) with one of its diploid putative progenitor species, G. raimondii Ulbr.
We constructed the cotton BIBAC library in a vector competent for high-molecular-weight DNA transformation in different plant species through either Agrobacterium or particle bombardment. The library contains 76,800 clones with an average insert size of 135 kb, providing an approximate 99% probability of obtaining at least one positive clone from the library using a single-copy probe. The quality and utility of the library were verified by identifying BIBACs containing genes important for fiber development, fiber cellulose biosynthesis, seed fatty acid metabolism, cotton-nematode interaction, and bacterial blight resistance. In order to gain an insight into the Upland cotton genome and its relationship with G. raimondii, we sequenced nearly 10,000 BIBAC ends (BESs) randomly selected from the library, generating approximately one BES for every 250 kb along the Upland cotton genome. The retroelement Gypsy/DIRS1 family predominates in the Upland cotton genome, accounting for over 77% of all transposable elements. From the BESs, we identified 1,269 simple sequence repeats (SSRs), of which 1,006 were new, thus providing additional markers for cotton genome research. Surprisingly, comparative sequence analysis showed that Upland cotton is much more diverged from G. raimondii at the genomic sequence level than expected. There seems to be no significant difference between the relationships of the Upland cotton D- and A-subgenomes with the G. raimondii genome, even though G. raimondii contains a D genome (D5).
The library represents the first BIBAC library in cotton and related species, thus providing tools useful for integrative physical mapping, large-scale genome sequencing and large-scale functional analysis of the Upland cotton genome. Comparative sequence analysis provides insights into the Upland cotton genome, and a possible mechanism underlying the divergence and evolution of polyploid Upland cotton from its diploid putative progenitor species, G. raimondii.
棉花是世界上主要的作物之一,对世界的纺织和能源产业至关重要,同时也是研究植物多倍体化、纤维素生物合成和细胞壁生物发生的模式物种。在这里,我们报告了一个具有植物转化能力的二元细菌人工染色体(BIBAC)文库的构建,并对多倍体陆地棉(Gossypium hirsutum L.)与其二倍体假定祖先种之一的雷蒙德氏棉(G. raimondii Ulbr.)进行了比较基因组序列分析。
我们通过农杆菌或粒子轰击的方法,在一种能够转化不同植物物种的高分子量 DNA 的载体中构建了棉花 BIBAC 文库。该文库包含 76800 个克隆,平均插入片段大小为 135kb,使用单拷贝探针从文库中获得至少一个阳性克隆的概率约为 99%。通过鉴定对纤维发育、纤维纤维素生物合成、种子脂肪酸代谢、棉花-线虫相互作用以及细菌性枯萎病抗性重要的 BIBAC ,验证了文库的质量和实用性。为了深入了解陆地棉基因组及其与雷蒙德氏棉的关系,我们随机从文库中选择了近 10000 个 BIBAC 末端(BESs)进行测序,大约每 250kb 就有一个 BES 覆盖陆地棉基因组。逆转座子 Gypsy/DIRS1 家族在陆地棉基因组中占主导地位,占所有转座元件的 77%以上。从 BESs 中,我们鉴定出了 1269 个简单重复序列(SSR),其中 1006 个是新的,从而为棉花基因组研究提供了更多的标记。令人惊讶的是,比较序列分析表明,陆地棉在基因组序列水平上与雷蒙德氏棉的分化程度比预期的要高得多。陆地棉的 D-和 A-亚基因组与雷蒙德氏棉基因组之间的关系似乎没有明显的差异,尽管雷蒙德氏棉含有 D 基因组(D5)。
该文库代表了棉花及其相关物种的第一个 BIBAC 文库,因此为陆地棉基因组的整合物理图谱、大规模测序和大规模功能分析提供了有用的工具。比较序列分析为陆地棉基因组提供了深入的了解,并为多倍体陆地棉与其二倍体假定祖先种雷蒙德氏棉的分化和进化提供了可能的机制。