Centro de Recursos Genéticos Vegetais, Instituto Agronômico de Campinas, CP 28, 13001-970, Campinas-SP, Brazil.
BMC Plant Biol. 2011 Feb 8;11:30. doi: 10.1186/1471-2229-11-30.
Coffee is one of the world's most important crops; it is consumed worldwide and plays a significant role in the economy of producing countries. Coffea arabica and C. canephora are responsible for 70 and 30% of commercial production, respectively. C. arabica is an allotetraploid from a recent hybridization of the diploid species, C. canephora and C. eugenioides. C. arabica has lower genetic diversity and results in a higher quality beverage than C. canephora. Research initiatives have been launched to produce genomic and transcriptomic data about Coffea spp. as a strategy to improve breeding efficiency.
Assembling the expressed sequence tags (ESTs) of C. arabica and C. canephora produced by the Brazilian Coffee Genome Project and the Nestlé-Cornell Consortium revealed 32,007 clusters of C. arabica and 16,665 clusters of C. canephora. We detected different GC3 profiles between these species that are related to their genome structure and mating system. BLAST analysis revealed similarities between coffee and grape (Vitis vinifera) genes. Using KA/KS analysis, we identified coffee genes under purifying and positive selection. Protein domain and gene ontology analyses suggested differences between Coffea spp. data, mainly in relation to complex sugar synthases and nucleotide binding proteins. OrthoMCL was used to identify specific and prevalent coffee protein families when compared to five other plant species. Among the interesting families annotated are new cystatins, glycine-rich proteins and RALF-like peptides. Hierarchical clustering was used to independently group C. arabica and C. canephora expression clusters according to expression data extracted from EST libraries, resulting in the identification of differentially expressed genes. Based on these results, we emphasize gene annotation and discuss plant defenses, abiotic stress and cup quality-related functional categories.
We present the first comprehensive genome-wide transcript profile study of C. arabica and C. canephora, which can be freely assessed by the scientific community at http://www.lge.ibi.unicamp.br/coffea. Our data reveal the presence of species-specific/prevalent genes in coffee that may help to explain particular characteristics of these two crops. The identification of differentially expressed transcripts offers a starting point for the correlation between gene expression profiles and Coffea spp. developmental traits, providing valuable insights for coffee breeding and biotechnology, especially concerning sugar metabolism and stress tolerance.
咖啡是世界上最重要的作物之一;它在全球范围内被消费,并在生产国的经济中发挥着重要作用。阿拉比卡咖啡和罗布斯塔咖啡分别占商业产量的 70%和 30%。阿拉比卡咖啡是由二倍体的罗布斯塔咖啡和埃塞尔比亚咖啡杂交形成的异源四倍体。阿拉比卡咖啡的遗传多样性较低,因此其生产的饮料质量更高。为了提高育种效率,人们已经启动了研究计划,以产生关于咖啡属的基因组和转录组数据。
巴西咖啡基因组计划和雀巢-康奈尔联盟产生的阿拉比卡咖啡和罗布斯塔咖啡的表达序列标签(ESTs)的组装揭示了 32007 个阿拉比卡咖啡簇和 16665 个罗布斯塔咖啡簇。我们检测到这些物种之间的不同 GC3 图谱与它们的基因组结构和交配系统有关。BLAST 分析显示咖啡和葡萄(Vitis vinifera)基因之间存在相似性。使用 KA/KS 分析,我们鉴定了咖啡受净化和正选择作用的基因。蛋白结构域和基因本体分析表明,咖啡种间数据存在差异,主要与复杂糖合成酶和核苷酸结合蛋白有关。OrthoMCL 用于鉴定与其他五种植物相比咖啡特有的和普遍存在的蛋白家族。注释的有趣家族包括新的半胱氨酸蛋白酶抑制剂、富含甘氨酸的蛋白和 RALF 样肽。层次聚类根据从 EST 文库中提取的表达数据,独立地将阿拉比卡咖啡和罗布斯塔咖啡的表达簇分组,从而鉴定出差异表达基因。基于这些结果,我们强调基因注释,并讨论植物防御、非生物胁迫和杯质量相关的功能类别。
我们首次对阿拉比卡咖啡和罗布斯塔咖啡进行了全面的全基因组转录谱研究,科学界可以在 http://www.lge.ibi.unicamp.br/coffea 上免费评估这些数据。我们的数据揭示了咖啡中存在的物种特异性/普遍存在的基因,这些基因可能有助于解释这两个作物的特殊特征。差异表达转录本的鉴定为基因表达谱与咖啡属发育特性之间的相关性提供了起点,为咖啡的培育和生物技术提供了有价值的见解,特别是在糖代谢和胁迫耐受方面。