Suppr超能文献

对茶树转录组进行深度测序,揭示了茶特异性化合物主要代谢途径的候选基因。

Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds.

机构信息

Key laboratory of Tea Biochemistry and Biotechnology, Ministry of Education, Ministry of Agriculture, Anhui Agricultural University, Hefei, 230036, PR China.

出版信息

BMC Genomics. 2011 Feb 28;12:131. doi: 10.1186/1471-2164-12-131.

Abstract

BACKGROUND

Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes.

RESULTS

Using high-throughput Illumina RNA-seq, the transcriptome from poly (A)+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR).

CONCLUSIONS

An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis.

摘要

背景

茶是全球最受欢迎的非酒精饮料之一。然而,茶树(Camellia sinensis)难以在体外培养、转化,且基因组庞大,导致基因组信息有限。大规模 RNA 测序(RNA-seq)的最新进展为功能基因组分析提供了一种快速、经济高效且可靠的方法来生成大型表达数据集,这尤其适用于基因组未测序的非模式物种。

结果

使用高通量 Illumina RNA-seq,对茶树多聚(A)+ RNA 的转录组进行了前所未有的深度分析(25.9 亿碱基对)。获得了约 3450 万条经修剪和组装的reads,形成 127094 个非编码基因,平均长度为 355bp,N50 为 506bp,包含 788 个连续聚类和 126306 个单核苷酸。这个非编码基因的数量是现有 GenBank 中茶树序列(截至 2010 年 8 月)的 10 倍。与六个公共数据库(Uniprot、NR 和 COGs at NCBI、Pfam、InterPro 和 KEGG)的序列相似性分析发现,55088 个非编码基因可以用基因描述、保守蛋白结构域或基因本体论术语进行注释。其中一些非编码基因被分配到假定的代谢途径中。使用这些注释进行的靶向搜索确定了与几种主要代谢途径和对茶叶质量很重要的天然产物途径相关的大多数基因,如类黄酮、茶氨酸和咖啡因生物合成途径。发现了这些次生途径的新候选基因。与四个先前制备的 cDNA 文库的比较表明,该转录组数据集与以前的 EST 数据具有高度一致性,并且覆盖率约增加了 20 倍。验证了 13 个与茶氨酸和类黄酮合成有关的非编码基因。通过 RT-PCR 和定量实时 PCR(qRT-PCR)分析了它们在茶树不同器官中的表达模式。

结论

从茶树的深度测序中获得了广泛的转录组数据集。转录组的覆盖度足够全面,可以发现几种主要代谢途径的所有已知基因。该转录组数据集可作为 C.sinensis 中基因表达、基因组学和功能基因组研究的重要公共信息平台。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验