对 RNA-seq 数据的综合分析揭示了芸薹属作物转录组的复杂性。
Comprehensive analysis of RNA-seq data reveals the complexity of the transcriptome in Brassica rapa.
机构信息
Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture, P,R, China, Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Wuhan 430062, China.
出版信息
BMC Genomics. 2013 Oct 7;14:689. doi: 10.1186/1471-2164-14-689.
BACKGROUND
The species Brassica rapa (2n=20, AA) is an important vegetable and oilseed crop, and serves as an excellent model for genomic and evolutionary research in Brassica species. With the availability of whole genome sequence of B. rapa, it is essential to further determine the activity of all functional elements of the B. rapa genome and explore the transcriptome on a genome-wide scale. Here, RNA-seq data was employed to provide a genome-wide transcriptional landscape and characterization of the annotated and novel transcripts and alternative splicing events across tissues.
RESULTS
RNA-seq reads were generated using the Illumina platform from six different tissues (root, stem, leaf, flower, silique and callus) of the B. rapa accession Chiifu-401-42, the same line used for whole genome sequencing. First, these data detected the widespread transcription of the B. rapa genome, leading to the identification of numerous novel transcripts and definition of 5'/3' UTRs of known genes. Second, 78.8% of the total annotated genes were detected as expressed and 45.8% were constitutively expressed across all tissues. We further defined several groups of genes: housekeeping genes, tissue-specific expressed genes and co-expressed genes across tissues, which will serve as a valuable repository for future crop functional genomics research. Third, alternative splicing (AS) is estimated to occur in more than 29.4% of intron-containing B. rapa genes, and 65% of them were commonly detected in more than two tissues. Interestingly, genes with high rate of AS were over-represented in GO categories relating to transcriptional regulation and signal transduction, suggesting potential importance of AS for playing regulatory role in these genes. Further, we observed that intron retention (IR) is predominant in the AS events and seems to preferentially occurred in genes with short introns.
CONCLUSIONS
The high-resolution RNA-seq analysis provides a global transcriptional landscape as a complement to the B. rapa genome sequence, which will advance our understanding of the dynamics and complexity of the B. rapa transcriptome. The atlas of gene expression in different tissues will be useful for accelerating research on functional genomics and genome evolution in Brassica species.
背景
芸薹属作物(2n=20,AA)是一种重要的蔬菜和油料作物,也是芸薹属物种基因组和进化研究的理想模型。随着芸薹属作物基因组序列的公布,进一步确定芸薹属作物基因组所有功能元件的活性以及在全基因组范围内探索转录组成为当务之急。在这里,我们利用 RNA-seq 数据提供了一个全基因组转录图谱,并对不同组织中注释和新转录本以及可变剪接事件进行了特征描述。
结果
使用 Illumina 平台从芸薹属作物 Chiifu-401-42 的 6 个不同组织(根、茎、叶、花、角果和愈伤组织)中生成了 RNA-seq 数据,该品系也被用于全基因组测序。首先,这些数据广泛检测到了芸薹属作物基因组的转录,从而鉴定出了大量的新转录本,并确定了已知基因的 5'/3'UTR。其次,在所有组织中,总注释基因的 78.8%被检测为表达,45.8%为组成型表达。我们进一步定义了几类基因:管家基因、组织特异性表达基因和组织间共表达基因,这将为未来作物功能基因组学研究提供有价值的资源。第三,可变剪接(AS)估计发生在超过 29.4%的含有内含子的芸薹属作物基因中,其中 65%的基因在两种以上组织中共同检测到。有趣的是,具有高 AS 率的基因在与转录调控和信号转导相关的 GO 类别中富集,这表明 AS 对这些基因的调控作用可能很重要。此外,我们观察到内含子保留(IR)在 AS 事件中占主导地位,并且似乎更倾向于发生在具有短内含子的基因中。
结论
高分辨率 RNA-seq 分析提供了一个全基因组转录图谱,补充了芸薹属作物基因组序列,这将有助于我们理解芸薹属作物转录组的动态和复杂性。不同组织中的基因表达图谱将有助于加速芸薹属物种功能基因组学和基因组进化的研究。