Zhao Zhixin, Wu Xiaohui, Kumar Praveen Kumar Raj, Dong Min, Ji Guoli, Li Qingshun Quinn, Liang Chun
Department of Biology, Miami University, Oxford, Ohio 45056.
Department of Biology, Miami University, Oxford, Ohio 45056 Department of Automation, Xiamen University, Xiamen, 361005, China.
G3 (Bethesda). 2014 Mar 13;4(5):871-83. doi: 10.1534/g3.114.010249.
Messenger RNA 3'-end formation is an essential posttranscriptional processing step for most eukaryotic genes. Different from plants and animals where AAUAAA and its variants routinely are found as the main poly(A) signal, Chlamydomonas reinhardtii uses UGUAA as the major poly(A) signal. The advance of sequencing technology provides an enormous amount of sequencing data for us to explore the variations of poly(A) signals, alternative polyadenylation (APA), and its relationship with splicing in this algal species. Through genome-wide analysis of poly(A) sites in C. reinhardtii, we identified a large number of poly(A) sites: 21,041 from Sanger expressed sequence tags, 88,184 from 454, and 195,266 from Illumina sequence reads. In comparison with previous collections, more new poly(A) sites are found in coding sequences and intron and intergenic regions by deep-sequencing. Interestingly, G-rich signals are particularly abundant in intron and intergenic regions. The prevalence of different poly(A) signals between coding sequences and a 3'-untranslated region implies potentially different polyadenylation mechanisms. Our data suggest that the APA occurs in about 68% of C. reinhardtii genes. Using Gene Ontolgy analysis, we found most of the APA genes are involved in RNA regulation and metabolic process, protein synthesis, hydrolase, and ligase activities. Moreover, intronic poly(A) sites are more abundant in constitutively spliced introns than retained introns, suggesting an interplay between polyadenylation and splicing. Our results support that APA, as in higher eukaryotes, may play significant roles in increasing transcriptome diversity and gene expression regulation in this algal species. Our datasets also provide useful information for accurate annotation of transcript ends in C. reinhardtii.
信使核糖核酸3'末端形成是大多数真核基因必不可少的转录后加工步骤。与植物和动物中通常以AAUAAA及其变体作为主要聚腺苷酸化信号不同,莱茵衣藻使用UGUAA作为主要聚腺苷酸化信号。测序技术的进步为我们提供了大量测序数据,以便探索该藻类物种中聚腺苷酸化信号的变异、可变聚腺苷酸化(APA)及其与剪接的关系。通过对莱茵衣藻中聚腺苷酸化位点进行全基因组分析,我们鉴定出大量聚腺苷酸化位点:来自桑格表达序列标签的有21,041个,来自454测序的有88,184个,来自Illumina序列读数有195,266个。与先前的数据相比,通过深度测序在编码序列、内含子和基因间区域发现了更多新的聚腺苷酸化位点。有趣的是,富含G的信号在内含子和基因间区域特别丰富。编码序列和3'非翻译区之间不同聚腺苷酸化信号的普遍性意味着潜在的不同聚腺苷酸化机制。我们的数据表明,约68%的莱茵衣藻基因发生可变聚腺苷酸化。使用基因本体分析,我们发现大多数可变聚腺苷酸化基因参与RNA调控和代谢过程、蛋白质合成、水解酶和连接酶活性。此外,组成型剪接内含子中的内含子聚腺苷酸化位点比保留内含子中的更丰富,这表明聚腺苷酸化和剪接之间存在相互作用。我们的结果支持,与高等真核生物一样,可变聚腺苷酸化可能在增加该藻类物种的转录组多样性和基因表达调控中发挥重要作用。我们的数据集还为准确注释莱茵衣藻转录本末端提供了有用信息。