Shen Yingjia, Ji Guoli, Haas Brian J, Wu Xiaohui, Zheng Jianti, Reese Greg J, Li Qingshun Quinn
Department of Botany, Miami University, Oxford, OH 45056, USA.
Nucleic Acids Res. 2008 May;36(9):3150-61. doi: 10.1093/nar/gkn158. Epub 2008 Apr 13.
The position of a poly(A) site of eukaryotic mRNA is determined by sequence signals in pre-mRNA and a group of polyadenylation factors. To reveal rice poly(A) signals at a genome level, we constructed a dataset of 55 742 authenticated poly(A) sites and characterized the poly(A) signals. This resulted in identifying the typical tripartite cis-elements, including FUE, NUE and CE, as previously observed in Arabidopsis. The average size of the 3'-UTR was 289 nucleotides. When mapped to the genome, however, 15% of these poly(A) sites were found to be located in the currently annotated intergenic regions. Moreover, an extensive alternative polyadenylation profile was evident where 50% of the genes analyzed had more than one unique poly(A) site (excluding microheterogeneity sites), and 13% had four or more poly(A) sites. About 4% of the analyzed genes possessed alternative poly(A) sites at their introns, 5'-UTRs, or protein coding regions. The authenticity of these alternative poly(A) sites was partially confirmed using MPSS data. Analysis of nucleotide profile and signal patterns indicated that there may be a different set of poly(A) signals for those poly(A) sites found in the coding regions. Based on the features of rice poly(A) signals, an updated algorithm termed PASS-Rice was designed to predict poly(A) sites.
真核生物mRNA的聚腺苷酸化(poly(A))位点的位置由前体mRNA中的序列信号和一组聚腺苷酸化因子决定。为了在基因组水平上揭示水稻的聚(A)信号,我们构建了一个包含55742个经鉴定的聚(A)位点的数据集,并对聚(A)信号进行了特征分析。这导致鉴定出典型的三联体顺式元件,包括FUE、NUE和CE,正如之前在拟南芥中观察到的那样。3'-非翻译区(3'-UTR)的平均长度为289个核苷酸。然而,当映射到基因组时,发现这些聚(A)位点中有15%位于当前注释的基因间区域。此外,一个广泛的可变聚腺苷酸化图谱很明显,其中50%的分析基因有不止一个独特的聚(A)位点(不包括微异质性位点),13%有四个或更多的聚(A)位点。约4%的分析基因在其内含子、5'-UTR或蛋白质编码区拥有可变聚腺苷酸化位点。使用MPSS数据部分证实了这些可变聚(A)位点的真实性。核苷酸图谱和信号模式分析表明,在编码区发现的那些聚(A)位点可能存在一组不同的聚(A)信号。基于水稻聚(A)信号的特征,设计了一种名为PASS-Rice的更新算法来预测聚(A)位点。