植物信使核糖核酸聚腺苷酸化位点的预测建模

Predictive modeling of plant messenger RNA polyadenylation sites.

作者信息

Ji Guoli, Zheng Jianti, Shen Yingjia, Wu Xiaohui, Jiang Ronghan, Lin Yun, Loke Johnny C, Davis Kimberly M, Reese Greg J, Li Qingshun Quinn

机构信息

Department of Automation, Xiamen University, Xiamen, Fujian, 361005, PR China.

出版信息

BMC Bioinformatics. 2007 Feb 7;8:43. doi: 10.1186/1471-2105-8-43.

DOI:10.1186/1471-2105-8-43

PMID:17286857

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1805453/

Abstract

BACKGROUND

One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and translation machinery. The position of a poly(A) site is predetermined by signals in the pre-mRNA sequence that are recognized by a complex of polyadenylation factors. These signals are generally tri-part sequence patterns around the cleavage site that serves as the future poly(A) site. In plants, there is little sequence conservation among these signal elements, which makes it difficult to develop an accurate algorithm to predict the poly(A) site of a given gene. We attempted to solve this problem.

RESULTS

Based on our current working model and the profile of nucleotide sequence distribution of the poly(A) signals and around poly(A) sites in Arabidopsis, we have devised a Generalized Hidden Markov Model based algorithm to predict potential poly(A) sites. The high specificity and sensitivity of the algorithm were demonstrated by testing several datasets, and at the best combinations, both reach 97%. The accuracy of the program, called poly(A) site sleuth or PASS, has been demonstrated by the prediction of many validated poly(A) sites. PASS also predicted the changes of poly(A) site efficiency in poly(A) signal mutants that were constructed and characterized by traditional genetic experiments. The efficacy of PASS was demonstrated by predicting poly(A) sites within long genomic sequences.

CONCLUSION

Based on the features of plant poly(A) signals, a computational model was built to effectively predict the poly(A) sites in Arabidopsis genes. The algorithm will be useful in gene annotation because a poly(A) site signifies the end of the transcript. This algorithm can also be used to predict alternative poly(A) sites in known genes, and will be useful in the design of transgenes for crop genetic engineering by predicting and eliminating undesirable poly(A) sites.

摘要

背景

前体mRNA成熟过程中的一个基本加工事件是转录后添加多聚腺苷酸[poly(A)]尾巴。3'端的poly(A)序列可保护mRNA免受无节制的降解，并通过mRNA输出和翻译机制的识别来指示mRNA的完整性。poly(A)位点的位置由前体mRNA序列中的信号预先确定，这些信号被多聚腺苷酸化因子复合物识别。这些信号通常是围绕切割位点的三部分序列模式，该切割位点将成为未来的poly(A)位点。在植物中，这些信号元件之间几乎没有序列保守性，这使得开发一种准确的算法来预测给定基因的poly(A)位点变得困难。我们试图解决这个问题。

结果

基于我们当前的工作模型以及拟南芥中poly(A)信号和poly(A)位点周围的核苷酸序列分布概况，我们设计了一种基于广义隐马尔可夫模型的算法来预测潜在的poly(A)位点。通过对几个数据集的测试证明了该算法具有高特异性和敏感性，在最佳组合下，两者均达到97%。通过对许多已验证的poly(A)位点进行预测，证明了名为poly(A)位点搜寻器或PASS的程序的准确性。PASS还预测了通过传统遗传实验构建和表征的poly(A)信号突变体中poly(A)位点效率的变化。通过预测长基因组序列中的poly(A)位点证明了PASS的有效性。

结论

基于植物poly(A)信号的特征，构建了一个计算模型来有效预测拟南芥基因中的poly(A)位点。该算法将有助于基因注释，因为poly(A)位点表示转录本的末端。该算法还可用于预测已知基因中的可变poly(A)位点，并通过预测和消除不需要的poly(A)位点，在作物基因工程转基因设计中发挥作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9633/1805453/53accb72a173/1471-2105-8-43-1.jpg

相似文献

Predictive modeling of plant messenger RNA polyadenylation sites.

BMC Bioinformatics. 2007 Feb 7;8:43. doi: 10.1186/1471-2105-8-43.

Prediction of plant mRNA polyadenylation sites.

Methods Mol Biol. 2015;1255:13-23. doi: 10.1007/978-1-4939-2175-1_2.

A classification-based prediction model of messenger RNA polyadenylation sites.

J Theor Biol. 2010 Aug 7;265(3):287-96. doi: 10.1016/j.jtbi.2010.05.015. Epub 2010 May 26.

Arabidopsis mRNA polyadenylation machinery: comprehensive analysis of protein-protein interactions and gene expression profiling.

BMC Genomics. 2008 May 14;9:220. doi: 10.1186/1471-2164-9-220.

Recognition of polyadenylation sites from Arabidopsis genomic sequences.

Genome Inform. 2007;19:73-82.

Alternative polyadenylation and gene expression regulation in plants.

Wiley Interdiscip Rev RNA. 2011 May-Jun;2(3):445-58. doi: 10.1002/wrna.59. Epub 2010 Nov 9.

Computational analysis of plant polyadenylation signals.

Methods Mol Biol. 2015;1255:3-11. doi: 10.1007/978-1-4939-2175-1_1.

Genome level analysis of rice mRNA 3'-end processing signals and alternative polyadenylation.

Nucleic Acids Res. 2008 May;36(9):3150-61. doi: 10.1093/nar/gkn158. Epub 2008 Apr 13.

A history of poly A sequences: from formation to factors to function.

Prog Nucleic Acid Res Mol Biol. 2002;71:285-389. doi: 10.1016/s0079-6603(02)71046-5.

In silico prediction of mRNA poly(A) sites in Chlamydomonas reinhardtii.

Mol Genet Genomics. 2012 Dec;287(11-12):895-907. doi: 10.1007/s00438-012-0725-5. Epub 2012 Oct 30.

引用本文的文献

Multifactorial analysis of terminator performance on heterologous gene expression in Physcomitrella.

Plant Cell Rep. 2024 Jan 22;43(2):43. doi: 10.1007/s00299-023-03088-5.

Alternative Transcripts Diversify Genome Function for Phenome Relevance to Health and Diseases.

Genes (Basel). 2023 Nov 8;14(11):2051. doi: 10.3390/genes14112051.

A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq.

Genomics Proteomics Bioinformatics. 2023 Feb;21(1):67-83. doi: 10.1016/j.gpb.2022.09.005. Epub 2022 Sep 24.

Poly(A)-DG: A deep-learning-based domain generalization method to identify cross-species Poly(A) signal without prior knowledge from target species.

PLoS Comput Biol. 2020 Nov 5;16(11):e1008297. doi: 10.1371/journal.pcbi.1008297. eCollection 2020 Nov.

Use of synthetic biology tools to optimize the production of active nitrogenase Fe protein in chloroplasts of tobacco leaf cells.

Plant Biotechnol J. 2020 Sep;18(9):1882-1896. doi: 10.1111/pbi.13347. Epub 2020 Apr 7.

Modeling of Genome-Wide Polyadenylation Signals in .

Front Genet. 2019 Jul 3;10:647. doi: 10.3389/fgene.2019.00647. eCollection 2019.

An intronless form of the tobacco extensin gene terminator strongly enhances transient gene expression in plant leaves.

Plant Mol Biol. 2018 Mar;96(4-5):429-443. doi: 10.1007/s11103-018-0708-y. Epub 2018 Feb 10.

Isolation of Alcohol Dehydrogenase cDNA and Basal Regulatory Region from Metroxylon sagu.

ISRN Mol Biol. 2012 Aug 26;2012:839427. doi: 10.5402/2012/839427. eCollection 2012.

A Genome-wide Study of "Non-3UTR" Polyadenylation Sites in Arabidopsis thaliana.

Sci Rep. 2016 Jun 15;6:28060. doi: 10.1038/srep28060.

Genome-wide characterization of intergenic polyadenylation sites redefines gene spaces in Arabidopsis thaliana.

BMC Genomics. 2015 Jul 9;16(1):511. doi: 10.1186/s12864-015-1691-1.

本文引用的文献

Prediction of mRNA polyadenylation sites by support vector machine.

Bioinformatics. 2006 Oct 1;22(19):2320-5. doi: 10.1093/bioinformatics/btl394. Epub 2006 Jul 26.

Features of Arabidopsis genes and genome discovered using full-length cDNAs.

Plant Mol Biol. 2006 Jan;60(1):69-85. doi: 10.1007/s11103-005-2564-9.

Bioinformatic identification of candidate cis-regulatory elements involved in human mRNA polyadenylation.

RNA. 2005 Oct;11(10):1485-93. doi: 10.1261/rna.2107305. Epub 2005 Aug 30.

Compilation of mRNA polyadenylation signals in Arabidopsis revealed a new signal element and potential secondary structures.

Plant Physiol. 2005 Jul;138(3):1457-68. doi: 10.1104/pp.105.060541. Epub 2005 Jun 17.

An in-silico method for prediction of polyadenylation signals in human sequences.

Genome Inform. 2003;14:84-93.

A large-scale analysis of mRNA polyadenylation of human and mouse genes.

Nucleic Acids Res. 2005 Jan 12;33(1):201-12. doi: 10.1093/nar/gki158. Print 2005.

Analysis of the transcriptional complexity of Arabidopsis thaliana by massively parallel signature sequencing.

Nat Biotechnol. 2004 Aug;22(8):1006-11. doi: 10.1038/nbt992. Epub 2004 Jul 11.

New perspectives on connecting messenger RNA 3' end formation to transcription.

Curr Opin Cell Biol. 2004 Jun;16(3):272-8. doi: 10.1016/j.ceb.2004.03.007.

A gateway cloning vector set for high-throughput functional analysis of genes in planta.

Plant Physiol. 2003 Oct;133(2):462-9. doi: 10.1104/pp.103.027979.

The Polyadenylation of RNA in Plants.

Plant Physiol. 1997 Oct;115(2):321-325. doi: 10.1104/pp.115.2.321.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

植物信使核糖核酸聚腺苷酸化位点的预测建模

Predictive modeling of plant messenger RNA polyadenylation sites.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献