• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于分类的信使核糖核酸聚腺苷酸化位点预测模型。

A classification-based prediction model of messenger RNA polyadenylation sites.

作者信息

Ji Guoli, Wu Xiaohui, Shen Yingjia, Huang Jiangyin, Quinn Li Qingshun

机构信息

Department of Automation, Xiamen University, Xiamen 361000, China.

出版信息

J Theor Biol. 2010 Aug 7;265(3):287-96. doi: 10.1016/j.jtbi.2010.05.015. Epub 2010 May 26.

DOI:10.1016/j.jtbi.2010.05.015
PMID:20546757
Abstract

Messenger RNA polyadenylation is one of the essential processing steps during eukaryotic gene expression. The site of polyadenylation [(poly(A) site] marks the end of a transcript, which is also the end of a gene. A computation program that is able to recognize poly(A) sites would not only prove useful for genome annotation in finding genes ends, but also for predicting alternative poly(A) sites. Features that define the poly(A) sites can now be extracted from the poly(A) site datasets to build such predictive models. Using methods, including K-gram pattern, Z-curve, position-specific scoring matrix and first-order inhomogeneous Markov sub-model, numerous features were generated and placed in an original feature space. To select the most useful features, attribute selection algorithms, such as information gain and entropy, were employed. A training model was then built based on the Bayesian network to determine a subset of the optimal features. Test models corresponding to the training models were built to predict poly(A) sites in Arabidopsis and rice. Thus, a prediction model, termed Poly(A) site classifier, or PAC, was constructed. The uniqueness of the model lies in its structure in that each sub-model can be replaced or expanded, while feature generation, selection and classification are all independent processes. Its modular design makes it easily adaptable to different species or datasets. The algorithm's high specificity and sensitivity were demonstrated by testing several datasets and, at the best combinations, they both reached 95%. The software package may be used for genome annotation and optimizing transgene structure.

摘要

信使核糖核酸(mRNA)聚腺苷酸化是真核基因表达过程中必不可少的加工步骤之一。聚腺苷酸化位点[poly(A)位点]标志着转录本的末端,也就是基因的末端。一个能够识别poly(A)位点的计算程序不仅在寻找基因末端的基因组注释中有用,而且在预测可变poly(A)位点方面也很有用。现在可以从poly(A)位点数据集中提取定义poly(A)位点的特征,以构建这样的预测模型。使用包括K-gram模式、Z曲线、位置特异性评分矩阵和一阶非齐次马尔可夫子模型在内的方法,生成了大量特征并将其置于原始特征空间中。为了选择最有用的特征,采用了信息增益和熵等属性选择算法。然后基于贝叶斯网络构建训练模型,以确定最优特征的一个子集。构建了与训练模型相对应的测试模型,以预测拟南芥和水稻中的poly(A)位点。因此,构建了一个称为Poly(A)位点分类器(PAC)的预测模型。该模型的独特之处在于其结构,即每个子模型都可以被替换或扩展,而特征生成、选择和分类都是独立的过程。其模块化设计使其很容易适应不同的物种或数据集。通过对几个数据集进行测试,证明了该算法具有很高的特异性和敏感性,在最佳组合下,两者均达到95%。该软件包可用于基因组注释和优化转基因结构。

相似文献

1
A classification-based prediction model of messenger RNA polyadenylation sites.一种基于分类的信使核糖核酸聚腺苷酸化位点预测模型。
J Theor Biol. 2010 Aug 7;265(3):287-96. doi: 10.1016/j.jtbi.2010.05.015. Epub 2010 May 26.
2
Prediction of plant mRNA polyadenylation sites.植物mRNA聚腺苷酸化位点的预测
Methods Mol Biol. 2015;1255:13-23. doi: 10.1007/978-1-4939-2175-1_2.
3
Recognition of polyadenylation sites from Arabidopsis genomic sequences.从拟南芥基因组序列中识别聚腺苷酸化位点。
Genome Inform. 2007;19:73-82.
4
Characterization and prediction of mRNA alternative polyadenylation sites in rice genes.水稻基因中mRNA可变聚腺苷酸化位点的表征与预测
Biomed Mater Eng. 2014;24(6):3779-85. doi: 10.3233/BME-141207.
5
Prediction of mRNA polyadenylation sites by support vector machine.利用支持向量机预测mRNA聚腺苷酸化位点
Bioinformatics. 2006 Oct 1;22(19):2320-5. doi: 10.1093/bioinformatics/btl394. Epub 2006 Jul 26.
6
Predictive modeling of plant messenger RNA polyadenylation sites.植物信使核糖核酸聚腺苷酸化位点的预测建模
BMC Bioinformatics. 2007 Feb 7;8:43. doi: 10.1186/1471-2105-8-43.
7
Genome-wide identification and predictive modeling of polyadenylation sites in eukaryotes.真核生物中聚腺苷酸化位点的全基因组鉴定与预测建模
Brief Bioinform. 2015 Mar;16(2):304-13. doi: 10.1093/bib/bbu011. Epub 2014 Apr 1.
8
SpliceIT: a hybrid method for splice signal identification based on probabilistic and biological inference.SpliceIT:一种基于概率和生物推理的混合剪接信号识别方法。
J Biomed Inform. 2010 Apr;43(2):208-17. doi: 10.1016/j.jbi.2009.09.004. Epub 2009 Sep 30.
9
In silico analysis of EST and genomic sequences allowed the prediction of cis-regulatory elements for Entamoeba histolytica mRNA polyadenylation.对EST和基因组序列进行的电子分析使得对溶组织内阿米巴mRNA聚腺苷酸化的顺式调控元件的预测成为可能。
Comput Biol Chem. 2008 Aug;32(4):256-63. doi: 10.1016/j.compbiolchem.2008.03.019. Epub 2008 Apr 12.
10
Fast model-based protein homology detection without alignment.基于快速模型的无需比对的蛋白质同源性检测。
Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8.

引用本文的文献

1
A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq.基于 DNA 序列、bulk RNA-seq 和单细胞 RNA-seq 预测多聚腺苷酸化位点的方法综述
Genomics Proteomics Bioinformatics. 2023 Feb;21(1):67-83. doi: 10.1016/j.gpb.2022.09.005. Epub 2022 Sep 24.
2
Advances in the Bioinformatics Knowledge of mRNA Polyadenylation in Baculovirus Genes.杆状病毒基因中 mRNA 多聚腺苷酸化的生物信息学知识的进展。
Viruses. 2020 Dec 6;12(12):1395. doi: 10.3390/v12121395.
3
Experimental Verification and Evolutionary Origin of 5'-UTR Polyadenylation Sites in .
……中5'-非翻译区聚腺苷酸化位点的实验验证与进化起源
Front Plant Sci. 2018 Jul 5;9:969. doi: 10.3389/fpls.2018.00969. eCollection 2018.
4
Predict and Analyze Protein Glycation Sites with the mRMR and IFS Methods.运用最大相关最小冗余法和迭代特征选择法预测与分析蛋白质糖基化位点
Biomed Res Int. 2015;2015:561547. doi: 10.1155/2015/561547. Epub 2015 Apr 15.
5
Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals.微生物、植物和动物中RNA多聚腺苷酸化位点周围的基序类型、基序位置和碱基组成模式。
BMC Evol Biol. 2014 Jul 23;14:162. doi: 10.1186/s12862-014-0162-7.
6
RNA polyadenylation sites on the genomes of microorganisms, animals, and plants.微生物、动物和植物基因组上的 RNA 多聚腺苷酸化位点。
PLoS One. 2013 Nov 18;8(11):e79511. doi: 10.1371/journal.pone.0079511. eCollection 2013.
7
Poly(A) motif prediction using spectral latent features from human DNA sequences.基于人类 DNA 序列的谱潜在特征进行 Poly(A) 基序预测。
Bioinformatics. 2013 Jul 1;29(13):i316-25. doi: 10.1093/bioinformatics/btt218.
8
A multispecies polyadenylation site model.多物种多聚腺苷酸化位点模型。
BMC Bioinformatics. 2013;14 Suppl 2(Suppl 2):S9. doi: 10.1186/1471-2105-14-S2-S9. Epub 2013 Jan 21.
9
In silico prediction of mRNA poly(A) sites in Chlamydomonas reinhardtii.莱茵衣藻 mRNA 多聚 A 位点的计算机预测。
Mol Genet Genomics. 2012 Dec;287(11-12):895-907. doi: 10.1007/s00438-012-0725-5. Epub 2012 Oct 30.
10
Dragon PolyA Spotter: predictor of poly(A) motifs within human genomic DNA sequences.Dragon PolyA Spotter:在人类基因组 DNA 序列中预测多聚(A)基序的工具。
Bioinformatics. 2012 Jan 1;28(1):127-9. doi: 10.1093/bioinformatics/btr602. Epub 2011 Nov 15.