• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于决策级融合的改进型聚腺苷酸(poly(A))基序识别方法。

An improved poly(A) motifs recognition method based on decision level fusion.

作者信息

Zhang Shanxin, Han Jiuqiang, Liu Jun, Zheng Jiguang, Liu Ruiling

机构信息

School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, PR China.

School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, PR China.

出版信息

Comput Biol Chem. 2015 Feb;54:49-56. doi: 10.1016/j.compbiolchem.2014.12.001. Epub 2014 Dec 30.

DOI:10.1016/j.compbiolchem.2014.12.001
PMID:25594576
Abstract

Polyadenylation is the process of addition of poly(A) tail to mRNA 3' ends. Identification of motifs controlling polyadenylation plays an essential role in improving genome annotation accuracy and better understanding of the mechanisms governing gene regulation. The bioinformatics methods used for poly(A) motifs recognition have demonstrated that information extracted from sequences surrounding the candidate motifs can differentiate true motifs from the false ones greatly. However, these methods depend on either domain features or string kernels. To date, methods combining information from different sources have not been found yet. Here, we proposed an improved poly(A) motifs recognition method by combing different sources based on decision level fusion. First of all, two novel prediction methods was proposed based on support vector machine (SVM): one method is achieved by using the domain-specific features and principle component analysis (PCA) method to eliminate the redundancy (PCA-SVM); the other method is based on Oligo string kernel (Oligo-SVM). Then we proposed a novel machine-learning method for poly(A) motif prediction by marrying four poly(A) motifs recognition methods, including two state-of-the-art methods (Random Forest (RF) and HMM-SVM), and two novel proposed methods (PCA-SVM and Oligo-SVM). A decision level information fusion method was employed to combine the decision values of different classifiers by applying the DS evidence theory. We evaluated our method on a comprehensive poly(A) dataset that consists of 14,740 samples on 12 variants of poly(A) motifs and 2750 samples containing none of these motifs. Our method has achieved accuracy up to 86.13%. Compared with the four classifiers, our evidence theory based method reduces the average error rate by about 30%, 27%, 26% and 16%, respectively. The experimental results suggest that the proposed method is more effective for poly(A) motif recognition.

摘要

聚腺苷酸化是在mRNA 3'末端添加聚(A)尾巴的过程。识别控制聚腺苷酸化的基序对于提高基因组注释准确性和更好地理解基因调控机制起着至关重要的作用。用于聚(A)基序识别的生物信息学方法表明,从候选基序周围序列中提取的信息可以极大地将真正的基序与假基序区分开来。然而,这些方法要么依赖于结构域特征,要么依赖于字符串核。迄今为止,尚未发现结合不同来源信息的方法。在此,我们基于决策级融合提出了一种通过结合不同来源来改进聚(A)基序识别的方法。首先,基于支持向量机(SVM)提出了两种新颖的预测方法:一种方法是通过使用特定结构域特征和主成分分析(PCA)方法来消除冗余(PCA-SVM);另一种方法基于寡核苷酸字符串核(Oligo-SVM)。然后,我们通过结合四种聚(A)基序识别方法,包括两种最先进的方法(随机森林(RF)和HMM-SVM)以及两种新提出的方法(PCA-SVM和Oligo-SVM),提出了一种用于聚(A)基序预测的新颖机器学习方法。采用决策级信息融合方法,通过应用DS证据理论来组合不同分类器的决策值。我们在一个综合的聚(A)数据集上评估了我们的方法,该数据集由14740个关于12种聚(A)基序变体的样本和2750个不包含这些基序的样本组成。我们的方法达到了高达86.13%的准确率。与这四个分类器相比,我们基于证据理论的方法分别将平均错误率降低了约30%、27%、26%和16%。实验结果表明,所提出的方法对于聚(A)基序识别更有效。

相似文献

1
An improved poly(A) motifs recognition method based on decision level fusion.一种基于决策级融合的改进型聚腺苷酸(poly(A))基序识别方法。
Comput Biol Chem. 2015 Feb;54:49-56. doi: 10.1016/j.compbiolchem.2014.12.001. Epub 2014 Dec 30.
2
Poly(A) motif prediction using spectral latent features from human DNA sequences.基于人类 DNA 序列的谱潜在特征进行 Poly(A) 基序预测。
Bioinformatics. 2013 Jul 1;29(13):i316-25. doi: 10.1093/bioinformatics/btt218.
3
Seminal quality prediction using data mining methods.使用数据挖掘方法进行精液质量预测。
Technol Health Care. 2014;22(4):531-45. doi: 10.3233/THC-140816.
4
Computational analysis of plant polyadenylation signals.植物聚腺苷酸化信号的计算分析
Methods Mol Biol. 2015;1255:3-11. doi: 10.1007/978-1-4939-2175-1_1.
5
An in-silico method for prediction of polyadenylation signals in human sequences.
Genome Inform. 2003;14:84-93.
6
Prediction of mRNA polyadenylation sites by support vector machine.利用支持向量机预测mRNA聚腺苷酸化位点
Bioinformatics. 2006 Oct 1;22(19):2320-5. doi: 10.1093/bioinformatics/btl394. Epub 2006 Jul 26.
7
A novel genome-wide polyadenylation sites recognition system based on condition random field.
Annu Int Conf IEEE Eng Med Biol Soc. 2014;2014:4755-8. doi: 10.1109/EMBC.2014.6944687.
8
Prediction of interactions between viral and host proteins using supervised machine learning methods.使用监督式机器学习方法预测病毒蛋白与宿主蛋白之间的相互作用。
PLoS One. 2014 Nov 6;9(11):e112034. doi: 10.1371/journal.pone.0112034. eCollection 2014.
9
A Support Vector Machine based method to distinguish long non-coding RNAs from protein coding transcripts.基于支持向量机的方法区分长非编码 RNA 与蛋白质编码转录本。
BMC Genomics. 2017 Oct 18;18(1):804. doi: 10.1186/s12864-017-4178-4.
10
Rapid detecting total acid content and classifying different types of vinegar based on near infrared spectroscopy and least-squares support vector machine.基于近红外光谱和最小二乘支持向量机快速检测总酸含量并对不同类型的醋进行分类。
Food Chem. 2013 May 1;138(1):192-9. doi: 10.1016/j.foodchem.2012.10.060. Epub 2012 Nov 8.

引用本文的文献

1
From shallow to deep: some lessons learned from application of machine learning for recognition of functional genomic elements in human genome.从浅入深:机器学习在人类基因组功能基因组元件识别应用中的一些经验教训。
Hum Genomics. 2022 Feb 18;16(1):7. doi: 10.1186/s40246-022-00376-1.