• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用支持向量机的人类剪接位点识别中的马尔可夫编码模型

Markovian encoding models in human splice site recognition using SVM.

作者信息

Pashaei Elham, Aydin Nizamettin

机构信息

Department of Computer Engineering, Yildiz Technical University, Istanbul, Turkey.

出版信息

Comput Biol Chem. 2018 Apr;73:159-170. doi: 10.1016/j.compbiolchem.2018.02.005. Epub 2018 Feb 14.

DOI:10.1016/j.compbiolchem.2018.02.005
PMID:29486390
Abstract

Splice site recognition is among the most significant and challenging tasks in bioinformatics due to its key role in gene annotation. Effective prediction of splice site requires nucleotide encoding methods that reveal the characteristics of DNA sequences to provide appropriate features to serve as input of machine learning classifiers. Markovian models are the most influential encoding methods that highly used for pattern recognition in biological data. However, a direct performance comparison of these methods in splice site domain has not been assessed yet. This study compares various Markovian encoding models for splice site prediction utilizing support vector machine, as the most outstanding learning method in the domain, and conducts a new precise evaluation of Markovian approaches that corrects this limitation. Moreover, a novel sequence encoding approach based on third order Markov model (MM3) is proposed. The experimental results show that the proposed method, namely MM3-SVM, performs significantly better than thirteen best known state-of-the-art algorithms, while tested on HS3D dataset considering several performance criteria. Further, it achieved higher prediction accuracy than several well-known tools like NNsplice, MEM, MM1, WMM, and GeneID, using an independent test set of 50 genes. We also developed MMSVM, a web tool to predict splice sites in any human sequence using the proposed approach. The MMSVM web server can be assessed at https://pashaei.shinyapps.io/mmsvm.

摘要

剪接位点识别是生物信息学中最重要且最具挑战性的任务之一,因为它在基因注释中起着关键作用。有效的剪接位点预测需要核苷酸编码方法,这些方法能够揭示DNA序列的特征,以提供合适的特征作为机器学习分类器的输入。马尔可夫模型是在生物数据模式识别中高度常用的最具影响力的编码方法。然而,尚未评估这些方法在剪接位点领域的直接性能比较。本研究利用支持向量机(该领域最杰出的学习方法)比较了用于剪接位点预测的各种马尔可夫编码模型,并对马尔可夫方法进行了新的精确评估,纠正了这一局限性。此外,还提出了一种基于三阶马尔可夫模型(MM3)的新型序列编码方法。实验结果表明,所提出的方法,即MM3-SVM,在考虑多个性能标准的HS3D数据集上进行测试时,其性能明显优于十三种最著名的现有算法。此外,使用50个基因的独立测试集,它比NNsplice、MEM、MM1、WMM和GeneID等几种知名工具具有更高的预测准确率。我们还开发了MMSVM,这是一个使用所提出的方法预测任何人类序列中剪接位点的网络工具。可以在https://pashaei.shinyapps.io/mmsvm评估MMSVM网络服务器。

相似文献

1
Markovian encoding models in human splice site recognition using SVM.使用支持向量机的人类剪接位点识别中的马尔可夫编码模型
Comput Biol Chem. 2018 Apr;73:159-170. doi: 10.1016/j.compbiolchem.2018.02.005. Epub 2018 Feb 14.
2
A computational approach for prediction of donor splice sites with improved accuracy.一种提高准确性的预测供体剪接位点的计算方法。
J Theor Biol. 2016 Sep 7;404:285-294. doi: 10.1016/j.jtbi.2016.06.013. Epub 2016 Jun 11.
3
Evaluating the performance of sequence encoding schemes and machine learning methods for splice sites recognition.评估序列编码方案和机器学习方法在剪接位点识别中的性能。
Gene. 2019 Jul 15;705:113-126. doi: 10.1016/j.gene.2019.04.047. Epub 2019 Apr 19.
4
A novel method for splice sites prediction using sequence component and hidden Markov model.一种使用序列成分和隐马尔可夫模型进行剪接位点预测的新方法。
Annu Int Conf IEEE Eng Med Biol Soc. 2016 Aug;2016:3076-3079. doi: 10.1109/EMBC.2016.7591379.
5
Prediction of donor splice sites using random forest with a new sequence encoding approach.使用随机森林和一种新的序列编码方法预测供体剪接位点。
BioData Min. 2016 Jan 22;9:4. doi: 10.1186/s13040-016-0086-4. eCollection 2016.
6
A statistical approach for 5' splice site prediction using short sequence motifs and without encoding sequence data.一种使用短序列基序且无需编码序列数据来预测5'剪接位点的统计方法。
BMC Bioinformatics. 2014 Nov 25;15:362. doi: 10.1186/s12859-014-0362-6.
7
Splice site identification using probabilistic parameters and SVM classification.使用概率参数和支持向量机分类进行剪接位点识别。
BMC Bioinformatics. 2006 Dec 18;7 Suppl 5(Suppl 5):S15. doi: 10.1186/1471-2105-7-S5-S15.
8
Fast splice site detection using information content and feature reduction.利用信息内容和特征约简进行快速剪接位点检测。
BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S8. doi: 10.1186/1471-2105-9-S12-S8.
9
An approach of encoding for prediction of splice sites using SVM.一种使用支持向量机进行剪接位点预测的编码方法。
Biochimie. 2006 Jul;88(7):923-9. doi: 10.1016/j.biochi.2006.03.006. Epub 2006 Apr 3.
10
Identification of donor splice sites using support vector machine: a computational approach based on positional, compositional and dependency features.使用支持向量机识别供体剪接位点:一种基于位置、组成和依赖性特征的计算方法。
Algorithms Mol Biol. 2016 Jun 1;11:16. doi: 10.1186/s13015-016-0078-4. eCollection 2016.

引用本文的文献

1
N6-methyladenine identification using deep learning and discriminative feature integration.利用深度学习和判别特征整合进行N6-甲基腺嘌呤识别
BMC Med Genomics. 2025 Mar 29;18(1):58. doi: 10.1186/s12920-025-02131-6.
2
CNNSplice: Robust models for splice site prediction using convolutional neural networks.CNNSplice:使用卷积神经网络进行剪接位点预测的稳健模型。
Comput Struct Biotechnol J. 2023 May 30;21:3210-3223. doi: 10.1016/j.csbj.2023.05.031. eCollection 2023.
3
Hybrid Hypercube Optimization Search Algorithm and Multilayer Perceptron Neural Network for Medical Data Classification.
混合超立方优化搜索算法和多层感知器神经网络在医学数据分类中的应用。
Comput Intell Neurosci. 2022 Mar 25;2022:1612468. doi: 10.1155/2022/1612468. eCollection 2022.
4
DASSI: differential architecture search for splice identification from DNA sequences.DASSI:用于从DNA序列中识别剪接的差异架构搜索
BioData Min. 2021 Feb 15;14(1):15. doi: 10.1186/s13040-021-00237-y.