• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种使用支持向量机进行剪接位点预测的编码方法。

An approach of encoding for prediction of splice sites using SVM.

作者信息

Huang J, Li T, Chen K, Wu J

机构信息

Department of Chemistry, Tongji University, Shanghai, China.

出版信息

Biochimie. 2006 Jul;88(7):923-9. doi: 10.1016/j.biochi.2006.03.006. Epub 2006 Apr 3.

DOI:10.1016/j.biochi.2006.03.006
PMID:16626852
Abstract

In splice sites prediction, the accuracy is lower than 90% though the sequences adjacent to the splice sites have a high conservation. In order to improve the prediction accuracy, much attention has been paid to the improvement of the performance of the algorithms used, and few used for solving the fundamental issues, namely, nucleotide encoding. In this paper, a predictor is constructed to predict the true and false splice sites for higher eukaryotes based on support vector machines (SVM). Four types of encoding, which were mono-nucleotide (MN) encoding, MN with frequency difference between the true sites and false sites (FDTF) encoding, Pair-wise nucleotides (PN) encoding and PN with FDTF encoding, were applied to generate the input for the SVM. The results showed that PN with FDTF encoding as input to SVM led to the most reliable recognition of splice sites and the accuracy for the prediction of true donor sites and false sites were 96.3%, 93.7%, respectively, and the accuracy for predicting of true acceptor sites and false sites were 94.0%, 93.2%, respectively.

摘要

在剪接位点预测中,尽管剪接位点附近的序列具有高度保守性,但准确率低于90%。为了提高预测准确率,人们已将大量注意力放在所使用算法性能的提升上,而很少关注用于解决根本问题,即核苷酸编码。本文构建了一个基于支持向量机(SVM)的预测器,用于预测高等真核生物的真假剪接位点。应用了四种编码方式来生成支持向量机的输入,分别是单核苷酸(MN)编码、具有真假位点频率差异的MN(FDTF)编码、双核苷酸(PN)编码以及具有FDTF的PN编码。结果表明,以具有FDTF的PN编码作为支持向量机的输入能最可靠地识别剪接位点,预测真供体位点和假位点的准确率分别为96.3%、93.7%,预测真受体位点和假位点的准确率分别为94.0%、93.2%。

相似文献

1
An approach of encoding for prediction of splice sites using SVM.一种使用支持向量机进行剪接位点预测的编码方法。
Biochimie. 2006 Jul;88(7):923-9. doi: 10.1016/j.biochi.2006.03.006. Epub 2006 Apr 3.
2
Evaluating the performance of sequence encoding schemes and machine learning methods for splice sites recognition.评估序列编码方案和机器学习方法在剪接位点识别中的性能。
Gene. 2019 Jul 15;705:113-126. doi: 10.1016/j.gene.2019.04.047. Epub 2019 Apr 19.
3
Genomic splice site prediction algorithm based on nucleotide sequence pattern for RNA viruses.基于核苷酸序列模式的RNA病毒基因组剪接位点预测算法
Comput Biol Chem. 2009 Apr;33(2):171-5. doi: 10.1016/j.compbiolchem.2008.08.002. Epub 2008 Aug 20.
4
A computational approach for prediction of donor splice sites with improved accuracy.一种提高准确性的预测供体剪接位点的计算方法。
J Theor Biol. 2016 Sep 7;404:285-294. doi: 10.1016/j.jtbi.2016.06.013. Epub 2016 Jun 11.
5
Prediction of protein subcellular localization.蛋白质亚细胞定位预测
Proteins. 2006 Aug 15;64(3):643-51. doi: 10.1002/prot.21018.
6
Splice site prediction with quadratic discriminant analysis using diversity measure.使用多样性度量的二次判别分析进行剪接位点预测。
Nucleic Acids Res. 2003 Nov 1;31(21):6214-20. doi: 10.1093/nar/gkg805.
7
Information for the Coordinates of Exons (ICE): a human splice sites database.外显子坐标信息(ICE):一个人类剪接位点数据库。
Genomics. 2004 Oct;84(4):762-6. doi: 10.1016/j.ygeno.2004.05.007.
8
Using estimative reaction free energy to predict splice sites and their flanking competitors.利用估计反应自由能预测剪接位点及其侧翼竞争序列。
Gene. 2008 Nov 15;424(1-2):115-20. doi: 10.1016/j.gene.2008.07.038. Epub 2008 Aug 7.
9
Predicting protein structural class by SVM with class-wise optimized features and decision probabilities.使用具有类别优化特征和决策概率的支持向量机预测蛋白质结构类别。
J Theor Biol. 2008 Jul 21;253(2):375-80. doi: 10.1016/j.jtbi.2008.02.031. Epub 2008 Mar 4.
10
Classification of splice-junction sequences via weighted position specific scoring approach.通过加权位置特异性评分方法对剪接接头序列进行分类。
Comput Biol Chem. 2010 Dec;34(5-6):293-9. doi: 10.1016/j.compbiolchem.2010.10.003. Epub 2010 Oct 14.

引用本文的文献

1
Improved recognition of splice sites in by incorporating secondary structure information into sequence-derived features: a computational study.通过将二级结构信息纳入序列衍生特征来提高对剪接位点的识别:一项计算研究。
3 Biotech. 2021 Nov;11(11):484. doi: 10.1007/s13205-021-03036-8. Epub 2021 Oct 31.
2
Splice2Deep: An ensemble of deep convolutional neural networks for improved splice site prediction in genomic DNA.Splice2Deep:用于改进基因组DNA中剪接位点预测的深度卷积神经网络集成方法。
Gene X. 2020 May 13;5:100035. doi: 10.1016/j.gene.2020.100035. eCollection 2020 Dec.
3
A high-performance approach for predicting donor splice sites based on short window size and imbalanced large samples.
基于短窗口大小和不平衡大样本的供体剪接位点预测的高性能方法。
Biol Direct. 2019 Apr 11;14(1):6. doi: 10.1186/s13062-019-0236-y.
4
Discerning novel splice junctions derived from RNA-seq alignment: a deep learning approach.从 RNA-seq 比对中识别新的剪接接头:一种深度学习方法。
BMC Genomics. 2018 Dec 27;19(1):971. doi: 10.1186/s12864-018-5350-1.
5
Identification of donor splice sites using support vector machine: a computational approach based on positional, compositional and dependency features.使用支持向量机识别供体剪接位点:一种基于位置、组成和依赖性特征的计算方法。
Algorithms Mol Biol. 2016 Jun 1;11:16. doi: 10.1186/s13015-016-0078-4. eCollection 2016.
6
Prediction of donor splice sites using random forest with a new sequence encoding approach.使用随机森林和一种新的序列编码方法预测供体剪接位点。
BioData Min. 2016 Jan 22;9:4. doi: 10.1186/s13040-016-0086-4. eCollection 2016.
7
A statistical approach for 5' splice site prediction using short sequence motifs and without encoding sequence data.一种使用短序列基序且无需编码序列数据来预测5'剪接位点的统计方法。
BMC Bioinformatics. 2014 Nov 25;15:362. doi: 10.1186/s12859-014-0362-6.
8
Prediction of vitamin interacting residues in a vitamin binding protein using evolutionary information.利用进化信息预测维生素结合蛋白中的维生素相互作用残基。
BMC Bioinformatics. 2013 Feb 7;14:44. doi: 10.1186/1471-2105-14-44.
9
Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information.局部组合变量:一种结合序列信息用于DNA结合螺旋-转角-螺旋基序预测的方法。
Nucleic Acids Res. 2009 Sep;37(17):5632-40. doi: 10.1093/nar/gkp628. Epub 2009 Aug 3.
10
Accurate splice site prediction using support vector machines.使用支持向量机进行精确的剪接位点预测。
BMC Bioinformatics. 2007;8 Suppl 10(Suppl 10):S7. doi: 10.1186/1471-2105-8-S10-S7.