• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用支持向量机进行精确的剪接位点预测。

Accurate splice site prediction using support vector machines.

作者信息

Sonnenburg Sören, Schweikert Gabriele, Philips Petra, Behr Jonas, Rätsch Gunnar

机构信息

Fraunhofer Institute FIRST, Kekuléstr, 7, 12489 Berlin, Germany.

出版信息

BMC Bioinformatics. 2007;8 Suppl 10(Suppl 10):S7. doi: 10.1186/1471-2105-8-S10-S7.

DOI:10.1186/1471-2105-8-S10-S7
PMID:18269701
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2230508/
Abstract

BACKGROUND

For splice site recognition, one has to solve two classification problems: discriminating true from decoy splice sites for both acceptor and donor sites. Gene finding systems typically rely on Markov Chains to solve these tasks.

RESULTS

In this work we consider Support Vector Machines for splice site recognition. We employ the so-called weighted degree kernel which turns out well suited for this task, as we will illustrate in several experiments where we compare its prediction accuracy with that of recently proposed systems. We apply our method to the genome-wide recognition of splice sites in Caenorhabditis elegans, Drosophila melanogaster, Arabidopsis thaliana, Danio rerio, and Homo sapiens. Our performance estimates indicate that splice sites can be recognized very accurately in these genomes and that our method outperforms many other methods including Markov Chains, GeneSplicer and SpliceMachine. We provide genome-wide predictions of splice sites and a stand-alone prediction tool ready to be used for incorporation in a gene finder.

AVAILABILITY

Data, splits, additional information on the model selection, the whole genome predictions, as well as the stand-alone prediction tool are available for download at http://www.fml.mpg.de/raetsch/projects/splice.

摘要

背景

对于剪接位点识别,必须解决两个分类问题:区分受体和供体位点的真实剪接位点与诱饵剪接位点。基因发现系统通常依靠马尔可夫链来解决这些任务。

结果

在这项工作中,我们考虑使用支持向量机进行剪接位点识别。我们采用了所谓的加权度核,结果证明它非常适合这项任务,正如我们将在几个实验中说明的那样,在这些实验中我们将其预测准确性与最近提出的系统的预测准确性进行了比较。我们将我们的方法应用于秀丽隐杆线虫、黑腹果蝇、拟南芥、斑马鱼和智人的全基因组剪接位点识别。我们的性能评估表明,在这些基因组中可以非常准确地识别剪接位点,并且我们的方法优于许多其他方法,包括马尔可夫链、基因剪接器和剪接机。我们提供了全基因组剪接位点预测以及一个独立的预测工具,可随时用于整合到基因发现器中。

可用性

数据、分割、关于模型选择的附加信息、全基因组预测以及独立预测工具可在http://www.fml.mpg.de/raetsch/projects/splice下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aeb1/2230508/52bf3d08ad3a/1471-2105-8-S10-S7-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aeb1/2230508/5a799c0ffb43/1471-2105-8-S10-S7-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aeb1/2230508/576904f0e785/1471-2105-8-S10-S7-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aeb1/2230508/de35e2eadc75/1471-2105-8-S10-S7-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aeb1/2230508/52bf3d08ad3a/1471-2105-8-S10-S7-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aeb1/2230508/5a799c0ffb43/1471-2105-8-S10-S7-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aeb1/2230508/576904f0e785/1471-2105-8-S10-S7-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aeb1/2230508/de35e2eadc75/1471-2105-8-S10-S7-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aeb1/2230508/52bf3d08ad3a/1471-2105-8-S10-S7-4.jpg

相似文献

1
Accurate splice site prediction using support vector machines.使用支持向量机进行精确的剪接位点预测。
BMC Bioinformatics. 2007;8 Suppl 10(Suppl 10):S7. doi: 10.1186/1471-2105-8-S10-S7.
2
Evaluating the performance of sequence encoding schemes and machine learning methods for splice sites recognition.评估序列编码方案和机器学习方法在剪接位点识别中的性能。
Gene. 2019 Jul 15;705:113-126. doi: 10.1016/j.gene.2019.04.047. Epub 2019 Apr 19.
3
Splice site prediction with quadratic discriminant analysis using diversity measure.使用多样性度量的二次判别分析进行剪接位点预测。
Nucleic Acids Res. 2003 Nov 1;31(21):6214-20. doi: 10.1093/nar/gkg805.
4
Splice2Deep: An ensemble of deep convolutional neural networks for improved splice site prediction in genomic DNA.Splice2Deep:用于改进基因组DNA中剪接位点预测的深度卷积神经网络集成方法。
Gene. 2020 Dec;763S:100035. doi: 10.1016/j.gene.2020.100035. Epub 2020 May 13.
5
RASE: recognition of alternatively spliced exons in C.elegans.RASE:秀丽隐杆线虫中可变剪接外显子的识别
Bioinformatics. 2005 Jun;21 Suppl 1:i369-77. doi: 10.1093/bioinformatics/bti1053.
6
Comprehensive splice-site analysis using comparative genomics.使用比较基因组学进行全面的剪接位点分析。
Nucleic Acids Res. 2006;34(14):3955-67. doi: 10.1093/nar/gkl556. Epub 2006 Aug 12.
7
SpliceMachine: predicting splice sites from high-dimensional local context representations.拼接机器:从高维局部上下文表示中预测剪接位点。
Bioinformatics. 2005 Apr 15;21(8):1332-8. doi: 10.1093/bioinformatics/bti166. Epub 2004 Nov 25.
8
mGene: accurate SVM-based gene finding with an application to nematode genomes.mGene:基于 SVM 的精确基因预测方法及其在线虫基因组中的应用。
Genome Res. 2009 Nov;19(11):2133-43. doi: 10.1101/gr.090597.108. Epub 2009 Jun 29.
9
Feature subset selection for splice site prediction.用于剪接位点预测的特征子集选择
Bioinformatics. 2002;18 Suppl 2:S75-83. doi: 10.1093/bioinformatics/18.suppl_2.s75.
10
SpliceFinder: ab initio prediction of splice sites using convolutional neural network.SpliceFinder:使用卷积神经网络进行剪接位点的从头预测。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):652. doi: 10.1186/s12859-019-3306-3.

引用本文的文献

1
From computational models of the splicing code to regulatory mechanisms and therapeutic implications.从剪接密码的计算模型到调控机制及治疗意义
Nat Rev Genet. 2025 Mar;26(3):171-190. doi: 10.1038/s41576-024-00774-2. Epub 2024 Oct 2.
2
Splam: a deep-learning-based splice site predictor that improves spliced alignments.Splam:一种基于深度学习的剪接位点预测器,可提高剪接对齐。
Genome Biol. 2024 Sep 16;25(1):243. doi: 10.1186/s13059-024-03379-4.
3
DRANetSplicer: A Splice Site Prediction Model Based on Deep Residual Attention Networks.

本文引用的文献

1
POIMs: positional oligomer importance matrices--understanding support vector machine-based signal detectors.位置寡聚物重要性矩阵(POIMs):理解基于支持向量机的信号检测器
Bioinformatics. 2008 Jul 1;24(13):i6-14. doi: 10.1093/bioinformatics/btn170.
2
An introduction to kernel-based learning algorithms.基于核的学习算法介绍。
IEEE Trans Neural Netw. 2001;12(2):181-201. doi: 10.1109/72.914517.
3
Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana.塑造拟南芥遗传多样性的常见序列多态性。
DRANetSplicer:一种基于深度残差注意力网络的剪接位点预测模型。
Genes (Basel). 2024 Mar 26;15(4):404. doi: 10.3390/genes15040404.
4
Splam: a deep-learning-based splice site predictor that improves spliced alignments.Splam:一种基于深度学习的剪接位点预测器,可改善剪接比对。
bioRxiv. 2023 Jul 29:2023.07.27.550754. doi: 10.1101/2023.07.27.550754.
5
Deep learning and support vector machines for transcription start site identification.用于转录起始位点识别的深度学习与支持向量机
PeerJ Comput Sci. 2023 Apr 17;9:e1340. doi: 10.7717/peerj-cs.1340. eCollection 2023.
6
EnsembleSplice: ensemble deep learning model for splice site prediction.EnsembleSplice:用于剪接位点预测的集成深度学习模型。
BMC Bioinformatics. 2022 Oct 6;23(1):413. doi: 10.1186/s12859-022-04971-w.
7
Spliceator: multi-species splice site prediction using convolutional neural networks.Spliceator:使用卷积神经网络进行多物种剪接位点预测。
BMC Bioinformatics. 2021 Nov 23;22(1):561. doi: 10.1186/s12859-021-04471-3.
8
Improved recognition of splice sites in by incorporating secondary structure information into sequence-derived features: a computational study.通过将二级结构信息纳入序列衍生特征来提高对剪接位点的识别:一项计算研究。
3 Biotech. 2021 Nov;11(11):484. doi: 10.1007/s13205-021-03036-8. Epub 2021 Oct 31.
9
Learning the Regulatory Code of Gene Expression.学习基因表达的调控密码。
Front Mol Biosci. 2021 Jun 10;8:673363. doi: 10.3389/fmolb.2021.673363. eCollection 2021.
10
MTSplice predicts effects of genetic variants on tissue-specific splicing.MTSplice 预测遗传变异对组织特异性剪接的影响。
Genome Biol. 2021 Mar 31;22(1):94. doi: 10.1186/s13059-021-02273-7.
Science. 2007 Jul 20;317(5836):338-42. doi: 10.1126/science.1138632.
4
A computational survey of candidate exonic splicing enhancer motifs in the model plant Arabidopsis thaliana.模式植物拟南芥中候选外显子剪接增强子基序的计算研究。
BMC Bioinformatics. 2007 May 21;8:159. doi: 10.1186/1471-2105-8-159.
5
Global discriminative learning for higher-accuracy computational gene prediction.用于更高精度计算基因预测的全局判别学习
PLoS Comput Biol. 2007 Mar 16;3(3):e54. doi: 10.1371/journal.pcbi.0030054. Epub 2007 Feb 2.
6
Improving the Caenorhabditis elegans genome annotation using machine learning.利用机器学习改进秀丽隐杆线虫基因组注释
PLoS Comput Biol. 2007 Feb 23;3(2):e20. doi: 10.1371/journal.pcbi.0030020. Epub 2006 Dec 21.
7
Splice site identification using probabilistic parameters and SVM classification.使用概率参数和支持向量机分类进行剪接位点识别。
BMC Bioinformatics. 2006 Dec 18;7 Suppl 5(Suppl 5):S15. doi: 10.1186/1471-2105-7-S5-S15.
8
What is a support vector machine?什么是支持向量机?
Nat Biotechnol. 2006 Dec;24(12):1565-7. doi: 10.1038/nbt1206-1565.
9
Markov encoding for detecting signals in genomic sequences.用于检测基因组序列中信号的马尔可夫编码
IEEE/ACM Trans Comput Biol Bioinform. 2005 Apr-Jun;2(2):131-42. doi: 10.1109/TCBB.2005.27.
10
Performance assessment of promoter predictions on ENCODE regions in the EGASP experiment.在EGASP实验中对ENCODE区域的启动子预测进行性能评估。
Genome Biol. 2006;7 Suppl 1(Suppl 1):S3.1-13. doi: 10.1186/gb-2006-7-s1-s3. Epub 2006 Aug 7.