• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于 DNA 序列中站点识别的分类模型组合的浮动搜索方法。

Floating Search Methodology for Combining Classification Models for Site Recognition in DNA Sequences.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2471-2482. doi: 10.1109/TCBB.2020.2974221. Epub 2021 Dec 8.

DOI:10.1109/TCBB.2020.2974221
PMID:32078558
Abstract

Recognition of the functional sites of genes, such as translation initiation sites, donor and acceptor splice sites and stop codons, is a relevant part of many current problems in bioinformatics. The best approaches use sophisticated classifiers, such as support vector machines. However, with the rapid accumulation of sequence data, methods for combining many sources of evidence are necessary as it is unlikely that a single classifier can solve this problem with the best possible performance. A major issue is that the number of possible models to combine is large and the use of all of these models is impractical. In this paper we present a methodology for combining many sources of information to recognize any functional site using "floating search", a powerful heuristics applicable when the cost of evaluating each solution is high. We present experiments on four functional sites in the human genome, which is used as the target genome, and use another 20 species as sources of evidence. The proposed methodology shows significant improvement over state-of-the-art methods. The results show an advantage of the proposed method and also challenge the standard assumption of using only genomes not very close and not very far from the human to improve the recognition of functional sites.

摘要

识别基因的功能位点,如翻译起始位点、供体和受体剪接位点以及终止密码子,是生物信息学中许多当前问题的一个重要组成部分。最好的方法是使用复杂的分类器,如支持向量机。然而,随着序列数据的快速积累,需要结合多种来源的证据的方法,因为不太可能有一种单一的分类器能够以最佳的性能解决这个问题。一个主要的问题是,要结合的可能模型的数量很大,使用所有这些模型是不切实际的。在本文中,我们提出了一种使用“浮动搜索”结合多种信息源来识别任何功能位点的方法,这是一种适用于评估每个解决方案成本很高的强大启发式方法。我们在人类基因组中的四个功能位点上进行了实验,并使用另外 20 个物种作为证据来源。所提出的方法与最先进的方法相比有显著的改进。结果表明,所提出的方法具有优势,同时也对仅使用与人类不太近也不太远的基因组来提高功能位点识别的标准假设提出了挑战。

相似文献

1
Floating Search Methodology for Combining Classification Models for Site Recognition in DNA Sequences.用于 DNA 序列中站点识别的分类模型组合的浮动搜索方法。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2471-2482. doi: 10.1109/TCBB.2020.2974221. Epub 2021 Dec 8.
2
Stepwise approach for combining many sources of evidence for site-recognition in genomic sequences.整合基因组序列中位点识别多种证据来源的逐步方法。
BMC Bioinformatics. 2016 Mar 5;17:117. doi: 10.1186/s12859-016-0968-y.
3
Improving translation initiation site and stop codon recognition by using more than two classes.通过使用两类以上信息来改进翻译起始位点和终止密码子识别。
Bioinformatics. 2014 Oct;30(19):2702-8. doi: 10.1093/bioinformatics/btu369. Epub 2014 Jun 4.
4
FunSiP: a modular and extensible classifier for the prediction of functional sites in DNA.FunSiP:一种用于预测DNA功能位点的模块化可扩展分类器。
Bioinformatics. 2008 Jul 1;24(13):1532-3. doi: 10.1093/bioinformatics/btn225. Epub 2008 May 12.
5
An evolutionary algorithm approach for feature generation from sequence data and its application to DNA splice site prediction.一种从序列数据中生成特征的进化算法方法及其在 DNA 剪接位点预测中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Sep-Oct;9(5):1387-98. doi: 10.1109/TCBB.2012.53.
6
Effective automated feature construction and selection for classification of biological sequences.用于生物序列分类的有效自动特征构建与选择
PLoS One. 2014 Jul 17;9(7):e99982. doi: 10.1371/journal.pone.0099982. eCollection 2014.
7
Fast model-based protein homology detection without alignment.基于快速模型的无需比对的蛋白质同源性检测。
Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8.
8
DNA splice site detection: a comparison of specific and general methods.DNA剪接位点检测:特异性方法与通用方法的比较
Proc AMIA Symp. 2002:390-4.
9
Ensemble approach combining multiple methods improves human transcription start site prediction.集成多种方法的组合方法可提高人类转录起始位点预测的准确性。
BMC Genomics. 2010 Nov 30;11:677. doi: 10.1186/1471-2164-11-677.
10
Engineering support vector machine kernels that recognize translation initiation sites.工程化支持向量机内核以识别翻译起始位点。
Bioinformatics. 2000 Sep;16(9):799-807. doi: 10.1093/bioinformatics/16.9.799.

引用本文的文献

1
Deep learning and support vector machines for transcription start site identification.用于转录起始位点识别的深度学习与支持向量机
PeerJ Comput Sci. 2023 Apr 17;9:e1340. doi: 10.7717/peerj-cs.1340. eCollection 2023.
2
Nonlinear physics opens a new paradigm for accurate transcription start site prediction.非线性物理学为准确的转录起始位点预测开辟了新的范例。
BMC Bioinformatics. 2022 Dec 30;23(1):565. doi: 10.1186/s12859-022-05129-4.