• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Biological sequence classification with multivariate string kernels.

作者信息

Kuksa Pavel P

机构信息

NEC Laboratories America Inc, Princeton.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2013 Sep-Oct;10(5):1201-10. doi: 10.1109/TCBB.2013.15.

DOI:10.1109/TCBB.2013.15
PMID:24384708
Abstract

String kernel-based machine learning methods have yielded great success in practical tasks of structured/sequential data analysis. They often exhibit state-of-the-art performance on many practical tasks of sequence analysis such as biological sequence classification, remote homology detection, or protein superfamily and fold prediction. However, typical string kernel methods rely on the analysis of discrete 1D string data (e.g., DNA or amino acid sequences). In this paper, we address the multiclass biological sequence classification problems using multivariate representations in the form of sequences of features vectors (as in biological sequence profiles, or sequences of individual amino acid physicochemical descriptors) and a class of multivariate string kernels that exploit these representations. On three protein sequence classification tasks, the proposed multivariate representations and kernels show significant 15-20 percent improvements compared to existing state-of-the-art sequence classification methods.

摘要

相似文献

1
Biological sequence classification with multivariate string kernels.
IEEE/ACM Trans Comput Biol Bioinform. 2013 Sep-Oct;10(5):1201-10. doi: 10.1109/TCBB.2013.15.
2
Fast and accurate multi-class protein fold recognition with spatial sample kernels.基于空间样本核的快速准确的多类别蛋白质折叠识别
Comput Syst Bioinformatics Conf. 2008;7:133-43.
3
Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection.概率多类多核学习:用于蛋白质折叠识别和远程同源性检测
Bioinformatics. 2008 May 15;24(10):1264-70. doi: 10.1093/bioinformatics/btn112. Epub 2008 Mar 31.
4
Mismatch string kernels for discriminative protein classification.用于判别式蛋白质分类的错配字符串核
Bioinformatics. 2004 Mar 1;20(4):467-76. doi: 10.1093/bioinformatics/btg431. Epub 2004 Jan 22.
5
Profile-based string kernels for remote homology detection and motif extraction.基于轮廓的字符串核用于远程同源性检测和基序提取。
J Bioinform Comput Biol. 2005 Jun;3(3):527-50. doi: 10.1142/s021972000500120x.
6
Protein homology detection using string alignment kernels.使用字符串比对核进行蛋白质同源性检测。
Bioinformatics. 2004 Jul 22;20(11):1682-9. doi: 10.1093/bioinformatics/bth141. Epub 2004 Feb 26.
7
Application of string kernels in protein sequence classification.字符串核在蛋白质序列分类中的应用。
Appl Bioinformatics. 2005;4(1):45-52. doi: 10.2165/00822942-200504010-00005.
8
Learned random-walk kernels and empirical-map kernels for protein sequence classification.用于蛋白质序列分类的学习型随机游走核和经验映射核。
J Comput Biol. 2009 Mar;16(3):457-74. doi: 10.1089/cmb.2008.0031.
9
Application of latent semantic analysis to protein remote homology detection.潜在语义分析在蛋白质远程同源性检测中的应用。
Bioinformatics. 2006 Feb 1;22(3):285-90. doi: 10.1093/bioinformatics/bti801. Epub 2005 Nov 29.
10
Profile-based string kernels for remote homology detection and motif extraction.基于轮廓的字符串核用于远程同源性检测和基序提取。
Proc IEEE Comput Syst Bioinform Conf. 2004:152-60. doi: 10.1109/csb.2004.1332428.

引用本文的文献

1
Maximum margin classifier working in a set of strings.在一组字符串中工作的最大间隔分类器。
Proc Math Phys Eng Sci. 2016 Mar;472(2187):20150551. doi: 10.1098/rspa.2015.0551.