基于空间样本核的快速准确的多类别蛋白质折叠识别

Fast and accurate multi-class protein fold recognition with spatial sample kernels.

作者信息

Kuksa Pavel, Huang Pai-Hsi, Pavlovic Vladimir

机构信息

Department of Computer Science, Rutgers University, Piscataway, NJ 08854, USA.

出版信息

Comput Syst Bioinformatics Conf. 2008;7:133-43.

PMID:19642275

Abstract

Establishing structural or functional relationship between sequences, for instance to infer the structural class of an unannotated protein, is a key task in biological sequence analysis. Recent computational methods such as profile and neighborhood mismatch kernels have shown very promising results for protein sequence classification, at the cost of high computational complexity. In this study we address the multi-class sequence classification problems using a class of string-based kernels, the sparse spatial sample kernels (SSSK), that are both biologically motivated and efficient to compute. The proposed methods can work with very large databases of protein sequences and show substantial improvements in computing time over the existing methods. Application of the SSSK to the multi-class protein prediction problems (fold recognition and remote homology detection) yields significantly better performance than existing state-of-the-art algorithms.

摘要

建立序列之间的结构或功能关系，例如推断未注释蛋白质的结构类别，是生物序列分析中的一项关键任务。最近的计算方法，如轮廓和邻域错配核，在蛋白质序列分类方面显示出非常有前景的结果，但代价是计算复杂度高。在本研究中，我们使用一类基于字符串的核——稀疏空间样本核（SSSK）来解决多类序列分类问题，这类核既有生物学动机又计算高效。所提出的方法可以处理非常大的蛋白质序列数据库，并且在计算时间上比现有方法有显著改进。将SSSK应用于多类蛋白质预测问题（折叠识别和远程同源性检测）产生的性能明显优于现有的最先进算法。

相似文献

Fast and accurate multi-class protein fold recognition with spatial sample kernels.基于空间样本核的快速准确的多类别蛋白质折叠识别

Comput Syst Bioinformatics Conf. 2008;7:133-43.

Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection.概率多类多核学习：用于蛋白质折叠识别和远程同源性检测

Bioinformatics. 2008 May 15;24(10):1264-70. doi: 10.1093/bioinformatics/btn112. Epub 2008 Mar 31.

SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.支持向量机折叠法：一种用于判别式多类别蛋白质折叠和超家族识别的工具。

BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.

Mismatch string kernels for discriminative protein classification.用于判别式蛋白质分类的错配字符串核

Bioinformatics. 2004 Mar 1;20(4):467-76. doi: 10.1093/bioinformatics/btg431. Epub 2004 Jan 22.

Profile-based direct kernels for remote homology detection and fold recognition.用于远程同源性检测和折叠识别的基于轮廓的直接内核。

Bioinformatics. 2005 Dec 1;21(23):4239-47. doi: 10.1093/bioinformatics/bti687. Epub 2005 Sep 27.

Prediction of protein structure classes with flexible neural tree.使用灵活神经树预测蛋白质结构类别。

Biomed Mater Eng. 2014;24(6):3797-806. doi: 10.3233/BME-141209.

Biological sequence classification with multivariate string kernels.

IEEE/ACM Trans Comput Biol Bioinform. 2013 Sep-Oct;10(5):1201-10. doi: 10.1109/TCBB.2013.15.

Profile-based string kernels for remote homology detection and motif extraction.基于轮廓的字符串核用于远程同源性检测和基序提取。

J Bioinform Comput Biol. 2005 Jun;3(3):527-50. doi: 10.1142/s021972000500120x.

Protein homology detection using string alignment kernels.使用字符串比对核进行蛋白质同源性检测。

Bioinformatics. 2004 Jul 22;20(11):1682-9. doi: 10.1093/bioinformatics/bth141. Epub 2004 Feb 26.

Protein homology detection with biologically inspired features and interpretable statistical models.

Int J Data Min Bioinform. 2008;2(2):157-75. doi: 10.1504/ijdmb.2008.019096.

引用本文的文献

A computational method for designing diverse linear epitopes including citrullinated peptides with desired binding affinities to intravenous immunoglobulin.一种用于设计多种线性表位的计算方法，包括对静脉注射免疫球蛋白具有所需结合亲和力的瓜氨酸化肽。

BMC Bioinformatics. 2016 Apr 8;17:155. doi: 10.1186/s12859-016-1008-7.

Efficient use of unlabeled data for protein sequence classification: a comparative study.蛋白质序列分类中未标记数据的高效利用：一项比较研究。

BMC Bioinformatics. 2009 Apr 29;10 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-10-S4-S2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于空间样本核的快速准确的多类别蛋白质折叠识别

Fast and accurate multi-class protein fold recognition with spatial sample kernels.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献