Suppr超能文献

基于自交协方差变换的蛋白质远程同源检测。

Protein remote homology detection based on auto-cross covariance transformation.

机构信息

College of Engineering, Shanghai Ocean University, Shanghai 201303, China.

出版信息

Comput Biol Med. 2011 Aug;41(8):640-7. doi: 10.1016/j.compbiomed.2011.05.015. Epub 2011 Jun 12.

Abstract

Protein remote homology detection is a critical step toward annotating its structure and function. Supervised learning algorithms such as support vector machine are currently the most accurate methods. The position-specific score matrices (PSSMs) contain wealthy information about the evolutionary relationship of proteins. However, the PSSMs often have different lengths, which are difficult to be used by machine-learning methods. In this study, a simple, fast and powerful method is presented for protein remote homology detection, which combines support vector machine with auto-cross covariance transformation. The PSSMs are converted into a series of fixed-length vectors by auto-cross covariance transformation and these vectors are then input to a support vector machine classifier for remote homology detection. The sequence-order effects can be effectively captured by this scheme. Experiments are performed on well-established datasets, and the remote homology is simulated at the superfamily and the fold level, respectively. The results show that the proposed method, referred to as ACCRe, is comparable or even better than the state-of-the-art methods in terms of detection performance, and its time complexity is superior to those of other profile-based SVM methods. The auto-cross covariance transformation provides a novel way for the usage of evolutionary information, which can be widely used for protein-level studies.

摘要

蛋白质远程同源性检测是注释其结构和功能的关键步骤。支持向量机等监督学习算法是目前最准确的方法。位置特异性评分矩阵(PSSMs)包含有关蛋白质进化关系的丰富信息。然而,PSSMs 通常具有不同的长度,这使得机器学习方法难以使用。在这项研究中,提出了一种简单、快速和强大的蛋白质远程同源性检测方法,该方法将支持向量机与自交叉协方差变换相结合。自交叉协方差变换将 PSSMs 转换为一系列固定长度的向量,然后将这些向量输入支持向量机分类器进行远程同源性检测。该方案可以有效地捕获序列顺序效应。在成熟的数据集上进行了实验,分别在超家族和折叠水平上模拟了远程同源性。结果表明,所提出的方法(简称 ACCRe)在检测性能方面可与最先进的方法相媲美,甚至更好,并且其时间复杂度优于其他基于轮廓的 SVM 方法。自交叉协方差变换为进化信息的使用提供了一种新方法,可广泛用于蛋白质水平的研究。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验