Suppr超能文献

一种使用自动交叉协方差变换和递归特征消除的高精度蛋白质结构类别预测方法。

A highly accurate protein structural class prediction approach using auto cross covariance transformation and recursive feature elimination.

作者信息

Li Xiaowei, Liu Taigang, Tao Peiying, Wang Chunhua, Chen Lanming

机构信息

College of Food Science & Technology, Shanghai Ocean University, Shanghai 201306, China.

College of Information Technology, Shanghai Ocean University, Shanghai 201306, China.

出版信息

Comput Biol Chem. 2015 Dec;59 Pt A:95-100. doi: 10.1016/j.compbiolchem.2015.08.012. Epub 2015 Sep 2.

Abstract

Structural class characterizes the overall folding type of a protein or its domain. Many methods have been proposed to improve the prediction accuracy of protein structural class in recent years, but it is still a challenge for the low-similarity sequences. In this study, we introduce a feature extraction technique based on auto cross covariance (ACC) transformation of position-specific score matrix (PSSM) to represent a protein sequence. Then support vector machine-recursive feature elimination (SVM-RFE) is adopted to select top K features according to their importance and these features are input to a support vector machine (SVM) to conduct the prediction. Performance evaluation of the proposed method is performed using the jackknife test on three low-similarity datasets, i.e., D640, 1189 and 25PDB. By means of this method, the overall accuracies of 97.2%, 96.2%, and 93.3% are achieved on these three datasets, which are higher than those of most existing methods. This suggests that the proposed method could serve as a very cost-effective tool for predicting protein structural class especially for low-similarity datasets.

摘要

结构类别表征蛋白质或其结构域的整体折叠类型。近年来,人们提出了许多方法来提高蛋白质结构类别的预测准确性,但对于低相似性序列来说,这仍然是一个挑战。在本研究中,我们引入了一种基于位置特异性得分矩阵(PSSM)的自协方差(ACC)变换的特征提取技术来表示蛋白质序列。然后采用支持向量机递归特征消除(SVM-RFE)根据特征的重要性选择前K个特征,并将这些特征输入支持向量机(SVM)进行预测。使用留一法在三个低相似性数据集(即D640、1189和25PDB)上对所提出的方法进行性能评估。通过这种方法,在这三个数据集上分别达到了97.2%、96.2%和93.3%的总体准确率,高于大多数现有方法。这表明所提出的方法可以作为一种非常经济高效的工具来预测蛋白质结构类别,特别是对于低相似性数据集。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验