Suppr超能文献

一种平衡二级结构预测器。

A balanced secondary structure predictor.

作者信息

Nasrul Islam Md, Iqbal Sumaiya, Katebi Ataur R, Tamjidul Hoque Md

机构信息

Computer Science, University of New Orleans, Louisiana 70148, USA.

National Cancer Institute, National Institute of Health (NIH), USA.

出版信息

J Theor Biol. 2016 Jan 21;389:60-71. doi: 10.1016/j.jtbi.2015.10.015. Epub 2015 Nov 5.

Abstract

Secondary structure (SS) refers to the local spatial organization of a polypeptide backbone atoms of a protein. Accurate prediction of SS can provide crucial features to form the next higher level of 3D structure of a protein accurately. SS has three different major components, helix (H), beta (E) and coil (C). Most of the SS predictors express imbalanced accuracies by claiming higher prediction performances in predicting H and C, and on the contrary having low accuracy in E predictions. E component being in low count, a predictor may show very good overall performance by over-predicting H and C and under predicting E, which can make such predictors biologically inapplicable. In this work we are motivated to develop a balanced SS predictor by incorporating 33 physicochemical properties into 15-tuble peptides via Chou׳s general PseAAC, which allowed obtaining higher accuracies in predicting all three SS components. Our approach uses three different support vector machines for binary classification of the major classes and then form optimized multiclass predictor using genetic algorithm (GA). The trained three binary SVMs are E versus non-E (i.e., E/¬E), C/¬C and H/¬H. This GA based optimized and combined three class predictor, called cSVM, is further combined with SPINE X to form the proposed final balanced predictor, called MetaSSPred. This novel paradigm assists us in optimizing the precision and recall. We prepared two independent test datasets (CB471 and N295) to compare the performance of our predictors with SPINE X. MetaSSPred significantly increases beta accuracy (QE) for both the datasets. QE score of MetaSSPred on CB471 and N295 were 71.7% and 74.4% respectively. These scores are 20.9% and 19.0% improvement over the QE scores given by SPINE X alone on CB471 and N295 datasets respectively. Standard deviations of the accuracies across three SS classes of MetaSSPred on CB471 and N295 datasets were 4.2% and 2.3% respectively. On the other hand, for SPINE X, these values are 12.9% and 10.9% respectively. These findings suggest that the proposed MetaSSPred is a well-balanced SS predictor compared to the state-of-the-art SPINE X predictor.

摘要

二级结构(SS)是指蛋白质多肽主链原子的局部空间组织。准确预测二级结构可为准确构建蛋白质的下一个更高层次的三维结构提供关键特征。二级结构有三种不同的主要成分,即螺旋(H)、β折叠(E)和无规卷曲(C)。大多数二级结构预测器的预测准确率不均衡,在预测螺旋和无规卷曲时声称具有较高的预测性能,而在预测β折叠时准确率较低。由于β折叠成分数量较少,一个预测器可能通过过度预测螺旋和无规卷曲以及低估β折叠来显示出非常好的整体性能,这可能导致此类预测器在生物学上不适用。在这项工作中,我们通过周的通用伪氨基酸组成将33种物理化学性质纳入15肽段,从而开发出一种平衡的二级结构预测器,这使得在预测所有三种二级结构成分时能够获得更高的准确率。我们的方法使用三种不同的支持向量机对主要类别进行二元分类,然后使用遗传算法(GA)形成优化的多类预测器。训练的三个二元支持向量机分别是β折叠与非β折叠(即E/¬E)、无规卷曲/¬无规卷曲和螺旋/¬螺旋。这种基于遗传算法优化并组合的三类预测器,称为cSVM,进一步与SPINE X相结合,形成了所提出的最终平衡预测器,称为MetaSSPred。这种新颖的范式有助于我们优化精确率和召回率。我们准备了两个独立的测试数据集(CB471和N295)来将我们的预测器与SPINE X的性能进行比较。MetaSSPred显著提高了两个数据集的β折叠准确率(QE)。MetaSSPred在CB471和N295上的QE得分分别为71.7%和74.4%。这些得分分别比单独的SPINE X在CB471和N295数据集上给出的QE得分提高了20.9%和19.0%。MetaSSPred在CB471和N295数据集上三种二级结构类别的准确率标准差分别为4.2%和2.3%。另一方面,对于SPINE X,这些值分别为12.9%和10.9%。这些发现表明,与最先进的SPINE X预测器相比,所提出的MetaSSPred是一种平衡良好的二级结构预测器。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验