基于二级结构的蛋白质结构类别的划分

Secondary structure-based assignment of the protein structural classes.

作者信息

Kurgan Lukasz A, Zhang Tuo, Zhang Hua, Shen Shiyi, Ruan Jishou

机构信息

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada.

出版信息

Amino Acids. 2008 Oct;35(3):551-64. doi: 10.1007/s00726-008-0080-3. Epub 2008 Apr 22.

DOI:10.1007/s00726-008-0080-3

PMID:18427716

Abstract

Structural class categorizes proteins based on the amount and arrangement of the constituent secondary structures. The knowledge of structural classes is applied in numerous important predictive tasks that address structural and functional features of proteins. We propose novel structural class assignment methods that use one-dimensional (1D) secondary structure as the input. The methods are designed based on a large set of low-identity sequences for which secondary structure is predicted from their sequence (PSSA(sc) model) or assigned based on their tertiary structure (SSA(sc)). The secondary structure is encoded using a comprehensive set of features describing count, content, and size of secondary structure segments, which are fed into a small decision tree that uses ten features to perform the assignment. The proposed models were compared against seven secondary structure-based and ten sequence-based structural class predictors. Using the 1D secondary structure, SSA(sc) and PSSA(sc) can assign proteins to the four main structural classes, while the existing secondary structure-based assignment methods can predict only three classes. Empirical evaluation shows that the proposed models are quite promising. Using the structure-based assignment performed in SCOP (structural classification of proteins) as the golden standard, the accuracy of SSA(sc) and PSSA(sc) equals 76 and 75%, respectively. We show that the use of the secondary structure predicted from the sequence as an input does not have a detrimental effect on the quality of structural class assignment when compared with using secondary structure derived from tertiary structure. Therefore, PSSA(sc) can be used to perform the automated assignment of structural classes based on the sequences.

摘要

结构类别根据组成二级结构的数量和排列对蛋白质进行分类。结构类别的知识被应用于许多重要的预测任务中，这些任务涉及蛋白质的结构和功能特征。我们提出了新颖的结构类别分配方法，该方法使用一维（1D）二级结构作为输入。这些方法是基于大量低同源性序列设计的，对于这些序列，二级结构是根据其序列预测的（PSSA(sc)模型）或根据其三级结构分配的（SSA(sc)）。二级结构使用一组全面的特征进行编码，这些特征描述了二级结构片段的数量、含量和大小，然后将这些特征输入到一个小型决策树中，该决策树使用十个特征来进行分配。将所提出的模型与七个基于二级结构的和十个基于序列的结构类别预测器进行了比较。使用1D二级结构，SSA(sc)和PSSA(sc)可以将蛋白质分配到四个主要结构类别，而现有的基于二级结构的分配方法只能预测三个类别。实证评估表明，所提出的模型很有前景。以SCOP（蛋白质结构分类）中基于结构的分配作为黄金标准，SSA(sc)和PSSA(sc)的准确率分别为76%和75%。我们表明，与使用从三级结构衍生的二级结构相比，使用从序列预测的二级结构作为输入对结构类别分配的质量没有不利影响。因此，PSSA(sc)可用于基于序列进行结构类别的自动分配。

相似文献

Secondary structure-based assignment of the protein structural classes.

Amino Acids. 2008 Oct;35(3):551-64. doi: 10.1007/s00726-008-0080-3. Epub 2008 Apr 22.

Beyond the Twilight Zone: automated prediction of structural properties of proteins by recursive neural networks and remote homology information.

Proteins. 2009 Oct;77(1):181-90. doi: 10.1002/prot.22429.

High-accuracy prediction of protein structural class for low-similarity sequences based on predicted secondary structure.

Biochimie. 2011 Apr;93(4):710-4. doi: 10.1016/j.biochi.2011.01.001. Epub 2011 Jan 13.

PFRES: protein fold classification by using evolutionary information and predicted secondary structure.

Bioinformatics. 2007 Nov 1;23(21):2843-50. doi: 10.1093/bioinformatics/btm475. Epub 2007 Oct 17.

A simple method for protein structural classification.

J Mol Graph Model. 2007 Mar;25(6):852-5. doi: 10.1016/j.jmgm.2006.08.006. Epub 2006 Aug 30.

Defining linear segments in protein structure.

J Mol Biol. 2001 Jul 27;310(5):1135-50. doi: 10.1006/jmbi.2001.4817.

Sequence-based protein structure prediction using a reduced state-space hidden Markov model.

Comput Biol Med. 2007 Sep;37(9):1211-24. doi: 10.1016/j.compbiomed.2006.10.014. Epub 2006 Dec 11.

New method for protein secondary structure assignment based on a simple topological descriptor.

Proteins. 2005 Aug 15;60(3):513-24. doi: 10.1002/prot.20471.

A high-accuracy protein structural class prediction algorithm using predicted secondary structural information.

J Theor Biol. 2010 Dec 7;267(3):272-5. doi: 10.1016/j.jtbi.2010.09.007. Epub 2010 Sep 8.

Enhanced protein fold recognition using a structural alphabet.

Proteins. 2009 Jul;76(1):129-37. doi: 10.1002/prot.22324.

引用本文的文献

CIPPN: computational identification of protein pupylation sites by using neural network.

Oncotarget. 2017 Nov 6;8(65):108867-108879. doi: 10.18632/oncotarget.22335. eCollection 2017 Dec 12.

Prediction of Protein Structural Class Based on Gapped-Dipeptides and a Recursive Feature Selection Approach.

Int J Mol Sci. 2015 Dec 24;17(1):15. doi: 10.3390/ijms17010015.

Customised fragments libraries for protein structure prediction based on structural class annotations.

BMC Bioinformatics. 2015 Apr 29;16(1):136. doi: 10.1186/s12859-015-0576-2.

Improving protein fold recognition using the amalgamation of evolutionary-based and structural based information.

BMC Bioinformatics. 2014;15 Suppl 16(Suppl 16):S12. doi: 10.1186/1471-2105-15-S16-S12. Epub 2014 Dec 8.

Characteristics of protein residue-residue contacts and their application in contact prediction.

J Mol Model. 2014 Nov;20(11):2497. doi: 10.1007/s00894-014-2497-9. Epub 2014 Nov 6.

Proposing a highly accurate protein structural class predictor using segmentation-based features.

BMC Genomics. 2014;15 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2164-15-S1-S2. Epub 2014 Jan 24.

Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences.

BMC Bioinformatics. 2009 Dec 13;10:414. doi: 10.1186/1471-2105-10-414.

Prodepth: predict residue depth by support vector regression approach from protein sequences only.

PLoS One. 2009 Sep 17;4(9):e7072. doi: 10.1371/journal.pone.0007072.

Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis.

BMC Struct Biol. 2009 Apr 17;9:23. doi: 10.1186/1472-6807-9-23.

Protein-segment universe exhibiting transitions at intermediate segment length in conformational subspaces.

BMC Struct Biol. 2008 Aug 13;8:37. doi: 10.1186/1472-6807-8-37.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于二级结构的蛋白质结构类别的划分

Secondary structure-based assignment of the protein structural classes.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献