通过功能域组成和伪氨基酸组成预测酶亚类

Predicting enzyme subclass by functional domain composition and pseudo amino acid composition.

作者信息

Cai Yu-Dong, Chou Kuo-Chen

机构信息

Bioinformatics Center, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China.

出版信息

J Proteome Res. 2005 May-Jun;4(3):967-71. doi: 10.1021/pr0500399.

DOI:10.1021/pr0500399

PMID:15952744

Abstract

As a continuous effort to use the sequence approach to identify enzymatic function at a deeper level, investigations are extended from the main enzyme classes (Protein Sci. 2004, 13, 2857-2863) to their subclasses. This is indispensable if we wish to understand the molecular mechanism of an enzyme at a deeper level. For each of the 6 main enzyme classes (i.e., oxidoreductase, transferase, hydrolase, lyase, isomerase, and ligase), a subclass training dataset is constructed. To reduce homologous bias, a stringent cutoff was imposed that all the entries included in the datasets have less than 40% sequence identity to each other. To catch the core feature that is intimately related to the biological function, the sample of a protein is represented by hybridizing the functional domain composition and pseudo amino acid composition. On the basis of such a hybridization representation, the FunD-PseAA predictor is established. It is demonstrated by the jackknife cross-validation tests that the overall success rate in identifying the 21 subclasses of oxidoreductases is above 86%, and the corresponding rates in identifying the subclasses of the other 5 main enzyme classes are 94-97%. The high success rates imply that the FunD-PseAA predictor may become a useful tool in bioinformatics and proteomics of the post-genomic era.

摘要

作为持续深入运用序列方法识别酶功能的努力，研究从主要酶类（《蛋白质科学》2004年，第13卷，2857 - 2863页）扩展到其亚类。如果我们希望更深入地理解酶的分子机制，这是必不可少的。对于6种主要酶类（即氧化还原酶、转移酶、水解酶、裂合酶、异构酶和连接酶）中的每一种，都构建了一个亚类训练数据集。为了减少同源性偏差，设定了严格的截止标准，即数据集中包含的所有条目彼此之间的序列同一性小于40%。为了捕捉与生物学功能密切相关的核心特征，通过将功能域组成和伪氨基酸组成进行杂交来表示蛋白质样本。基于这种杂交表示，建立了FunD - PseAA预测器。刀切法交叉验证测试表明，识别氧化还原酶21个亚类的总体成功率高于86%，识别其他5种主要酶类亚类的相应成功率为94% - 97%。高成功率意味着FunD - PseAA预测器可能成为后基因组时代生物信息学和蛋白质组学中的一个有用工具。

相似文献

Predicting enzyme subclass by functional domain composition and pseudo amino acid composition.

J Proteome Res. 2005 May-Jun;4(3):967-71. doi: 10.1021/pr0500399.

Predicting enzyme family classes by hybridizing gene product composition and pseudo-amino acid composition.

J Theor Biol. 2005 May 7;234(1):145-9. doi: 10.1016/j.jtbi.2004.11.017. Epub 2005 Jan 26.

Predicting membrane protein type by functional domain composition and pseudo-amino acid composition.

J Theor Biol. 2006 Jan 21;238(2):395-400. doi: 10.1016/j.jtbi.2005.05.035. Epub 2005 Jul 25.

ECS: an automatic enzyme classifier based on functional domain composition.

Comput Biol Chem. 2007 Jun;31(3):226-32. doi: 10.1016/j.compbiolchem.2007.03.008. Epub 2007 Mar 30.

Using GO-PseAA predictor to predict enzyme sub-class.

Biochem Biophys Res Commun. 2004 Dec 10;325(2):506-9. doi: 10.1016/j.bbrc.2004.10.058.

Predicting enzyme family class in a hybridization space.

Protein Sci. 2004 Nov;13(11):2857-63. doi: 10.1110/ps.04981104.

Predicting protease types by hybridizing gene ontology and pseudo amino acid composition.

Proteins. 2006 May 15;63(3):681-4. doi: 10.1002/prot.20898.

Predicting subcellular localization of proteins by hybridizing functional domain composition and pseudo-amino acid composition.

J Cell Biochem. 2004 Apr 15;91(6):1197-203. doi: 10.1002/jcb.10790.

EzyPred: a top-down approach for predicting enzyme functional classes and subclasses.

Biochem Biophys Res Commun. 2007 Dec 7;364(1):53-9. doi: 10.1016/j.bbrc.2007.09.098. Epub 2007 Oct 2.

Using GO-PseAA predictor to identify membrane proteins and their types.

Biochem Biophys Res Commun. 2005 Feb 18;327(3):845-7. doi: 10.1016/j.bbrc.2004.12.069.

引用本文的文献

PredictEFC: a fast and efficient multi-label classifier for predicting enzyme family classes.

BMC Bioinformatics. 2024 Jan 30;25(1):50. doi: 10.1186/s12859-024-05665-1.

Improving automatic GO annotation with semantic similarity.

BMC Bioinformatics. 2022 Dec 12;23(Suppl 2):433. doi: 10.1186/s12859-022-04958-7.

Yeasts isolated from a lotic continental environment in Brazil show potential to produce amylase, cellulase and protease.

Biotechnol Rep (Amst). 2021 May 29;30:e00630. doi: 10.1016/j.btre.2021.e00630. eCollection 2021 Jun.

Identification of Human Enzymes Using Amino Acid Composition and the Composition of -Spaced Amino Acid Pairs.

Biomed Res Int. 2020 May 22;2020:9235920. doi: 10.1155/2020/9235920. eCollection 2020.

Pathogenicity-associated protein domains: The fiercely-conserved evolutionary signatures.

Gene Rep. 2017 Jun;7:127-141. doi: 10.1016/j.genrep.2017.04.004. Epub 2017 Apr 8.

GrAPFI: predicting enzymatic function of proteins from domain similarity graphs.

BMC Bioinformatics. 2020 Apr 29;21(1):168. doi: 10.1186/s12859-020-3460-7.

DEEPre: sequence-based enzyme EC number prediction by deep learning.

Bioinformatics. 2018 Mar 1;34(5):760-769. doi: 10.1093/bioinformatics/btx680.

Using feature optimization-based support vector machine method to recognize the β-hairpin motifs in enzymes.

Saudi J Biol Sci. 2017 Sep;24(6):1361-1369. doi: 10.1016/j.sjbs.2016.11.014. Epub 2016 Nov 28.

Fishy business: effect of omega-3 fatty acids on zinc transporters and free zinc availability in human neuronal cells.

Nutrients. 2014 Aug 15;6(8):3245-58. doi: 10.3390/nu6083245.

Computational Approaches for Automated Classification of Enzyme Sequences.

J Proteomics Bioinform. 2011 Aug 23;4:147-152. doi: 10.4172/jpb.1000183.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过功能域组成和伪氨基酸组成预测酶亚类

Predicting enzyme subclass by functional domain composition and pseudo amino acid composition.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献