Mohammed Akram, Guda Chittibabu
Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, NE, USA.
J Proteomics Bioinform. 2011 Aug 23;4:147-152. doi: 10.4172/jpb.1000183.
Determining the functional role(s) of enzymes is very important to build the metabolic blueprint of an organism and to identify the potential roles enzymes may play in metabolic and disease pathways. With exponential growth in gene and protein sequence data, it is not feasible to experimentally characterize the function(s) of all enzymes. Alternatively, computational methods can be used to annotate the enormous amount of unannotated enzyme sequences. For function prediction and classification of enzymes, features based on amino acid composition, sequence and structural properties, domain composition and specific peptide information have been widely used by different computational approaches. Each feature space has its own merits and limitations on the overall prediction accuracy. Prediction accuracy improves when machine-learning methods are used to classify enzymes. Given the incomplete and unbalanced nature of annotations in biological databases, ensemble methods or methods that bank on a combination of orthogonal feature are more desirable for achieving higher accuracy and coverage in enzyme classification. In this review article, we systematically describe all the features and methods used thus far for enzyme class prediction. To the authors' knowledge, this review represents the most exhaustive description of methods used for computational prediction of enzyme classes.
确定酶的功能作用对于构建生物体的代谢蓝图以及识别酶在代谢和疾病途径中可能发挥的潜在作用非常重要。随着基因和蛋白质序列数据呈指数增长,通过实验表征所有酶的功能是不可行的。作为替代方案,可以使用计算方法来注释大量未注释的酶序列。对于酶的功能预测和分类,基于氨基酸组成、序列和结构特性、结构域组成以及特定肽信息的特征已被不同的计算方法广泛使用。每个特征空间在整体预测准确性方面都有其自身的优点和局限性。当使用机器学习方法对酶进行分类时,预测准确性会提高。鉴于生物数据库中注释的不完整性和不平衡性,集成方法或依赖正交特征组合的方法在酶分类中更适合实现更高的准确性和覆盖率。在这篇综述文章中,我们系统地描述了迄今为止用于酶类预测的所有特征和方法。据作者所知,这篇综述是对用于酶类计算预测的方法最详尽的描述。