Lu Lingyi, Qian Ziliang, Cai Yu-Dong, Li Yixue
Bioinformatics Center, Key Lab of Molecular Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China.
Comput Biol Chem. 2007 Jun;31(3):226-32. doi: 10.1016/j.compbiolchem.2007.03.008. Epub 2007 Mar 30.
Classification for enzymes is a prerequisite for understanding their function. Here, an automatic enzyme identifier based on support vector machine (SVM) with feature vectors from protein functional domain composition was built to identify enzymes and further a classifier to classify enzymes into six different classes: oxidoreductase, transferase, hydrolase, lyase, isomerase and ligase. Jackknife cross-validation test was adopted to evaluate the performance of our classifier. The 86.03% success rate achieved for enzyme/non-enzyme identification and 91.32% for enzyme classification, which is much better than that of the BLAST and PSI-BLAST based method, also outperforms several existed works. The results indicate that protein functional domain composition is able to capture the major features which facilitate the identification/classification of proteins, thus demonstrating that our predictor could be a more effective and promising high-throughput method in enzyme research. Moreover, a web-based software Enzyme Classification System (ECS) for identification as well as classification of enzymes can be accessed at: http://pcal.biosino.org/.
酶的分类是理解其功能的前提条件。在此,构建了一种基于支持向量机(SVM)的自动酶识别器,其特征向量来自蛋白质功能域组成,用于识别酶,并进一步构建了一个分类器,将酶分为六个不同类别:氧化还原酶、转移酶、水解酶、裂合酶、异构酶和连接酶。采用留一法交叉验证测试来评估我们分类器的性能。酶/非酶识别的成功率达到86.03%,酶分类的成功率达到91.32%,这比基于BLAST和PSI-BLAST的方法要好得多,也优于一些已有的研究工作。结果表明,蛋白质功能域组成能够捕捉有助于蛋白质识别/分类的主要特征,从而证明我们的预测器在酶研究中可能是一种更有效且有前景的高通量方法。此外,可通过以下网址访问用于酶识别和分类的基于网络的软件酶分类系统(ECS):http://pcal.biosino.org/ 。