Muthukrishnan S, Garg Aarti, Raghava G P S
Institute of Microbial Technology, Sector 39-A, Chandigarh 160036, India.
Genomics Proteomics Bioinformatics. 2007 Dec;5(3-4):250-2. doi: 10.1016/S1672-0229(08)60012-1.
This study describes a method for predicting and classifying oxygen-binding proteins. Firstly, support vector machine (SVM) modules were developed using amino acid composition and dipeptide composition for predicting oxygen-binding proteins, and achieved maximum accuracy of 85.5% and 87.8%, respectively. Secondly, an SVM module was developed based on amino acid composition, classifying the predicted oxygen-binding proteins into six classes with accuracy of 95.8%, 97.5%, 97.5%, 96.9%, 99.4%, and 96.0% for erythrocruorin, hemerythrin, hemocyanin, hemoglobin, leghemoglobin, and myoglobin proteins, respectively. Finally, an SVM module was developed using dipeptide composition for classifying the oxygen-binding proteins, and achieved maximum accuracy of 96.1%, 98.7%, 98.7%, 85.6%, 99.6%, and 93.3% for the above six classes, respectively. All modules were trained and tested by five-fold cross validation. Based on the above approach, a web server Oxypred was developed for predicting and classifying oxygen-binding proteins (available from http://www.imtech.res.in/raghava/oxypred/).
本研究描述了一种预测和分类氧结合蛋白的方法。首先,利用氨基酸组成和二肽组成开发了支持向量机(SVM)模块来预测氧结合蛋白,其最大准确率分别达到85.5%和87.8%。其次,基于氨基酸组成开发了一个SVM模块,将预测的氧结合蛋白分为六类,对于蚯蚓血红蛋白、蚯蚓血红蛋白、血蓝蛋白、血红蛋白、豆血红蛋白和肌红蛋白,其准确率分别为95.8%、97.5%、97.5%、96.9%、99.4%和96.0%。最后,利用二肽组成开发了一个SVM模块来分类氧结合蛋白,对于上述六类,其最大准确率分别为96.1%、98.7%、98.7%、85.6%、99.6%和93.3%。所有模块均通过五折交叉验证进行训练和测试。基于上述方法,开发了一个用于预测和分类氧结合蛋白的网络服务器Oxypred(可从http://www.imtech.res.in/raghava/oxypred/获取)。