Bhasin Manoj, Raghava G P S
Institute of Microbial Technology Sector 39-A, Chandigarh, 160036, India.
Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W383-9. doi: 10.1093/nar/gkh416.
G-protein coupled receptors (GPCRs) belong to one of the largest superfamilies of membrane proteins and are important targets for drug design. In this study, a support vector machine (SVM)-based method, GPCRpred, has been developed for predicting families and subfamilies of GPCRs from the dipeptide composition of proteins. The dataset used in this study for training and testing was obtained from http://www.soe.ucsc.edu/research/compbio/gpcr/. The method classified GPCRs and non-GPCRs with an accuracy of 99.5% when evaluated using 5-fold cross-validation. The method is further able to predict five major classes or families of GPCRs with an overall Matthew's correlation coefficient (MCC) and accuracy of 0.81 and 97.5% respectively. In recognizing the subfamilies of the rhodopsin-like family, the method achieved an average MCC and accuracy of 0.97 and 97.3% respectively. The method achieved overall accuracy of 91.3% and 96.4% at family and subfamily level respectively when evaluated on an independent/blind dataset of 650 GPCRs. A server for recognition and classification of GPCRs based on multiclass SVMs has been set up at http://www.imtech.res.in/raghava/gpcrpred/. We have also suggested subfamilies for 42 sequences which were previously identified as unclassified ClassA GPCRs. The supplementary information is available at http://www.imtech.res.in/raghava/gpcrpred/info.html.
G蛋白偶联受体(GPCRs)属于膜蛋白中最大的超家族之一,是药物设计的重要靶点。在本研究中,已开发出一种基于支持向量机(SVM)的方法GPCRpred,用于根据蛋白质的二肽组成预测GPCRs的家族和亚家族。本研究中用于训练和测试的数据集来自http://www.soe.ucsc.edu/research/compbio/gpcr/。当使用五折交叉验证进行评估时,该方法对GPCRs和非GPCRs的分类准确率为99.5%。该方法还能够预测GPCRs的五个主要类别或家族,马修斯相关系数(MCC)和准确率分别为0.81和97.5%。在识别视紫红质样家族的亚家族时,该方法的平均MCC和准确率分别为0.97和97.3%。在一个由650个GPCRs组成的独立/盲测数据集上进行评估时,该方法在家族和亚家族水平上的总体准确率分别为91.3%和96.4%。基于多类支持向量机的GPCRs识别和分类服务器已在http://www.imtech.res.in/raghava/gpcrpred/上建立。我们还为42个先前被鉴定为未分类的A类GPCRs序列提出了亚家族。补充信息可在http://www.imtech.res.in/raghava/gpcrpred/info.html上获取。