Guo Yanzhi, Li Menglong, Lu Minchun, Wen Zhining, Huang Zhongtian
College of Chemistry, Sichuan University, Chengdu, People's Republic of China.
Proteins. 2006 Oct 1;65(1):55-60. doi: 10.1002/prot.21097.
Determining G-protein coupled receptors (GPCRs) coupling specificity is very important for further understanding the functions of receptors. A successful method in this area will benefit both basic research and drug discovery practice. Previously published methods rely on the transmembrane topology prediction at training step, even at prediction step. However, the transmembrane topology predicted by even the best algorithm is not of high accuracy. In this study, we developed a new method, autocross-covariance (ACC) transform based support vector machine (SVM), to predict coupling specificity between GPCRs and G-proteins. The primary amino acid sequences are translated into vectors based on the principal physicochemical properties of the amino acids and the data are transformed into a uniform matrix by applying ACC transform. SVMs for nonpromiscuous coupled GPCRs and promiscuous coupled GPCRs were trained and validated by jackknife test and the results thus obtained are very promising. All classifiers were also evaluated by the test datasets with good performance. Besides the high prediction accuracy, the most important feature of this method is that it does not require any transmembrane topology prediction at either training or prediction step but only the primary sequences of proteins. The results indicate that this relatively simple method is applicable. Academic users can freely download the prediction program at http://www.scucic.net/group/database/Service.asp.
确定G蛋白偶联受体(GPCRs)的偶联特异性对于进一步了解受体功能非常重要。该领域的一种成功方法将使基础研究和药物发现实践都受益。以前发表的方法在训练步骤甚至预测步骤都依赖于跨膜拓扑预测。然而,即使是最好的算法预测的跨膜拓扑也不是高精度的。在本研究中,我们开发了一种新方法,即基于自互协方差(ACC)变换的支持向量机(SVM),来预测GPCRs与G蛋白之间的偶联特异性。基于氨基酸的主要物理化学性质将一级氨基酸序列转化为向量,并通过应用ACC变换将数据转化为统一矩阵。通过留一法测试对非混杂偶联GPCRs和混杂偶联GPCRs的支持向量机进行训练和验证,得到的结果很有前景。所有分类器也通过测试数据集进行评估,性能良好。除了预测准确率高之外,该方法最重要的特点是在训练或预测步骤都不需要任何跨膜拓扑预测,只需要蛋白质的一级序列。结果表明这种相对简单的方法是适用的。学术用户可以在http://www.scucic.net/group/database/Service.asp免费下载预测程序。