Suppr超能文献

GPCR-2L:通过两种不同模式的伪氨基酸组成杂交预测G蛋白偶联受体及其类型。

GPCR-2L: predicting G protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions.

作者信息

Xiao Xuan, Wang Pu, Chou Kuo-Chen

机构信息

Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen 333403, China.

出版信息

Mol Biosyst. 2011 Mar;7(3):911-9. doi: 10.1039/c0mb00170h. Epub 2010 Dec 23.

Abstract

G protein-coupled receptors (GPCRs) are among the most frequent targets of therapeutic drugs. With the avalanche of newly generated protein sequences in the post genomic age, to expedite the process of drug discovery, it is highly desirable to develop an automated method to rapidly identify GPCRs and their types. A new predictor was developed by hybridizing two different modes of pseudo-amino acid composition (PseAAC): the functional domain PseAAC and the low-frequency Fourier spectrum PseAAC. The new predictor is called GPCR-2L, where "2L" means that it is a two-layer predictor: the 1st layer prediction engine is to identify a query protein as GPCR or not; if it is, the prediction will be automatically continued to further identify it as belonging to one of the following six types: (1) rhodopsin-like (Class A), (2) secretin-like (Class B), (3) metabotropic glutamate/pheromone (Class C), (4) fungal pheromone (Class D), (5) cAMP receptor (Class E), or (6) frizzled/smoothened family (Class F). The overall success rate of GPCR-2L in identifying proteins as GPCRs or non-GPCRs is over 97.2%, while identifying GPCRs among their six types is over 97.8%. Such high success rates were derived by the rigorous jackknife cross-validation on a stringent benchmark dataset, in which none of the included proteins had ≥40% pairwise sequence identity to any other protein in a same subset. As a user-friendly web-server, GPCR-2L is freely accessible to the public at http://icpr.jci.edu.cn/, by which one can obtain the 2-level results in about 20 s for a query protein sequence of 500 amino acids. The longer the sequence is, the more time it may usually need. The high success rates reported here indicate that it is a quite effective approach to identify GPCRs and their types with the functional domain information and the low-frequency Fourier spectrum analysis. It is anticipated that GPCR-2L may become a useful tool for both basic research and drug development in the areas related to GPCRs.

摘要

G蛋白偶联受体(GPCRs)是治疗药物最常见的靶点之一。在后基因组时代,随着新生成的蛋白质序列大量涌现,为加快药物发现进程,迫切需要开发一种自动方法来快速识别GPCRs及其类型。通过将两种不同模式的伪氨基酸组成(PseAAC)——功能域PseAAC和低频傅里叶频谱PseAAC进行杂交,开发了一种新的预测器。这个新的预测器被称为GPCR-2L,其中“2L”表示它是一个两层预测器:第一层预测引擎用于识别查询蛋白是否为GPCR;如果是,则预测将自动继续,以进一步将其识别为属于以下六种类型之一:(1)视紫红质样(A类),(2)促胰液素样(B类),(3)代谢型谷氨酸/信息素(C类),(4)真菌信息素(D类),(5)cAMP受体(E类),或(6)卷曲/ smoothened家族(F类)。GPCR-2L在将蛋白质识别为GPCR或非GPCR方面的总体成功率超过97.2%,而在六种类型中识别GPCR的成功率超过97.8%。如此高的成功率是通过在一个严格的基准数据集上进行严格的留一法交叉验证得出的,其中所包含的蛋白质中没有任何一个与同一子集中的任何其他蛋白质具有≥40%的成对序列同一性。作为一个用户友好的网络服务器,公众可以通过http://icpr.jci.edu.cn/免费访问GPCR-2L,通过该网站,对于一个500个氨基酸的查询蛋白质序列,大约20秒就能获得两级结果。序列越长,通常可能需要的时间就越多。这里报道的高成功率表明,利用功能域信息和低频傅里叶频谱分析来识别GPCR及其类型是一种相当有效的方法。预计GPCR-2L可能会成为GPCR相关领域基础研究和药物开发的有用工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验