GPCR-2L：通过两种不同模式的伪氨基酸组成杂交预测G蛋白偶联受体及其类型。

GPCR-2L: predicting G protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions.

作者信息

Xiao Xuan, Wang Pu, Chou Kuo-Chen

机构信息

Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen 333403, China.

出版信息

Mol Biosyst. 2011 Mar;7(3):911-9. doi: 10.1039/c0mb00170h. Epub 2010 Dec 23.

DOI:10.1039/c0mb00170h

PMID:21180772

Abstract

G protein-coupled receptors (GPCRs) are among the most frequent targets of therapeutic drugs. With the avalanche of newly generated protein sequences in the post genomic age, to expedite the process of drug discovery, it is highly desirable to develop an automated method to rapidly identify GPCRs and their types. A new predictor was developed by hybridizing two different modes of pseudo-amino acid composition (PseAAC): the functional domain PseAAC and the low-frequency Fourier spectrum PseAAC. The new predictor is called GPCR-2L, where "2L" means that it is a two-layer predictor: the 1st layer prediction engine is to identify a query protein as GPCR or not; if it is, the prediction will be automatically continued to further identify it as belonging to one of the following six types: (1) rhodopsin-like (Class A), (2) secretin-like (Class B), (3) metabotropic glutamate/pheromone (Class C), (4) fungal pheromone (Class D), (5) cAMP receptor (Class E), or (6) frizzled/smoothened family (Class F). The overall success rate of GPCR-2L in identifying proteins as GPCRs or non-GPCRs is over 97.2%, while identifying GPCRs among their six types is over 97.8%. Such high success rates were derived by the rigorous jackknife cross-validation on a stringent benchmark dataset, in which none of the included proteins had ≥40% pairwise sequence identity to any other protein in a same subset. As a user-friendly web-server, GPCR-2L is freely accessible to the public at http://icpr.jci.edu.cn/, by which one can obtain the 2-level results in about 20 s for a query protein sequence of 500 amino acids. The longer the sequence is, the more time it may usually need. The high success rates reported here indicate that it is a quite effective approach to identify GPCRs and their types with the functional domain information and the low-frequency Fourier spectrum analysis. It is anticipated that GPCR-2L may become a useful tool for both basic research and drug development in the areas related to GPCRs.

摘要

G蛋白偶联受体（GPCRs）是治疗药物最常见的靶点之一。在后基因组时代，随着新生成的蛋白质序列大量涌现，为加快药物发现进程，迫切需要开发一种自动方法来快速识别GPCRs及其类型。通过将两种不同模式的伪氨基酸组成（PseAAC）——功能域PseAAC和低频傅里叶频谱PseAAC进行杂交，开发了一种新的预测器。这个新的预测器被称为GPCR-2L，其中“2L”表示它是一个两层预测器：第一层预测引擎用于识别查询蛋白是否为GPCR；如果是，则预测将自动继续，以进一步将其识别为属于以下六种类型之一：（1）视紫红质样（A类），（2）促胰液素样（B类），（3）代谢型谷氨酸/信息素（C类），（4）真菌信息素（D类），（5）cAMP受体（E类），或（6）卷曲/ smoothened家族（F类）。GPCR-2L在将蛋白质识别为GPCR或非GPCR方面的总体成功率超过97.2%，而在六种类型中识别GPCR的成功率超过97.8%。如此高的成功率是通过在一个严格的基准数据集上进行严格的留一法交叉验证得出的，其中所包含的蛋白质中没有任何一个与同一子集中的任何其他蛋白质具有≥40%的成对序列同一性。作为一个用户友好的网络服务器，公众可以通过http://icpr.jci.edu.cn/免费访问GPCR-2L，通过该网站，对于一个500个氨基酸的查询蛋白质序列，大约20秒就能获得两级结果。序列越长，通常可能需要的时间就越多。这里报道的高成功率表明，利用功能域信息和低频傅里叶频谱分析来识别GPCR及其类型是一种相当有效的方法。预计GPCR-2L可能会成为GPCR相关领域基础研究和药物开发的有用工具。

相似文献

GPCR-2L: predicting G protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions.GPCR-2L：通过两种不同模式的伪氨基酸组成杂交预测G蛋白偶联受体及其类型。

Mol Biosyst. 2011 Mar;7(3):911-9. doi: 10.1039/c0mb00170h. Epub 2010 Dec 23.

GPCR-CA: A cellular automaton image approach for predicting G-protein-coupled receptor functional classes.GPCR-CA：一种用于预测G蛋白偶联受体功能类别的细胞自动机图像方法。

J Comput Chem. 2009 Jul 15;30(9):1414-23. doi: 10.1002/jcc.21163.

GPCR-GIA: a web-server for identifying G-protein coupled receptors and their families with grey incidence analysis.GPCR-GIA：一个利用灰色关联分析识别 G 蛋白偶联受体及其家族的网络服务器。

Protein Eng Des Sel. 2009 Nov;22(11):699-705. doi: 10.1093/protein/gzp057. Epub 2009 Sep 22.

MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM.MemType-2L：一个通过伪位置特异性得分矩阵整合进化信息来预测膜蛋白及其类型的网络服务器。

Biochem Biophys Res Commun. 2007 Aug 24;360(2):339-45. doi: 10.1016/j.bbrc.2007.06.027. Epub 2007 Jun 15.

Prediction of G-protein-coupled receptor classes.G蛋白偶联受体类别的预测。

J Proteome Res. 2005 Jul-Aug;4(4):1413-8. doi: 10.1021/pr050087t.

GPCR-MPredictor: multi-level prediction of G protein-coupled receptors using genetic ensemble.GPCR-MPredictor：基于遗传集成的 G 蛋白偶联受体多层次预测

Amino Acids. 2012 May;42(5):1809-23. doi: 10.1007/s00726-011-0902-6. Epub 2011 Apr 20.

QuatIdent: a web server for identifying protein quaternary structural attribute by fusing functional domain and sequential evolution information.QuatIdent：一个通过融合功能域和序列进化信息来识别蛋白质四级结构属性的网络服务器。

J Proteome Res. 2009 Mar;8(3):1577-84. doi: 10.1021/pr800957q.

Using optimized evidence-theoretic K-nearest neighbor classifier and pseudo-amino acid composition to predict membrane protein types.使用优化的证据理论K近邻分类器和伪氨基酸组成来预测膜蛋白类型。

Biochem Biophys Res Commun. 2005 Aug 19;334(1):288-92. doi: 10.1016/j.bbrc.2005.06.087.

EzyPred: a top-down approach for predicting enzyme functional classes and subclasses.EzyPred：一种用于预测酶功能类别和亚类的自上而下方法。

Biochem Biophys Res Commun. 2007 Dec 7;364(1):53-9. doi: 10.1016/j.bbrc.2007.09.098. Epub 2007 Oct 2.

iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition.iHSP-PseRAAAC：利用伪简约氨基酸字母组成鉴定热休克蛋白家族。

Anal Biochem. 2013 Nov 1;442(1):118-25. doi: 10.1016/j.ab.2013.05.024. Epub 2013 Jun 10.

引用本文的文献

Plant protection product dose rate estimation in apple orchards using a fuzzy logic system.利用模糊逻辑系统估算苹果园的植保产品施药量。

PLoS One. 2019 Apr 24;14(4):e0214315. doi: 10.1371/journal.pone.0214315. eCollection 2019.

HRGPred: Prediction of herbicide resistant genes with k-mer nucleotide compositional features and support vector machine.HRGPred：基于 k--mer 核苷酸组成特征和支持向量机预测除草剂抗性基因。

Sci Rep. 2019 Jan 28;9(1):778. doi: 10.1038/s41598-018-37309-9.

A novel feature ranking method for prediction of cancer stages using proteomics data.一种利用蛋白质组学数据预测癌症分期的新型特征排序方法。

PLoS One. 2017 Sep 21;12(9):e0184203. doi: 10.1371/journal.pone.0184203. eCollection 2017.

2L-piRNA: A Two-Layer Ensemble Classifier for Identifying Piwi-Interacting RNAs and Their Function.2L-piRNA：一种用于识别Piwi相互作用RNA及其功能的双层集成分类器。

Mol Ther Nucleic Acids. 2017 Jun 16;7:267-277. doi: 10.1016/j.omtn.2017.04.008. Epub 2017 Apr 13.

iACP: a sequence-based tool for identifying anticancer peptides.iACP：一种用于鉴定抗癌肽的基于序列的工具。

Oncotarget. 2016 Mar 29;7(13):16895-909. doi: 10.18632/oncotarget.7815.

Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble.利用基因本体论和多标签分类器集成进行多地点革兰氏阳性和革兰氏阴性细菌蛋白质亚细胞定位

BMC Bioinformatics. 2015;16 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-16-S12-S1. Epub 2015 Aug 25.

Reverse Engineering of Genome-wide Gene Regulatory Networks from Gene Expression Data.从基因表达数据中反向工程全基因组基因调控网络。

Curr Genomics. 2015 Feb;16(1):3-22. doi: 10.2174/1389202915666141110210634.

A high performance prediction of HPV genotypes by Chaos game representation and singular value decomposition.基于混沌博弈表示法和奇异值分解的人乳头瘤病毒基因型高性能预测

BMC Bioinformatics. 2015 Mar 5;16:71. doi: 10.1186/s12859-015-0493-4.

iCTX-type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels.iCTX型：一种基于序列的预测工具，用于识别靶向离子通道的芋螺毒素类型。

Biomed Res Int. 2014;2014:286419. doi: 10.1155/2014/286419. Epub 2014 Jun 1.

iMethyl-PseAAC: identification of protein methylation sites via a pseudo amino acid composition approach.iMethyl-PseAAC：通过伪氨基酸组成方法鉴定蛋白质甲基化位点。

Biomed Res Int. 2014;2014:947416. doi: 10.1155/2014/947416. Epub 2014 May 22.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

GPCR-2L：通过两种不同模式的伪氨基酸组成杂交预测G蛋白偶联受体及其类型。

GPCR-2L: predicting G protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献