CLIPS-1D：分析多重序列比对，推断残基位置在催化、配体结合或蛋白质结构中的作用。

CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure.

机构信息

Institute of Biophysics and Physical Biochemistry, University of Regensburg, 93040 Regensburg, Germany.

出版信息

BMC Bioinformatics. 2012 Apr 5;13:55. doi: 10.1186/1471-2105-13-55.

DOI:10.1186/1471-2105-13-55

PMID:22480135

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3391178/

Abstract

BACKGROUND

One aim of the in silico characterization of proteins is to identify all residue-positions, which are crucial for function or structure. Several sequence-based algorithms exist, which predict functionally important sites. However, with respect to sequence information, many functionally and structurally important sites are hard to distinguish and consequently a large number of incorrectly predicted functional sites have to be expected. This is why we were interested to design a new classifier that differentiates between functionally and structurally important sites and to assess its performance on representative datasets.

RESULTS

We have implemented CLIPS-1D, which predicts a role in catalysis, ligand-binding, or protein structure for residue-positions in a mutually exclusive manner. By analyzing a multiple sequence alignment, the algorithm scores conservation as well as abundance of residues at individual sites and their local neighborhood and categorizes by means of a multiclass support vector machine. A cross-validation confirmed that residue-positions involved in catalysis were identified with state-of-the-art quality; the mean MCC-value was 0.34. For structurally important sites, prediction quality was considerably higher (mean MCC = 0.67). For ligand-binding sites, prediction quality was lower (mean MCC = 0.12), because binding sites and structurally important residue-positions share conservation and abundance values, which makes their separation difficult. We show that classification success varies for residues in a class-specific manner. This is why our algorithm computes residue-specific p-values, which allow for the statistical assessment of each individual prediction. CLIPS-1D is available as a Web service at http://www-bioinf.uni-regensburg.de/.

CONCLUSIONS

CLIPS-1D is a classifier, whose prediction quality has been determined separately for catalytic sites, ligand-binding sites, and structurally important sites. It generates hypotheses about residue-positions important for a set of homologous proteins and focuses on conservation and abundance signals. Thus, the algorithm can be applied in cases where function cannot be transferred from well-characterized proteins by means of sequence comparison.

摘要

背景

蛋白质的计算特性分析的目的之一是确定所有对功能或结构至关重要的残基位置。有几个基于序列的算法可以预测功能重要的位点。然而，就序列信息而言，许多功能和结构上重要的位点很难区分，因此预计会有大量错误预测的功能位点。这就是为什么我们有兴趣设计一个新的分类器，以区分功能和结构重要的位点，并评估其在代表性数据集上的性能。

结果

我们实现了 CLIPS-1D，它以相互排斥的方式预测残基位置在催化、配体结合或蛋白质结构中的作用。通过分析多序列比对，该算法对单个位点及其局部邻域的保守性和丰度进行评分，并通过多类支持向量机进行分类。交叉验证证实，参与催化的残基位置的识别质量达到了最新水平；平均 MCC 值为 0.34。对于结构重要的位点，预测质量要高得多（平均 MCC = 0.67）。对于配体结合位点，预测质量较低（平均 MCC = 0.12），因为结合位点和结构上重要的残基位置共享保守性和丰度值，这使得它们的分离变得困难。我们表明，分类成功因类特异性残基而异。这就是为什么我们的算法计算残基特异性 p 值，这允许对每个单独的预测进行统计评估。CLIPS-1D 作为 Web 服务可在 http://www-bioinf.uni-regensburg.de/ 获得。

结论

CLIPS-1D 是一个分类器，其预测质量已分别针对催化位点、配体结合位点和结构重要位点进行了确定。它针对一组同源蛋白质中对残基位置的重要性产生假设，并侧重于保守性和丰度信号。因此，该算法可用于通过序列比较无法从特征良好的蛋白质中转导功能的情况。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0964/3391178/be29dad4816b/1471-2105-13-55-1.jpg

相似文献

CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure.CLIPS-1D：分析多重序列比对，推断残基位置在催化、配体结合或蛋白质结构中的作用。

BMC Bioinformatics. 2012 Apr 5;13:55. doi: 10.1186/1471-2105-13-55.

CLIPS-4D: a classifier that distinguishes structurally and functionally important residue-positions based on sequence and 3D data.CLIPS-4D：一种基于序列和 3D 数据区分结构和功能重要残基位置的分类器。

Bioinformatics. 2013 Dec 1;29(23):3029-35. doi: 10.1093/bioinformatics/btt519. Epub 2013 Sep 18.

H2rs: deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments.H2rs：通过对多个序列比对进行基于熵和相似性的分析来推断进化和功能重要的残基位置。

BMC Bioinformatics. 2014 Apr 27;15:118. doi: 10.1186/1471-2105-15-118.

Experimental assessment of the importance of amino acid positions identified by an entropy-based correlation analysis of multiple-sequence alignments.基于多重序列比对的熵相关性分析鉴定的氨基酸位置重要性的实验评估。

Biochemistry. 2012 Jul 17;51(28):5633-41. doi: 10.1021/bi300747r. Epub 2012 Jul 6.

The catalytic mechanism of indole-3-glycerol phosphate synthase: crystal structures of complexes of the enzyme from Sulfolobus solfataricus with substrate analogue, substrate, and product.吲哚-3-甘油磷酸合酶的催化机制：嗜热栖热菌中该酶与底物类似物、底物及产物复合物的晶体结构

J Mol Biol. 2002 Jun 7;319(3):757-66. doi: 10.1016/S0022-2836(02)00378-9.

Loop-loop interactions govern multiple steps in indole-3-glycerol phosphate synthase catalysis.环-环相互作用控制色氨酸-3-甘油磷酸合酶催化的多个步骤。

Protein Sci. 2014 Mar;23(3):302-11. doi: 10.1002/pro.2416. Epub 2014 Feb 4.

H2r: identification of evolutionary important residues by means of an entropy based analysis of multiple sequence alignments.H2r：通过对多序列比对进行基于熵的分析来识别进化上重要的残基。

BMC Bioinformatics. 2008 Mar 18;9:151. doi: 10.1186/1471-2105-9-151.

Kinetic mechanism of indole-3-glycerol phosphate synthase.色氨酸-3-甘油磷酸合酶的动力学机制。

Biochemistry. 2013 Jan 8;52(1):132-42. doi: 10.1021/bi301342j. Epub 2012 Dec 19.

Sequence based residue depth prediction using evolutionary information and predicted secondary structure.基于序列的残基深度预测，利用进化信息和预测的二级结构。

BMC Bioinformatics. 2008 Sep 20;9:388. doi: 10.1186/1471-2105-9-388.

AL2CO: calculation of positional conservation in a protein sequence alignment.AL2CO：蛋白质序列比对中位置保守性的计算

Bioinformatics. 2001 Aug;17(8):700-12. doi: 10.1093/bioinformatics/17.8.700.

引用本文的文献

Making Enzymes Suitable for Organic Chemistry by Rational Protein Design.通过理性蛋白质设计使酶适用于有机化学。

Chembiochem. 2022 Jul 19;23(14):e202200049. doi: 10.1002/cbic.202200049. Epub 2022 Apr 27.

Deep Analysis of Residue Constraints (DARC): identifying determinants of protein functional specificity.深度解析残基约束（DARC）：鉴定蛋白质功能特异性的决定因素。

Sci Rep. 2020 Feb 3;10(1):1691. doi: 10.1038/s41598-019-55118-6.

Inferring joint sequence-structural determinants of protein functional specificity.推断蛋白质功能特异性的关节序列结构决定因素。

Elife. 2018 Jan 16;7:e29880. doi: 10.7554/eLife.29880.

Initial Cluster Analysis.初始聚类分析。

J Comput Biol. 2018 Feb;25(2):121-129. doi: 10.1089/cmb.2017.0050. Epub 2017 Aug 3.

Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.基于统计相关性推断功能相关的N-乙酰转移酶残基

PLoS Comput Biol. 2016 Dec 21;12(12):e1005294. doi: 10.1371/journal.pcbi.1005294. eCollection 2016 Dec.

Development of a machine learning method to predict membrane protein-ligand binding residues using basic sequence information.利用基本序列信息开发一种预测膜蛋白-配体结合残基的机器学习方法。

Adv Bioinformatics. 2015;2015:843030. doi: 10.1155/2015/843030. Epub 2015 Jan 31.

BMC Bioinformatics. 2014 Apr 27;15:118. doi: 10.1186/1471-2105-15-118.

Quantum coupled mutation finder: predicting functionally or structurally important sites in proteins using quantum Jensen-Shannon divergence and CUDA programming.量子耦合突变发现器：使用量子 Jensen-Shannon 散度和 CUDA 编程预测蛋白质中的功能或结构重要位点。

BMC Bioinformatics. 2014 Apr 3;15:96. doi: 10.1186/1471-2105-15-96.

本文引用的文献

Structure of indole-3-glycerol phosphate synthase from Thermus thermophilus HB8: implications for thermal stability.嗜热栖热菌HB8中吲哚-3-磷酸甘油合酶的结构：对热稳定性的影响

Acta Crystallogr D Biol Crystallogr. 2011 Dec;67(Pt 12):1054-64. doi: 10.1107/S0907444911045264. Epub 2011 Nov 18.

PresCont: predicting protein-protein interfaces utilizing four residue properties.PresCont：利用四个残基性质预测蛋白质-蛋白质界面。

Proteins. 2012 Jan;80(1):154-68. doi: 10.1002/prot.23172. Epub 2011 Oct 31.

The Enzyme Function Initiative.酶功能倡议。

Biochemistry. 2011 Nov 22;50(46):9950-62. doi: 10.1021/bi201312u. Epub 2011 Oct 26.

firestar--advances in the prediction of functionally important residues.火之星——预测功能重要残基的进展。

Nucleic Acids Res. 2011 Jul;39(Web Server issue):W235-41. doi: 10.1093/nar/gkr437. Epub 2011 Jun 14.

Non-Alignment Features Based Enzyme/Non-Enzyme Classification Using an Ensemble Method.基于非对齐特征的酶/非酶分类集成方法

Proc Int Conf Mach Learn Appl. 2010 Dec 12:546-551. doi: 10.1109/ICMLA.2010.167.

Structure-based identification of catalytic residues.基于结构的催化残基鉴定。

Proteins. 2011 Jun;79(6):1952-63. doi: 10.1002/prot.23020. Epub 2011 Apr 12.

Sequence conservation in the prediction of catalytic sites.催化位点预测中的序列保守性。

Protein J. 2011 Apr;30(4):229-39. doi: 10.1007/s10930-011-9324-2.

Contribution of hydrophobic interactions to protein stability.疏水性相互作用对蛋白质稳定性的贡献。

J Mol Biol. 2011 May 6;408(3):514-28. doi: 10.1016/j.jmb.2011.02.053. Epub 2011 Mar 4.

A geometry-based generic predictor for catalytic and allosteric sites.基于几何形状的通用催化和别构位点预测器。

Protein Eng Des Sel. 2011 Apr;24(4):405-9. doi: 10.1093/protein/gzq115. Epub 2010 Dec 15.

Networks of high mutual information define the structural proximity of catalytic sites: implications for catalytic residue identification.高互信息网络定义了催化位点的结构邻近性：对催化残基识别的影响。

PLoS Comput Biol. 2010 Nov 4;6(11):e1000978. doi: 10.1371/journal.pcbi.1000978.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

CLIPS-1D：分析多重序列比对，推断残基位置在催化、配体结合或蛋白质结构中的作用。

CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献