Suppr超能文献

解析进化信号:保守性、特异性决定位置和共进化。对催化残基预测的启示。

Disentangling evolutionary signals: conservation, specificity determining positions and coevolution. Implication for catalytic residue prediction.

机构信息

Fundación Instituto Leloir, Avda, Patricias Argentinas 435, CABA, C1405BWE, Argentina.

出版信息

BMC Bioinformatics. 2012 Sep 14;13:235. doi: 10.1186/1471-2105-13-235.

Abstract

BACKGROUND

A large panel of methods exists that aim to identify residues with critical impact on protein function based on evolutionary signals, sequence and structure information. However, it is not clear to what extent these different methods overlap, and if any of the methods have higher predictive potential compared to others when it comes to, in particular, the identification of catalytic residues (CR) in proteins. Using a large set of enzymatic protein families and measures based on different evolutionary signals, we sought to break up the different components of the information content within a multiple sequence alignment to investigate their predictive potential and degree of overlap.

RESULTS

Our results demonstrate that the different methods included in the benchmark in general can be divided into three groups with a limited mutual overlap. One group containing real-value Evolutionary Trace (rvET) methods and conservation, another containing mutual information (MI) methods, and the last containing methods designed explicitly for the identification of specificity determining positions (SDPs): integer-value Evolutionary Trace (ivET), SDPfox, and XDET. In terms of prediction of CR, we find using a proximity score integrating structural information (as the sum of the scores of residues located within a given distance of the residue in question) that only the methods from the first two groups displayed a reliable performance. Next, we investigated to what degree proximity scores for conservation, rvET and cumulative MI (cMI) provide complementary information capable of improving the performance for CR identification. We found that integrating conservation with proximity scores for rvET and cMI achieved the highest performance. The proximity conservation score contained no complementary information when integrated with proximity rvET. Moreover, the signal from rvET provided only a limited gain in predictive performance when integrated with mutual information and conservation proximity scores. Combined, these observations demonstrate that the rvET and cMI scores add complementary information to the prediction system.

CONCLUSIONS

This work contributes to the understanding of the different signals of evolution and also shows that it is possible to improve the detection of catalytic residues by integrating structural and higher order sequence evolutionary information with sequence conservation.

摘要

背景

存在大量方法旨在根据进化信号、序列和结构信息来识别对蛋白质功能具有关键影响的残基。然而,目前尚不清楚这些不同方法之间的重叠程度,以及在识别蛋白质中的催化残基(CR)方面,哪些方法具有更高的预测潜力。我们使用大量酶蛋白家族和基于不同进化信号的度量方法,试图分解多重序列比对中信息内容的不同组成部分,以研究它们的预测潜力和重叠程度。

结果

我们的结果表明,基准测试中包含的不同方法通常可以分为三个相互重叠有限的组。一组包含真实值进化痕迹(rvET)方法和保守性,另一组包含互信息(MI)方法,最后一组包含专门用于识别特异性决定位置(SDP)的方法:整数值进化痕迹(ivET)、SDPfox 和 XDET。就 CR 的预测而言,我们发现使用一种整合结构信息的接近度得分(即问题残基周围给定距离内的残基的得分之和),只有前两组的方法显示出可靠的性能。接下来,我们研究了接近度得分对于保守性、rvET 和累积 MI(cMI)提供互补信息的程度,这些信息能够提高 CR 识别的性能。我们发现,将保守性与 rvET 和 cMI 的接近度得分相结合,可以获得最高的性能。当与 rvET 的接近度得分结合时,保守性接近度得分不包含互补信息。此外,当与互信息和保守性接近度得分结合时,rvET 的信号仅在预测性能上提供了有限的增益。综合来看,这些观察结果表明,rvET 和 cMI 得分为预测系统添加了互补信息。

结论

这项工作有助于理解不同的进化信号,也表明通过将结构和更高阶序列进化信息与序列保守性相结合,有可能提高催化残基的检测能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c23/3515339/a1c9b344eea9/1471-2105-13-235-1.jpg

相似文献

3
MISTIC: Mutual information server to infer coevolution.
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W8-14. doi: 10.1093/nar/gkt427. Epub 2013 May 28.
4
Evolutionary information hidden in a single protein structure.
Proteins. 2012 Jun;80(6):1647-57. doi: 10.1002/prot.24058. Epub 2012 Mar 27.
5
Active site prediction using evolutionary and structural information.
Bioinformatics. 2010 Mar 1;26(5):617-24. doi: 10.1093/bioinformatics/btq008. Epub 2010 Jan 14.
6
Evaluation of features for catalytic residue prediction in novel folds.
Protein Sci. 2007 Feb;16(2):216-26. doi: 10.1110/ps.062523907. Epub 2006 Dec 22.
7
Analysis of evolutionary conservation patterns and their influence on identifying protein functional sites.
J Bioinform Comput Biol. 2014 Oct;12(5):1440003. doi: 10.1142/S0219720014400034.
8
CoeViz: a web-based tool for coevolution analysis of protein residues.
BMC Bioinformatics. 2016 Mar 8;17:119. doi: 10.1186/s12859-016-0975-z.
9
Protein-protein interactions leave evolutionary footprints: High molecular coevolution at the core of interfaces.
Protein Sci. 2017 Dec;26(12):2438-2444. doi: 10.1002/pro.3318. Epub 2017 Oct 25.
10
Joint evolutionary trees: a large-scale method to predict protein interfaces based on sequence sampling.
PLoS Comput Biol. 2009 Jan;5(1):e1000267. doi: 10.1371/journal.pcbi.1000267. Epub 2009 Jan 23.

引用本文的文献

1
Prediction of Protein Sites and Physicochemical Properties Related to Functional Specificity.
Bioengineering (Basel). 2021 Dec 3;8(12):201. doi: 10.3390/bioengineering8120201.
2
Predicting the Specificity- Determining Positions of Receptor Tyrosine Kinase Axl.
Front Mol Biosci. 2021 Jun 14;8:658906. doi: 10.3389/fmolb.2021.658906. eCollection 2021.
5
Learning protein constitutive motifs from sequence data.
Elife. 2019 Mar 12;8:e39397. doi: 10.7554/eLife.39397.
6
Effect of the sequence data deluge on the performance of methods for detecting protein functional residues.
BMC Bioinformatics. 2018 Feb 27;19(1):67. doi: 10.1186/s12859-018-2084-7.
10
LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system.
BMC Bioinformatics. 2016 Jul 7;17(1):271. doi: 10.1186/s12859-016-1146-y.

本文引用的文献

3
Evolution-guided discovery and recoding of allosteric pathway specificity determinants in psychoactive bioamine receptors.
Proc Natl Acad Sci U S A. 2010 Apr 27;107(17):7787-92. doi: 10.1073/pnas.0914877107. Epub 2010 Apr 12.
4
NetCTLpan: pan-specific MHC class I pathway epitope predictions.
Immunogenetics. 2010 Jun;62(6):357-68. doi: 10.1007/s00251-010-0441-4. Epub 2010 Apr 9.
5
The Pfam protein families database.
Nucleic Acids Res. 2010 Jan;38(Database issue):D211-22. doi: 10.1093/nar/gkp985. Epub 2009 Nov 17.
6
Ensemble approach to predict specificity determinants: benchmarking and validation.
BMC Bioinformatics. 2009 Jul 2;10:207. doi: 10.1186/1471-2105-10-207.
7
Coevolution in defining the functional specificity.
Proteins. 2009 Apr;75(1):231-40. doi: 10.1002/prot.22239.
8
INTREPID--INformation-theoretic TREe traversal for Protein functional site IDentification.
Bioinformatics. 2008 Nov 1;24(21):2445-52. doi: 10.1093/bioinformatics/btn474. Epub 2008 Sep 6.
9
Characterization and prediction of residues determining protein functional specificity.
Bioinformatics. 2008 Jul 1;24(13):1473-80. doi: 10.1093/bioinformatics/btn214. Epub 2008 May 1.
10
Tracing evolutionary pressure.
Bioinformatics. 2008 Apr 1;24(7):908-15. doi: 10.1093/bioinformatics/btn057. Epub 2008 Feb 26.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验