Suppr超能文献

从序列保守性预测功能重要残基。

Predicting functionally important residues from sequence conservation.

作者信息

Capra John A, Singh Mona

机构信息

Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA.

出版信息

Bioinformatics. 2007 Aug 1;23(15):1875-82. doi: 10.1093/bioinformatics/btm270. Epub 2007 May 22.

Abstract

MOTIVATION

All residues in a protein are not equally important. Some are essential for the proper structure and function of the protein, whereas others can be readily replaced. Conservation analysis is one of the most widely used methods for predicting these functionally important residues in protein sequences.

RESULTS

We introduce an information-theoretic approach for estimating sequence conservation based on Jensen-Shannon divergence. We also develop a general heuristic that considers the estimated conservation of sequentially neighboring sites. In large-scale testing, we demonstrate that our combined approach outperforms previous conservation-based measures in identifying functionally important residues; in particular, it is significantly better than the commonly used Shannon entropy measure. We find that considering conservation at sequential neighbors improves the performance of all methods tested. Our analysis also reveals that many existing methods that attempt to incorporate the relationships between amino acids do not lead to better identification of functionally important sites. Finally, we find that while conservation is highly predictive in identifying catalytic sites and residues near bound ligands, it is much less effective in identifying residues in protein-protein interfaces.

AVAILABILITY

Data sets and code for all conservation measures evaluated are available at http://compbio.cs.princeton.edu/conservation/

摘要

动机

蛋白质中的所有残基并非同等重要。有些对于蛋白质的正确结构和功能至关重要,而其他一些则可以很容易地被替代。保守性分析是预测蛋白质序列中这些功能重要残基最广泛使用的方法之一。

结果

我们引入了一种基于詹森 - 香农散度估计序列保守性的信息论方法。我们还开发了一种通用启发式方法,该方法考虑了连续相邻位点的估计保守性。在大规模测试中,我们证明我们的组合方法在识别功能重要残基方面优于以前基于保守性的方法;特别是,它明显优于常用的香农熵度量。我们发现考虑连续相邻位点的保守性可提高所有测试方法的性能。我们的分析还表明,许多试图纳入氨基酸之间关系的现有方法并不能更好地识别功能重要位点。最后,我们发现虽然保守性在识别催化位点和结合配体附近的残基方面具有高度预测性,但在识别蛋白质 - 蛋白质界面中的残基方面效果要差得多。

可用性

所有评估的保守性度量的数据集和代码可在http://compbio.cs.princeton.edu/conservation/获取

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验