Gold Nicola D, Jackson Richard M
Institute of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9JT, UK.
J Mol Biol. 2006 Feb 3;355(5):1112-24. doi: 10.1016/j.jmb.2005.11.044. Epub 2005 Dec 1.
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.
蛋白质结构数据的快速增长以及结构基因组学项目的出现,增加了对自动结构分析和功能预测工具的需求。小分子识别对许多蛋白质的功能至关重要;因此,确定配体结合位点的相似性对于理解配体相互作用很重要,并且可能有助于对其进行功能分类。在这里,我们展示了一个结合位点数据库(SitesBase),它在给定已知蛋白质 - 配体结合位点的情况下,可以快速检索具有相似结构的其他结合位点,而无需考虑整体序列或折叠相似性。然而,每个匹配项还会标注序列相似性和折叠信息,以帮助解释结构和功能相似性。配体结合位点的相似性可以表明共同的结合模式和对相似分子的识别,从而可以对未表征蛋白质的功能进行潜在推断,或者在已经知道序列或折叠相似性的情况下,为共同功能提供额外证据。或者,该资源可以为分子识别的详细研究提供有价值的信息,包括基于结构的配体设计以及理解配体交叉反应性。在这里,我们展示了超家族或更远亲缘折叠关系的蛋白质之间以及看似不相关的蛋白质之间原子相似性的例子。还对未分类蛋白质进行了结构超家族的归属,在大多数情况下证实了使用序列相似性所做的归属。在序列相似性未能找到显著匹配的情况下,正确归属也是可能的,这说明了结合位点比较对于新确定蛋白质的潜在用途。