Binkowski T Andrew, Adamian Larisa, Liang Jie
Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607-7052, USA.
J Mol Biol. 2003 Sep 12;332(2):505-26. doi: 10.1016/s0022-2836(03)00882-9.
We describe a novel approach for inferring functional relationship of proteins by detecting sequence and spatial patterns of protein surfaces. Well-formed concave surface regions in the form of pockets and voids are examined to identify similarity relationship that might be directly related to protein function. We first exhaustively identify and measure analytically all 910,379 surface pockets and interior voids on 12,177 protein structures from the Protein Data Bank. The similarity of patterns of residues forming pockets and voids are then assessed in sequence, in spatial arrangement, and in orientational arrangement. Statistical significance in the form of E and p-values is then estimated for each of the three types of similarity measurements. Our method is fully automated without human intervention and can be used without input of query patterns. It does not assume any prior knowledge of functional residues of a protein, and can detect similarity based on surface patterns small and large. It also tolerates, to some extent, conformational flexibility of functional sites. We show with examples that this method can detect functional relationship with specificity for members of the same protein family and superfamily, as well as remotely related functional surfaces from proteins of different fold structures. We envision that this method can be used for discovering novel functional relationship of protein surfaces, for functional annotation of protein structures with unknown biological roles, and for further inquiries on evolutionary origins of structural elements important for protein function.
我们描述了一种通过检测蛋白质表面的序列和空间模式来推断蛋白质功能关系的新方法。研究呈口袋和空洞形式的结构良好的凹面区域,以识别可能与蛋白质功能直接相关的相似性关系。我们首先详尽地识别并通过分析测量了蛋白质数据库中12177个蛋白质结构上的所有910379个表面口袋和内部空洞。然后从序列、空间排列和方向排列方面评估形成口袋和空洞的残基模式的相似性。接着针对三种相似性测量中的每一种估计以E值和p值形式表示的统计显著性。我们的方法完全自动化,无需人工干预,且无需输入查询模式即可使用。它不假定对蛋白质功能残基有任何先验知识,并且能够基于大小不同的表面模式检测相似性。它在一定程度上还能容忍功能位点的构象灵活性。我们通过实例表明,该方法能够特异性地检测同一蛋白质家族和超家族成员之间的功能关系,以及来自不同折叠结构蛋白质的远距离相关功能表面。我们设想该方法可用于发现蛋白质表面的新功能关系,对具有未知生物学作用的蛋白质结构进行功能注释,以及进一步探究对蛋白质功能重要的结构元件的进化起源。