Department of Biological Sciences, College of Science, Purdue University, West Lafayette, Indiana 47907, USA.
Proteins. 2012 Jan;80(1):126-41. doi: 10.1002/prot.23169. Epub 2011 Oct 12.
Protein-protein binding events mediate many critical biological functions in the cell. Typically, functionally important sites in proteins can be well identified by considering sequence conservation. However, protein-protein interaction sites exhibit higher sequence variation than other functional regions, such as catalytic sites of enzymes. Consequently, the mutational behavior leading to weak sequence conservation poses significant challenges to the protein-protein interaction site prediction. Here, we present a phylogenetic framework to capture critical sequence variations that favor the selection of residues essential for protein-protein binding. Through the comprehensive analysis of diverse protein families, we show that protein binding interfaces exhibit distinct amino acid substitution as compared with other surface residues. On the basis of this analysis, we have developed a novel method, BindML, which utilizes the substitution models to predict protein-protein binding sites of protein with unknown interacting partners. BindML estimates the likelihood that a phylogenetic tree of a local surface region in a query protein structure follows the substitution patterns of protein binding interface and nonbinding surfaces. BindML is shown to perform well compared to alternative methods for protein binding interface prediction. The methodology developed in this study is very versatile in the sense that it can be generally applied for predicting other types of functional sites, such as DNA, RNA, and membrane binding sites in proteins.
蛋白质-蛋白质结合事件在细胞中介导许多关键的生物学功能。通常,通过考虑序列保守性,可以很好地识别蛋白质中功能重要的位点。然而,蛋白质-蛋白质相互作用位点的序列变化比其他功能区域(如酶的催化位点)更高。因此,导致弱序列保守的突变行为给蛋白质-蛋白质相互作用位点的预测带来了重大挑战。在这里,我们提出了一个系统发育框架,以捕捉有利于选择对蛋白质结合至关重要的残基的关键序列变化。通过对各种蛋白质家族的综合分析,我们表明蛋白质结合界面与其他表面残基相比表现出不同的氨基酸取代。在此分析的基础上,我们开发了一种新的方法 BindML,该方法利用取代模型来预测具有未知相互作用伙伴的蛋白质的蛋白质-蛋白质结合位点。与其他蛋白质结合界面预测方法相比,BindML 的表现要好得多。本研究中开发的方法在通用性方面非常出色,因为它可以普遍应用于预测其他类型的功能位点,如 DNA、RNA 和蛋白质中的膜结合位点。