Bradford James R, Needham Chris J, Bulpitt Andrew J, Westhead David R
Institute of Molecular and Cellular Biology, University of Leeds, Leeds, LS2 9JT, UK.
J Mol Biol. 2006 Sep 15;362(2):365-86. doi: 10.1016/j.jmb.2006.07.028. Epub 2006 Jul 21.
Identifying the interface between two interacting proteins provides important clues to the function of a protein, and is becoming increasing relevant to drug discovery. Here, surface patch analysis was combined with a Bayesian network to predict protein-protein binding sites with a success rate of 82% on a benchmark dataset of 180 proteins, improving by 6% on previous work and well above the 36% that would be achieved by a random method. A comparable success rate was achieved even when evolutionary information was missing, a further improvement on our previous method which was unable to handle incomplete data automatically. In a case study of the Mog1p family, we showed that our Bayesian network method can aid the prediction of previously uncharacterised binding sites and provide important clues to protein function. On Mog1p itself a putative binding site involved in the SLN1-SKN7 signal transduction pathway was detected, as was a Ran binding site, previously characterized solely by conservation studies, even though our automated method operated without using homologous proteins. On the remaining members of the family (two structural genomics targets, and a protein involved in the photosystem II complex in higher plants) we identified novel binding sites with little correspondence to those on Mog1p. These results suggest that members of the Mog1p family bind to different proteins and probably have different functions despite sharing the same overall fold. We also demonstrated the applicability of our method to drug discovery efforts by successfully locating a number of binding sites involved in the protein-protein interaction network of papilloma virus infection. In a separate study, we attempted to distinguish between the two types of binding site, obligate and non-obligate, within our dataset using a second Bayesian network. This proved difficult although some separation was achieved on the basis of patch size, electrostatic potential and conservation. Such was the similarity between the two interacting patch types, we were able to use obligate binding site properties to predict the location of non-obligate binding sites and vice versa.
确定两个相互作用蛋白之间的界面可为蛋白功能提供重要线索,并且在药物发现中变得越来越重要。在此,表面斑块分析与贝叶斯网络相结合,在一个包含180种蛋白的基准数据集上预测蛋白-蛋白结合位点,成功率达82%,比之前的工作提高了6%,远高于随机方法所能达到的36%。即使缺少进化信息,也能获得相当的成功率,这是对我们之前无法自动处理不完整数据的方法的进一步改进。在对Mog1p家族的案例研究中,我们表明我们的贝叶斯网络方法有助于预测以前未表征的结合位点,并为蛋白功能提供重要线索。在Mog1p自身上,检测到一个参与SLN1-SKN7信号转导途径的假定结合位点,以及一个Ran结合位点,该位点之前仅通过保守性研究进行了表征,尽管我们的自动化方法在不使用同源蛋白的情况下运行。在该家族的其他成员(两个结构基因组学靶点,以及高等植物中参与光系统II复合体的一种蛋白)上,我们鉴定出了与Mog1p上的位点几乎没有对应关系的新结合位点。这些结果表明,Mog1p家族的成员尽管具有相同的整体折叠结构,但它们与不同的蛋白结合,可能具有不同的功能。我们还通过成功定位乳头瘤病毒感染的蛋白-蛋白相互作用网络中涉及的多个结合位点,证明了我们的方法在药物发现工作中的适用性。在另一项研究中,我们试图使用第二个贝叶斯网络在我们的数据集中区分两种类型的结合位点,即专一性和非专一性结合位点。尽管基于斑块大小、静电势和保守性实现了一些区分,但这证明是困难的。这两种相互作用斑块类型之间的相似性使得我们能够利用专一性结合位点的特性来预测非专一性结合位点的位置,反之亦然。