Laboratoire de Biochimie Théorique, CNRS UPR9080, Institut de Biologie Physico-Chimique, University Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie, Paris, 75005, France.
Laboratoire de Biologie Computationnelle et Quantitative, CNRS UMR7238, UPMC Univ-Paris 6, Sorbonne Université, 4 place Jussieu, Paris, 75005, France.
Proteins. 2018 Jul;86(7):723-737. doi: 10.1002/prot.25506. Epub 2018 Apr 24.
Protein-protein interactions control a large range of biological processes and their identification is essential to understand the underlying biological mechanisms. To complement experimental approaches, in silico methods are available to investigate protein-protein interactions. Cross-docking methods, in particular, can be used to predict protein binding sites. However, proteins can interact with numerous partners and can present multiple binding sites on their surface, which may alter the binding site prediction quality. We evaluate the binding site predictions obtained using complete cross-docking simulations of 358 proteins with 2 different scoring schemes accounting for multiple binding sites. Despite overall good binding site prediction performances, 68 cases were still associated with very low prediction quality, presenting individual area under the specificity-sensitivity ROC curve (AUC) values below the random AUC threshold of 0.5, since cross-docking calculations can lead to the identification of alternate protein binding sites (that are different from the reference experimental sites). For the large majority of these proteins, we show that the predicted alternate binding sites correspond to interaction sites with hidden partners, that is, partners not included in the original cross-docking dataset. Among those new partners, we find proteins, but also nucleic acid molecules. Finally, for proteins with multiple binding sites on their surface, we investigated the structural determinants associated with the binding sites the most targeted by the docking partners.
蛋白质-蛋白质相互作用控制着广泛的生物过程,其鉴定对于理解潜在的生物学机制至关重要。为了补充实验方法,还可以使用计算方法来研究蛋白质-蛋白质相互作用。特别是,对接方法可用于预测蛋白质结合位点。然而,蛋白质可以与许多伴侣相互作用,并且其表面可以呈现多个结合位点,这可能会改变结合位点预测的质量。我们使用两种不同的评分方案评估了对 358 种蛋白质进行完整对接模拟后获得的结合位点预测。尽管总体上具有良好的结合位点预测性能,但仍有 68 种情况与非常低的预测质量相关,其个体特异性-敏感性 ROC 曲线下面积(AUC)值低于随机 AUC 阈值 0.5,因为对接计算可能会导致替代蛋白质结合位点的识别(与参考实验位点不同)。对于这些蛋白质中的大多数,我们表明预测的替代结合位点对应于与隐藏伴侣相互作用的位点,即未包含在原始对接数据集中的伴侣。在这些新伴侣中,我们发现了蛋白质,还有核酸分子。最后,对于其表面具有多个结合位点的蛋白质,我们研究了与对接伴侣最相关的结合位点的结构决定因素。