Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS-Université de Strasbourg , 67400 Illkirch, France.
J Chem Inf Model. 2015 Sep 28;55(9):2005-14. doi: 10.1021/acs.jcim.5b00190. Epub 2015 Sep 14.
Protein-protein interactions are becoming a major focus of academic and pharmaceutical research to identify low molecular weight compounds able to modulate oligomeric signaling complexes. As the number of protein complexes of known three-dimensional structure is constantly increasing, there is a need to discard biologically irrelevant interfaces and prioritize those of high value for potential druggability assessment. A Random Forest model has been trained on a set of 300 protein-protein interfaces using 45 molecular interaction descriptors as input. It is able to predict the nature of external test interfaces (crystallographic vs biological) with accuracy at least equal to that of the best state-of-the-art methods. However, our method presents unique advantages in the early prioritization of potentially ligandable protein-protein interfaces: (i) it is equally robust in predicting either crystallographic or biological contacts and (ii) it can be applied to a wide array of oligomeric complexes ranging from small-sized biological interfaces to large crystallographic contacts.
蛋白质-蛋白质相互作用正成为学术和药物研究的一个主要焦点,以鉴定能够调节低分子量寡聚信号复合物的化合物。随着已知三维结构的蛋白质复合物数量的不断增加,需要摒弃与生物学无关的界面,并优先考虑那些对潜在药物评估具有高价值的界面。使用 45 个分子相互作用描述符作为输入,在一组 300 个蛋白质-蛋白质界面上训练了一个随机森林模型。它能够准确预测外部测试界面(晶体学与生物学)的性质,其准确性至少与最先进方法的准确性相当。然而,我们的方法在潜在配体蛋白质-蛋白质界面的早期优先级排序方面具有独特的优势:(i)它在预测晶体学或生物学接触方面同样稳健,(ii)它可以应用于广泛的寡聚复合物,从小型生物界面到大型晶体学接触。