Bioinformatics Centre, Indian Institute of Science, Bangalore, Karnataka, India.
J Comput Chem. 2011 Apr 15;32(5):787-96. doi: 10.1002/jcc.21657. Epub 2010 Oct 12.
A successful protein-protein docking study culminates in identification of decoys at top ranks with near-native quaternary structures. However, this task remains enigmatic because no generalized scoring functions exist that effectively infer decoys according to the similarity to near-native quaternary structures. Difficulties arise because of the highly irregular nature of the protein surface and the significant variation of the nonbonding and solvation energies based on the chemical composition of the protein-protein interface. In this work, we describe a novel method combining an interface-size filter, a regression model for geometric compatibility (based on two correlated surface and packing parameters), and normalized interaction energy (calculated from correlated nonbonded and solvation energies), to effectively rank decoys from a set of 10,000 decoys. Tests on 30 unbound binary protein-protein complexes show that in 16 cases we can identify at least one decoy in top three ranks having ≤10 Å backbone root mean square deviation from true binding geometry. Comparisons with other state-of-art methods confirm the improved ranking power of our method without the use of any experiment-guided restraints, evolutionary information, statistical propensities, or modified interaction energy equations. Tests on 118 less-difficult bound binary protein-protein complexes with ≤35% sequence redundancy at the interface showed that in 77% cases, at least 1 in 10,000 decoys were identified with ≤5Å backbone root mean square deviation from true geometry at first rank. The work will promote the use of new concepts where correlations among parameters provide more robust scoring models. It will facilitate studies involving molecular interactions, including modeling of large macromolecular assemblies and protein structure prediction.
一项成功的蛋白质-蛋白质对接研究最终确定了具有接近天然四级结构的顶级诱饵。然而,由于没有通用的评分函数可以根据与近天然四级结构的相似性有效地推断诱饵,因此这项任务仍然是一个谜。由于蛋白质表面高度不规则,以及非键和溶剂化能根据蛋白质-蛋白质界面的化学成分而有很大变化,因此出现了困难。在这项工作中,我们描述了一种新方法,该方法结合了界面大小过滤器、基于两个相关表面和包装参数的几何兼容性回归模型以及归一化相互作用能(根据相关非键和溶剂化能计算),可有效地从 10000 个诱饵中对诱饵进行排序。对 30 个未结合的二元蛋白质-蛋白质复合物的测试表明,在 16 种情况下,我们可以在前三个等级中识别出至少一个诱饵,其与真实结合几何形状的骨架均方根偏差≤10Å。与其他最先进方法的比较证实了我们的方法在不使用任何实验引导约束、进化信息、统计倾向或修改的相互作用能方程的情况下,提高了排名能力。对 118 个具有≤35%界面序列冗余的较简单的结合二元蛋白质-蛋白质复合物的测试表明,在 77%的情况下,在前 1 等级中,至少有 1 个在 10000 个诱饵中与真实几何形状的骨架均方根偏差≤5Å。这项工作将促进使用新概念,其中参数之间的相关性提供更稳健的评分模型。它将有助于涉及分子相互作用的研究,包括大的大分子组装体的建模和蛋白质结构预测。