Geng Cunliang, Jung Yong, Renaud Nicolas, Honavar Vasant, Bonvin Alexandre M J J, Xue Li C
Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht 3584 CH, The Netherlands.
Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA.
Bioinformatics. 2020 Jan 1;36(1):112-121. doi: 10.1093/bioinformatics/btz496.
Protein complexes play critical roles in many aspects of biological functions. Three-dimensional (3D) structures of protein complexes are critical for gaining insights into structural bases of interactions and their roles in the biomolecular pathways that orchestrate key cellular processes. Because of the expense and effort associated with experimental determinations of 3D protein complex structures, computational docking has evolved as a valuable tool to predict 3D structures of biomolecular complexes. Despite recent progress, reliably distinguishing near-native docking conformations from a large number of candidate conformations, the so-called scoring problem, remains a major challenge.
Here we present iScore, a novel approach to scoring docked conformations that combines HADDOCK energy terms with a score obtained using a graph representation of the protein-protein interfaces and a measure of evolutionary conservation. It achieves a scoring performance competitive with, or superior to, that of state-of-the-art scoring functions on two independent datasets: (i) Docking software-specific models and (ii) the CAPRI score set generated by a wide variety of docking approaches (i.e. docking software-non-specific). iScore ranks among the top scoring approaches on the CAPRI score set (13 targets) when compared with the 37 scoring groups in CAPRI. The results demonstrate the utility of combining evolutionary, topological and energetic information for scoring docked conformations. This work represents the first successful demonstration of graph kernels to protein interfaces for effective discrimination of near-native and non-native conformations of protein complexes.
The iScore code is freely available from Github: https://github.com/DeepRank/iScore (DOI: 10.5281/zenodo.2630567). And the docking models used are available from SBGrid: https://data.sbgrid.org/dataset/684).
Supplementary data are available at Bioinformatics online.
蛋白质复合物在生物功能的许多方面发挥着关键作用。蛋白质复合物的三维(3D)结构对于深入了解相互作用的结构基础及其在协调关键细胞过程的生物分子途径中的作用至关重要。由于实验测定3D蛋白质复合物结构的成本和工作量较大,计算对接已发展成为预测生物分子复合物3D结构的重要工具。尽管最近取得了进展,但从大量候选构象中可靠地区分近天然对接构象,即所谓的评分问题,仍然是一个重大挑战。
在此,我们提出了iScore,一种对接构象评分的新方法,该方法将HADDOCK能量项与使用蛋白质-蛋白质界面的图形表示获得的分数以及进化保守性度量相结合。在两个独立的数据集上,它实现了与现有最先进评分函数相当或更优的评分性能:(i)特定对接软件的模型和(ii)由多种对接方法(即非特定对接软件)生成的CAPRI评分集。与CAPRI中的37个评分组相比,iScore在CAPRI评分集(13个目标)中位列得分最高的方法之一。结果证明了结合进化、拓扑和能量信息对对接构象进行评分的实用性。这项工作首次成功展示了将图核应用于蛋白质界面以有效区分蛋白质复合物的近天然和非天然构象。
iScore代码可从Github免费获取:https://github.com/DeepRank/iScore(DOI:10.5281/zenodo.2630567)。所使用的对接模型可从SBGrid获取:https://data.sbgrid.org/dataset/684)。
补充数据可在《生物信息学》在线获取。