Institute of Biophysics, School of Physics , Huazhong University of Science and Technology , Wuhan , Hubei 430074 , P. R. China.
J Chem Inf Model. 2019 Jun 24;59(6):3080-3090. doi: 10.1021/acs.jcim.9b00057. Epub 2019 May 13.
An accurate energy scoring function is crucial for protein structure prediction. Given the increasing number of experimentally determined structures, knowledge-based approaches have been widely used to develop scoring functions for protein structure prediction in the past three decades. However, current scoring functions often only consider nonbonded interactions and neglect bonded potentials like covalent bonds and angles for the sake of speed and simplicity. Although such scoring functions may be successful on fully relaxed conformations, they would have difficulties in ranking those decoys with distorted bonds or angles, especially when being used for conformational sampling in structure prediction. Therefore, such a scoring function may perform well on one or several decoy sets, but it often has limited accuracy on large diverse sets. Addressing the limitation, we have developed a composite knowledge-based scoring function, named as ITCPS, by integrating bonded and nonbonded potentials as well as orientation-dependent and hydrophobic interactions. Our scoring function ITCPS was extensively evaluated on 18 decoy sets of 927 proteins including three sets of 3DRobot, AMBER benchmarking set, HR, CASP5-8, CASP9-13, eight sets of Decoy 'R' Us, MOULDER, ROSETTA, and I-TASSER set and compared with 51 other scoring functions. It was shown that overall ITCPS performed the best among the 52 scoring functions and achieved a good performance on all the test sets. Of 927 proteins, ITCPS recognized the native structures for 842 proteins, giving a success rate of 90.8% and an average Z-score of 3.36. Moreover, ITCPS also exhibited a strong ability to distinguish the best near-native structure among decoys and achieved a significantly better performance than other tested scoring functions. The present model is expected to be beneficial for the development of scoring functions for other interactions.
准确的能量评分函数对于蛋白质结构预测至关重要。在过去的三十年中,随着实验确定的结构数量不断增加,基于知识的方法已被广泛用于开发蛋白质结构预测的评分函数。然而,当前的评分函数通常仅考虑非键相互作用,为了速度和简单性而忽略了键合势,如共价键和角度。尽管这种评分函数在完全松弛的构象上可能是成功的,但它们在对扭曲键或角度的诱饵进行排序时会遇到困难,尤其是在结构预测中进行构象采样时。因此,这种评分函数在一个或几个诱饵集中可能表现良好,但在大型多样集中通常准确性有限。为了解决这个限制,我们通过整合键合和非键合势以及方向依赖性和疏水性相互作用,开发了一种复合基于知识的评分函数,命名为 ITCPS。我们的评分函数 ITCPS 在包括 3DRobot、AMBER 基准测试集、HR、CASP5-8、CASP9-13、Decoy 'R' Us、MOULDER、ROSETTA 和 I-TASSER 集在内的 927 个蛋白质的 18 个诱饵集中进行了广泛评估,并与 51 个其他评分函数进行了比较。结果表明,总体而言,ITCPS 在 52 个评分函数中表现最好,在所有测试集中都取得了良好的性能。在 927 个蛋白质中,ITCPS 识别出 842 个天然结构,成功率为 90.8%,平均 Z 分数为 3.36。此外,ITCPS 还表现出了很强的能力,能够在诱饵中区分出最佳的近天然结构,比其他测试的评分函数表现要好得多。该模型有望有助于开发其他相互作用的评分函数。