Department of Computer Science, Modern Sciences and Arts University, Giza, Egypt.
Central Lab for Agricultural Experts Systems, Ministry of Agriculture and Land Reclamation, Giza, Egypt.
Chem Biol Drug Des. 2018 Aug;92(2):1429-1434. doi: 10.1111/cbdd.13206. Epub 2018 Apr 27.
Despite recent efforts to improve the scoring performance of scoring functions, accurately predicting the binding affinity is still a challenging task. Therefore, different approaches were tried to improve the prediction performance of four scoring functions (x-score, vina, autodock, and rf-score) by substituting the linear regression model of classical scoring function by random forest to examine the performance improvement if an additive functional form is not imposed, and by combining different scoring functions into hybrid ones. The datasets were derived from the PDBbind-CN database version 2016. When evaluating the original scoring functions on the generic dataset, rf-score has outperformed classical scoring functions, which shows the superiority of descriptor-based scoring functions. Substituting linear regression as a linear model by random forest as a nonlinear model had largely improved the scoring performance of autodock and vina while x-score had only a slight performance increase. All hybrid scoring functions had only a slight improvement-if any-on both of the combined scoring functions, which is not worth the slower calculation time.
尽管最近在努力提高评分函数的评分性能,但准确预测结合亲和力仍然是一项具有挑战性的任务。因此,尝试了不同的方法来提高四个评分函数(x-score、vina、autodock 和 rf-score)的预测性能,通过将经典评分函数的线性回归模型替换为随机森林,以检查如果不施加加性函数形式是否会提高性能,以及通过将不同的评分函数组合成混合评分函数。数据集来自 PDBbind-CN 数据库版本 2016。在对通用数据集评估原始评分函数时,rf-score 优于经典评分函数,这表明基于描述符的评分函数具有优越性。用随机森林代替线性回归作为非线性模型,大大提高了 autodock 和 vina 的评分性能,而 x-score 的评分性能仅略有提高。所有混合评分函数对两种组合评分函数的改进都只有很小的改善——如果有的话——这并不值得计算时间的延长。