Ericksen Spencer S, Wu Haozhen, Zhang Huikun, Michael Lauren A, Newton Michael A, Hoffmann F Michael, Wildman Scott A
Center for High Throughput Computing, Department of Computer Sciences, University of Wisconsin-Madison , 1210 W. Dayton St., Madison, Wisconsin 53706, United States.
J Chem Inf Model. 2017 Jul 24;57(7):1579-1590. doi: 10.1021/acs.jcim.7b00153. Epub 2017 Jul 12.
In structure-based virtual screening, compound ranking through a consensus of scores from a variety of docking programs or scoring functions, rather than ranking by scores from a single program, provides better predictive performance and reduces target performance variability. Here we compare traditional consensus scoring methods with a novel, unsupervised gradient boosting approach. We also observed increased score variation among active ligands and developed a statistical mixture model consensus score based on combining score means and variances. To evaluate performance, we used the common performance metrics ROCAUC and EF1 on 21 benchmark targets from DUD-E. Traditional consensus methods, such as taking the mean of quantile normalized docking scores, outperformed individual docking methods and are more robust to target variation. The mixture model and gradient boosting provided further improvements over the traditional consensus methods. These methods are readily applicable to new targets in academic research and overcome the potentially poor performance of using a single docking method on a new target.
在基于结构的虚拟筛选中,通过多种对接程序或评分函数的分数共识对化合物进行排名,而不是根据单个程序的分数进行排名,可提供更好的预测性能并降低目标性能的变异性。在这里,我们将传统的共识评分方法与一种新颖的无监督梯度提升方法进行比较。我们还观察到活性配体之间的分数差异增加,并基于分数均值和方差的组合开发了一种统计混合模型共识分数。为了评估性能,我们在来自DUD-E的21个基准目标上使用了常见的性能指标ROCAUC和EF1。传统的共识方法,如取分位数归一化对接分数的均值,优于单个对接方法,并且对目标变异更具鲁棒性。混合模型和梯度提升在传统共识方法的基础上进一步改进。这些方法很容易应用于学术研究中的新目标,并克服了在新目标上使用单一对接方法可能出现的性能不佳问题。