Good Andrew C, Oprea Tudor I
J Comput Aided Mol Des. 2008 Mar-Apr;22(3-4):169-78. doi: 10.1007/s10822-007-9167-2. Epub 2008 Jan 9.
Over the last few years many articles have been published in an attempt to provide performance benchmarks for virtual screening tools. While this research has imparted useful insights, the myriad variables controlling said studies place significant limits on results interpretability. Here we investigate the effects of these variables, including analysis of calculation setup variation, the effect of target choice, active/decoy set selection (with particular emphasis on the effect of analogue bias) and enrichment data interpretation. In addition the optimization of the publicly available DUD benchmark sets through analogue bias removal is discussed, as is their augmentation through the addition of large diverse data sets collated using WOMBAT.
在过去几年中,已经发表了许多文章,试图为虚拟筛选工具提供性能基准。虽然这项研究提供了有用的见解,但控制这些研究的无数变量对结果的可解释性造成了很大限制。在这里,我们研究这些变量的影响,包括计算设置变化的分析、目标选择的影响、活性/诱饵集的选择(特别强调类似物偏差的影响)以及富集数据的解释。此外,还讨论了通过消除类似物偏差对公开可用的DUD基准集进行优化,以及通过添加使用WOMBAT整理的大量多样化数据集对其进行扩充。