Tiikkainen Pekka, Markt Patrick, Wolber Gerhard, Kirchmair Johannes, Distinto Simona, Poso Antti, Kallioniemi Olli
University of Turku and VTT Medical Biotechnology, Itäinen Pitkäkatu 4 C, FI-20521 Turku, Finland.
J Chem Inf Model. 2009 Oct;49(10):2168-78. doi: 10.1021/ci900249b.
In the current work, we measure the performance of seven ligand-based virtual screening tools--five similarity search methods and two pharmacophore elucidators--against the MUV data set. For the similarity search tools, single active molecules as well as active compound sets clustered in terms of their chemical diversity were used as templates. Their score was calculated against all inactive and active compounds in their target class. Subsequently, the scores were used to calculate different performance metrics including enrichment factors and AUC values. We also studied the effect of data fusion on the results. To measure the performance of the pharmacophore tools, a set of active molecules was picked either random- or chemical diversity-based from each target class to build a pharmacophore model which was then used to screen the remaining compounds in the set. Our results indicate that template sets selected by their chemical diversity are the best choice for similarity search tools, whereas the optimal training sets for pharmacophore elucidators are based on random selection underscoring that pharmacophore modeling cannot be easily automated. We also suggest a number of improvements for future benchmark sets and discuss activity cliffs as a potential problem in ligand-based virtual screening.
在当前工作中,我们针对MUV数据集评估了七种基于配体的虚拟筛选工具的性能,其中包括五种相似性搜索方法和两种药效团阐释工具。对于相似性搜索工具,单个活性分子以及根据化学多样性聚类的活性化合物集被用作模板。其得分是根据目标类别中的所有非活性和活性化合物计算得出的。随后,这些得分被用于计算不同的性能指标,包括富集因子和AUC值。我们还研究了数据融合对结果的影响。为了评估药效团工具的性能,从每个目标类别中随机或基于化学多样性挑选一组活性分子来构建药效团模型,然后用该模型筛选数据集中的其余化合物。我们的结果表明,根据化学多样性选择的模板集是相似性搜索工具的最佳选择,而药效团阐释工具的最佳训练集基于随机选择,这突出表明药效团建模不易实现自动化。我们还针对未来的基准数据集提出了一些改进建议,并讨论了活性悬崖作为基于配体的虚拟筛选中的一个潜在问题。