Kirchmair Johannes, Markt Patrick, Distinto Simona, Wolber Gerhard, Langer Thierry
Inte:Ligand Software-Entwicklungs- und Consulting GmbH, Clemens Maria Hofbauer-Gasse 6, 2344, Maria Enzersdorf, Austria.
J Comput Aided Mol Des. 2008 Mar-Apr;22(3-4):213-28. doi: 10.1007/s10822-007-9163-6. Epub 2008 Jan 15.
Within the last few years a considerable amount of evaluative studies has been published that investigate the performance of 3D virtual screening approaches. Thereby, in particular assessments of protein-ligand docking are facing remarkable interest in the scientific community. However, comparing virtual screening approaches is a non-trivial task. Several publications, especially in the field of molecular docking, suffer from shortcomings that are likely to affect the significance of the results considerably. These quality issues often arise from poor study design, biasing, by using improper or inexpressive enrichment descriptors, and from errors in interpretation of the data output. In this review we analyze recent literature evaluating 3D virtual screening methods, with focus on molecular docking. We highlight problematic issues and provide guidelines on how to improve the quality of computational studies. Since 3D virtual screening protocols are in general assessed by their ability to discriminate between active and inactive compounds, we summarize the impact of the composition and preparation of test sets on the outcome of evaluations. Moreover, we investigate the significance of both classic enrichment parameters and advanced descriptors for the performance of 3D virtual screening methods. Furthermore, we review the significance and suitability of RMSD as a measure for the accuracy of protein-ligand docking algorithms and of conformational space sub sampling algorithms.
在过去几年里,已经发表了大量评估研究,这些研究调查了三维虚拟筛选方法的性能。因此,蛋白质-配体对接的特别评估在科学界引起了极大的兴趣。然而,比较虚拟筛选方法并非易事。一些出版物,特别是在分子对接领域,存在可能严重影响结果显著性的缺点。这些质量问题往往源于研究设计不佳、偏差(通过使用不当或无表现力的富集描述符)以及数据输出解释中的错误。在本综述中,我们分析了近期评估三维虚拟筛选方法的文献,重点是分子对接。我们突出了存在问题的方面,并提供了关于如何提高计算研究质量的指导方针。由于三维虚拟筛选方案通常通过区分活性和非活性化合物的能力来评估,我们总结了测试集的组成和制备对评估结果的影响。此外,我们研究了经典富集参数和先进描述符对三维虚拟筛选方法性能的重要性。此外,我们还综述了均方根偏差(RMSD)作为蛋白质-配体对接算法和构象空间子采样算法准确性度量的重要性和适用性。