INRIA Nancy Grand Est, LORIA, 54506, Vandoeuvre-lès-Nancy, France.
J Chem Inf Model. 2010 Dec 27;50(12):2079-93. doi: 10.1021/ci100263p. Epub 2010 Nov 23.
In recent years, many virtual screening (VS) tools have been developed that employ different molecular representations and have different speed and accuracy characteristics. In this paper, we compare ten popular ligand-based VS tools using the publicly available Directory of Useful Decoys (DUD) data set comprising over 100 000 compounds distributed across 40 protein targets. The DUD was developed initially to evaluate docking algorithms, but our results from an operational correlation analysis show that it is also well suited for comparing ligand-based VS tools. Although it is conventional wisdom that 3D molecular shape is an important determinant of biological activity, our results based on permutational significance tests of several commonly used VS metrics show that the 2D fingerprint-based methods generally give better VS performance than the 3D shape-based approaches for surprisingly many of the DUD targets. To help understand this finding, we have analyzed the nature of the scoring functions used and the composition of the DUD data set itself. We propose that to improve the VS performance of current 3D methods, it will be necessary to devise screening queries that can represent multiple possible conformations and which can exploit knowledge of known actives that span multiple scaffold families.
近年来,已经开发出许多虚拟筛选 (VS) 工具,它们采用不同的分子表示形式,具有不同的速度和准确性特征。在本文中,我们使用公开可用的包含超过 100,000 种化合物的 40 个蛋白质靶标分布的有用诱饵目录 (DUD) 数据集比较了十种流行的基于配体的 VS 工具。DUD 最初是为评估对接算法而开发的,但我们从操作相关分析的结果表明,它也非常适合比较基于配体的 VS 工具。尽管 3D 分子形状是生物活性的重要决定因素是常识,但我们基于几种常用 VS 指标的置换显着性检验的结果表明,对于许多令人惊讶的 DUD 靶标,基于 2D 指纹的方法通常比基于 3D 形状的方法具有更好的 VS 性能。为了帮助理解这一发现,我们分析了所使用的评分函数的性质和 DUD 数据集本身的组成。我们提出,为了提高当前 3D 方法的 VS 性能,有必要设计可以表示多个可能构象的筛选查询,并可以利用跨越多个支架家族的已知活性剂的知识。