Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California 94158, United States.
Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., P.O. Box B, Frederick, Maryland 21702, United States.
J Chem Inf Model. 2021 Feb 22;61(2):699-714. doi: 10.1021/acs.jcim.0c00598. Epub 2021 Jan 25.
Enrichment of ligands versus property-matched decoys is widely used to test and optimize docking library screens. However, the unconstrained optimization of enrichment alone can mislead, leading to false confidence in prospective performance. This can arise by over-optimizing for enrichment against property-matched decoys, without considering the full spectrum of molecules to be found in a true large library screen. Adding decoys representing charge extrema helps mitigate over-optimizing for electrostatic interactions. Adding decoys that represent the overall characteristics of the library to be docked allows one to sample molecules not represented by ligands and property-matched decoys but that one will encounter in a prospective screen. An optimized version of the DUD-E set (DUDE-Z), as well as Extrema and sets representing broad features of the library (Goldilocks), is developed here. We also explore the variability that one can encounter in enrichment calculations and how that can temper one's confidence in small enrichment differences. The new tools and new decoy sets are freely available at http://tldr.docking.org and http://dudez.docking.org.
配体富集与性质匹配的伪配体常用于测试和优化对接库筛选。然而,仅对配体富集进行无约束优化可能会产生误导,导致对预期性能产生错误的信心。这可能是由于过分优化了与性质匹配的伪配体的富集,而没有考虑到在真正的大型文库筛选中可能发现的分子的全貌。添加代表电荷极值的伪配体有助于减轻对静电相互作用的过度优化。添加代表待对接文库整体特征的伪配体可以使我们能够采样到那些不能用配体和性质匹配的伪配体代表但在预期筛选中可能遇到的分子。这里开发了一个经过优化的 DUD-E 数据集(DUDE-Z)以及代表库广泛特征的极值和数据集( Goldilocks)。我们还探讨了在富集计算中可能遇到的可变性,以及如何调整对小富集差异的信心。新的工具和新的伪配体集可在 http://tldr.docking.org 和 http://dudez.docking.org 免费获得。