Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California 94158, United States.
Program in Biophysics, University of California, San Francisco, San Francisco, California 94158, United States.
J Chem Inf Model. 2024 Oct 14;64(19):7398-7408. doi: 10.1021/acs.jcim.4c00683. Epub 2024 Oct 3.
Make-on-demand chemical libraries have drastically increased the reach of molecular docking, with the enumerated ready-to-dock ZINC-22 library approaching 6.4 billion molecules (July 2024). While ever-growing libraries result in better-scoring molecules, the computational resources required to dock all of ZINC-22 make this endeavor infeasible for most. Here, we organize and traverse chemical space with hierarchical navigable small-world graphs, a method we term retrieval augmented docking (RAD). RAD recovers most virtual actives, despite docking only a fraction of the library. Furthermore, RAD is protein-agnostic, supporting additional docking campaigns without additional computational overhead. In depth, we assess RAD on published large-scale docking campaigns against D4 and AmpC spanning 99.5 million and 138 million molecules, respectively. RAD recovers 95% of DOCK virtual actives for both targets after evaluating only 10% of the libraries. In breadth, RAD shows widespread applicability against 43 DUDE-Z proteins, evaluating 50.3 million associations. On average, RAD recovers 87% of virtual actives while docking 10% of the library without sacrificing chemical diversity.
按需定制的化学文库极大地扩展了分子对接的应用范围,其中枚举的可对接 ZINC-22 文库已接近 64 亿个分子(2024 年 7 月)。尽管不断增长的文库可以得到更好的评分分子,但对接整个 ZINC-22 文库所需的计算资源使得大多数人都无法实现这一目标。在这里,我们使用分层可导航小世界图对化学空间进行组织和遍历,我们将这种方法称为检索增强对接(RAD)。RAD 恢复了大多数虚拟活性分子,尽管只对接了文库的一小部分。此外,RAD 是与蛋白质无关的,可以支持额外的对接活动,而不会增加额外的计算开销。在深度方面,我们评估了 RAD 在针对 D4 和 AmpC 的已发表的大规模对接活动中的表现,这两个靶标分别涵盖了 9950 万个和 1.38 亿个分子。RAD 在评估文库的 10%后,对这两个靶标都恢复了 95%的 DOCK 虚拟活性分子。在广度方面,RAD 显示了对 43 个 DUDE-Z 蛋白的广泛适用性,评估了 5030 万个关联。平均而言,RAD 在对接文库的 10%的情况下,恢复了 87%的虚拟活性分子,而不会牺牲化学多样性。