UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, 27599, USA.
Mol Inform. 2024 Jan;43(1):e202300207. doi: 10.1002/minf.202300207. Epub 2023 Dec 19.
Recent rapid expansion of make-on-demand, purchasable, chemical libraries comprising dozens of billions or even trillions of molecules has challenged the efficient application of traditional structure-based virtual screening methods that rely on molecular docking. We present a novel computational methodology termed HIDDEN GEM (HIt Discovery using Docking ENriched by GEnerative Modeling) that greatly accelerates virtual screening. This workflow uniquely integrates machine learning, generative chemistry, massive chemical similarity searching and molecular docking of small, selected libraries in the beginning and the end of the workflow. For each target, HIDDEN GEM nominates a small number of top-scoring virtual hits prioritized from ultra-large chemical libraries. We have benchmarked HIDDEN GEM by conducting virtual screening campaigns for 16 diverse protein targets using Enamine REAL Space library comprising 37 billion molecules. We show that HIDDEN GEM yields the highest enrichment factors as compared to state of the art accelerated virtual screening methods, while requiring the least computational resources. HIDDEN GEM can be executed with any docking software and employed by users with limited computational resources.
最近,按需制造、可购买的数十亿甚至数万亿分子的化学库的快速扩张,对依赖分子对接的传统基于结构的虚拟筛选方法的有效应用提出了挑战。我们提出了一种新的计算方法,称为 HIDDEN GEM(使用通过生成模型增强的对接进行命中发现),它大大加快了虚拟筛选的速度。该工作流程独特地将机器学习、生成化学、大规模化学相似性搜索和小分子库的分子对接整合在工作流程的开始和结束。对于每个目标,HIDDEN GEM 从超大型化学库中提名一小部分得分最高的虚拟命中。我们通过使用包含 370 亿个分子的 Enamine REAL Space 库对 16 个不同的蛋白质靶标进行虚拟筛选活动来对 HIDDEN GEM 进行基准测试。与最先进的加速虚拟筛选方法相比,HIDDEN GEM 产生了最高的富集因子,同时所需的计算资源最少。HIDDEN GEM 可以与任何对接软件一起执行,并且可以由计算资源有限的用户使用。