Suppr超能文献

机器学习增强对接能够高效地对万亿级枚举化学库进行基于结构的虚拟筛选。

Machine Learning-Boosted Docking Enables the Efficient Structure-Based Virtual Screening of Giga-Scale Enumerated Chemical Libraries.

机构信息

School of Pharmacy, University of Eastern Finland, Kuopio FI-70211, Finland.

CSC─IT Center for Science Ltd., Espoo FI-02101, Finland.

出版信息

J Chem Inf Model. 2023 Sep 25;63(18):5773-5783. doi: 10.1021/acs.jcim.3c01239. Epub 2023 Sep 1.

Abstract

The emergence of ultra-large screening libraries, filled to the brim with billions of readily available compounds, poses a growing challenge for docking-based virtual screening. Machine learning (ML)-boosted strategies like the tool HASTEN combine rapid ML prediction with the brute-force docking of small fractions of such libraries to increase screening throughput and take on giga-scale libraries. In our case study of an anti-bacterial chaperone and an anti-viral kinase, we first generated a brute-force docking baseline for 1.56 billion compounds in the Enamine REAL lead-like library with the fast Glide high-throughput virtual screening protocol. With HASTEN, we observed robust recall of 90% of the true 1000 top-scoring virtual hits in both targets when docking only 1% of the entire library. This reduction of the required docking experiments by 99% significantly shortens the screening time. In the kinase target, the employment of a hydrogen bonding constraint resulted in a major proportion of unsuccessful docking attempts and hampered ML predictions. We demonstrate the optimization potential in the treatment of failed compounds when performing ML-boosted screening and benchmark and showcase HASTEN as a fast and robust tool in a growing arsenal of approaches to unlock the chemical space covered by giga-scale screening libraries for everyday drug discovery campaigns.

摘要

超大筛选库的出现,其中充满了数十亿种现成的化合物,这给基于对接的虚拟筛选带来了越来越大的挑战。像 HASTEN 这样的机器学习 (ML) 增强策略结合了快速 ML 预测和小部分此类库的暴力对接,以提高筛选通量并应对千兆级库。在我们对一种抗菌伴侣蛋白和一种抗病毒激酶的案例研究中,我们首先使用快速 Glide 高通量虚拟筛选方案,为 Enamine REAL 类 Lead 库中的 15.6 亿种化合物生成了暴力对接基线。使用 HASTEN,当仅对接整个库的 1%时,我们观察到在两个靶标中,真实的 1000 个最佳虚拟命中中有 90%的稳健召回率。这将所需的对接实验减少了 99%,大大缩短了筛选时间。在激酶靶标中,氢键约束的使用导致大部分对接尝试失败,并阻碍了 ML 预测。我们展示了当进行 ML 增强筛选时,在处理失败化合物方面的优化潜力,并将 HASTEN 作为一种快速而强大的工具进行基准测试,并展示了 HASTEN 在日益增长的解锁千兆级筛选库覆盖的化学空间的方法武器库中的应用,用于日常药物发现活动。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bac7/10523430/30d238f61b23/ci3c01239_0002.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验