Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California 90089, United States.
Department of Physics and Astronomy, University of Southern California, Los Angeles, California 90089, United States.
J Chem Inf Model. 2024 Aug 26;64(16):6450-6463. doi: 10.1021/acs.jcim.4c01193. Epub 2024 Jul 26.
Recently, the remarkable growth of available crystal structure data and libraries of commercially available or readily synthesizable molecules have unlocked previously inaccessible regions of chemical space for drug development. Paired with improvements in virtual ligand screening methods, these expanded libraries are having a notable impact on early drug design efforts. Yet screening-based methods still face scalability limits, due to computational constraints and the sheer scale of drug-like space. Machine learning approaches are overcoming these limitations by learning the fundamental intra- and intermolecular relationships in drug-target systems from existing data. Here, we introduce DrugHIVE, a deep hierarchical variational autoencoder that outperforms state-of-the-art autoregressive and diffusion-based methods in both speed and performance on common generative benchmarks. DrugHIVE's hierarchical design enables improved control over molecular generation. Its capabilities include dramatically increasing virtual screening efficiency and accelerating a wide range of common drug design tasks, including de novo generation, molecular optimization, scaffold hopping, linker design, and high-throughput pattern replacement. Our highly scalable method can even be applied to receptors with high-confidence AlphaFold-predicted structures, extending the ability to generate high-quality drug-like molecules to a majority of the unsolved human proteome.
最近,可用晶体结构数据和商业上可获得或易于合成的分子库的显著增长,为药物开发解锁了以前无法进入的化学空间区域。与虚拟配体筛选方法的改进相结合,这些扩展的库对早期药物设计工作产生了显著的影响。然而,基于筛选的方法仍然面临着可扩展性的限制,这是由于计算约束和类药性空间的规模。机器学习方法通过从现有数据中学习药物-靶标系统中的基本分子内和分子间关系,克服了这些限制。在这里,我们引入了 DrugHIVE,这是一种深度层次变分自动编码器,在常见的生成基准测试中,它在速度和性能方面都优于最先进的自回归和扩散方法。DrugHIVE 的层次设计能够更好地控制分子生成。它的功能包括显著提高虚拟筛选效率,并加速广泛的常见药物设计任务,包括从头生成、分子优化、支架跳跃、连接子设计和高通量模式替换。我们的高可扩展方法甚至可以应用于具有高可信度 AlphaFold 预测结构的受体,将生成高质量类药性分子的能力扩展到大多数未解决的人类蛋白质组。