Filella-Merce Isaac, Molina Alexis, Díaz Lucía, Orzechowski Marek, Berchiche Yamina A, Zhu Yang Ming, Vilalta-Mor Júlia, Malo Laura, Yekkirala Ajay S, Ray Soumya, Guallar Victor
Barcelona Supercomputing Center (BSC), Barcelona, Spain.
Nostrum Biodiscovery, Barcelona, Spain.
Commun Chem. 2025 Aug 8;8(1):238. doi: 10.1038/s42004-025-01635-7.
Machine learning is transforming drug discovery, with generative models (GMs) gaining attention for their ability to design molecules with specific properties. However, GMs often struggle with target engagement, synthetic accessibility, or generalization. To address these, we developed a GM workflow integrating a variational autoencoder with two nested active learning cycles. These iteratively refine their predictions using chemoinformatics and molecular modeling predictors. We tested our workflow on two systems, CDK2 and KRAS, successfully generating diverse, drug-like molecules with high predicted affinity and synthesis accessibility. Notably, we generated novel scaffolds distinct from those known for each target. For CDK2, we synthetized 9 molecules yielding 8 with in vitro activity, including one with nanomolar potency. For KRAS, in silico methods validated by CDK2 assays identified 4 molecules with potential activity. These findings showcase our GM workflow's ability to explore novel chemical spaces tailored for specific targets, thereby opening new avenues in drug discovery.
机器学习正在改变药物发现领域,生成模型(GMs)因其能够设计具有特定性质的分子而受到关注。然而,生成模型在靶点结合、合成可及性或泛化方面常常面临困难。为了解决这些问题,我们开发了一种将变分自编码器与两个嵌套的主动学习循环相结合的生成模型工作流程。这些循环使用化学信息学和分子建模预测器迭代地改进它们的预测。我们在两个系统CDK2和KRAS上测试了我们的工作流程,成功地生成了具有高预测亲和力和合成可及性的多样的类药物分子。值得注意的是,我们生成了与每个靶点已知的支架不同的新型支架。对于CDK2,我们合成了9个分子,其中8个具有体外活性,包括一个具有纳摩尔效力的分子。对于KRAS,通过CDK2测定验证的计算机方法鉴定出4个具有潜在活性的分子。这些发现展示了我们的生成模型工作流程探索为特定靶点量身定制的新型化学空间的能力,从而为药物发现开辟了新途径。