PASTEUR, Département de chimie, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 75005 Paris, France.
Molecular Design Sciences - Integrated Drug Discovery, Sanofi R&D, 94400 Vitry-sur-Seine, France.
J Chem Inf Model. 2020 Dec 28;60(12):5637-5646. doi: 10.1021/acs.jcim.0c01015. Epub 2020 Dec 10.
One of the major applications of generative models for drug discovery targets the lead-optimization phase. During the optimization of a lead series, it is common to have scaffold constraints imposed on the structure of the molecules designed. Without enforcing such constraints, the probability of generating molecules with the required scaffold is extremely low and hinders the practicality of generative models for de novo drug design. To tackle this issue, we introduce a new algorithm, named SAMOA (Scaffold Constrained Molecular Generation), to perform scaffold-constrained in silico molecular design. We build on the well-known SMILES-based Recurrent Neural Network (RNN) generative model, with a modified sampling procedure to achieve scaffold-constrained generation. We directly benefit from the associated reinforcement learning methods, allowing to design molecules optimized for different properties while exploring only the relevant chemical space. We showcase the method's ability to perform scaffold-constrained generation on various tasks: designing novel molecules around scaffolds extracted from SureChEMBL chemical series, generating novel active molecules on the Dopamine Receptor D2 (DRD2) target, and finally, designing predicted actives on the MMP-12 series, an industrial lead-optimization project.
生成模型在药物发现靶点中的主要应用之一是针对先导优化阶段。在先导系列的优化过程中,通常对设计的分子结构施加支架约束。如果不执行这些约束,生成具有所需支架的分子的概率极低,并且阻碍了从头药物设计的生成模型的实用性。为了解决这个问题,我们引入了一种新的算法,称为 SAMOA(支架约束分子生成),用于进行支架约束的计算机分子设计。我们建立在著名的基于 SMILES 的递归神经网络 (RNN) 生成模型之上,采用改进的采样过程来实现支架约束生成。我们直接受益于相关的强化学习方法,允许在仅探索相关化学空间的情况下,针对不同性质设计优化的分子。我们展示了该方法在各种任务中执行支架约束生成的能力:围绕从 SureChEMBL 化学系列中提取的支架设计新型分子,在多巴胺受体 D2 (DRD2) 靶标上生成新型活性分子,最后在 MMP-12 系列上设计预测的活性分子,这是一个工业先导优化项目。