Olivecrona Marcus, Blaschke Thomas, Engkvist Ola, Chen Hongming
Hit Discovery, Discovery Sciences, Innovative Medicines and Early Development Biotech Unit, AstraZeneca R&D Gothenburg, 43183, Mölndal, Sweden.
J Cheminform. 2017 Sep 4;9(1):48. doi: 10.1186/s13321-017-0235-x.
This work introduces a method to tune a sequence-based generative model for molecular de novo design that through augmented episodic likelihood can learn to generate structures with certain specified desirable properties. We demonstrate how this model can execute a range of tasks such as generating analogues to a query structure and generating compounds predicted to be active against a biological target. As a proof of principle, the model is first trained to generate molecules that do not contain sulphur. As a second example, the model is trained to generate analogues to the drug Celecoxib, a technique that could be used for scaffold hopping or library expansion starting from a single molecule. Finally, when tuning the model towards generating compounds predicted to be active against the dopamine receptor type 2, the model generates structures of which more than 95% are predicted to be active, including experimentally confirmed actives that have not been included in either the generative model nor the activity prediction model. Graphical abstract .
这项工作介绍了一种用于分子从头设计的基于序列的生成模型调优方法,该方法通过增强情节似然性能够学习生成具有某些特定理想特性的结构。我们展示了该模型如何执行一系列任务,例如生成查询结构的类似物以及生成预测对生物靶标有活性的化合物。作为原理证明,该模型首先被训练生成不含硫的分子。作为第二个例子,该模型被训练生成药物塞来昔布的类似物,这是一种可用于从单个分子开始进行骨架跳跃或文库扩展的技术。最后,当将模型调整为生成预测对2型多巴胺受体有活性的化合物时,该模型生成的结构中超过95%被预测为有活性,包括未包含在生成模型或活性预测模型中的经实验证实的活性物质。图形摘要 。