文献检索，用中文搜 PubMed

Designing compounds with a range of desirable properties is a fundamental challenge in drug discovery. In pre-clinical early drug discovery, novel compounds are often designed based on an already existing promising starting compound through structural modifications for further property optimization. Recently, transformer-based deep learning models have been explored for the task of molecular optimization by training on pairs of similar molecules. This provides a starting point for generating similar molecules to a given input molecule, but has limited flexibility regarding user-defined property profiles. Here, we evaluate the effect of reinforcement learning on transformer-based molecular generative models. The generative model can be considered as a pre-trained model with knowledge of the chemical space close to an input compound, while reinforcement learning can be viewed as a tuning phase, steering the model towards chemical space with user-specific desirable properties. The evaluation of two distinct tasks-molecular optimization and scaffold discovery-suggest that reinforcement learning could guide the transformer-based generative model towards the generation of more compounds of interest. Additionally, the impact of pre-trained models, learning steps and learning rates are investigated.Scientific contributionOur study investigates the effect of reinforcement learning on a transformer-based generative model initially trained for generating molecules similar to starting molecules. The reinforcement learning framework is applied to facilitate multiparameter optimisation of starting molecules. This approach allows for more flexibility for optimizing user-specific property profiles and helps finding more ideas of interest.

设计具有一系列理想特性的化合物是药物研发中的一项基本挑战。在临床前早期药物研发中，新型化合物通常是基于已有的有前景的起始化合物，通过结构修饰来进一步优化性质而设计的。最近，基于Transformer的深度学习模型已被探索用于通过对相似分子对进行训练来完成分子优化任务。这为生成与给定输入分子相似的分子提供了一个起点，但在用户定义的性质概况方面灵活性有限。在此，我们评估强化学习对基于Transformer的分子生成模型的影响。生成模型可被视为一个预训练模型，它了解靠近输入化合物的化学空间，而强化学习可被视为一个调优阶段，引导模型朝着具有用户特定理想性质的化学空间发展。对两个不同任务——分子优化和骨架发现——的评估表明，强化学习可以引导基于Transformer的生成模型生成更多感兴趣的化合物。此外，还研究了预训练模型、学习步骤和学习率的影响。

科学贡献

我们的研究调查了强化学习对最初为生成与起始分子相似的分子而训练的基于Transformer的生成模型的影响。应用强化学习框架以促进起始分子的多参数优化。这种方法在优化用户特定性质概况方面具有更大的灵活性，并有助于找到更多感兴趣的思路。