增强爬山算法提高了基于语言的从头分子生成的强化学习效率。

Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation.

作者信息

Thomas Morgan, O'Boyle Noel M, Bender Andreas, de Graaf Chris

机构信息

Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK.

Computational Chemistry, Sosei Heptares, Steinmetz Building, Granta Park, Great Abington, Cambridge, CB21 6DG, UK.

出版信息

J Cheminform. 2022 Oct 3;14(1):68. doi: 10.1186/s13321-022-00646-z.

DOI:10.1186/s13321-022-00646-z

PMID:36192789

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9531503/

Abstract

A plethora of AI-based techniques now exists to conduct de novo molecule generation that can devise molecules conditioned towards a particular endpoint in the context of drug design. One popular approach is using reinforcement learning to update a recurrent neural network or language-based de novo molecule generator. However, reinforcement learning can be inefficient, sometimes requiring up to 10 molecules to be sampled to optimize more complex objectives, which poses a limitation when using computationally expensive scoring functions like docking or computer-aided synthesis planning models. In this work, we propose a reinforcement learning strategy called Augmented Hill-Climb based on a simple, hypothesis-driven hybrid between REINVENT and Hill-Climb that improves sample-efficiency by addressing the limitations of both currently used strategies. We compare its ability to optimize several docking tasks with REINVENT and benchmark this strategy against other commonly used reinforcement learning strategies including REINFORCE, REINVENT (version 1 and 2), Hill-Climb and best agent reminder. We find that optimization ability is improved ~ 1.5-fold and sample-efficiency is improved ~ 45-fold compared to REINVENT while still delivering appealing chemistry as output. Diversity filters were used, and their parameters were tuned to overcome observed failure modes that take advantage of certain diversity filter configurations. We find that Augmented Hill-Climb outperforms the other reinforcement learning strategies used on six tasks, especially in the early stages of training or for more difficult objectives. Lastly, we show improved performance not only on recurrent neural networks but also on a reinforcement learning stabilized transformer architecture. Overall, we show that Augmented Hill-Climb improves sample-efficiency for language-based de novo molecule generation conditioning via reinforcement learning, compared to the current state-of-the-art. This makes more computationally expensive scoring functions, such as docking, more accessible on a relevant timescale.

摘要

现在有大量基于人工智能的技术可用于进行从头分子生成，这些技术能够在药物设计的背景下设计出针对特定终点的分子。一种流行的方法是使用强化学习来更新递归神经网络或基于语言的从头分子生成器。然而，强化学习可能效率低下，有时需要对多达10个分子进行采样才能优化更复杂的目标，这在使用对接或计算机辅助合成规划模型等计算成本高昂的评分函数时构成了限制。在这项工作中，我们提出了一种名为增强爬山法的强化学习策略，它基于REINVENT和爬山法之间简单的、假设驱动的混合方法，通过解决当前使用的两种策略的局限性来提高样本效率。我们将其优化几个对接任务的能力与REINVENT进行比较，并将该策略与其他常用的强化学习策略（包括REINFORCE、REINVENT（版本1和2）、爬山法和最佳智能体提醒）进行基准测试。我们发现，与REINVENT相比，优化能力提高了约1.5倍，样本效率提高了约45倍，同时仍能产生吸引人的化学结构作为输出。使用了多样性过滤器，并对其参数进行了调整，以克服利用某些多样性过滤器配置的观察到的失败模式。我们发现，增强爬山法在六项任务上优于其他使用的强化学习策略，尤其是在训练的早期阶段或针对更困难的目标时。最后，我们不仅在递归神经网络上，而且在强化学习稳定的变压器架构上都展示了改进的性能。总体而言，我们表明，与当前的最新技术相比，增强爬山法通过强化学习提高了基于语言的从头分子生成条件的样本效率。这使得对接等计算成本更高的评分函数在相关时间尺度上更容易使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b2/9531503/de17cd71ca7f/13321_2022_646_Fig1_HTML.jpg

相似文献

Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation.

J Cheminform. 2022 Oct 3;14(1):68. doi: 10.1186/s13321-022-00646-z.

Comparison of structure- and ligand-based scoring functions for deep generative models: a GPCR case study.

J Cheminform. 2021 May 13;13(1):39. doi: 10.1186/s13321-021-00516-0.

DockStream: a docking wrapper to enhance de novo molecular design.

J Cheminform. 2021 Nov 17;13(1):89. doi: 10.1186/s13321-021-00563-7.

Reinvent 4: Modern AI-driven generative molecule design.

J Cheminform. 2024 Feb 21;16(1):20. doi: 10.1186/s13321-024-00812-5.

Faster and more diverse de novo molecular optimization with double-loop reinforcement learning using augmented SMILES.

J Comput Aided Mol Des. 2023 Aug;37(8):373-394. doi: 10.1007/s10822-023-00512-6. Epub 2023 Jun 17.

An exploration strategy improves the diversity of de novo ligands using deep reinforcement learning: a case for the adenosine A receptor.

J Cheminform. 2019 May 24;11(1):35. doi: 10.1186/s13321-019-0355-6.

Has Artificial Intelligence Impacted Drug Discovery?

Methods Mol Biol. 2022;2390:153-176. doi: 10.1007/978-1-0716-1787-8_6.

Optimization of binding affinities in chemical space with generative pre-trained transformer and deep reinforcement learning.

F1000Res. 2024 Feb 20;12:757. doi: 10.12688/f1000research.130936.2. eCollection 2023.

Enhancing reinforcement learning for de novo molecular design applying self-attention mechanisms.

Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad368.

Adversarial Threshold Neural Computer for Molecular de Novo Design.

Mol Pharm. 2018 Oct 1;15(10):4386-4397. doi: 10.1021/acs.molpharmaceut.7b01137. Epub 2018 Mar 30.

引用本文的文献

Generative artificial intelligence based models optimization towards molecule design enhancement.

J Cheminform. 2025 Aug 4;17(1):116. doi: 10.1186/s13321-025-01059-4.

Generative Deep Learning for de Novo Drug Design─A Chemical Space Odyssey.

J Chem Inf Model. 2025 Jul 28;65(14):7352-7372. doi: 10.1021/acs.jcim.5c00641. Epub 2025 Jul 9.

A genotype-to-drug diffusion model for generation of tailored anti-cancer small molecules.

Nat Commun. 2025 Jul 1;16(1):5628. doi: 10.1038/s41467-025-60763-9.

Identification of nanomolar adenosine A receptor ligands using reinforcement learning and structure-based drug design.

Nat Commun. 2025 Jul 1;16(1):5485. doi: 10.1038/s41467-025-60629-0.

A beginner's approach to deep learning applied to VS and MD techniques.

J Cheminform. 2025 Apr 8;17(1):47. doi: 10.1186/s13321-025-00985-7.

A systematic review of deep learning chemical language models in recent era.

J Cheminform. 2024 Nov 18;16(1):129. doi: 10.1186/s13321-024-00916-y.

ACEGEN: Reinforcement Learning of Generative Chemical Agents for Drug Discovery.

J Chem Inf Model. 2024 Aug 12;64(15):5900-5911. doi: 10.1021/acs.jcim.4c00895. Epub 2024 Aug 2.

Diverse Hits in De Novo Molecule Design: Diversity-Based Comparison of Goal-Directed Generators.

J Chem Inf Model. 2024 Aug 12;64(15):5756-5761. doi: 10.1021/acs.jcim.4c00519. Epub 2024 Jul 19.

PromptSMILES: prompting for scaffold decoration and fragment linking in chemical language models.

J Cheminform. 2024 Jul 4;16(1):77. doi: 10.1186/s13321-024-00866-5.

Augmented Memory: Sample-Efficient Generative Molecular Design with Reinforcement Learning.

JACS Au. 2024 Apr 10;4(6):2160-2172. doi: 10.1021/jacsau.4c00066. eCollection 2024 Jun 24.

本文引用的文献

Generative Models Should at Least Be Able to Design Molecules That Dock Well: A New Benchmark.

J Chem Inf Model. 2023 Jun 12;63(11):3238-3247. doi: 10.1021/acs.jcim.2c01355. Epub 2023 May 24.

DrugEx v3: scaffold-constrained drug design with graph transformer-based reinforcement learning.

J Cheminform. 2023 Feb 20;15(1):24. doi: 10.1186/s13321-023-00694-z.

Drug Design Using Reinforcement Learning with Graph-Based Deep Generative Models.

J Chem Inf Model. 2022 Oct 24;62(20):4863-4872. doi: 10.1021/acs.jcim.2c00838. Epub 2022 Oct 11.

Parallel tempered genetic algorithm guided by deep neural networks for inverse molecular design.

Digit Discov. 2022 May 3;1(4):390-404. doi: 10.1039/d2dd00003b. eCollection 2022 Aug 8.

Language models can learn complex molecular distributions.

Nat Commun. 2022 Jun 7;13(1):3293. doi: 10.1038/s41467-022-30839-x.

Explaining and avoiding failure modes in goal-directed generation of small molecules.

J Cheminform. 2022 Apr 1;14(1):20. doi: 10.1186/s13321-022-00601-y.

Transformer-Based Generative Model Accelerating the Development of Novel BRAF Inhibitors.

ACS Omega. 2021 Dec 1;6(49):33864-33873. doi: 10.1021/acsomega.1c05145. eCollection 2021 Dec 14.

MERMAID: an open source automated hit-to-lead method based on deep reinforcement learning.

J Cheminform. 2021 Nov 27;13(1):94. doi: 10.1186/s13321-021-00572-6.

DockStream: a docking wrapper to enhance de novo molecular design.

J Cheminform. 2021 Nov 17;13(1):89. doi: 10.1186/s13321-021-00563-7.

DrugEx v2: de novo design of drug molecules by Pareto-based multi-objective reinforcement learning in polypharmacology.

J Cheminform. 2021 Nov 12;13(1):85. doi: 10.1186/s13321-021-00561-9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

增强爬山算法提高了基于语言的从头分子生成的强化学习效率。

Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation.

作者信息

Thomas Morgan, O'Boyle Noel M, Bender Andreas, de Graaf Chris

机构信息

Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK.

Computational Chemistry, Sosei Heptares, Steinmetz Building, Granta Park, Great Abington, Cambridge, CB21 6DG, UK.

出版信息

J Cheminform. 2022 Oct 3;14(1):68. doi: 10.1186/s13321-022-00646-z.

DOI:10.1186/s13321-022-00646-z

PMID:36192789

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9531503/

Abstract

摘要

增强爬山算法提高了基于语言的从头分子生成的强化学习效率。

Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

增强爬山算法提高了基于语言的从头分子生成的强化学习效率。

Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献