用于分子设计的基于主动学习的样本高效强化学习

Sample efficient reinforcement learning with active learning for molecular design.

作者信息

Dodds Michael, Guo Jeff, Löhr Thomas, Tibo Alessandro, Engkvist Ola, Janet Jon Paul

机构信息

Molecular AI, Discovery Sciences, R&D, AstraZeneca 431 50 Gothenburg Sweden

出版信息

Chem Sci. 2024 Feb 8;15(11):4146-4160. doi: 10.1039/d3sc04653b. eCollection 2024 Mar 13.

DOI:10.1039/d3sc04653b

PMID:38487235

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10935729/

Abstract

Reinforcement learning (RL) is a powerful and flexible paradigm for searching for solutions in high-dimensional action spaces. However, bridging the gap between playing computer games with thousands of simulated episodes and solving real scientific problems with complex and involved environments (up to actual laboratory experiments) requires improvements in terms of sample efficiency to make the most of expensive information. The discovery of new drugs is a major commercial application of RL, motivated by the very large nature of the chemical space and the need to perform multiparameter optimization (MPO) across different properties. methods, such as virtual library screening (VS) and molecular generation with RL, show great promise in accelerating this search. However, incorporation of increasingly complex computational models in these workflows requires increasing sample efficiency. Here, we introduce an active learning system linked with an RL model (RL-AL) for molecular design, which aims to improve the sample-efficiency of the optimization process. We identity and characterize unique challenges combining RL and AL, investigate the interplay between the systems, and develop a novel AL approach to solve the MPO problem. Our approach greatly expedites the search for novel solutions relative to baseline-RL for simple ligand- and structure-based oracle functions, with a 5-66-fold increase in hits generated for a fixed oracle budget and a 4-64-fold reduction in computational time to find a specific number of hits. Furthermore, compounds discovered through RL-AL display substantial enrichment of a multi-parameter scoring objective, indicating superior efficacy in curating high-scoring compounds, without a reduction in output diversity. This significant acceleration improves the feasibility of oracle functions that have largely been overlooked in RL due to high computational costs, for example free energy perturbation methods, and in principle is applicable to any RL domain.

摘要

强化学习（RL）是一种在高维动作空间中寻找解决方案的强大且灵活的范式。然而，要弥合通过数千次模拟情节玩电脑游戏与在复杂且棘手的环境（直至实际实验室实验）中解决实际科学问题之间的差距，就需要在样本效率方面加以改进，以便充分利用昂贵的信息。新药发现是强化学习的一项主要商业应用，其动机源于化学空间的巨大规模以及对跨不同属性进行多参数优化（MPO）的需求。虚拟库筛选（VS）和利用强化学习进行分子生成等方法在加速这一搜索过程中显示出巨大潜力。然而，在这些工作流程中纳入日益复杂的计算模型需要提高样本效率。在此，我们引入了一种与强化学习模型（RL-AL）相关联的主动学习系统用于分子设计，旨在提高优化过程的样本效率。我们识别并刻画了将强化学习与主动学习相结合所面临的独特挑战，研究了各系统之间的相互作用，并开发了一种新颖的主动学习方法来解决多参数优化问题。相对于基于简单配体和结构的预言函数的基线强化学习，我们的方法极大地加快了寻找新解决方案的速度，在固定的预言预算下，命中次数增加了5至66倍，而找到特定数量命中结果的计算时间减少了4至64倍。此外，通过RL-AL发现的化合物在多参数评分目标上有显著富集，这表明在筛选高分化合物方面具有卓越功效，同时输出多样性并未降低。这种显著的加速提高了由于计算成本高而在强化学习中 largely 被忽视的预言函数的可行性，例如自由能微扰方法，并且原则上适用于任何强化学习领域。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e919/10935729/ecd963225897/d3sc04653b-f1.jpg

相似文献

Sample efficient reinforcement learning with active learning for molecular design.用于分子设计的基于主动学习的样本高效强化学习

Chem Sci. 2024 Feb 8;15(11):4146-4160. doi: 10.1039/d3sc04653b. eCollection 2024 Mar 13.

Learning in continuous action space for developing high dimensional potential energy models.在连续动作空间中学习，以开发高维势能模型。

Nat Commun. 2022 Jan 18;13(1):368. doi: 10.1038/s41467-021-27849-6.

Active Inference and Reinforcement Learning: A Unified Inference on Continuous State and Action Spaces Under Partial Observability.主动推理与强化学习：部分可观测性下连续状态与动作空间的统一推理

Neural Comput. 2024 Sep 17;36(10):2073-2135. doi: 10.1162/neco_a_01698.

Combining Reinforcement Learning and Tensor Networks, with an Application to Dynamical Large Deviations.结合强化学习与张量网络及其在动态大偏差中的应用

Phys Rev Lett. 2024 May 10;132(19):197301. doi: 10.1103/PhysRevLett.132.197301.

Optimization of binding affinities in chemical space with generative pre-trained transformer and deep reinforcement learning.利用生成式预训练变换器和深度强化学习在化学空间中优化结合亲和力

F1000Res. 2024 Feb 20;12:757. doi: 10.12688/f1000research.130936.2. eCollection 2023.

Memory-assisted reinforcement learning for diverse molecular de novo design.用于多样分子从头设计的记忆辅助强化学习

J Cheminform. 2020 Nov 10;12(1):68. doi: 10.1186/s13321-020-00473-0.

Human locomotion with reinforcement learning using bioinspired reward reshaping strategies.基于生物启发式奖励重塑策略的强化学习的人类运动。

Med Biol Eng Comput. 2021 Jan;59(1):243-256. doi: 10.1007/s11517-020-02309-3. Epub 2021 Jan 8.

Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning.基于强化学习和逆强化学习的蛇形机器人节能与损伤恢复蠕动步态设计。

Neural Netw. 2020 Sep;129:323-333. doi: 10.1016/j.neunet.2020.05.029. Epub 2020 Jun 16.

Deep reinforcement learning in chemistry: A review.化学中的深度强化学习：综述

J Comput Chem. 2024 Aug 15;45(22):1886-1898. doi: 10.1002/jcc.27354. Epub 2024 May 2.

Faster and more diverse de novo molecular optimization with double-loop reinforcement learning using augmented SMILES.使用增强型 SMILES 进行双环强化学习，实现更快、更多样的从头分子优化。

J Comput Aided Mol Des. 2023 Aug;37(8):373-394. doi: 10.1007/s10822-023-00512-6. Epub 2023 Jun 17.

引用本文的文献

Nanotechnology-Driven Drug Delivery Systems for Lung Cancer: Computational Advances and Clinical Perspectives.用于肺癌的纳米技术驱动的药物递送系统：计算进展与临床展望。

Thorac Cancer. 2025 Jul;16(14):e70134. doi: 10.1111/1759-7714.70134.

Generative Deep Learning for de Novo Drug Design─A Chemical Space Odyssey.用于从头药物设计的生成式深度学习——一场化学空间奥德赛。

J Chem Inf Model. 2025 Jul 28;65(14):7352-7372. doi: 10.1021/acs.jcim.5c00641. Epub 2025 Jul 9.

Identification of nanomolar adenosine A receptor ligands using reinforcement learning and structure-based drug design.利用强化学习和基于结构的药物设计鉴定纳摩尔级别的腺苷 A 受体配体。

Nat Commun. 2025 Jul 1;16(1):5485. doi: 10.1038/s41467-025-60629-0.

Directly optimizing for synthesizability in generative molecular design using retrosynthesis models.利用逆合成模型在生成式分子设计中直接优化可合成性。

Chem Sci. 2025 Mar 21;16(16):6943-6956. doi: 10.1039/d5sc01476j. eCollection 2025 Apr 16.

Artificial intelligence in drug development.药物研发中的人工智能

Nat Med. 2025 Jan;31(1):45-59. doi: 10.1038/s41591-024-03434-4. Epub 2025 Jan 20.

Augmented Memory: Sample-Efficient Generative Molecular Design with Reinforcement Learning.增强记忆：基于强化学习的样本高效生成式分子设计

JACS Au. 2024 Apr 10;4(6):2160-2172. doi: 10.1021/jacsau.4c00066. eCollection 2024 Jun 24.

Reinvent 4: Modern AI-driven generative molecule design.重塑4：现代人工智能驱动的生成式分子设计。

J Cheminform. 2024 Feb 21;16(1):20. doi: 10.1186/s13321-024-00812-5.

ChemSpaceAL: An Efficient Active Learning Methodology Applied to Protein-Specific Molecular Generation.化学空间主动学习（ChemSpaceAL）：一种应用于蛋白质特异性分子生成的高效主动学习方法。

ArXiv. 2023 Dec 4:arXiv:2309.05853v2.

本文引用的文献

J Comput Aided Mol Des. 2023 Aug;37(8):373-394. doi: 10.1007/s10822-023-00512-6. Epub 2023 Jun 17.

Generative Models as an Emerging Paradigm in the Chemical Sciences.生成模型在化学科学中的新兴范例。

J Am Chem Soc. 2023 Apr 26;145(16):8736-8750. doi: 10.1021/jacs.2c13467. Epub 2023 Apr 13.

Artificial intelligence in molecular de novo design: Integration with experiment.人工智能在分子从头设计中的应用：与实验的结合。

Curr Opin Struct Biol. 2023 Jun;80:102575. doi: 10.1016/j.sbi.2023.102575. Epub 2023 Mar 24.

AlphaFold accelerates artificial intelligence powered drug discovery: efficient discovery of a novel CDK20 small molecule inhibitor.AlphaFold加速人工智能驱动的药物发现：高效发现新型CDK20小分子抑制剂。

Chem Sci. 2023 Jan 10;14(6):1443-1452. doi: 10.1039/d2sc05709c. eCollection 2023 Feb 8.

Generative and reinforcement learning approaches for the automated de novo design of bioactive compounds.用于生物活性化合物自动从头设计的生成式和强化学习方法。

Commun Chem. 2022 Oct 18;5(1):129. doi: 10.1038/s42004-022-00733-0.

Active Learning Guided Drug Design Lead Optimization Based on Relative Binding Free Energy Modeling.基于相对结合自由能建模的主动学习引导药物设计先导优化

J Chem Inf Model. 2023 Jan 23;63(2):583-594. doi: 10.1021/acs.jcim.2c01052. Epub 2023 Jan 4.

Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor.生成式深度学习可发现强效且选择性的 RIPK1 抑制剂。

Nat Commun. 2022 Nov 12;13(1):6891. doi: 10.1038/s41467-022-34692-w.

Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation.增强爬山算法提高了基于语言的从头分子生成的强化学习效率。

J Cheminform. 2022 Oct 3;14(1):68. doi: 10.1186/s13321-022-00646-z.

Icolos: a workflow manager for structure-based post-processing of de novo generated small molecules.Icolos：一种用于从头生成的小分子的基于结构的后处理的工作流管理器。

Bioinformatics. 2022 Oct 31;38(21):4951-4952. doi: 10.1093/bioinformatics/btac614.

Transformer-based molecular optimization beyond matched molecular pairs.超越匹配分子对的基于Transformer的分子优化。

J Cheminform. 2022 Mar 28;14(1):18. doi: 10.1186/s13321-022-00599-3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于分子设计的基于主动学习的样本高效强化学习

Sample efficient reinforcement learning with active learning for molecular design.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献