用于改善化学反应性能的强化学习

Reinforcement Learning for Improving Chemical Reaction Performance.

作者信息

Hoque Ajnabiul, Surve Mihir, Kalyanakrishnan Shivaram, Sunoj Raghavan B

机构信息

Department of Chemistry, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India.

Department of Computer Science and Engineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India.

出版信息

J Am Chem Soc. 2024 Oct 2. doi: 10.1021/jacs.4c08866.

DOI:10.1021/jacs.4c08866

PMID:39356950

Abstract

Deep learning (DL) methods have gained notable prominence in predictive and generative tasks in molecular space. However, their application in chemical reactions remains grossly underutilized. Chemical reactions are intrinsically complex: typically involving multiple molecules besides bond-breaking/forming events. In reaction discovery, one aims to maximize yield and/or selectivity that depends on a number of factors, mostly centered on reacting partners and reaction conditions. Herein, we introduce RE-EXPLORE, a novel approach that integrates deep reinforcement learning (RL) with an RNN-based deep generative model to identify prospective new reactants/catalysts, whose yield/selectivity is estimated using a pretrained regressor. Three chemical databases (ChEMBL, ZINC, and COCONUT containing half a million to one million unlabeled molecules) are independently used for pretraining the generators to enrich them with valuable information from diverse chemical space. Standard RL methods are found to be insufficient, as learners tend to prioritize exploitation for immediate gains, resulting in repetitive generation of same/similar molecules. Our engineered reward function includes a Tanimoto-based uniqueness factor within the RL loop that improved the exploration of the environment and has helped accrue larger returns. Integration of a user-defined core fragment into the generated molecules facilitated learning of specific reaction types. Together, RE-EXPLORE can navigate the reaction space toward practically meaningful regions and offers notable improvements across the three distinct reaction types considered in this study. It identifies high-yielding substrates and highly enantioselective chiral catalysts. This RL-based approach has the potential to expedite reaction discovery and aid in the synthesis planning of important compounds, including drugs and pharmaceuticals.

摘要

深度学习（DL）方法在分子空间的预测和生成任务中已获得显著的突出地位。然而，它们在化学反应中的应用仍未得到充分利用。化学反应本质上是复杂的：除了键的断裂/形成事件外，通常还涉及多个分子。在反应发现中，目标是最大化产率和/或选择性，这取决于许多因素，主要集中在反应伙伴和反应条件上。在此，我们引入了RE-EXPLORE，这是一种将深度强化学习（RL）与基于循环神经网络（RNN）的深度生成模型相结合的新方法，用于识别潜在的新反应物/催化剂，其产率/选择性使用预训练的回归器进行估计。三个化学数据库（包含50万至100万个未标记分子的ChEMBL、ZINC和COCONUT）被独立用于预训练生成器，以便用来自不同化学空间的有价值信息丰富它们。发现标准的RL方法是不够的，因为学习者倾向于优先利用以获得即时收益，导致相同/相似分子的重复生成。我们设计的奖励函数在RL循环中包含一个基于塔尼莫托系数的独特性因子，这改善了对环境的探索，并有助于获得更大的回报。将用户定义的核心片段整合到生成的分子中有助于特定反应类型的学习。总之，RE-EXPLORE可以在反应空间中朝着实际有意义的区域导航，并在本研究中考虑的三种不同反应类型中提供显著改进。它识别出高产率的底物和高对映选择性的手性催化剂。这种基于RL的方法有潜力加速反应发现，并有助于重要化合物（包括药物和药品）的合成规划。