具有极端性质的材料发现：强化学习引导的组合化学

Materials discovery with extreme properties reinforcement learning-guided combinatorial chemistry.

作者信息

Kim Hyunseung, Choi Haeyeon, Kang Dongju, Lee Won Bo, Na Jonggeol

机构信息

School of Chemical and Biological Engineering, Seoul National University Republic of Korea

Department of Chemical Engineering and Materials Science, Ewha Womans University Republic of Korea

出版信息

Chem Sci. 2024 Apr 24;15(21):7908-7925. doi: 10.1039/d3sc05281h. eCollection 2024 May 29.

DOI:10.1039/d3sc05281h

PMID:38817562

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11134411/

Abstract

The goal of most materials discovery is to discover materials that are superior to those currently known. Fundamentally, this is close to extrapolation, which is a weak point for most machine learning models that learn the probability distribution of data. Herein, we develop reinforcement learning-guided combinatorial chemistry, which is a rule-based molecular designer driven by trained policy for selecting subsequent molecular fragments to get a target molecule. Since our model has the potential to generate all possible molecular structures that can be obtained from combinations of molecular fragments, unknown molecules with superior properties can be discovered. We theoretically and empirically demonstrate that our model is more suitable for discovering better compounds than probability distribution-learning models. In an experiment aimed at discovering molecules that hit seven extreme target properties, our model discovered 1315 of all target-hitting molecules and 7629 of five target-hitting molecules out of 100 000 trials, whereas the probability distribution-learning models failed. Moreover, it has been confirmed that every molecule generated under the binding rules of molecular fragments is 100% chemically valid. To illustrate the performance in actual problems, we also demonstrate that our models work well on two practical applications: discovering protein docking molecules and HIV inhibitors.

摘要

大多数材料发现的目标是发现比目前已知材料更优越的材料。从根本上讲，这接近于外推法，而外推法是大多数学习数据概率分布的机器学习模型的一个弱点。在此，我们开发了强化学习引导的组合化学，这是一种基于规则的分子设计方法，由经过训练的策略驱动，用于选择后续分子片段以获得目标分子。由于我们的模型有潜力生成从分子片段组合中可以获得的所有可能的分子结构，因此可以发现具有优越性质的未知分子。我们从理论和实证上证明，我们的模型比概率分布学习模型更适合发现更好的化合物。在一个旨在发现具有七种极端目标性质的分子的实验中，我们的模型在100000次试验中发现了所有命中目标的分子中的1315个以及五个命中目标的分子中的7629个，而概率分布学习模型则失败了。此外，已经证实，在分子片段的结合规则下生成的每个分子在化学上都是100%有效的。为了说明在实际问题中的性能，我们还证明了我们的模型在两个实际应用中表现良好：发现蛋白质对接分子和HIV抑制剂。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa35/11134411/0f8e6c5f7ec6/d3sc05281h-f1.jpg

相似文献

Materials discovery with extreme properties reinforcement learning-guided combinatorial chemistry.

Chem Sci. 2024 Apr 24;15(21):7908-7925. doi: 10.1039/d3sc05281h. eCollection 2024 May 29.

Optimization of binding affinities in chemical space with generative pre-trained transformer and deep reinforcement learning.

F1000Res. 2024 Feb 20;12:757. doi: 10.12688/f1000research.130936.2. eCollection 2023.

Improving drug discovery with a hybrid deep generative model using reinforcement learning trained on a Bayesian docking approximation.

J Comput Aided Mol Des. 2023 Nov;37(11):507-517. doi: 10.1007/s10822-023-00523-3. Epub 2023 Aug 8.

Modern machine learning for tackling inverse problems in chemistry: molecular design to realization.

Chem Commun (Camb). 2022 Apr 28;58(35):5316-5331. doi: 10.1039/d1cc07035e.

Enhancing reinforcement learning for de novo molecular design applying self-attention mechanisms.

Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad368.

Molecule generation toward target protein (SARS-CoV-2) using reinforcement learning-based graph neural network via knowledge graph.

Netw Model Anal Health Inform Bioinform. 2023;12(1):13. doi: 10.1007/s13721-023-00409-2. Epub 2023 Jan 6.

De novo drug design as GPT language modeling: large chemistry models with supervised and reinforcement learning.

J Comput Aided Mol Des. 2024 Apr 22;38(1):20. doi: 10.1007/s10822-024-00559-z.

Discovering the Active Ingredients of Medicine and Food Homologous Substances for Inhibiting the Cyclooxygenase-2 Metabolic Pathway by Machine Learning Algorithms.

Molecules. 2023 Sep 23;28(19):6782. doi: 10.3390/molecules28196782.

Molecule generation using transformers and policy gradient reinforcement learning.

Sci Rep. 2023 May 31;13(1):8799. doi: 10.1038/s41598-023-35648-w.

Deep Reinforcement Learning for Multiparameter Optimization in Drug Design.

J Chem Inf Model. 2019 Jul 22;59(7):3166-3176. doi: 10.1021/acs.jcim.9b00325. Epub 2019 Jul 5.

引用本文的文献

Generative artificial intelligence based models optimization towards molecule design enhancement.

J Cheminform. 2025 Aug 4;17(1):116. doi: 10.1186/s13321-025-01059-4.

Data science-centric design, discovery, and evaluation of novel synthetically accessible polyimides with desired dielectric constants.

Chem Sci. 2024 Oct 4;15(43):18099-110. doi: 10.1039/d4sc05000b.

Revolutionizing Molecular Design for Innovative Therapeutic Applications through Artificial Intelligence.

Molecules. 2024 Sep 29;29(19):4626. doi: 10.3390/molecules29194626.

本文引用的文献

An Efficient Modern Strategy to Screen Drug Candidates Targeting RdRp of SARS-CoV-2 With Potentially High Selectivity and Specificity.

Front Chem. 2022 Jul 12;10:933102. doi: 10.3389/fchem.2022.933102. eCollection 2022.

Virtual Screening of TADF Emitters for Single-Layer OLEDs.

Front Chem. 2021 Dec 16;9:800027. doi: 10.3389/fchem.2021.800027. eCollection 2021.

Generative Chemical Transformer: Neural Machine Learning of Molecular Geometric Structures from Chemical Language via Attention.

J Chem Inf Model. 2021 Dec 27;61(12):5804-5814. doi: 10.1021/acs.jcim.1c01289. Epub 2021 Dec 2.

Integrating Computational and Experimental Workflows for Accelerated Organic Materials Discovery.

Adv Mater. 2021 Mar;33(11):e2004831. doi: 10.1002/adma.202004831. Epub 2021 Feb 9.

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models.

Front Pharmacol. 2020 Dec 18;11:565644. doi: 10.3389/fphar.2020.565644. eCollection 2020.

Constrained Bayesian optimization for automatic chemical design using variational autoencoders.

Chem Sci. 2019 Nov 18;11(2):577-586. doi: 10.1039/c9sc04026a. eCollection 2020 Jan 14.

De novo generation of hit-like molecules from gene expression signatures using artificial intelligence.

Nat Commun. 2020 Jan 3;11(1):10. doi: 10.1038/s41467-019-13807-w.

Rapid Multiscale Computational Screening for OLED Host Materials.

ACS Appl Mater Interfaces. 2019 Feb 6;11(5):5276-5288. doi: 10.1021/acsami.8b16225. Epub 2019 Jan 29.

ChEMBL: towards direct deposition of bioassay data.

Nucleic Acids Res. 2019 Jan 8;47(D1):D930-D940. doi: 10.1093/nar/gky1075.

Molecular generative model based on conditional variational autoencoder for de novo molecular design.

J Cheminform. 2018 Jul 11;10(1):31. doi: 10.1186/s13321-018-0286-7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

具有极端性质的材料发现：强化学习引导的组合化学

Materials discovery with extreme properties reinforcement learning-guided combinatorial chemistry.

作者信息

Kim Hyunseung, Choi Haeyeon, Kang Dongju, Lee Won Bo, Na Jonggeol

机构信息

School of Chemical and Biological Engineering, Seoul National University Republic of Korea

Department of Chemical Engineering and Materials Science, Ewha Womans University Republic of Korea

出版信息

Chem Sci. 2024 Apr 24;15(21):7908-7925. doi: 10.1039/d3sc05281h. eCollection 2024 May 29.

DOI:10.1039/d3sc05281h

PMID:38817562

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11134411/

Abstract

摘要

具有极端性质的材料发现：强化学习引导的组合化学

Materials discovery with extreme properties reinforcement learning-guided combinatorial chemistry.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

具有极端性质的材料发现：强化学习引导的组合化学

Materials discovery with extreme properties reinforcement learning-guided combinatorial chemistry.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献