一种使用生成模型在小数据情况下进行反应发现的迁移学习方法。

A transfer learning approach for reaction discovery in small data situations using generative model.

作者信息

Singh Sukriti, Sunoj Raghavan B

机构信息

Department of Chemistry, Indian Institute of Technology Bombay, Mumbai 400076, India.

Centre for Machine Intelligence and Data Science, Indian Institute of Technology Bombay, Mumbai 400076, India.

出版信息

iScience. 2022 Jun 22;25(7):104661. doi: 10.1016/j.isci.2022.104661. eCollection 2022 Jul 15.

DOI:10.1016/j.isci.2022.104661

PMID:35832891

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9272387/

Abstract

Sustainable practices in chemical sciences can be better realized by adopting interdisciplinary approaches that combine the advantages of machine learning (ML) on the initially acquired small data in reaction discovery. Developing new reactions generally remains heuristic and even time and resource intensive. For instance, synthesis of fluorine-containing compounds, which constitute ∼20% of the marketed drugs, relies on deoxyfluorination of abundantly available alcohols. Herein, we demonstrate the use of a recurrent neural network-based deep generative model built on a library of just 37 alcohols for effective learning and exploration of the chemical space. The proof-of-concept ML model is able to generate good quality, synthetically accessible, higher-yielding novel alcohol molecules. This protocol would have superior utility for deployment into a practical reaction discovery pipeline.

摘要

通过采用跨学科方法，结合机器学习（ML）在反应发现中最初获取的小数据方面的优势，可以更好地实现化学科学中的可持续实践。开发新反应通常仍然是试探性的，甚至耗费时间和资源。例如，构成约20%市售药物的含氟化合物的合成依赖于大量可得醇的脱氧氟化反应。在此，我们展示了基于仅37种醇的库构建的基于循环神经网络的深度生成模型，用于有效学习和探索化学空间。这个概念验证的ML模型能够生成高质量、可合成获得、产率更高的新型醇分子。该方案在部署到实际反应发现流程中将具有卓越的实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b0e/9272387/ba08e80a45cb/fx1.jpg

相似文献

A transfer learning approach for reaction discovery in small data situations using generative model.

iScience. 2022 Jun 22;25(7):104661. doi: 10.1016/j.isci.2022.104661. eCollection 2022 Jul 15.

Molecular Machine Learning for Chemical Catalysis: Prospects and Challenges.

Acc Chem Res. 2023 Feb 7;56(3):402-412. doi: 10.1021/acs.accounts.2c00801. Epub 2023 Jan 30.

Combining Cloud-Based Free-Energy Calculations, Synthetically Aware Enumerations, and Goal-Directed Generative Machine Learning for Rapid Large-Scale Chemical Exploration and Optimization.

J Chem Inf Model. 2020 Sep 28;60(9):4311-4325. doi: 10.1021/acs.jcim.0c00120. Epub 2020 Jun 19.

De Novo Peptide and Protein Design Using Generative Adversarial Networks: An Update.

J Chem Inf Model. 2022 Feb 28;62(4):761-774. doi: 10.1021/acs.jcim.1c01361. Epub 2022 Feb 7.

Computational Discovery of TTF Molecules with Deep Generative Models.

Front Chem. 2021 Dec 23;9:800133. doi: 10.3389/fchem.2021.800133. eCollection 2021.

Navigating Transition-Metal Chemical Space: Artificial Intelligence for First-Principles Design.

Acc Chem Res. 2021 Feb 2;54(3):532-545. doi: 10.1021/acs.accounts.0c00686. Epub 2021 Jan 22.

Deep Learning and Computational Chemistry.

Methods Mol Biol. 2022;2390:125-151. doi: 10.1007/978-1-0716-1787-8_5.

The power of deep learning to ligand-based novel drug discovery.

Expert Opin Drug Discov. 2020 Jul;15(7):755-764. doi: 10.1080/17460441.2020.1745183. Epub 2020 Mar 31.

Generative machine learning for de novo drug discovery: A systematic review.

Comput Biol Med. 2022 Jun;145:105403. doi: 10.1016/j.compbiomed.2022.105403. Epub 2022 Mar 13.

Generative and reinforcement learning approaches for the automated de novo design of bioactive compounds.

Commun Chem. 2022 Oct 18;5(1):129. doi: 10.1038/s42004-022-00733-0.

引用本文的文献

Molecular Machine Learning Approach to Enantioselective C-H Bond Activation Reactions: From Generative AI to Experimental Validation.

Chem Sci. 2025 Jun 10. doi: 10.1039/d5sc01098e.

Bayesian Meta-Learning for Few-Shot Reaction Outcome Prediction of Asymmetric Hydrogenation of Olefins.

Angew Chem Int Ed Engl. 2025 Jul;64(27):e202503821. doi: 10.1002/anie.202503821. Epub 2025 May 2.

A meta-learning approach for selectivity prediction in asymmetric catalysis.

Nat Commun. 2025 Apr 16;16(1):3599. doi: 10.1038/s41467-025-58854-8.

A systematic review of deep learning chemical language models in recent era.

J Cheminform. 2024 Nov 18;16(1):129. doi: 10.1186/s13321-024-00916-y.

Motif2Mol: Prediction of New Active Compounds Based on Sequence Motifs of Ligand Binding Sites in Proteins Using a Biochemical Language Model.

Biomolecules. 2023 May 13;13(5):833. doi: 10.3390/biom13050833.

本文引用的文献

Impact of Artificial Intelligence on Compound Discovery, Design, and Synthesis.

ACS Omega. 2021 Nov 29;6(49):33293-33299. doi: 10.1021/acsomega.1c05512. eCollection 2021 Dec 14.

Computational and data driven molecular material design assisted by low scaling quantum mechanics calculations and machine learning.

Chem Sci. 2021 Nov 8;12(45):14987-15006. doi: 10.1039/d1sc02574k. eCollection 2021 Nov 24.

MEMES: Machine learning framework for Enhanced MolEcular Screening.

Chem Sci. 2021 Jul 26;12(35):11710-11721. doi: 10.1039/d1sc02783b. eCollection 2021 Sep 15.

Size Doesn't Matter: Predicting Physico- or Biochemical Properties Based on Dozens of Molecules.

J Phys Chem Lett. 2021 Sep 30;12(38):9213-9219. doi: 10.1021/acs.jpclett.1c02477. Epub 2021 Sep 16.

Improving Molecule Generation by Embedding LSTM and Attention Mechanism in CycleGAN.

Front Genet. 2021 Aug 5;12:709500. doi: 10.3389/fgene.2021.709500. eCollection 2021.

Attention-based generative models for molecular design.

Chem Sci. 2021 May 14;12(24):8362-8372. doi: 10.1039/d1sc01050f.

Reaction-based machine learning representations for predicting the enantioselectivity of organocatalysts.

Chem Sci. 2021 Apr 3;12(20):6879-6889. doi: 10.1039/d1sc00482d.

Scaffold-based molecular design with a graph generative model.

Chem Sci. 2019 Dec 3;11(4):1153-1164. doi: 10.1039/c9sc04503a.

Trifluoromethyl Thianthrenium Triflate: A Readily Available Trifluoromethylating Reagent with Formal CF, CF, and CF Reactivity.

J Am Chem Soc. 2021 May 26;143(20):7623-7628. doi: 10.1021/jacs.1c02606. Epub 2021 May 14.

Melting point prediction of organic molecules by deciphering the chemical structure into a natural language.

Chem Commun (Camb). 2021 Mar 14;57(21):2633-2636. doi: 10.1039/d0cc07384a. Epub 2021 Feb 15.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种使用生成模型在小数据情况下进行反应发现的迁移学习方法。

A transfer learning approach for reaction discovery in small data situations using generative model.

作者信息

Singh Sukriti, Sunoj Raghavan B

机构信息

Department of Chemistry, Indian Institute of Technology Bombay, Mumbai 400076, India.

Centre for Machine Intelligence and Data Science, Indian Institute of Technology Bombay, Mumbai 400076, India.

出版信息

iScience. 2022 Jun 22;25(7):104661. doi: 10.1016/j.isci.2022.104661. eCollection 2022 Jul 15.

DOI:10.1016/j.isci.2022.104661

PMID:35832891

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9272387/

Abstract

摘要

一种使用生成模型在小数据情况下进行反应发现的迁移学习方法。

A transfer learning approach for reaction discovery in small data situations using generative model.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

一种使用生成模型在小数据情况下进行反应发现的迁移学习方法。

A transfer learning approach for reaction discovery in small data situations using generative model.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献