用于逆合成预测的基于子结构的神经机器翻译

Substructure-based neural machine translation for retrosynthetic prediction.

作者信息

Ucak Umit V, Kang Taek, Ko Junsu, Lee Juyong

机构信息

Division of Chemistry and Biochemistry, Department of Chemistry, Kangwon National University, Chuncheon, South Korea.

Center for Neuro-Medicine, Brain Science Institute, Korea Institute of Science and Technology, Seoul, South Korea.

出版信息

J Cheminform. 2021 Jan 11;13(1):4. doi: 10.1186/s13321-020-00482-z.

DOI:10.1186/s13321-020-00482-z

PMID:33431017

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7802345/

Abstract

With the rapid improvement of machine translation approaches, neural machine translation has started to play an important role in retrosynthesis planning, which finds reasonable synthetic pathways for a target molecule. Previous studies showed that utilizing the sequence-to-sequence frameworks of neural machine translation is a promising approach to tackle the retrosynthetic planning problem. In this work, we recast the retrosynthetic planning problem as a language translation problem using a template-free sequence-to-sequence model. The model is trained in an end-to-end and a fully data-driven fashion. Unlike previous models translating the SMILES strings of reactants and products, we introduced a new way of representing a chemical reaction based on molecular fragments. It is demonstrated that the new approach yields better prediction results than current state-of-the-art computational methods. The new approach resolves the major drawbacks of existing retrosynthetic methods such as generating invalid SMILES strings. Specifically, our approach predicts highly similar reactant molecules with an accuracy of 57.7%. In addition, our method yields more robust predictions than existing methods.

摘要

随着机器翻译方法的迅速改进，神经机器翻译已开始在逆合成规划中发挥重要作用，逆合成规划旨在为目标分子找到合理的合成途径。先前的研究表明，利用神经机器翻译的序列到序列框架是解决逆合成规划问题的一种有前途的方法。在这项工作中，我们使用无模板的序列到序列模型将逆合成规划问题重塑为语言翻译问题。该模型以端到端和完全数据驱动的方式进行训练。与之前翻译反应物和产物的SMILES字符串的模型不同，我们引入了一种基于分子片段表示化学反应的新方法。结果表明，新方法比当前最先进的计算方法产生更好的预测结果。新方法解决了现有逆合成方法的主要缺点，如生成无效的SMILES字符串。具体而言，我们的方法预测高度相似的反应物分子的准确率为57.7%。此外，我们的方法比现有方法产生更稳健的预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91fb/7802345/cc68d3789b88/13321_2020_482_Fig1_HTML.jpg

相似文献

Substructure-based neural machine translation for retrosynthetic prediction.用于逆合成预测的基于子结构的神经机器翻译

J Cheminform. 2021 Jan 11;13(1):4. doi: 10.1186/s13321-020-00482-z.

Transfer Learning: Making Retrosynthetic Predictions Based on a Small Chemical Reaction Dataset Scale to a New Level.迁移学习：基于小规模化学反应数据集的逆向合成预测扩展到新的水平。

Molecules. 2020 May 19;25(10):2357. doi: 10.3390/molecules25102357.

Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments.通过原子环境的神经机器翻译预测反合成反应途径。

Nat Commun. 2022 Mar 4;13(1):1186. doi: 10.1038/s41467-022-28857-w.

Automatic retrosynthetic route planning using template-free models.使用无模板模型的自动逆合成路线规划。

Chem Sci. 2020 Mar 3;11(12):3355-3364. doi: 10.1039/c9sc03666k.

Predicting Retrosynthetic Reactions Using Self-Corrected Transformer Neural Networks.使用自校正变换神经网络预测逆向合成反应。

J Chem Inf Model. 2020 Jan 27;60(1):47-55. doi: 10.1021/acs.jcim.9b00949. Epub 2019 Dec 24.

Ualign: pushing the limit of template-free retrosynthesis prediction with unsupervised SMILES alignment.Ualign：通过无监督的SMILES比对突破无模板逆合成预测的极限。

J Cheminform. 2024 Jul 15;16(1):80. doi: 10.1186/s13321-024-00877-2.

Molecular Transformer unifies reaction prediction and retrosynthesis across pharma chemical space.分子变换统一了药物化学空间中的反应预测和反合成。

Chem Commun (Camb). 2019 Oct 8;55(81):12152-12155. doi: 10.1039/c9cc05122h.

Enhancing Retrosynthetic Reaction Prediction with Deep Learning Using Multiscale Reaction Classification.利用多尺度反应分类增强深度学习的逆合成反应预测

J Chem Inf Model. 2019 Feb 25;59(2):673-688. doi: 10.1021/acs.jcim.8b00801. Epub 2019 Feb 1.

Permutation Invariant Graph-to-Sequence Model for Template-Free Retrosynthesis and Reaction Prediction.无模板回溯合成和反应预测的置换不变图到序列模型。

J Chem Inf Model. 2022 Aug 8;62(15):3503-3513. doi: 10.1021/acs.jcim.2c00321. Epub 2022 Jul 26.

Site-specific template generative approach for retrosynthetic planning.用于逆合成规划的位点特异性模板生成方法。

Nat Commun. 2024 Sep 6;15(1):7818. doi: 10.1038/s41467-024-52048-4.

引用本文的文献

TransMA: an explainable multi-modal deep learning model for predicting properties of ionizable lipid nanoparticles in mRNA delivery.TransMA：一种用于预测可电离脂质纳米颗粒在mRNA递送中性质的可解释多模态深度学习模型。

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf307.

Site-specific template generative approach for retrosynthetic planning.用于逆合成规划的位点特异性模板生成方法。

Nat Commun. 2024 Sep 6;15(1):7818. doi: 10.1038/s41467-024-52048-4.

Artificial Intelligence Methods and Models for Retro-Biosynthesis: A Scoping Review.人工智能方法和模型在逆向生物合成中的应用：范围综述。

ACS Synth Biol. 2024 Aug 16;13(8):2276-2294. doi: 10.1021/acssynbio.4c00091. Epub 2024 Jul 24.

CMMS-GCL: cross-modality metabolic stability prediction with graph contrastive learning.CMMS-GCL：基于图对比学习的跨模态代谢稳定性预测。

Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad503.

Predictive chemistry: machine learning for reaction deployment, reaction development, and reaction discovery.预测化学：用于反应部署、反应开发和反应发现的机器学习

Chem Sci. 2022 Nov 28;14(2):226-244. doi: 10.1039/d2sc05089g. eCollection 2023 Jan 4.

Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments.通过原子环境的神经机器翻译预测反合成反应途径。

Nat Commun. 2022 Mar 4;13(1):1186. doi: 10.1038/s41467-022-28857-w.

Improving Few- and Zero-Shot Reaction Template Prediction Using Modern Hopfield Networks.利用现代 Hopfield 网络改进少样本和零样本反应模板预测。

J Chem Inf Model. 2022 May 9;62(9):2111-2120. doi: 10.1021/acs.jcim.1c01065. Epub 2022 Jan 15.

本文引用的文献

Retrosynthesis with attention-based NMT model and chemical analysis of "wrong" predictions.基于注意力机制的神经机器翻译模型的逆合成及“错误”预测的化学分析

RSC Adv. 2020 Jan 8;10(3):1371-1378. doi: 10.1039/c9ra08535a. eCollection 2020 Jan 7.

Automatic retrosynthetic route planning using template-free models.使用无模板模型的自动逆合成路线规划。

Chem Sci. 2020 Mar 3;11(12):3355-3364. doi: 10.1039/c9sc03666k.

Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy.使用基于Transformer的模型和超图探索策略预测逆合成途径。

Chem Sci. 2020 Mar 3;11(12):3316-3325. doi: 10.1039/c9sc05704h.

State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis.最先进的增强型自然语言处理转换器模型，用于直接和单步逆合成。

Nat Commun. 2020 Nov 4;11(1):5575. doi: 10.1038/s41467-020-19266-y.

Bayesian Algorithm for Retrosynthesis.贝叶斯算法在逆合成中的应用。

J Chem Inf Model. 2020 Oct 26;60(10):4474-4486. doi: 10.1021/acs.jcim.0c00320. Epub 2020 Oct 13.

Predicting Retrosynthetic Reactions Using Self-Corrected Transformer Neural Networks.使用自校正变换神经网络预测逆向合成反应。

J Chem Inf Model. 2020 Jan 27;60(1):47-55. doi: 10.1021/acs.jcim.9b00949. Epub 2019 Dec 24.

Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction.分子变压器：一种用于不确定性校准化学反应预测的模型。

ACS Cent Sci. 2019 Sep 25;5(9):1572-1583. doi: 10.1021/acscentsci.9b00576. Epub 2019 Aug 30.

Exploring the GDB-13 chemical space using deep generative models.使用深度生成模型探索GDB-13化学空间。

J Cheminform. 2019 Mar 12;11(1):20. doi: 10.1186/s13321-019-0341-z.

"Found in Translation": predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models.《翻译中的发现》：使用神经序列到序列模型预测复杂有机化学反应的结果。

Chem Sci. 2018 Jun 22;9(28):6091-6098. doi: 10.1039/c8sc02339e. eCollection 2018 Jul 28.

Computational Chemical Synthesis Analysis and Pathway Design.计算化学合成分析与途径设计

Front Chem. 2018 Jun 5;6:199. doi: 10.3389/fchem.2018.00199. eCollection 2018.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于逆合成预测的基于子结构的神经机器翻译

Substructure-based neural machine translation for retrosynthetic prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献