Suppr超能文献

根对齐的简化分子线性输入规范(SMILES):一种用于化学反应预测的紧凑表示法。

Root-aligned SMILES: a tight representation for chemical reaction prediction.

作者信息

Zhong Zipeng, Song Jie, Feng Zunlei, Liu Tiantao, Jia Lingxiang, Yao Shaolun, Wu Min, Hou Tingjun, Song Mingli

机构信息

College of Computer Science and Technology, Zhejiang University Hangzhou 310027 P. R. China

School of Software Technology, Zhejiang University Ningbo 315048 P. R. China.

出版信息

Chem Sci. 2022 Jul 12;13(31):9023-9034. doi: 10.1039/d2sc02763a. eCollection 2022 Aug 10.

Abstract

Chemical reaction prediction, involving forward synthesis and retrosynthesis prediction, is a fundamental problem in organic synthesis. A popular computational paradigm formulates synthesis prediction as a sequence-to-sequence translation problem, where the typical SMILES is adopted for molecule representations. However, the general-purpose SMILES neglects the characteristics of chemical reactions, where the molecular graph topology is largely unaltered from reactants to products, resulting in the suboptimal performance of SMILES if straightforwardly applied. In this article, we propose the root-aligned SMILES (R-SMILES), which specifies a tightly aligned one-to-one mapping between the product and the reactant SMILES for more efficient synthesis prediction. Due to the strict one-to-one mapping and reduced edit distance, the computational model is largely relieved from learning the complex syntax and dedicated to learning the chemical knowledge for reactions. We compare the proposed R-SMILES with various state-of-the-art baselines and show that it significantly outperforms them all, demonstrating the superiority of the proposed method.

摘要

化学反应预测,包括正向合成预测和逆向合成预测,是有机合成中的一个基本问题。一种流行的计算范式将合成预测表述为一个序列到序列的翻译问题,其中分子表示采用典型的SMILES。然而,通用的SMILES忽略了化学反应的特征,即从反应物到产物分子图拓扑结构基本不变,直接应用时会导致SMILES性能次优。在本文中,我们提出了根对齐的SMILES(R-SMILES),它为产物和反应物SMILES指定了紧密对齐的一对一映射,以实现更高效的合成预测。由于严格的一对一映射和减小的编辑距离,计算模型在很大程度上无需学习复杂的语法,而专注于学习反应的化学知识。我们将提出的R-SMILES与各种最先进的基线进行比较,结果表明它显著优于所有基线,证明了所提方法的优越性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cd2/9365080/209c1f75595d/d2sc02763a-f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验