Suppr超能文献

自动化反应数据库与反应网络分析:利用化学信息学提取反应模板

Automated reaction database and reaction network analysis: extraction of reaction templates using cheminformatics.

作者信息

Plehiers Pieter P, Marin Guy B, Stevens Christian V, Van Geem Kevin M

机构信息

Laboratory for Chemical Technology, Department of Materials, Textiles and Chemical Engineering, Ghent University, Technologiepark 914, 9052, Ghent, Belgium.

SynBioC Research Group, Department of Sustainable Organic Chemistry and Technology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000, Ghent, Belgium.

出版信息

J Cheminform. 2018 Mar 9;10(1):11. doi: 10.1186/s13321-018-0269-8.

Abstract

Both the automated generation of reaction networks and the automated prediction of synthetic trees require, in one way or another, the definition of possible transformations a molecule can undergo. One way of doing this is by using reaction templates. In view of the expanding amount of known reactions, it has become more and more difficult to envision all possible transformations that could occur in a studied system. Nonetheless, most reaction network generation tools rely on user-defined reaction templates. Not only does this limit the amount of chemistry that can be accounted for in the reaction networks, it also confines the wide-spread use of the tools by a broad public. In retrosynthetic analysis, the quality of the analysis depends on what percentage of the known chemistry is accounted for. Using databases to identify templates is therefore crucial in this respect. For this purpose, an algorithm has been developed to extract reaction templates from various types of chemical databases. Some databases such as the Kyoto Encyclopedia for Genes and Genomes and RMG do not report an atom-atom mapping (AAM) for the reactions. This makes the extraction of a template non-straightforward. If no mapping is available, it is calculated by the Reaction Decoder Tool (RDT). With a correct AAM-either calculated by RDT or specified-the algorithm consistently extracts a correct template for a wide variety of reactions, both elementary and non-elementary. The developed algorithm is a first step towards data-driven generation of synthetic trees or reaction networks, and a greater accessibility for non-expert users.

摘要

反应网络的自动生成和合成树的自动预测都需要以某种方式定义分子可能经历的转化。一种实现方法是使用反应模板。鉴于已知反应数量不断增加,设想研究系统中可能发生的所有转化变得越来越困难。尽管如此,大多数反应网络生成工具仍依赖用户定义的反应模板。这不仅限制了反应网络中可以考虑的化学内容量,还限制了广大公众对这些工具的广泛使用。在逆合成分析中,分析的质量取决于已知化学内容的占比。因此,在这方面使用数据库来识别模板至关重要。为此,已开发出一种算法,用于从各种类型的化学数据库中提取反应模板。一些数据库,如京都基因与基因组百科全书和反应分子生成器(RMG),并未报告反应的原子-原子映射(AAM)。这使得模板的提取并非易事。如果没有可用的映射,则由反应解码器工具(RDT)进行计算。有了正确的AAM(无论是由RDT计算得出还是指定的),该算法就能始终为各种反应(包括基元反应和非基元反应)提取正确的模板。所开发的算法是迈向数据驱动的合成树或反应网络生成以及提高非专业用户可及性的第一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9378/5845084/cff4873510f6/13321_2018_269_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验