Phan Tieu-Long, Weinbauer Klaus, Laffitte Marcos E González, Pan Yingjie, Merkle Daniel, Andersen Jakob L, Fagerberg Rolf, Flamm Christoph, Stadler Peter F
Bioinformatics Group, Department of Computer Science &Interdisciplinary Center for Bioinformatics &School for Embedded and Composite Artificial Intelligence (SECAI), Leipzig University, Härtelstraße 16-18, D-04107 Leipzig, Germany.
Department of Mathematics and Computer Science, University of Southern Denmark, DK-5230 Odense M, Denmark.
J Chem Inf Model. 2025 Mar 24;65(6):2882-2896. doi: 10.1021/acs.jcim.4c01795. Epub 2025 Feb 28.
Reaction templates are graphs that represent the reaction center as well as the surrounding context in order to specify salient features of chemical reactions. They are subgraphs of , which are equivalent to double pushout graph rewriting rules and thus can be applied directly to predict reaction outcomes at the structural formula level. We introduce here SynTemp, a framework designed to extract and hierarchically cluster reaction templates from large-scale reaction data repositories. Rule inference is implemented as a robust graph-theoretic approach, which first computes an atom-atom mapping (AAM) as a consensus over partial predictions from multiple state-of-the-art tools and then augments the raw AAM by mechanistically relevant hydrogen atoms and extracts the reactions center extended by relevant context. SynTemp achieves an exceptional accuracy of 99.5% and a success rate of 71.23% in obtaining AAMs on the . Hierarchical clustering of the extended reaction centers based on topological features results in a library of 311 transformation rules explaining 86% of the reaction dataset.
反应模板是一种图形,它表示反应中心以及周围环境,以便指定化学反应的显著特征。它们是的子图,等同于双推出图重写规则,因此可以直接应用于在结构式层面预测反应结果。我们在此介绍SynTemp,这是一个旨在从大规模反应数据存储库中提取反应模板并进行层次聚类的框架。规则推理是作为一种强大的图论方法实现的,该方法首先计算原子-原子映射(AAM),作为对多个先进工具的部分预测的共识,然后通过与机理相关的氢原子扩充原始AAM,并提取由相关环境扩展的反应中心。SynTemp在获得AAM方面达到了99.5%的卓越准确率和71.23%的成功率。基于拓扑特征对扩展反应中心进行层次聚类,得到了一个包含311条转化规则的库,该库解释了86%的反应数据集。