CLCA：MetRxn 数据库中的最大公共分子子结构查询。

CLCA: maximum common molecular substructure queries within the MetRxn database.

机构信息

The Huck Institutes of the Life Sciences, Pennsylvania State University , University Park, Pennsylvania 16802, United States.

出版信息

J Chem Inf Model. 2014 Dec 22;54(12):3417-38. doi: 10.1021/ci5003922. Epub 2014 Dec 1.

DOI:10.1021/ci5003922

PMID:25412255

Abstract

The challenge of automatically identifying the preserved molecular moieties in a chemical reaction is referred to as the atom mapping problem. Reaction atom maps provide the ability to locate the fate of individual atoms across an entire metabolic network. Atom maps are used to track atoms in isotope labeling experiments for metabolic flux elucidation, trace novel biosynthetic routes to a target compound, and contrast entire pathways for structural homology. However, rapid computation of the reaction atom mappings remains elusive despite significant research. We present a novel substructure search algorithm, canonical labeling for clique approximation (CLCA), with polynomial run-time complexity to quickly generate atom maps for all the reactions present in MetRxn. CLCA uses number theory (i.e., prime factorization) to generate canonical labels or unique IDs and identify a bijection between the vertices (atoms) of two distinct molecular graphs. CLCA utilizes molecular graphs generated by combining atomistic information on reactions and metabolites from 112 metabolic models and 8 metabolic databases. CLCA offers improvements in run time, accuracy, and memory utilization over existing heuristic and combinatorial maximum common substructure (MCS) search algorithms. We provide detailed examples on the various advantages as well as failure modes of CLCA over existing algorithms.

摘要

自动识别化学反应中保留的分子部分的挑战被称为原子映射问题。反应原子图提供了在整个代谢网络中定位单个原子命运的能力。原子图用于同位素标记实验中追踪原子，以阐明代谢通量，追踪目标化合物的新生物合成途径，并对比整个结构同源性的途径。然而，尽管进行了大量研究，但快速计算反应原子映射仍然难以实现。我们提出了一种新的子结构搜索算法，即用于团逼近的规范标记（CLCA），其具有多项式时间复杂度，可快速生成 MetRxn 中所有反应的原子图。CLCA 使用数论（即质因数分解）生成规范标签或唯一 ID，并识别两个不同分子图的顶点（原子）之间的双射。CLCA 利用通过将反应和代谢物的原子信息组合生成的分子图，这些信息来自 112 个代谢模型和 8 个代谢数据库。CLCA 在运行时间、准确性和内存利用率方面优于现有的启发式和组合最大公共子结构（MCS）搜索算法。我们提供了详细的示例，说明了 CLCA 相对于现有算法的各种优势和失败模式。

相似文献

CLCA: maximum common molecular substructure queries within the MetRxn database.

J Chem Inf Model. 2014 Dec 22;54(12):3417-38. doi: 10.1021/ci5003922. Epub 2014 Dec 1.

Construction of an E. Coli genome-scale atom mapping model for MFA calculations.

Biotechnol Bioeng. 2011 Jun;108(6):1372-82. doi: 10.1002/bit.23070. Epub 2011 Feb 19.

Reconstruction of biological pathways and metabolic networks from in silico labeled metabolites.

Biotechnol J. 2017 Jan;12(1). doi: 10.1002/biot.201600464.

Accurate atom-mapping computation for biochemical reactions.

J Chem Inf Model. 2012 Nov 26;52(11):2970-82. doi: 10.1021/ci3002217. Epub 2012 Oct 15.

Computing atom mappings for biochemical reactions without subgraph isomorphism.

J Comput Biol. 2011 Jan;18(1):43-58. doi: 10.1089/cmb.2009.0216.

Quantifying and assessing the effect of chemical symmetry in metabolic pathways.

J Chem Inf Model. 2012 Oct 22;52(10):2684-96. doi: 10.1021/ci300259u. Epub 2012 Sep 28.

MetRxn: a knowledgebase of metabolites and reactions spanning metabolic models and databases.

BMC Bioinformatics. 2012 Jan 10;13:6. doi: 10.1186/1471-2105-13-6.

Identification of Conserved Moieties in Metabolic Networks by Graph Theoretical Analysis of Atom Transition Networks.

PLoS Comput Biol. 2016 Nov 21;12(11):e1004999. doi: 10.1371/journal.pcbi.1004999. eCollection 2016 Nov.

ReMatch: a web-based tool to construct, store and share stoichiometric metabolic models with carbon maps for metabolic flux analysis.

J Integr Bioinform. 2008 Aug 25;5(2):102. doi: 10.2390/biecoll-jib-2008-102.

An automated workflow that generates atom mappings for large-scale metabolic models and its application to Arabidopsis thaliana.

Plant J. 2022 Sep;111(5):1486-1500. doi: 10.1111/tpj.15903. Epub 2022 Jul 22.

引用本文的文献

Computation of Protein-Ligand Binding Free Energies with a Quantum Mechanics-Based Mining Minima Algorithm.

J Chem Theory Comput. 2025 Apr 22;21(8):4236-4265. doi: 10.1021/acs.jctc.4c01707. Epub 2025 Apr 9.

Rapid, Accurate, Ranking of Protein-Ligand Binding Affinities with VM2, the Second-Generation Mining Minima Method.

J Chem Theory Comput. 2024 Jul 23;20(14):6328-6340. doi: 10.1021/acs.jctc.4c00407. Epub 2024 Jul 11.

MetAMDB: Metabolic Atom Mapping Database.

Metabolites. 2022 Jan 27;12(2):122. doi: 10.3390/metabo12020122.

dGPredictor: Automated fragmentation method for metabolic reaction free energy prediction and de novo pathway design.

PLoS Comput Biol. 2021 Sep 27;17(9):e1009448. doi: 10.1371/journal.pcbi.1009448. eCollection 2021 Sep.

NetFlow: A tool for isolating carbon flows in genome-scale metabolic networks.

Metab Eng Commun. 2020 Dec 2;12:e00154. doi: 10.1016/j.mec.2020.e00154. eCollection 2021 Jun.

Inferring Biochemical Reactions and Metabolite Structures to Understand Metabolic Pathway Drift.

iScience. 2020 Feb 21;23(2):100849. doi: 10.1016/j.isci.2020.100849. Epub 2020 Jan 17.

Harnessing biocompatible chemistry for developing improved and novel microbial cell factories.

Microb Biotechnol. 2020 Jan;13(1):54-66. doi: 10.1111/1751-7915.13472. Epub 2019 Aug 6.

The Design of FluxML: A Universal Modeling Language for C Metabolic Flux Analysis.

Front Microbiol. 2019 May 24;10:1022. doi: 10.3389/fmicb.2019.01022. eCollection 2019.

Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v.3.0.

Nat Protoc. 2019 Mar;14(3):639-702. doi: 10.1038/s41596-018-0098-2.

Genome-Scale Fluxome of UTEX 2973 Using Transient C-Labeling Data.

Plant Physiol. 2019 Feb;179(2):761-769. doi: 10.1104/pp.18.01357. Epub 2018 Dec 14.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

CLCA：MetRxn 数据库中的最大公共分子子结构查询。

CLCA: maximum common molecular substructure queries within the MetRxn database.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献