MINEs：用于非靶向代谢组学的计算预测酶多底物催化产物的开放获取数据库。

MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics.

作者信息

Jeffryes James G, Colastani Ricardo L, Elbadawi-Sidhu Mona, Kind Tobias, Niehaus Thomas D, Broadbelt Linda J, Hanson Andrew D, Fiehn Oliver, Tyo Keith E J, Henry Christopher S

机构信息

Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL USA ; Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA.

Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA.

出版信息

J Cheminform. 2015 Aug 28;7:44. doi: 10.1186/s13321-015-0087-1. eCollection 2015.

DOI:10.1186/s13321-015-0087-1

PMID:26322134

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4550642/

Abstract

BACKGROUND

In spite of its great promise, metabolomics has proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography-mass spectrometry (LC-MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases.

DESCRIPTION

Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likely to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC-MS accurate mass data enabled the identity of an unknown peak to be confidently predicted.

CONCLUSIONS

MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. Furthermore, MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures. Graphical abstractMINE database construction and access methods. The process of constructing a MINE database from the curated source databases is depicted on the left. The methods for accessing the database are shown on the right.

摘要

背景

尽管代谢组学前景广阔，但事实证明，以非靶向且可推广的方式实施代谢组学颇具难度。液相色谱 - 质谱联用技术（LC - MS）使得收集数千种细胞代谢物的数据成为可能。然而，将代谢物与其光谱特征进行匹配仍是一个瓶颈，这意味着许多收集到的信息仍未得到解读，而且在非靶向研究中很少发现新的代谢物。这些挑战需要新的方法，这些方法要考虑到经整理的生物化学数据库之外的化合物。

描述

在此，我们介绍代谢虚拟网络扩展（MINEs），这是对已知代谢物数据库的一种扩展，纳入了尚未观察到但基于已知代谢物和常见生化反应可能存在的分子。我们利用一种名为生化网络综合计算探索器（BNICE）的算法以及基于酶委员会分类系统的专家整理反应规则，来提出构成MINE数据库的新化学结构和反应。从京都基因与基因组百科全书（KEGG）化合物数据库出发，MINE包含超过571,000种化合物，其中93%在PubChem数据库中不存在。然而，这些MINE化合物与天然产物的结构相似性平均高于KEGG或PubChem中的化合物。MINE数据库能够为667个MassBank光谱中的98.6%提出注释，比仅使用KEGG多14%，与PubChem相当，同时每个光谱返回的候选物比PubChem少得多（中位数候选物分别为46个和1715个）。将MINEs应用于LC - MS精确质量数据能够可靠地预测未知峰的身份。

结论

MINE数据库可通过用户友好的网络工具（http://minedatabase.mcs.anl.gov）和开发者友好的应用程序编程接口（APIs）免费供非商业使用。与结果包含不相关合成化合物的一般化学数据库相比，MINEs改进了代谢组学峰的识别。此外，MINEs补充并扩展了之前专注于人类代谢的虚拟生成化合物数据库。我们正在积极开发该数据库；此资源的未来版本将纳入自发化学反应的转化规则以及对候选结构更高级的筛选和排序。图形摘要MINE数据库的构建和访问方法。左侧描绘了从经整理的源数据库构建MINE数据库的过程。右侧展示了访问该数据库的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6cdb/4551763/a0afe013345f/13321_2015_87_Figa_HTML.jpg

相似文献

MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics.MINEs：用于非靶向代谢组学的计算预测酶多底物催化产物的开放获取数据库。

J Cheminform. 2015 Aug 28;7:44. doi: 10.1186/s13321-015-0087-1. eCollection 2015.

[A novel method for efficient screening and annotation of important pathway-associated metabolites based on the modified metabolome and probe molecules].一种基于改良代谢组和探针分子的重要通路相关代谢物高效筛选与注释新方法

Se Pu. 2022 Sep;40(9):788-796. doi: 10.3724/SP.J.1123.2022.03025.

MINE 2.0: enhanced biochemical coverage for peak identification in untargeted metabolomics.MINE 2.0：增强的生物化学覆盖范围，用于非靶向代谢组学中的峰识别。

Bioinformatics. 2022 Jun 27;38(13):3484-3487. doi: 10.1093/bioinformatics/btac331.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学：基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍

Pickaxe: a Python library for the prediction of novel metabolic reactions.锹：用于预测新型代谢反应的 Python 库。

BMC Bioinformatics. 2023 Mar 22;24(1):106. doi: 10.1186/s12859-023-05149-8.

compMS2Miner: An Automatable Metabolite Identification, Visualization, and Data-Sharing R Package for High-Resolution LC-MS Data Sets.compMS2Miner：一个用于高分辨 LC-MS 数据集的自动化代谢物鉴定、可视化和数据共享 R 包。

Anal Chem. 2017 Apr 4;89(7):3919-3928. doi: 10.1021/acs.analchem.6b02394. Epub 2017 Mar 27.

Evaluation of an Artificial Neural Network Retention Index Model for Chemical Structure Identification in Nontargeted Metabolomics.评价人工神经网络保留指数模型在非靶向代谢组学中的化学结构鉴定。

Anal Chem. 2018 Nov 6;90(21):12752-12760. doi: 10.1021/acs.analchem.8b03118. Epub 2018 Oct 24.

Chemical-damage MINE: A database of curated and predicted spontaneous metabolic reactions.化学损伤代谢物数据库：一个经过精心整理和预测的自发代谢反应数据库。

Metab Eng. 2022 Jan;69:302-312. doi: 10.1016/j.ymben.2021.11.009. Epub 2021 Dec 25.

In silico enzymatic synthesis of a 400,000 compound biochemical database for nontargeted metabolomics.基于计算机的 40 万化合物生化数据库的酶法合成用于非靶向代谢组学。

J Chem Inf Model. 2013 Sep 23;53(9):2483-92. doi: 10.1021/ci400368v. Epub 2013 Sep 12.

MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry.MAW：用于非靶向串联质谱的可重复代谢组注释工作流程

J Cheminform. 2023 Mar 4;15(1):32. doi: 10.1186/s13321-023-00695-y.

引用本文的文献

Isotope tracing-based metabolite identification for mass spectrometry metabolomics.基于同位素示踪的质谱代谢组学代谢物鉴定

bioRxiv. 2025 Apr 8:2025.04.07.647691. doi: 10.1101/2025.04.07.647691.

Predicting Collision-Induced-Dissociation Tandem Mass Spectra (CID-MS/MS) Using Ab Initio Molecular Dynamics.使用从头算分子动力学预测碰撞诱导解离串联质谱（CID-MS/MS）。

J Chem Inf Model. 2024 Oct 14;64(19):7470-7487. doi: 10.1021/acs.jcim.4c00760. Epub 2024 Sep 27.

Knowledge-based in silico fragmentation and annotation of mass spectra for natural products with MassKG.利用MassKG对天然产物质谱进行基于知识的计算机碎片化和注释。

Comput Struct Biotechnol J. 2024 Sep 7;23:3327-3341. doi: 10.1016/j.csbj.2024.09.001. eCollection 2024 Dec.

Introducing 'identification probability' for automated and transferable assessment of metabolite identification confidence in metabolomics and related studies.引入“识别概率”用于代谢组学及相关研究中代谢物识别可信度的自动化和可转移评估。

bioRxiv. 2024 Jul 31:2024.07.30.605945. doi: 10.1101/2024.07.30.605945.

Extending PROXIMAL to predict degradation pathways of phenolic compounds in the human gut microbiota.将 PROXIMAL 扩展用于预测人肠道微生物群中酚类化合物的降解途径。

NPJ Syst Biol Appl. 2024 May 27;10(1):56. doi: 10.1038/s41540-024-00381-1.

MetaboAnalystR 4.0: a unified LC-MS workflow for global metabolomics.MetaboAnalystR 4.0：一个用于全局代谢组学的统一 LC-MS 工作流程。

Nat Commun. 2024 May 1;15(1):3675. doi: 10.1038/s41467-024-48009-6.

MetaboAnalyst 6.0: towards a unified platform for metabolomics data processing, analysis and interpretation.MetaboAnalyst 6.0：迈向代谢组学数据处理、分析和解释的统一平台。

Nucleic Acids Res. 2024 Jul 5;52(W1):W398-W406. doi: 10.1093/nar/gkae253.

SelenzymeRF: updated enzyme suggestion software for unbalanced biochemical reactions.SelenzymeRF：用于不平衡生化反应的更新酶建议软件。

Comput Struct Biotechnol J. 2023 Nov 23;21:5868-5876. doi: 10.1016/j.csbj.2023.11.039. eCollection 2023.

PolyID: Artificial Intelligence for Discovering Performance-Advantaged and Sustainable Polymers.PolyID：用于发现性能优越且可持续聚合物的人工智能。

Macromolecules. 2023 Oct 19;56(21):8547-8557. doi: 10.1021/acs.macromol.3c00994. eCollection 2023 Nov 14.

enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways.enviRule：一种端到端的系统，用于从环境污染物生物转化途径中自动提取反应模式。

Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad407.

本文引用的文献

Rethinking Mass Spectrometry-Based Small Molecule Identification Strategies in Metabolomics.代谢组学中基于质谱的小分子鉴定策略的再思考

Mass Spectrom (Tokyo). 2014;3(Spec Iss 2):S0038. doi: 10.5702/massspectrometry.S0038. Epub 2014 Aug 16.

A directed-overflow and damage-control N-glycosidase in riboflavin biosynthesis.核黄素生物合成中一种定向溢流和损伤控制的N-糖苷酶。

Biochem J. 2015 Feb 15;466(1):137-45. doi: 10.1042/BJ20141237.

MIDAS: a database-searching algorithm for metabolite identification in metabolomics.MIDAS：一种用于代谢组学中代谢物鉴定的数据库搜索算法。

Anal Chem. 2014 Oct 7;86(19):9496-503. doi: 10.1021/ac5014783. Epub 2014 Sep 11.

Metabolite Identification through Machine Learning- Tackling CASMI Challenge Using FingerID.通过机器学习进行代谢物鉴定——使用FingerID应对CASMI挑战

Metabolites. 2013 Jun 6;3(2):484-505. doi: 10.3390/metabo3020484.

CASMI: And the Winner is . .化学传感与医学成像国际会议（CASMI）：获胜者是……

Metabolites. 2013 May 24;3(2):412-39. doi: 10.3390/metabo3020412.

Systematic applications of metabolomics in metabolic engineering.代谢组学在代谢工程中的系统应用。

Metabolites. 2012 Dec 14;2(4):1090-122. doi: 10.3390/metabo2041090.

In silico prediction and automatic LC-MS(n) annotation of green tea metabolites in urine.尿液中绿茶代谢物的计算机模拟预测及液相色谱-质谱(n)自动注释

Anal Chem. 2014 May 20;86(10):4767-74. doi: 10.1021/ac403875b. Epub 2014 Apr 29.

Metabolomics in nutritional epidemiology: identifying metabolites associated with diet and quantifying their potential to uncover diet-disease relations in populations.营养流行病学中的代谢组学：识别与饮食相关的代谢物并量化其在揭示人群饮食与疾病关系方面的潜力。

Am J Clin Nutr. 2014 Jul;100(1):208-17. doi: 10.3945/ajcn.113.078758. Epub 2014 Apr 16.

Systematic structural characterization of metabolites in Arabidopsis via candidate substrate-product pair networks.通过候选底物-产物对网络对拟南芥代谢物进行系统的结构表征。

Plant Cell. 2014 Mar;26(3):929-45. doi: 10.1105/tpc.113.122242. Epub 2014 Mar 31.

Data, information, knowledge and principle: back to metabolism in KEGG.数据、信息、知识和原理：回到 KEGG 的代谢途径中。

Nucleic Acids Res. 2014 Jan;42(Database issue):D199-205. doi: 10.1093/nar/gkt1076. Epub 2013 Nov 7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

MINEs：用于非靶向代谢组学的计算预测酶多底物催化产物的开放获取数据库。

MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics.

作者信息

机构信息

出版信息

BACKGROUND

DESCRIPTION

CONCLUSIONS

背景

描述

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献