Suppr超能文献

用于非靶向代谢组学中准确代谢物注释的知识和数据驱动的双层网络

Knowledge and data-driven two-layer networking for accurate metabolite annotation in untargeted metabolomics.

作者信息

Zhang Haosong, Zeng Xinhao, Yin Yandong, Zhu Zheng-Jiang

机构信息

Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, China.

University of Chinese Academy of Sciences, Beijing, China.

出版信息

Nat Commun. 2025 Aug 30;16(1):8118. doi: 10.1038/s41467-025-63536-6.

Abstract

Metabolite annotation in untargeted metabolomics remains challenging due to the vast structural diversity of metabolites. Network-based approaches have emerged as powerful strategies, particularly for annotating metabolites lacking chemical standards. Here, we develop a two-layer interactive networking topology that integrates data-driven and knowledge-driven networks to enhance metabolite annotation. A comprehensive metabolic reaction network is curated using graph neural network-based prediction of reaction relationships, enhancing both coverage and network connectivity. Experimental data are pre-mapped onto this network via sequential MS1 matching, reaction relationship mapping, and MS2 similarity constraints. The generated networking topology enables interactive annotation propagation with over 10-fold improved computational efficiency. In common biological samples, it annotates over 1600 seed metabolites with chemical standards and >12,000 putatively annotated metabolites through network-based propagation. Notably, two previously uncharacterized endogenous metabolites absent from human metabolome databases have been discovered. Overall, this strategy significantly improves the coverage, accuracy, and efficiency of metabolite annotation and is freely available as MetDNA3.

摘要

由于代谢物的结构多样性极为广泛,非靶向代谢组学中的代谢物注释仍然具有挑战性。基于网络的方法已成为强大的策略,特别是对于注释缺乏化学标准品的代谢物。在此,我们开发了一种两层交互式网络拓扑结构,该结构整合了数据驱动和知识驱动的网络,以增强代谢物注释。使用基于图神经网络的反应关系预测来精心构建一个全面的代谢反应网络,从而提高覆盖率和网络连通性。通过顺序MS1匹配、反应关系映射和MS2相似性约束,将实验数据预先映射到该网络上。生成的网络拓扑结构能够进行交互式注释传播,计算效率提高了10倍以上。在常见的生物样本中,它通过基于网络的传播,为1600多种有化学标准品的种子代谢物以及超过12,000种推定注释的代谢物进行注释。值得注意的是,发现了人类代谢组数据库中不存在的两种以前未表征的内源性代谢物。总体而言,该策略显著提高了代谢物注释的覆盖率、准确性和效率,并且作为MetDNA3可免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验