Yan Yan, Jiménez Beatriz, Judge Michael T, Athersuch Toby, De Iorio Maria, Ebbels Timothy M D
Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Faculty of Medicine, Imperial College London, London W12 0NN, United Kingdom.
National Phenome Centre & Section of Bioanalytical Chemistry, Department of Metabolism, Digestion and Reproduction, Imperial College London, London W12 0NN, United Kingdom.
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf045.
Metabolomics extensively utilizes nuclear magnetic resonance (NMR) spectroscopy due to its excellent reproducibility and high throughput. Both 1D and 2D NMR spectra provide crucial information for metabolite annotation and quantification, yet present complex overlapping patterns which may require sophisticated machine learning algorithms to decipher. Unfortunately, the limited availability of labeled spectra can hamper application of machine learning, especially deep learning algorithms which require large amounts of labeled data. In this context, simulation of spectral data becomes a tractable solution for algorithm development.
Here, we introduce MetAssimulo 2.0, a comprehensive upgrade of the MetAssimulo 1.b metabolomic 1H NMR simulation tool, reimplemented as a Python-based web application. Where MetAssimulo 1.0 only simulated 1D 1H spectra of human urine, MetAssimulo 2.0 expands functionality to urine, blood, and cerebral spinal fluid, enhancing the realism of blood spectra by incorporating a broad protein background. This enhancement enables a closer approximation to real blood spectra, achieving a Pearson correlation of approximately 0.82. Moreover, this tool now includes simulation capabilities for 2D J-resolved (J-Res) and Correlation Spectroscopy spectra, significantly broadening its utility in complex mixture analysis. MetAssimulo 2.0 simulates both single, and groups, of spectra with both discrete (case-control, e.g. heart transplant versus healthy) and continuous (e.g. body mass index) outcomes and includes inter-metabolite correlations. It thus supports a range of experimental designs and demonstrating associations between metabolite profiles and biomedical responses.By enhancing NMR spectral simulations, MetAssimulo 2.0 is well positioned to support and enhance research at the intersection of deep learning and metabolomics.
The code and the detailed instruction/tutorial for MetAssimulo 2.0 is available at https://github.com/yanyan5420/MetAssimulo_2.git. The relevant NMR spectra for metabolites are deposited in MetaboLights with accession number MTBLS12081.
代谢组学广泛使用核磁共振(NMR)光谱技术,因为其具有出色的重现性和高通量。一维和二维NMR光谱都为代谢物注释和定量提供了关键信息,但呈现出复杂的重叠模式,这可能需要复杂的机器学习算法来解析。不幸的是,标记光谱的可用性有限会阻碍机器学习的应用,尤其是需要大量标记数据的深度学习算法。在这种情况下,光谱数据模拟成为算法开发的一个可行解决方案。
在此,我们介绍MetAssimulo 2.0,它是MetAssimulo 1.b代谢组学1H NMR模拟工具的全面升级,重新实现为基于Python的网络应用程序。MetAssimulo 1.0仅模拟人尿液的一维1H光谱,而MetAssimulo 2.0将功能扩展到尿液、血液和脑脊液,通过纳入广泛的蛋白质背景增强了血液光谱的真实性。这种增强使得更接近真实血液光谱,实现了约0.82的皮尔逊相关性。此外,该工具现在包括二维J分辨(J-Res)和相关光谱的模拟能力,显著拓宽了其在复杂混合物分析中的效用。MetAssimulo 2.0模拟具有离散(病例对照,例如心脏移植与健康对照)和连续(例如体重指数)结果的单个光谱和光谱组,并包括代谢物间的相关性。因此,它支持一系列实验设计,并证明代谢物谱与生物医学反应之间的关联。通过增强NMR光谱模拟,MetAssimulo 2.0非常适合支持和加强深度学习与代谢组学交叉领域的研究。
MetAssimulo 2.0的代码和详细说明/教程可在https://github.com/yanyan5420/MetAssimulo_2.git获取。代谢物的相关NMR光谱存于MetaboLights,登录号为MTBLS12081。