MVAPACK 中实现的代谢组学数据处理管道评估的模拟 LC-MS 数据集。

Simulated LC-MS Data Set for Assessing the Metabolomics Data Processing Pipeline Implemented into MVAPACK.

机构信息

Department of Chemistry, University of Nebraska-Lincoln, Lincoln, Nebraska 68588-0304, United States.

Nebraska Center for Integrated Biomolecular Communication, University of Nebraska-Lincoln, Lincoln, Nebraska 68588-0304, United States.

出版信息

Anal Chem. 2024 Aug 13;96(32):12943-12956. doi: 10.1021/acs.analchem.3c04979. Epub 2024 Jul 30.

DOI:10.1021/acs.analchem.3c04979

PMID:39078713

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11610799/

Abstract

Metabolomics commonly relies on using one-dimensional (1D) H NMR spectroscopy or liquid chromatography-mass spectrometry (LC-MS) to derive scientific insights from large collections of biological samples. NMR and MS approaches to metabolomics require, among other issues, a data processing pipeline. Quantitative assessment of the performance of these software platforms is challenged by a lack of standardized data sets with "known" outcomes. To resolve this issue, we created a novel simulated LC-MS data set with known peak locations and intensities, defined metabolite differences between groups (i.e., fold change > 2, coefficient of variation ≤ 25%), and different amounts of added Gaussian noise (0, 5, or 10%) and missing features (0, 10, or 20%). This data set was developed to improve benchmarking of existing LC-MS metabolomics software and to validate the updated version of our MVAPACK software, which added gas chromatography-MS and LC-MS functionality to its existing 1D and two-dimensional NMR data processing capabilities. We also included two experimental LC-MS data sets acquired from a standard mixture andcell lysates since a simulated data set alone may not capture all the unique characteristics and variability of real spectra needed to assess software performance properly. Our simulated and experimental LC-MS data sets were processed with the MS-DIAL and XCMSOnline software packages and our MVAPACK toolkit to showcase the utility of our data sets to benchmark MVAPACK against community standards. Our results demonstrate the enhanced objectivity and clarity of software assessment that can be achieved when both simulated and experimental data are employed since distinctly different software performances were observed with the simulated and experimental LC-MS data sets. We also demonstrate that the performance of MVAPACK is equivalent to or exceeds existing LC-MS software programs while providing a single platform for processing and analyzing both NMR and MS data sets.

摘要

代谢组学通常依赖于使用一维（1D）H NMR 光谱或液相色谱-质谱（LC-MS）从大量生物样本中得出科学见解。NMR 和 MS 代谢组学方法除其他问题外，还需要数据处理管道。由于缺乏具有“已知”结果的标准化数据集，因此难以对这些软件平台的性能进行定量评估。为了解决这个问题，我们创建了一个具有已知峰位置和强度的新型模拟 LC-MS 数据集，定义了组间代谢物差异（即，倍数变化>2，变异系数≤25%），以及不同量的添加高斯噪声（0、5 或 10%）和缺失特征（0、10 或 20%）。该数据集旨在改进现有 LC-MS 代谢组学软件的基准测试，并验证我们的 MVAPACK 软件的更新版本，该版本在其现有的 1D 和二维 NMR 数据处理功能中添加了气相色谱-MS 和 LC-MS 功能。我们还包括两个从标准混合物和细胞裂解物中获得的实验性 LC-MS 数据集，因为仅模拟数据集可能无法捕获适当评估软件性能所需的真实光谱的所有独特特征和可变性。我们的模拟和实验性 LC-MS 数据集使用 MS-DIAL 和 XCMSOnline 软件包以及我们的 MVAPACK 工具包进行处理，以展示我们的数据集对基准测试 MVAPACK 与社区标准的有用性。我们的结果表明，当使用模拟和实验数据时，可以实现软件评估的增强客观性和清晰度，因为在模拟和实验 LC-MS 数据集中观察到明显不同的软件性能。我们还证明，MVAPACK 的性能与现有 LC-MS 软件程序相当或超过，同时为处理和分析 NMR 和 MS 数据集提供了一个单一平台。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f58/11610799/bd014b33bb5a/nihms-2036298-f0001.jpg

相似文献

Simulated LC-MS Data Set for Assessing the Metabolomics Data Processing Pipeline Implemented into MVAPACK.MVAPACK 中实现的代谢组学数据处理管道评估的模拟 LC-MS 数据集。

Anal Chem. 2024 Aug 13;96(32):12943-12956. doi: 10.1021/acs.analchem.3c04979. Epub 2024 Jul 30.

Counting missing values in a metabolite-intensity data set for measuring the analytical performance of a metabolomics platform.计算代谢物强度数据集中的缺失值，以衡量代谢组学平台的分析性能。

Anal Chem. 2015 Jan 20;87(2):1306-13. doi: 10.1021/ac5039994. Epub 2014 Dec 30.

Optimization of XCMS parameters for LC-MS metabolomics: an assessment of automated versus manual tuning and its effect on the final results.LC-MS 代谢组学中 XCMS 参数的优化：自动调谐与手动调谐的评估及其对最终结果的影响。

Metabolomics. 2020 Jan 10;16(1):14. doi: 10.1007/s11306-020-1636-9.

Workflow4Metabolomics (W4M): A User-Friendly Metabolomics Platform for Analysis of Mass Spectrometry and Nuclear Magnetic Resonance Data.代谢组学工作流程4（W4M）：一个用于质谱和核磁共振数据分析的用户友好型代谢组学平台。

Curr Protoc. 2025 Feb;5(2):e70095. doi: 10.1002/cpz1.70095.

Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data.Galaxy-M：一种用于处理和分析基于直接进样和液相色谱质谱联用的代谢组学数据的Galaxy工作流程。

Gigascience. 2016 Feb 23;5:10. doi: 10.1186/s13742-016-0115-8. eCollection 2016.

MetaboAnalystR 4.0: a unified LC-MS workflow for global metabolomics.MetaboAnalystR 4.0：一个用于全局代谢组学的统一 LC-MS 工作流程。

Nat Commun. 2024 May 1;15(1):3675. doi: 10.1038/s41467-024-48009-6.

MET-COFEA: a liquid chromatography/mass spectrometry data processing platform for metabolite compound feature extraction and annotation.MET-COFEA：一种用于代谢物化合物特征提取和注释的液相色谱/质谱数据处理平台。

Anal Chem. 2014 Jul 1;86(13):6245-53. doi: 10.1021/ac501162k. Epub 2014 Jun 9.

Nextflow4MS-DIAL: A Reproducible Nextflow-Based Workflow for Liquid Chromatography-Mass Spectrometry Metabolomics Data Processing.Nextflow4MS-DIAL：一种基于Nextflow的可重现工作流程，用于液相色谱-质谱代谢组学数据处理。

J Am Soc Mass Spectrom. 2025 Feb 5;36(2):433-438. doi: 10.1021/jasms.4c00364. Epub 2025 Jan 5.

LC-MSsim--a simulation software for liquid chromatography mass spectrometry data.LC-MSsim——一款用于液相色谱质谱数据的模拟软件。

BMC Bioinformatics. 2008 Oct 8;9:423. doi: 10.1186/1471-2105-9-423.

speaq 2.0: A complete workflow for high-throughput 1D NMR spectra processing and quantification.speaq 2.0：高通量一维 NMR 谱处理和定量的完整工作流程。

PLoS Comput Biol. 2018 Mar 1;14(3):e1006018. doi: 10.1371/journal.pcbi.1006018. eCollection 2018 Mar.

引用本文的文献

[A review and research prospects on the application of the XCMS mass-spectrometry data-processing software in the environmental science field].[XCMS质谱数据处理软件在环境科学领域应用的综述与研究展望]

Se Pu. 2025 Jun;43(6):585-593. doi: 10.3724/SP.J.1123.2025.01019.

Untargeted metabolomics revealed that quercetin improves rat renal metabolic disorders induced by chronic unpredictable mild stress.非靶向代谢组学研究表明，槲皮素可改善慢性不可预测轻度应激诱导的大鼠肾脏代谢紊乱。

Naunyn Schmiedebergs Arch Pharmacol. 2025 Apr 25. doi: 10.1007/s00210-025-04186-9.

Innovative applications and future perspectives of chromatography-mass spectrometry in drug research.色谱 - 质谱联用技术在药物研究中的创新应用及未来展望

Front Pharmacol. 2025 Mar 26;16:1529468. doi: 10.3389/fphar.2025.1529468. eCollection 2025.

本文引用的文献

Trackable and scalable LC-MS metabolomics data processing using asari.使用 asari 进行可追踪和可扩展的 LC-MS 代谢组学数据处理。

Nat Commun. 2023 Jul 11;14(1):4113. doi: 10.1038/s41467-023-39889-1.

Multiplatform untargeted metabolomics.多平台非靶向代谢组学。

Magn Reson Chem. 2023 Dec;61(12):628-653. doi: 10.1002/mrc.5350. Epub 2023 Apr 17.

Metabolomic profile of combined healthy lifestyle behaviours in humans: A systematic review.人类健康生活方式组合的代谢组学特征：一项系统综述。

Proteomics. 2022 Sep;22(18):e2100388. doi: 10.1002/pmic.202100388. Epub 2022 Jul 19.

Recent advances in LC-MS-based metabolomics for clinical biomarker discovery.基于液相色谱-质谱联用技术的代谢组学在临床生物标志物发现方面的最新进展。

Mass Spectrom Rev. 2023 Nov-Dec;42(6):2349-2378. doi: 10.1002/mas.21785. Epub 2022 May 29.

A reversed phase ultra-high-performance liquid chromatography-data independent mass spectrometry method for the rapid identification of mycobacterial lipids.一种反相超高效液相色谱-数据非依赖性质谱联用方法，用于快速鉴定分枝杆菌脂质。

J Chromatogr A. 2022 Jan 11;1662:462739. doi: 10.1016/j.chroma.2021.462739. Epub 2021 Dec 8.

Metabolomics for personalized medicine: the input of analytical chemistry from biomarker discovery to point-of-care tests.代谢组学在个性化医疗中的应用：分析化学在从生物标志物发现到即时检测的贡献。

Anal Bioanal Chem. 2022 Jan;414(2):759-789. doi: 10.1007/s00216-021-03586-z. Epub 2021 Aug 25.

UPLC/MS-based untargeted metabolomics reveals the changes of metabolites profile of Salvia miltiorrhiza bunge during Sweating processing.基于 UPLC/MS 的非靶向代谢组学揭示了丹参在发汗炮制过程中代谢产物谱的变化。

Sci Rep. 2020 Nov 11;10(1):19524. doi: 10.1038/s41598-020-76650-w.

Association of caffeine and related analytes with resistance to Parkinson disease among mutation carriers: A metabolomic study.咖啡因及其相关分析物与突变携带者帕金森病抵抗的关联：一项代谢组学研究。

Neurology. 2020 Dec 15;95(24):e3428-e3437. doi: 10.1212/WNL.0000000000010863. Epub 2020 Sep 30.

In Silico Optimization of Mass Spectrometry Fragmentation Strategies in Metabolomics.代谢组学中质谱碎裂策略的计算机模拟优化

Metabolites. 2019 Oct 9;9(10):219. doi: 10.3390/metabo9100219.

Metabolomics for Investigating Physiological and Pathophysiological Processes.代谢组学在研究生理和病理生理过程中的应用。

Physiol Rev. 2019 Oct 1;99(4):1819-1875. doi: 10.1152/physrev.00035.2018.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验