配对非靶向液相色谱-高分辨质谱代谢组学特征匹配与不同采集数据集的串联

: Paired Untargeted LC-HRMS Metabolomics Feature Matching and Concatenation of Disparately Acquired Data Sets.

作者信息

Habra Hani, Kachman Maureen, Bullock Kevin, Clish Clary, Evans Charles R, Karnovsky Alla

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, 100 Washtenaw Avenue, Arbor, Michigan 48109, United States.

Michigan Regional Comprehensive Metabolomics Resource Core, University of Michigan, 1000 Wall Street, Ann Arbor, Michigan 48105, United States.

出版信息

Anal Chem. 2021 Mar 30;93(12):5028-5036. doi: 10.1021/acs.analchem.0c03693. Epub 2021 Mar 16.

DOI:10.1021/acs.analchem.0c03693

PMID:33724799

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9906987/

Abstract

LC-HRMS experiments detect thousands of compounds, with only a small fraction of them identified in most studies. Traditional data processing pipelines contain an alignment step to assemble the measurements of overlapping features across samples into a unified table. However, data sets acquired under nonidentical conditions are not amenable to this process, mostly due to significant alterations in chromatographic retention times. Alignment of features between disparately acquired LC-MS metabolomics data could aid collaborative compound identification efforts and enable meta-analyses of expanded data sets. Here, we describe , a new computational pipeline for matching known and unknown features in a pair of untargeted LC-MS data sets and concatenating their abundances into a combined table of intersecting feature measurements. groups features by mass-to-charge (/) values to generate a search space of possible feature pair alignments, fits a spline through a set of selected retention time ordered pairs, and ranks alignments by /, mapped retention time, and relative abundance similarity. We evaluated this workflow on a pair of plasma metabolomics data sets acquired with different gradient elution methods, achieving a mean absolute retention time prediction error of roughly 0.06 min and a weighted per-compound matching accuracy of approximately 90%. We further demonstrate the utility of this method by comprehensively mapping features in urine and muscle metabolomics data sets acquired from different laboratories. has the potential to bridge the gap between otherwise incompatible metabolomics data sets and is available as an R package at https://github.com/hhabra/metabCombiner and .

摘要

液相色谱-高分辨质谱（LC-HRMS）实验可检测到数千种化合物，但在大多数研究中，只有一小部分化合物能被鉴定出来。传统的数据处理流程包含一个比对步骤，即将跨样本的重叠特征测量值整合到一个统一的表格中。然而，在非相同条件下获取的数据集并不适合此过程，主要原因是色谱保留时间存在显著变化。对不同来源的液相色谱-质谱代谢组学数据进行特征比对，有助于协同开展化合物鉴定工作，并能对扩展后的数据集进行荟萃分析。在此，我们描述了一种新的计算流程，用于匹配一对非靶向液相色谱-质谱数据集里已知和未知的特征，并将它们的丰度串联到一个相交特征测量的组合表格中。该流程根据质荷比（/）值对特征进行分组，以生成可能的特征对匹配搜索空间，通过一组选定的保留时间有序对拟合样条曲线，并根据质荷比、映射保留时间和相对丰度相似度对匹配进行排序。我们在一对采用不同梯度洗脱方法获取的血浆代谢组学数据集上评估了此工作流程，实现了约0.06分钟的平均绝对保留时间预测误差和约90%的加权化合物匹配准确率。我们还通过全面映射从不同实验室获取的尿液和肌肉代谢组学数据集中的特征，进一步证明了该方法的实用性。该流程有潜力弥合原本不兼容的代谢组学数据集之间的差距，可作为R包在https://github.com/hhabra/metabCombiner获取。

相似文献

: Paired Untargeted LC-HRMS Metabolomics Feature Matching and Concatenation of Disparately Acquired Data Sets.配对非靶向液相色谱-高分辨质谱代谢组学特征匹配与不同采集数据集的串联

Anal Chem. 2021 Mar 30;93(12):5028-5036. doi: 10.1021/acs.analchem.0c03693. Epub 2021 Mar 16.

metabCombiner 2.0: Disparate Multi-Dataset Feature Alignment for LC-MS Metabolomics.代谢组合器2.0：用于液相色谱-质谱代谢组学的多数据集特征对齐

Metabolites. 2024 Feb 15;14(2):125. doi: 10.3390/metabo14020125.

Alignment and Analysis of a Disparately Acquired Multibatch Metabolomics Study of Maternal Pregnancy Samples.对来自不同批次的母体妊娠样本的代谢组学研究进行对齐和分析。

J Proteome Res. 2022 Dec 2;21(12):2936-2946. doi: 10.1021/acs.jproteome.2c00371. Epub 2022 Nov 11.

Finding Correspondence between Metabolomic Features in Untargeted Liquid Chromatography-Mass Spectrometry Metabolomics Datasets.在非靶向液相色谱-质谱代谢组学数据集之间寻找代谢特征的对应关系。

Anal Chem. 2022 Apr 12;94(14):5493-5503. doi: 10.1021/acs.analchem.1c03592. Epub 2022 Mar 31.

compMS2Miner: An Automatable Metabolite Identification, Visualization, and Data-Sharing R Package for High-Resolution LC-MS Data Sets.compMS2Miner：一个用于高分辨 LC-MS 数据集的自动化代谢物鉴定、可视化和数据共享 R 包。

Anal Chem. 2017 Apr 4;89(7):3919-3928. doi: 10.1021/acs.analchem.6b02394. Epub 2017 Mar 27.

G-Aligner: a graph-based feature alignment method for untargeted LC-MS-based metabolomics.G-Aligner：一种基于图的特征对齐方法，用于非靶向基于 LC-MS 的代谢组学。

BMC Bioinformatics. 2023 Nov 14;24(1):431. doi: 10.1186/s12859-023-05525-4.

Retip: Retention Time Prediction for Compound Annotation in Untargeted Metabolomics.提示：用于无靶标代谢组学中化合物注释的保留时间预测。

Anal Chem. 2020 Jun 2;92(11):7515-7522. doi: 10.1021/acs.analchem.9b05765. Epub 2020 May 21.

IDSL.IPA Characterizes the Organic Chemical Space in Untargeted LC/HRMS Data Sets.IDSL.IPA 描绘了非靶向 LC/HRMS 数据集的有机化学空间。

J Proteome Res. 2022 Jun 3;21(6):1485-1494. doi: 10.1021/acs.jproteome.2c00120. Epub 2022 May 17.

Combined LC-MS/MS feature grouping, statistical prioritization, and interactive networking in msFeaST.msFeaST 中结合了 LC-MS/MS 特征分组、统计优先级排序和交互式网络。

Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae584.

geoRge: A Computational Tool To Detect the Presence of Stable Isotope Labeling in LC/MS-Based Untargeted Metabolomics.乔治：一种用于在基于液相色谱/质谱的非靶向代谢组学中检测稳定同位素标记存在的计算工具。

Anal Chem. 2016 Jan 5;88(1):621-8. doi: 10.1021/acs.analchem.5b03628. Epub 2015 Dec 18.

引用本文的文献

Eclipse: a Python package for alignment of two or more nontargeted LC-MS metabolomics datasets.Eclipse：一个用于比对两个或更多非靶向液相色谱-质谱代谢组学数据集的Python软件包。

Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf290.

Application of untargeted liquid chromatography-mass spectrometry to routine analysis of food using three-dimensional bucketing and machine learning.应用无靶向液相色谱-质谱联用技术，结合三维分桶和机器学习，对食品进行常规分析。

Sci Rep. 2024 Jul 18;14(1):16594. doi: 10.1038/s41598-024-67459-y.

Optimal transport for automatic alignment of untargeted metabolomic data.最优传输在非靶向代谢组学数据自动配准中的应用。

Elife. 2024 Jun 18;12:RP91597. doi: 10.7554/eLife.91597.

Offline Two-Dimensional Liquid Chromatography-Mass Spectrometry for Deep Annotation of the Fecal Metabolome Following Fecal Microbiota Transplantation.基于粪便微生物群移植的粪便代谢组学深度注释的离线二维液相色谱-质谱法。

J Proteome Res. 2024 Jun 7;23(6):2000-2012. doi: 10.1021/acs.jproteome.4c00022. Epub 2024 May 16.

metabCombiner 2.0: Disparate Multi-Dataset Feature Alignment for LC-MS Metabolomics.代谢组合器2.0：用于液相色谱-质谱代谢组学的多数据集特征对齐

Metabolites. 2024 Feb 15;14(2):125. doi: 10.3390/metabo14020125.

Alignment of multiple metabolomics LC-MS datasets from disparate diseases to reveal fever-associated metabolites.将来自不同疾病的多个代谢组学 LC-MS 数据集进行对齐，以揭示与发热相关的代谢物。

PLoS Negl Trop Dis. 2023 Jul 24;17(7):e0011133. doi: 10.1371/journal.pntd.0011133. eCollection 2023 Jul.

Compound Identification Strategies in Mass Spectrometry-Based Metabolomics and Pharmacometabolomics.基于质谱的代谢组学和药物代谢组学中的化合物鉴定策略

Handb Exp Pharmacol. 2023;277:43-71. doi: 10.1007/164_2022_617.

Alignment and Analysis of a Disparately Acquired Multibatch Metabolomics Study of Maternal Pregnancy Samples.对来自不同批次的母体妊娠样本的代谢组学研究进行对齐和分析。

J Proteome Res. 2022 Dec 2;21(12):2936-2946. doi: 10.1021/acs.jproteome.2c00371. Epub 2022 Nov 11.

Anal Chem. 2022 Apr 12;94(14):5493-5503. doi: 10.1021/acs.analchem.1c03592. Epub 2022 Mar 31.

Modifying Chromatography Conditions for Improved Unknown Feature Identification in Untargeted Metabolomics.优化色谱条件以改善非靶向代谢组学中未知特征的鉴定

Anal Chem. 2021 Dec 7;93(48):15840-15849. doi: 10.1021/acs.analchem.1c02149. Epub 2021 Nov 18.

本文引用的文献

Generalized Calibration Across Liquid Chromatography Setups for Generic Prediction of Small-Molecule Retention Times.通用液相色谱条件下的小分子保留时间通用预测的广义校准。

Anal Chem. 2020 May 5;92(9):6571-6578. doi: 10.1021/acs.analchem.0c00233. Epub 2020 Apr 17.

Chemical Discovery in the Era of Metabolomics.代谢组学时代的化学发现。

J Am Chem Soc. 2020 May 20;142(20):9097-9105. doi: 10.1021/jacs.9b13198. Epub 2020 May 11.

Disparate Metabolomics Data Reassembler: A Novel Algorithm for Agglomerating Incongruent LC-MS Metabolomics Datasets.差异代谢组学数据重组器：一种用于聚集不相符 LC-MS 代谢组学数据集的新算法。

Anal Chem. 2020 Apr 7;92(7):5231-5239. doi: 10.1021/acs.analchem.9b05763. Epub 2020 Mar 10.

Deep annotation of untargeted LC-MS metabolomics data with Binner.使用 Binner 对非靶向 LC-MS 代谢组学数据进行深度注释。

Bioinformatics. 2020 Mar 1;36(6):1801-1806. doi: 10.1093/bioinformatics/btz798.

PAIRUP-MS: Pathway analysis and imputation to relate unknowns in profiles from mass spectrometry-based metabolite data.PAIRUP-MS：基于质谱的代谢物数据谱中未知物的途径分析和推断。

PLoS Comput Biol. 2019 Jan 14;15(1):e1006734. doi: 10.1371/journal.pcbi.1006734. eCollection 2019 Jan.

Structure Annotation of All Mass Spectra in Untargeted Metabolomics.无靶向代谢组学中所有质谱的结构注释。

Anal Chem. 2019 Feb 5;91(3):2155-2162. doi: 10.1021/acs.analchem.8b04698. Epub 2019 Jan 16.

Validating Quantitative Untargeted Lipidomics Across Nine Liquid Chromatography-High-Resolution Mass Spectrometry Platforms.验证九种液相色谱-高分辨率质谱平台的定量非靶向脂质组学。

Anal Chem. 2017 Nov 21;89(22):12360-12368. doi: 10.1021/acs.analchem.7b03404. Epub 2017 Nov 7.

Accurate prediction of retention in hydrophilic interaction chromatography by back calculation of high pressure liquid chromatography gradient profiles.通过高压液相色谱梯度曲线的反向计算准确预测亲水作用色谱中的保留情况。

J Chromatogr A. 2017 Oct 20;1520:75-82. doi: 10.1016/j.chroma.2017.08.050. Epub 2017 Aug 26.

Detailed Investigation and Comparison of the XCMS and MZmine 2 Chromatogram Construction and Chromatographic Peak Detection Methods for Preprocessing Mass Spectrometry Metabolomics Data.用于质谱代谢组学数据预处理的XCMS和MZmine 2色谱图构建及色谱峰检测方法的详细研究与比较

Anal Chem. 2017 Sep 5;89(17):8689-8695. doi: 10.1021/acs.analchem.7b01069. Epub 2017 Aug 17.

One Step Forward for Reducing False Positive and False Negative Compound Identifications from Mass Spectrometry Metabolomics Data: New Algorithms for Constructing Extracted Ion Chromatograms and Detecting Chromatographic Peaks.减少质谱代谢组学数据中假阳性和假阴性化合物鉴定的新进展：构建提取离子色谱图和检测色谱峰的新算法

Anal Chem. 2017 Sep 5;89(17):8696-8703. doi: 10.1021/acs.analchem.7b00947. Epub 2017 Aug 17.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验