Suppr超能文献

配对非靶向液相色谱-高分辨质谱代谢组学特征匹配与不同采集数据集的串联

: Paired Untargeted LC-HRMS Metabolomics Feature Matching and Concatenation of Disparately Acquired Data Sets.

作者信息

Habra Hani, Kachman Maureen, Bullock Kevin, Clish Clary, Evans Charles R, Karnovsky Alla

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, 100 Washtenaw Avenue, Arbor, Michigan 48109, United States.

Michigan Regional Comprehensive Metabolomics Resource Core, University of Michigan, 1000 Wall Street, Ann Arbor, Michigan 48105, United States.

出版信息

Anal Chem. 2021 Mar 30;93(12):5028-5036. doi: 10.1021/acs.analchem.0c03693. Epub 2021 Mar 16.

Abstract

LC-HRMS experiments detect thousands of compounds, with only a small fraction of them identified in most studies. Traditional data processing pipelines contain an alignment step to assemble the measurements of overlapping features across samples into a unified table. However, data sets acquired under nonidentical conditions are not amenable to this process, mostly due to significant alterations in chromatographic retention times. Alignment of features between disparately acquired LC-MS metabolomics data could aid collaborative compound identification efforts and enable meta-analyses of expanded data sets. Here, we describe , a new computational pipeline for matching known and unknown features in a pair of untargeted LC-MS data sets and concatenating their abundances into a combined table of intersecting feature measurements. groups features by mass-to-charge (/) values to generate a search space of possible feature pair alignments, fits a spline through a set of selected retention time ordered pairs, and ranks alignments by /, mapped retention time, and relative abundance similarity. We evaluated this workflow on a pair of plasma metabolomics data sets acquired with different gradient elution methods, achieving a mean absolute retention time prediction error of roughly 0.06 min and a weighted per-compound matching accuracy of approximately 90%. We further demonstrate the utility of this method by comprehensively mapping features in urine and muscle metabolomics data sets acquired from different laboratories. has the potential to bridge the gap between otherwise incompatible metabolomics data sets and is available as an R package at https://github.com/hhabra/metabCombiner and .

摘要

液相色谱-高分辨质谱(LC-HRMS)实验可检测到数千种化合物,但在大多数研究中,只有一小部分化合物能被鉴定出来。传统的数据处理流程包含一个比对步骤,即将跨样本的重叠特征测量值整合到一个统一的表格中。然而,在非相同条件下获取的数据集并不适合此过程,主要原因是色谱保留时间存在显著变化。对不同来源的液相色谱-质谱代谢组学数据进行特征比对,有助于协同开展化合物鉴定工作,并能对扩展后的数据集进行荟萃分析。在此,我们描述了一种新的计算流程,用于匹配一对非靶向液相色谱-质谱数据集里已知和未知的特征,并将它们的丰度串联到一个相交特征测量的组合表格中。该流程根据质荷比(/)值对特征进行分组,以生成可能的特征对匹配搜索空间,通过一组选定的保留时间有序对拟合样条曲线,并根据质荷比、映射保留时间和相对丰度相似度对匹配进行排序。我们在一对采用不同梯度洗脱方法获取的血浆代谢组学数据集上评估了此工作流程,实现了约0.06分钟的平均绝对保留时间预测误差和约90%的加权化合物匹配准确率。我们还通过全面映射从不同实验室获取的尿液和肌肉代谢组学数据集中的特征,进一步证明了该方法的实用性。该流程有潜力弥合原本不兼容的代谢组学数据集之间的差距,可作为R包在https://github.com/hhabra/metabCombiner获取。

相似文献

引用本文的文献

本文引用的文献

2
Chemical Discovery in the Era of Metabolomics.代谢组学时代的化学发现。
J Am Chem Soc. 2020 May 20;142(20):9097-9105. doi: 10.1021/jacs.9b13198. Epub 2020 May 11.
6
Structure Annotation of All Mass Spectra in Untargeted Metabolomics.无靶向代谢组学中所有质谱的结构注释。
Anal Chem. 2019 Feb 5;91(3):2155-2162. doi: 10.1021/acs.analchem.8b04698. Epub 2019 Jan 16.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验