Suppr超能文献

MetTailor:用于代谢组学中质谱数据分析的动态块摘要和强度归一化,以实现稳健分析。

MetTailor: dynamic block summary and intensity normalization for robust analysis of mass spectrometry data in metabolomics.

机构信息

Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Singapore, Singapore.

Interdisciplinary Research Group in Infectious Diseases, Singapore-MIT Alliance for Research & Technology, Singapore, Singapore and.

出版信息

Bioinformatics. 2015 Nov 15;31(22):3645-52. doi: 10.1093/bioinformatics/btv434. Epub 2015 Jul 27.

Abstract

MOTIVATION

Accurate cross-sample peak alignment and reliable intensity normalization is a critical step for robust quantitative analysis in untargetted metabolomics since tandem mass spectrometry (MS/MS) is rarely used for compound identification. Therefore shortcomings in the data processing steps can easily introduce false positives due to misalignments and erroneous normalization adjustments in large sample studies.

RESULTS

In this work, we developed a software package MetTailor featuring two novel data preprocessing steps to remedy drawbacks in the existing processing tools. First, we propose a novel dynamic block summarization (DBS) method for correcting misalignments from peak alignment algorithms, which alleviates missing data problem due to misalignments. For the purpose of verifying correct re-alignments, we propose to use the cross-sample consistency in isotopic intensity ratios as a quality metric. Second, we developed a flexible intensity normalization procedure that adjusts normalizing factors against the temporal variations in total ion chromatogram (TIC) along the chromatographic retention time (RT). We first evaluated the DBS algorithm using a curated metabolomics dataset, illustrating that the algorithm identifies misaligned peaks and correctly realigns them with good sensitivity. We next demonstrated the DBS algorithm and the RT-based normalization procedure in a large-scale dataset featuring >100 sera samples in primary Dengue infection study. Although the initial alignment was successful for the majority of peaks, the DBS algorithm still corrected ∼7000 misaligned peaks in this data and many recovered peaks showed consistent isotopic patterns with the peaks they were realigned to. In addition, the RT-based normalization algorithm efficiently removed visible local variations in TIC along the RT, without sacrificing the sensitivity of detecting differentially expressed metabolites.

AVAILABILITY AND IMPLEMENTATION

The R package MetTailor is freely available at the SourceForge website http://mettailor.sourceforge.net/.

CONTACT

hyung_won_choi@nuhs.edu.sg

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

由于串联质谱(MS/MS)很少用于化合物鉴定,因此在靶向代谢组学中,准确的跨样本峰对齐和可靠的强度归一化是稳健定量分析的关键步骤。因此,在大型样本研究中,由于数据处理步骤中的缺陷,很容易导致峰对齐和错误的归一化调整引入假阳性。

结果

在这项工作中,我们开发了一个名为 MetTailor 的软件包,它具有两个新的数据预处理步骤,可以弥补现有处理工具的不足。首先,我们提出了一种新的动态块摘要(DBS)方法来纠正峰对齐算法中的对齐错误,从而缓解了由于对齐错误导致的缺失数据问题。为了验证正确的重新对齐,我们提出使用同位素强度比的跨样本一致性作为质量度量。其次,我们开发了一种灵活的强度归一化程序,根据总离子色谱(TIC)沿色谱保留时间(RT)的时间变化来调整归一化因子。我们首先使用经过编目的代谢组学数据集评估了 DBS 算法,结果表明该算法可以识别对齐错误的峰,并以良好的灵敏度正确地重新对齐它们。接下来,我们在一个包含 100 多个血清样本的大规模数据集(原发性登革热感染研究)中演示了 DBS 算法和基于 RT 的归一化程序。尽管大多数峰的初始对齐都很成功,但 DBS 算法仍纠正了该数据中的约 7000 个对齐错误,并且许多恢复的峰与它们重新对齐的峰显示出一致的同位素模式。此外,基于 RT 的归一化算法有效地消除了 TIC 沿 RT 的可见局部变化,而不会牺牲检测差异表达代谢物的灵敏度。

可用性和实现

R 包 MetTailor 可在 SourceForge 网站 http://mettailor.sourceforge.net/ 上免费获得。

联系人

hyung_won_choi@nuhs.edu.sg

补充信息

补充数据可在生物信息学在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验