Karpievitch Yuliya V, Nikolic Sonja B, Wilson Richard, Sharman James E, Edwards Lindsay M
School of Mathematics and Physics, University of Tasmania, Hobart, TAS, Australia.
Menzies Research Institute Tasmania, University of Tasmania, Hobart, TAS, Australia.
PLoS One. 2014 Dec 30;9(12):e116221. doi: 10.1371/journal.pone.0116221. eCollection 2014.
Liquid chromatography mass spectrometry has become one of the analytical platforms of choice for metabolomics studies. However, LC-MS metabolomics data can suffer from the effects of various systematic biases. These include batch effects, day-to-day variations in instrument performance, signal intensity loss due to time-dependent effects of the LC column performance, accumulation of contaminants in the MS ion source and MS sensitivity among others. In this study we aimed to test a singular value decomposition-based method, called EigenMS, for normalization of metabolomics data. We analyzed a clinical human dataset where LC-MS serum metabolomics data and physiological measurements were collected from thirty nine healthy subjects and forty with type 2 diabetes and applied EigenMS to detect and correct for any systematic bias. EigenMS works in several stages. First, EigenMS preserves the treatment group differences in the metabolomics data by estimating treatment effects with an ANOVA model (multiple fixed effects can be estimated). Singular value decomposition of the residuals matrix is then used to determine bias trends in the data. The number of bias trends is then estimated via a permutation test and the effects of the bias trends are eliminated. EigenMS removed bias of unknown complexity from the LC-MS metabolomics data, allowing for increased sensitivity in differential analysis. Moreover, normalized samples better correlated with both other normalized samples and corresponding physiological data, such as blood glucose level, glycated haemoglobin, exercise central augmentation pressure normalized to heart rate of 75, and total cholesterol. We were able to report 2578 discriminatory metabolite peaks in the normalized data (p<0.05) as compared to only 1840 metabolite signals in the raw data. Our results support the use of singular value decomposition-based normalization for metabolomics data.
液相色谱 - 质谱联用技术已成为代谢组学研究中首选的分析平台之一。然而,液相色谱 - 质谱联用的代谢组学数据可能会受到各种系统偏差的影响。这些偏差包括批次效应、仪器性能的日常变化、由于液相色谱柱性能随时间变化而导致的信号强度损失、质谱离子源中污染物的积累以及质谱灵敏度等。在本研究中,我们旨在测试一种基于奇异值分解的方法,即EigenMS,用于代谢组学数据的归一化处理。我们分析了一个临床人类数据集,该数据集收集了39名健康受试者和40名2型糖尿病患者的液相色谱 - 质谱血清代谢组学数据以及生理测量数据,并应用EigenMS来检测和校正任何系统偏差。EigenMS分几个阶段工作。首先,EigenMS通过使用方差分析模型估计处理效应(可以估计多个固定效应)来保留代谢组学数据中治疗组的差异。然后,对残差矩阵进行奇异值分解,以确定数据中的偏差趋势。接着,通过排列检验估计偏差趋势的数量,并消除偏差趋势的影响。EigenMS消除了液相色谱 - 质谱代谢组学数据中未知复杂性的偏差,从而提高了差异分析的灵敏度。此外,归一化后的样本与其他归一化样本以及相应的生理数据(如血糖水平、糖化血红蛋白、归一化至心率75时的运动中心增强压和总胆固醇)具有更好的相关性。与原始数据中仅1840个代谢物信号相比,我们能够在归一化数据中报告2578个具有鉴别力的代谢物峰(p<0.05)。我们的结果支持将基于奇异值分解的归一化方法用于代谢组学数据。