Zacharias Helena U, Altenbuchinger Michael, Gronwald Wolfram
Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstraße 1, 85764 Neuherberg, Germany.
Statistical Bioinformatics, Institute of Functional Genomics, University of Regensburg, Am Biopark 9, 93053 Regensburg, Germany.
Metabolites. 2018 Aug 28;8(3):47. doi: 10.3390/metabo8030047.
In this review, we summarize established and recent bioinformatic and statistical methods for the analysis of NMR-based metabolomics. Data analysis of NMR metabolic fingerprints exhibits several challenges, including unwanted biases, high dimensionality, and typically low sample numbers. Common analysis tasks comprise the identification of differential metabolites and the classification of specimens. However, analysis results strongly depend on the preprocessing of the data, and there is no consensus yet on how to remove unwanted biases and experimental variance prior to statistical analysis. Here, we first review established and new preprocessing protocols and illustrate their pros and cons, including different data normalizations and transformations. Second, we give a brief overview of state-of-the-art statistical analysis in NMR-based metabolomics. Finally, we discuss a recent development in statistical data analysis, where data normalization becomes obsolete. This method, called zero-sum regression, builds metabolite signatures whose estimation as well as predictions are independent of prior normalization.
在本综述中,我们总结了用于基于核磁共振(NMR)的代谢组学分析的既定及最新生物信息学和统计方法。NMR代谢指纹图谱的数据分析存在若干挑战,包括不必要的偏差、高维度以及通常较少的样本数量。常见的分析任务包括差异代谢物的鉴定和样本的分类。然而,分析结果强烈依赖于数据的预处理,并且在统计分析之前如何消除不必要的偏差和实验方差尚无共识。在此,我们首先回顾既定和新的预处理方案,并说明其优缺点,包括不同的数据归一化和变换。其次,我们简要概述基于NMR的代谢组学中的最新统计分析。最后,我们讨论统计数据分析中的一项最新进展,其中数据归一化变得过时。这种方法称为零和回归,它构建代谢物特征,其估计以及预测均独立于先前的归一化。