比较归一化方法和噪声的影响。

Comparing normalization methods and the impact of noise.

机构信息

Department of Statistics, University of Nebraska-Lincoln, Lincoln, NE, 68583-0963, USA.

Department of Chemistry, University of Nebraska-Lincoln, Lincoln, NE, 68588-0304, USA.

出版信息

Metabolomics. 2018 Aug 10;14(8):108. doi: 10.1007/s11306-018-1400-6.

DOI:10.1007/s11306-018-1400-6

PMID:30830388

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6638559/

Abstract

INTRODUCTION

Failure to properly account for normal systematic variations in OMICS datasets may result in misleading biological conclusions. Accordingly, normalization is a necessary step in the proper preprocessing of OMICS datasets. In this regards, an optimal normalization method will effectively reduce unwanted biases and increase the accuracy of downstream quantitative analyses. But, it is currently unclear which normalization method is best since each algorithm addresses systematic noise in different ways.

OBJECTIVE

Determine an optimal choice of a normalization method for the preprocessing of metabolomics datasets.

METHODS

Nine MVAPACK normalization algorithms were compared with simulated and experimental NMR spectra modified with added Gaussian noise and random dilution factors. Methods were evaluated based on an ability to recover the intensities of the true spectral peaks and the reproducibility of true classifying features from orthogonal projections to latent structures-discriminant analysis model (OPLS-DA).

RESULTS

Most normalization methods (except histogram matching) performed equally well at modest levels of signal variance. Only probabilistic quotient (PQ) and constant sum (CS) maintained the highest level of peak recovery (> 67%) and correlation with true loadings (> 0.6) at maximal noise.

CONCLUSION

PQ and CS performed the best at recovering peak intensities and reproducing the true classifying features for an OPLS-DA model regardless of spectral noise level. Our findings suggest that performance is largely determined by the level of noise in the dataset, while the effect of dilution factors was negligible. A minimal allowable noise level of 20% was also identified for a valid NMR metabolomics dataset.

摘要

简介

如果不能正确地解释 OMICS 数据集的正常系统变化，可能会导致误导性的生物学结论。因此，在 OMICS 数据集的正确预处理中，归一化是必要的步骤。在这方面，最优的归一化方法将有效地减少不必要的偏差，并提高下游定量分析的准确性。但是，目前还不清楚哪种归一化方法是最好的，因为每种算法都以不同的方式解决系统噪声问题。

目的

确定代谢组学数据集预处理的最优归一化方法选择。

方法

比较了 9 种 MVAPACK 归一化算法与添加高斯噪声和随机稀释因子修改的模拟和实验 NMR 光谱。方法的评估基于以下能力：恢复真实谱峰的强度和正交投影到潜在结构判别分析模型（OPLS-DA）的真实分类特征的可重复性。

结果

大多数归一化方法（除了直方图匹配）在信号方差适度的情况下表现相当。只有概率商（PQ）和常数和（CS）在最大噪声下保持了最高的峰恢复水平（>67%）和与真实载荷的相关性（>0.6）。

结论

PQ 和 CS 在恢复峰强度和再现 OPLS-DA 模型的真实分类特征方面表现最好，无论光谱噪声水平如何。我们的发现表明，性能主要取决于数据集的噪声水平，而稀释因子的影响可以忽略不计。还确定了 20%的最小允许噪声水平，以确保 NMR 代谢组学数据集的有效性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

比较归一化方法和噪声的影响。

Comparing normalization methods and the impact of noise.

机构信息

出版信息

INTRODUCTION

OBJECTIVE

METHODS

RESULTS

CONCLUSION

简介

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

相似文献

引用本文的文献

本文引用的文献

比较归一化方法和噪声的影响。

Comparing normalization methods and the impact of noise.

机构信息

出版信息

INTRODUCTION

OBJECTIVE

METHODS

RESULTS

CONCLUSION

简介

目的

方法

结果

结论