Environmental and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USA.
National Security Directorate, Pacific Northwest National Laboratory, Richland, Washington, USA.
Rapid Commun Mass Spectrom. 2021 May 15;35(9):e9068. doi: 10.1002/rcm.9068.
Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) is a preferred technique for analyzing complex organic mixtures. Currently, there is no consensus normalization approach, nor an objective method for selecting one, for quantitative analyses of FT-ICR-MS data. We investigate a method to evaluate and score the amount of bias various normalization approaches introduce into the data.
We evaluate the ability of the Statistical Procedure for the Analysis of Normalization Strategies (SPANS) to guide the selection of appropriate normalization approaches for two different FT-ICR-MS data sets. Furthermore, we test the robustness of SPANS results to changes in SPANS parameter values and assess the impact of using various normalization approaches on downstream statistical analyses.
The normalization approach identified by SPANS differed for the two data sets. Normalization methods impacted the statistical significance of peaks differently, underscoring the importance of carefully evaluating potential methods. More consistent SPANS scores resulted when at least 120 significant peaks are used, where larger sets of peaks were obtained by increasing the p-value threshold. Interestingly, we show that total sum scaling and highest peak normalization, used in previous studies, underperformed relative to SPANS-recommended normalization approaches.
Although there is no single, best normalization method for all data sets, SPANS provides a mechanism to identify an appropriate normalization method for analyzing FT-ICR-MS data quantitatively. The number of peaks used in the background distributions of SPANS contributes more significantly to the reproducibility of results than the p-value thresholds used to obtain those peaks.
傅里叶变换离子回旋共振质谱(FT-ICR-MS)是分析复杂有机混合物的首选技术。目前,对于 FT-ICR-MS 数据的定量分析,既没有共识的归一化方法,也没有客观的方法来选择一种方法。我们研究了一种评估和评分各种归一化方法引入数据中偏差量的方法。
我们评估了统计程序用于分析归一化策略(SPANS)的能力,以指导为两个不同的 FT-ICR-MS 数据集选择适当的归一化方法。此外,我们还测试了 SPANS 结果对 SPANS 参数值变化的稳健性,并评估了使用各种归一化方法对下游统计分析的影响。
SPANS 确定的归一化方法因两个数据集而异。归一化方法对峰的统计显着性的影响不同,这强调了仔细评估潜在方法的重要性。当使用至少 120 个显着峰时,SPANS 得分更一致,通过增加 p 值阈值可以获得更大的峰集。有趣的是,我们表明,以前研究中使用的总和缩放和最高峰归一化方法的性能相对低于 SPANS 推荐的归一化方法。
尽管对于所有数据集都没有单一的最佳归一化方法,但 SPANS 提供了一种机制,可以识别出用于定量分析 FT-ICR-MS 数据的适当归一化方法。SPANS 中背景分布中使用的峰数对结果的可重复性的贡献比用于获得这些峰的 p 值阈值更为重要。