Department of Chemistry, University of Alberta, Edmonton, AB, T6G 2G2, Canada.
The Metabolomics Innovation Centre, Edmonton, AB, Canada.
Metabolomics. 2024 Feb 12;20(2):22. doi: 10.1007/s11306-023-02086-8.
For many samples studied by GC-based metabolomics applications, extensive sample preparation involving extraction followed by a two-step derivatization procedure of methoximation and trimethylsilylation (TMS) is typically required to expand the metabolome coverage. Performing normalization is critical to correct for variations present in samples and any biases added during the sample preparation steps and analytical runs. Addressing the totality of variations with an adequate normalization method increases the reliability of the downstream data analysis and interpretation of the results.
Normalizing to sample mass is one of the most commonly employed strategies, while the total peak area (TPA) as a normalization factor is also frequently used as a post-acquisition technique. Here, we present a new normalization approach, total derivatized peak area (TDPA), where data are normalized to the intensity of all derivatized compounds. TDPA relies on the benefits of silylation as a universal derivatization method for GC-based metabolomics studies.
Two sample classes consisting of systematically incremented sample mass were simulated, with the only difference between the groups being the added amino acid concentrations. The samples were TMS derivatized and analyzed using comprehensive two-dimensional gas chromatography coupled to time-of-flight mass spectrometry (GC × GC-TOFMS). The performance of five normalization strategies (no normalization, normalized to sample mass, TPA, total useful peak area (TUPA), and TDPA) were evaluated on the acquired data.
Of the five normalization techniques compared, TUPA and TDPA were the most effective. On PCA score space, they offered a clear separation between the two classes.
TUPA and TDPA carry different strengths: TUPA requires peak alignment across all samples, which depends upon the completion of the study, while TDPA is free from the requirement of alignment. The findings of the study would enhance the convenient and effective use of data normalization strategies and contribute to overcoming the data normalization challenges that currently exist in the metabolomics community.
对于许多基于 GC 的代谢组学应用研究的样本,通常需要进行广泛的样品制备,包括提取,然后进行两步衍生化处理,即甲氧基化和三甲基硅烷化(TMS),以扩大代谢组覆盖范围。进行归一化对于校正样品中存在的差异以及样品制备步骤和分析运行中添加的任何偏差至关重要。使用适当的归一化方法解决所有变化可以提高下游数据分析和结果解释的可靠性。
将样品质量归一化是最常用的策略之一,而总峰面积(TPA)作为归一化因子也经常被用作采集后的技术。在这里,我们提出了一种新的归一化方法,即总衍生化峰面积(TDPA),其中数据被归一化为所有衍生化化合物的强度。TDPA 依赖于硅烷化作为 GC 代谢组学研究中通用衍生化方法的优势。
模拟了两个样本类,这些样本类包含系统地增加的样本质量,两组之间唯一的区别在于添加的氨基酸浓度。这些样本经过 TMS 衍生化,并使用全二维气相色谱-飞行时间质谱联用仪(GC×GC-TOFMS)进行分析。在获得的数据上评估了五种归一化策略(无归一化、归一化至样品质量、TPA、总有用峰面积(TUPA)和 TDPA)的性能。
在所比较的五种归一化技术中,TUPA 和 TDPA 最为有效。在 PCA 得分空间上,它们清楚地区分了两个类别。
TUPA 和 TDPA 各有优势:TUPA 需要对所有样本进行峰对齐,这取决于研究的完成情况,而 TDPA 则无需对齐。该研究的结果将增强数据归一化策略的便捷有效使用,并有助于克服代谢组学界目前存在的数据归一化挑战。