Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington, 850 Republican Street, Seattle, Washington 98109, United States.
Department of Electronic Science, Xiamen University, Xiamen 361005, China.
Anal Chem. 2020 Oct 6;92(19):12925-12933. doi: 10.1021/acs.analchem.0c01493. Epub 2020 Sep 14.
Data quality in global metabolomics is of great importance for biomarker discovery and system biology studies. However, comprehensive metrics and methods to evaluate and compare the data quality of global metabolomics data sets are lacking. In this work, we combine newly developed metrics, along with well-known measures, to comprehensively and quantitatively characterize the data quality across two similar liquid chromatography coupled with mass spectrometry (LC-MS) platforms, with the goal of providing an efficient and improved ability to evaluate the data quality in global metabolite profiling experiments. A pooled human serum sample was run 50 times on two high-resolution LC-QTOF-MS platforms to provide profile and centroid MS data. These data were processed using Progenesis QI software and then analyzed using five important data quality measures, including retention time drift, the number of compounds detected, missing values, and MS reproducibility (2 measures). The detected compounds were fit to a γ distribution versus compound abundance, which was normalized to allow comparison of different platforms. To evaluate missing values, characteristic curves were obtained by plotting the compound detection percentage versus extraction frequency. To characterize reproducibility, the accumulative coefficient of variation (CV) versus the percentage of total compounds detected and intraclass correlation coefficient (ICC) versus compound abundance were investigated. Key findings include significantly better performance using profile mode data compared to centroid mode as well quantitatively better performance from the newer, higher resolution instrument. A summary table of results gives a snapshot of the experimental results and provides a template to evaluate the global metabolite profiling workflow. In total, these measures give a good overall view of data quality in global profiling and allow comparisons of data acquisition strategies and platforms as well as optimization of parameters.
全局代谢组学的数据质量对于生物标志物发现和系统生物学研究非常重要。然而,缺乏全面的指标和方法来评估和比较全局代谢组学数据集的数据质量。在这项工作中,我们结合了新开发的指标以及知名的指标,全面定量地描述了两个类似的液相色谱与质谱联用(LC-MS)平台的全局代谢组学数据质量,旨在提供一种高效且改进的能力来评估全局代谢物分析实验中的数据质量。将混合人血清样本在两个高分辨率 LC-QTOF-MS 平台上运行 50 次,以提供轮廓和质心 MS 数据。使用 Progenesis QI 软件处理这些数据,然后使用五个重要的数据质量指标进行分析,包括保留时间漂移、检测到的化合物数量、缺失值和 MS 重现性(2 个指标)。将检测到的化合物拟合到与化合物丰度相关的γ分布中,并归一化以允许比较不同的平台。为了评估缺失值,通过绘制化合物检测百分比与提取频率之间的关系来获得特征曲线。为了描述重现性,研究了累积变异系数(CV)与检测到的总化合物百分比以及内类相关系数(ICC)与化合物丰度之间的关系。主要发现包括与质心模式数据相比,轮廓模式数据的性能明显更好,并且新的、更高分辨率的仪器在定量上的性能更好。结果总结表提供了实验结果的快照,并为评估全局代谢物分析工作流程提供了模板。总的来说,这些指标可以全面了解全局分析中的数据质量,并允许比较数据采集策略和平台,以及优化参数。