Biosystems Data Analysis, Swammerdam Institute for Life Sciences, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands.
Analytical Biosciences, LACDR, Leiden University, 2333 CC Leiden, The Netherlands.
Anal Chem. 2020 Oct 20;92(20):13614-13621. doi: 10.1021/acs.analchem.9b05613. Epub 2020 Oct 9.
Metabolomics is becoming a mature part of analytical chemistry as evidenced by the growing number of publications and attendees of international conferences dedicated to this topic. Yet, a systematic treatment of the fundamental structure and properties of metabolomics data is lagging behind. We want to fill this gap by introducing two fundamental theories concerning metabolomics data: data theory and measurement theory. Our approach is to ask simple questions, the answers of which require applying these theories to metabolomics. We show that we can distinguish at least four different levels of metabolomics data with different properties and warn against confusing data with numbers. This treatment provides a theoretical underpinning for preprocessing and postprocessing methods in metabolomics and also argues for a proper match between type of metabolomics data and the biological question to be answered. The approach can be extended to other omics measurements such as proteomics and is thus of relevance for a large analytical chemistry community.
代谢组学正在成为分析化学中一个成熟的部分,这一点可以从专门针对这个主题的国际会议的出版物和与会者数量的不断增加得到证明。然而,对代谢组学数据的基本结构和特性的系统处理却滞后了。我们希望通过引入与代谢组学数据相关的两个基本理论来填补这一空白:数据理论和测量理论。我们的方法是提出一些简单的问题,这些问题的答案需要将这些理论应用于代谢组学。我们表明,我们至少可以区分具有不同特性的四种不同层次的代谢组学数据,并警告不要将数据与数字混淆。这种处理为代谢组学中的预处理和后处理方法提供了理论基础,也为代谢组学数据的类型与要回答的生物学问题之间的适当匹配提供了依据。该方法可以扩展到其他组学测量,如蛋白质组学,因此对广大分析化学界具有重要意义。