Traquete Francisco, Luz João, Cordeiro Carlos, Sousa Silva Marta, Ferreira António E N
Laboratório de FT-ICR e Espectrometria de Massa Estrutural, MARE-Marine and Environmental Sciences Centre, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.
Front Mol Biosci. 2022 Jul 22;9:917911. doi: 10.3389/fmolb.2022.917911. eCollection 2022.
Untargeted metabolomics seeks to identify and quantify most metabolites in a biological system. In general, metabolomics results are represented by numerical matrices containing data that represent the intensities of the detected variables. These matrices are subsequently analyzed by methods that seek to extract significant biological information from the data. In mass spectrometry-based metabolomics, if mass is detected with sufficient accuracy, below 1 ppm, it is possible to derive mass-difference networks, which have spectral features as nodes and chemical changes as edges. These networks have previously been used as means to assist formula annotation and to rank the importance of chemical transformations. In this work, we propose a novel role for such networks in untargeted metabolomics data analysis: we demonstrate that their properties as graphs can also be used as signatures for metabolic profiling and class discrimination. For several benchmark examples, we computed six graph properties and we found that the degree profile was consistently the property that allowed for the best performance of several clustering and classification methods, reaching levels that are competitive with the performance using intensity data matrices and traditional pretreatment procedures. Furthermore, we propose two new metrics for the ranking of chemical transformations derived from network properties, which can be applied to sample comparison or clustering. These metrics illustrate how the graph properties of mass-difference networks can highlight the aspects of the information contained in data that are complementary to the information extracted from intensity-based data analysis.
非靶向代谢组学旨在识别和量化生物系统中的大多数代谢物。一般来说,代谢组学结果由包含代表检测到的变量强度数据的数值矩阵表示。随后通过旨在从数据中提取重要生物信息的方法对这些矩阵进行分析。在基于质谱的代谢组学中,如果以足够的精度(低于1 ppm)检测到质量,则有可能推导出质量差异网络,该网络以光谱特征为节点,化学变化为边。这些网络以前曾被用作辅助分子式注释和对化学转化重要性进行排序的手段。在这项工作中,我们提出了此类网络在非靶向代谢组学数据分析中的一种新作用:我们证明了它们作为图的属性也可以用作代谢谱分析和类别区分的特征。对于几个基准示例,我们计算了六种图属性,发现度分布始终是使几种聚类和分类方法表现最佳的属性,其性能水平与使用强度数据矩阵和传统预处理程序的性能具有竞争力。此外,我们提出了两种基于网络属性对化学转化进行排序的新指标,可应用于样本比较或聚类。这些指标说明了质量差异网络的图属性如何突出数据中包含的信息的各个方面,这些方面与从基于强度的数据分析中提取的信息互补。