Iacovacci Jacopo, Peluso Alina, Ebbels Timothy, Ralser Markus, Glen Robert C
Department of Metabolism, Digestion, and Reproduction, Faculty of Medicine, Imperial College London, London SW7 2AZ, UK.
The Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London NW1 1AT, UK.
Metabolites. 2020 Oct 29;10(11):435. doi: 10.3390/metabo10110435.
Mass spectrometry technologies are widely used in the fields of ionomics and metabolomics to simultaneously profile the intracellular concentrations of, e.g., amino acids or elements in genome-wide mutant libraries. These molecular or sub-molecular features are generally non-Gaussian and their covariance reveals patterns of correlations that reflect the system nature of the cell biochemistry and biology. Here, we introduce two similarity measures, the Mahalanobis cosine and the hybrid Mahalanobis cosine, that enforce information from the empirical covariance matrix of omics data from high-throughput screening and that can be used to quantify similarities between the profiled features of different mutants. We evaluate the performance of these similarity measures in the task of inferring and integrating genetic networks from short-profile ionomics/metabolomics data through an analysis of experimental data sets related to the ionome and the metabolome of the model organism . The study of the resulting ionome-metabolome multilayer genetic network, which encodes multiple omic-specific levels of correlations between genes, shows that the proposed measures can provide an alternative description of relations between biological processes when compared to the commonly used Pearson's correlation coefficient and have the potential to guide the construction of novel hypotheses on the function of uncharacterised genes.
质谱技术在离子组学和代谢组学领域被广泛应用,以同时分析全基因组突变文库中细胞内例如氨基酸或元素的浓度。这些分子或亚分子特征通常是非高斯分布的,它们的协方差揭示了反映细胞生物化学和生物学系统本质的相关模式。在此,我们引入两种相似性度量,即马氏余弦和混合马氏余弦,它们利用高通量筛选的组学数据的经验协方差矩阵中的信息,可用于量化不同突变体的特征之间的相似性。我们通过分析与模式生物的离子组和代谢组相关的实验数据集,评估这些相似性度量在从短特征离子组学/代谢组学数据推断和整合遗传网络任务中的性能。对所得的离子组 - 代谢组多层遗传网络的研究编码了基因之间多个组学特异性水平的相关性,结果表明,与常用的皮尔逊相关系数相比,所提出的度量可以提供生物过程之间关系的另一种描述,并且有潜力指导关于未表征基因功能的新假设的构建。