Radboud University, Institute for Molecules and Materials (IMM) Heyendaalseweg 135, 6525, AJ Nijmegen, The Netherlands.
Biometris, Wageningen UR, Droevendaalsesteeg 1, 6708, PB Wageningen, The Netherlands.
Sci Rep. 2019 Feb 4;9(1):1123. doi: 10.1038/s41598-018-37494-7.
Platforms like metabolomics provide an unprecedented view on the chemical versatility in biomedical samples. Many diseases reflect themselves as perturbations in specific metabolite combinations. Multivariate analyses are essential to detect such combinations and associate them to specific diseases. For this, usually targeted discriminations of samples associated to a specific disease from non-diseased control samples are used. Such targeted data interpretation may not respect the heterogeneity of metabolic responses, both between diseases and within diseases. Here we show that multivariate methods that find any set of perturbed metabolites in a single patient, may be employed in combination with data collected with a single metabolomics technology to simultaneously investigate a large array of diseases. Several such untargeted data analysis approaches have been already proposed in other fields to find both expected and unexpected perturbations, e.g. in Statistical Process Control. We have critically compared several of these approaches for their sensitivity and their correct identification of the specifically perturbed metabolites. Also a new approach is introduced for this purpose. The newly introduced Sparse Mean approach, which we find here as most sensitive and best able to identify the specifically perturbed metabolites, turns metabolomics into an untargeted diagnostic platform. Aside from metabolomics, the proposed approach may greatly benefit fault diagnosis with untargeted analyses in many other fields, such as Industrial Process Control, food Adulteration Detection, and Intrusion Detection.
平台,如代谢组学,提供了前所未有的生物医学样本化学多样性的视角。许多疾病反映在特定代谢物组合的扰动。多元分析是检测此类组合并将其与特定疾病相关联的必要手段。为此,通常使用针对特定疾病的样本与非疾病对照样本的靶向区分。这种针对目标的数据解释可能不符合代谢反应的异质性,无论是在疾病之间还是在疾病内部。在这里,我们展示了可以在单个患者中找到任何一组扰动代谢物的多元方法,可与使用单一代谢组学技术收集的数据结合使用,同时研究大量疾病。在其他领域已经提出了几种这样的非靶向数据分析方法来发现预期和意外的扰动,例如在统计过程控制中。我们对这些方法的敏感性及其正确识别特定扰动代谢物的能力进行了批判性比较。还为此引入了一种新方法。我们发现新引入的稀疏均值方法是最敏感和最能够识别特定扰动代谢物的方法,它将代谢组学转化为一种非靶向诊断平台。除代谢组学外,该方法还可以极大地受益于许多其他领域的非靶向分析,如工业过程控制、食品掺假检测和入侵检测中的故障诊断。