MRC-University of Glasgow Centre for Virus Research, University of Glasgow, Garscube Estate, Glasgow G61 1QH, UK.
Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, Sir Alexander Fleming Building, Exhibition Road, South Kensington, London SW7 2AZ, UK.
Methods. 2018 Dec 1;151:12-20. doi: 10.1016/j.ymeth.2018.02.004. Epub 2018 Feb 10.
Metabolic phenotyping technologies based on Nuclear Magnetic Spectroscopy (NMR) and Mass Spectrometry (MS) generate vast amounts of unrefined data from biological samples. Clustering strategies are frequently employed to provide insight into patterns of relationships between samples and metabolites. Here, we propose the use of a non-negative matrix factorization driven bi-clustering strategy for metabolic phenotyping data in order to discover subsets of interrelated metabolites that exhibit similar behaviour across subsets of samples. The proposed strategy incorporates bi-cross validation and statistical segmentation techniques to automatically determine the number and structure of bi-clusters. This alternative approach is in contrast to the widely used conventional clustering approaches that incorporate all molecular peaks for clustering in metabolic studies and require a priori specification of the number of clusters. We perform the comparative analysis of the proposed strategy with other bi-clustering approaches, which were developed in the context of genomics and transcriptomics research. We demonstrate the superior performance of the proposed bi-clustering strategy on both simulated (NMR) and real (MS) bacterial metabolic data.
基于核磁共振(NMR)和质谱(MS)的代谢表型技术从生物样本中产生大量未经精制的数据。聚类策略常用于深入了解样本和代谢物之间关系模式。在这里,我们提出使用非负矩阵分解驱动的双聚类策略对代谢表型数据进行分析,以发现具有相似行为的相关代谢物子集,这些代谢物在样本子集之间表现出相似的行为。所提出的策略结合了双交叉验证和统计分割技术,可自动确定双聚类的数量和结构。这种替代方法与广泛使用的传统聚类方法形成对比,后者在代谢研究中整合了所有分子峰进行聚类,并需要事先指定聚类的数量。我们对所提出的策略与其他在基因组学和转录组学研究背景下开发的双聚类方法进行了比较分析。我们在模拟(NMR)和真实(MS)细菌代谢数据上证明了所提出的双聚类策略的优越性能。