Suppr超能文献

基于先验知识的代谢组学数据探索。

Metabolomics data exploration guided by prior knowledge.

作者信息

van den Berg Robert A, Rubingh Carina M, Westerhuis Johan A, van der Werf Mariët J, Smilde Age K

机构信息

TNO Quality of Life, P.O. Box 360, 3700 AJ Zeist, The Netherlands.

出版信息

Anal Chim Acta. 2009 Oct 5;651(2):173-81. doi: 10.1016/j.aca.2009.08.029. Epub 2009 Aug 25.

Abstract

In metabolomics research, it is often important to focus the data analysis to specific areas of interest within the metabolome. In this paper, we describe the application of consensus principal component analysis (CPCA) and canonical correlation analysis (CCA) as a means to explore the relation between metabolome data and (i) biochemically related metabolites and (ii) an amino acid biosynthesis pathway. CPCA searches for major trends in the behavior of metabolite concentrations that are in common for the metabolites of interest and the remainder of the metabolome. CCA identifies the strongest correlations between the metabolites of interest and the remainder of the metabolome. CPCA and CCA were applied to two different microbial metabolomics data sets. The first data set, derived from Pseudomonas putida S12, was relatively simple as it contained metabolomes obtained under four environmental conditions only. The second data set, obtained from Escherichia coli, was much more complex as it consisted of metabolomes obtained under 28 different environmental conditions. In case of the simple and coherent P. putida S12 data set, CCA and CPCA gave similar results as the variation in the subset of the selected metabolites and the remainder of the metabolome was similar. In contrast, CCA and CPCA yielded different results in case of the E. coli data set. With CPCA the trends in the selected subset--the phenylalanine biosynthesis pathway--dominated the results. The main trends were related to high and low phenylalanine productivity, and the metabolites showing a similar behavior in concentration were metabolites regulating the phenylalanine biosynthesis route in the subset and metabolites related to general amino acid metabolism in the remainder of the metabolome. With CCA, neither subset truly dominated the data analysis. CCA described the differences between the wild type and the overproducing strain and the differences between the succinate and glucose grown cells. For the difference between the wild type and the overproducing strain, metabolites from the beginning and the end of aromatic amino acid pathways like erythrose-4-phosphate, tryptophan, and phenylalanine were important for the selected metabolites. CCA and CPCA proved to be complementary data analysis tools that enable the focusing of the data analysis on groups of metabolites that are of specific interest in relation to the remainder of the metabolome. Compared to an ordinary PCA, focusing the data analysis on biologically relevant metabolites lead especially for the complex E. coli data to a better biological interpretation of the data.

摘要

在代谢组学研究中,将数据分析聚焦于代谢组内特定的感兴趣领域通常很重要。在本文中,我们描述了一致性主成分分析(CPCA)和典型相关分析(CCA)的应用,作为探索代谢组数据与(i)生化相关代谢物和(ii)氨基酸生物合成途径之间关系的一种手段。CPCA寻找感兴趣的代谢物和代谢组其余部分在代谢物浓度行为方面的主要趋势。CCA识别感兴趣的代谢物与代谢组其余部分之间最强的相关性。CPCA和CCA被应用于两个不同的微生物代谢组学数据集。第一个数据集源自恶臭假单胞菌S12,相对简单,因为它仅包含在四种环境条件下获得的代谢组。第二个数据集来自大肠杆菌,更为复杂,因为它由在28种不同环境条件下获得的代谢组组成。对于简单且连贯的恶臭假单胞菌S12数据集,由于所选代谢物子集和代谢组其余部分的变化相似,CCA和CPCA给出了相似的结果。相比之下,在大肠杆菌数据集的情况下,CCA和CPCA产生了不同的结果。使用CPCA时,所选子集(苯丙氨酸生物合成途径)中的趋势主导了结果。主要趋势与高和低苯丙氨酸生产率相关,在浓度上表现出相似行为的代谢物是子集中调节苯丙氨酸生物合成途径的代谢物以及代谢组其余部分中与一般氨基酸代谢相关的代谢物。使用CCA时,两个子集中没有一个真正主导数据分析。CCA描述了野生型和高产菌株之间的差异以及琥珀酸和葡萄糖培养细胞之间的差异。对于野生型和高产菌株之间的差异,来自芳香族氨基酸途径起始和末端的代谢物,如赤藓糖 - 4 - 磷酸、色氨酸和苯丙氨酸,对所选代谢物很重要。CCA和CPCA被证明是互补的数据分析工具,能够将数据分析聚焦于与代谢组其余部分相关的特定感兴趣的代谢物组。与普通主成分分析相比,将数据分析聚焦于生物学相关代谢物尤其对于复杂的大肠杆菌数据能够更好地对数据进行生物学解释。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验