Kuligowski Julia, Pérez-Guaita David, Sánchez-Illana Ángel, León-González Zacarías, de la Guardia Miguel, Vento Máximo, Lock Eric F, Quintás Guillermo
Neonatal Research Centre, Health Research Institute La Fe, Valencia, Spain.
Analyst. 2015 Jul 7;140(13):4521-9. doi: 10.1039/c5an00706b.
Metabolic profiling is increasingly being used for understanding biological processes but there is no single analytical technique that provides a complete quantitative or qualitative profiling of the metabolome. Data fusion (i.e. joint analysis of data from multiple sources) has the potential to circumvent this issue facilitating knowledge discovery and reliable biomarker identification. Another field of application of data fusion is the simultaneous analysis of metabolomic changes through several biofluids or tissues. However, metabolomics typically deals with large datasets, with hundreds to thousands of variables and the identification of shared and individual factors or structures across multiple sources is challenging due to the high variable to sample ratios and differences in intensity and noise range. In this work we apply a recent method, Joint and Individual Variation Explained (JIVE), for the integrated unsupervised analysis of metabolomic profiles from multiple data sources. This method separates the shared patterns among data sources (i.e. the joint structure) from the individual structure of each data source that is unrelated to the joint structure. Two examples are described to show the applicability of JIVE for the simultaneous analysis of multi-source data using: (i) plasma samples subjected to different analytical techniques, sample treatment and measurement conditions; and (ii) plasma and urine samples subjected to liquid chromatography-mass spectrometry measured using two ionization conditions.
代谢谱分析越来越多地用于理解生物过程,但没有一种单一的分析技术能够提供代谢组完整的定量或定性分析。数据融合(即对来自多个来源的数据进行联合分析)有潜力规避这一问题,促进知识发现和可靠生物标志物的识别。数据融合的另一个应用领域是通过几种生物流体或组织同时分析代谢组学变化。然而,代谢组学通常处理的是大型数据集,包含数百到数千个变量,由于高变量与样本比率以及强度和噪声范围的差异,识别多个来源中的共享和个体因素或结构具有挑战性。在这项工作中,我们应用了一种最近的方法,即联合和个体变异解释(JIVE),用于对来自多个数据源的代谢组学谱进行综合无监督分析。该方法将数据源之间的共享模式(即联合结构)与每个数据源中与联合结构无关的个体结构区分开来。描述了两个例子,以展示JIVE在使用以下数据进行多源数据同时分析中的适用性:(i)经过不同分析技术、样本处理和测量条件的血浆样本;以及(ii)使用两种电离条件通过液相色谱 - 质谱法测量的血浆和尿液样本。