, Institut de Mathématiques - Université de Toulouse III et CNRS, UMR 5219, F-31062 Toulouse, France.
BioData Min. 2012 Nov 13;5(1):19. doi: 10.1186/1756-0381-5-19.
Each omics platform is now able to generate a large amount of data. Genomics, proteomics, metabolomics, interactomics are compiled at an ever increasing pace and now form a core part of the fundamental systems biology framework. Recently, several integrative approaches have been proposed to extract meaningful information. However, these approaches lack of visualisation outputs to fully unravel the complex associations between different biological entities.
The multivariate statistical approaches 'regularized Canonical Correlation Analysis' and 'sparse Partial Least Squares regression' were recently developed to integrate two types of highly dimensional 'omics' data and to select relevant information. Using the results of these methods, we propose to revisit few graphical outputs to better understand the relationships between two 'omics' data and to better visualise the correlation structure between the different biological entities. These graphical outputs include Correlation Circle plots, Relevance Networks and Clustered Image Maps. We demonstrate the usefulness of such graphical outputs on several biological data sets and further assess their biological relevance using gene ontology analysis.
Such graphical outputs are undoubtedly useful to aid the interpretation of these promising integrative analysis tools and will certainly help in addressing fundamental biological questions and understanding systems as a whole.
The graphical tools described in this paper are implemented in the freely available R package mixOmics and in its associated web application.
每个组学平台现在都能够生成大量的数据。基因组学、蛋白质组学、代谢组学、相互作用组学的编译速度越来越快,现在已经成为基础系统生物学框架的核心部分。最近,已经提出了几种综合方法来提取有意义的信息。然而,这些方法缺乏可视化输出,无法完全揭示不同生物实体之间的复杂关联。
最近开发了多元统计方法“正则化典型相关分析”和“稀疏偏最小二乘回归”,以整合两种类型的高维“组学”数据并选择相关信息。利用这些方法的结果,我们建议重新审视几种图形输出,以更好地理解两种“组学”数据之间的关系,并更好地可视化不同生物实体之间的相关结构。这些图形输出包括相关圆图、相关性网络和聚类图像地图。我们在几个生物数据集上展示了这些图形输出的有用性,并进一步使用基因本体分析评估了它们的生物学相关性。
这些图形输出无疑有助于帮助解释这些有前途的综合分析工具,并且肯定有助于解决基本的生物学问题和整体理解系统。
本文描述的图形工具在免费的 R 包 mixOmics 及其相关的网络应用程序中实现。