Department of Bioinformatics and Telemedicine, Jagiellonian University-Medical College, Krakow, Poland.
Department of Medical Physics, Jagiellonian University, Marian Smoluchowski Institute of Physics, Krakow, Poland.
PLoS One. 2020 Jul 29;15(7):e0235398. doi: 10.1371/journal.pone.0235398. eCollection 2020.
A huge amount of atomized biological data collected in various databases and the need for a description of their relation by theoretical methods causes the development of data integration methods. The omics data analysis by integration of biological knowledge with mathematical procedures implemented in the OmicsON R library is presented in the paper. OmicsON is a tool for the integration of two sets of data: transcriptomics and metabolomics. In the workflow of the library, the functional grouping and statistical analysis are applied. Subgroups among the transcriptomic and metabolomics sets are created based on the biological knowledge stored in Reactome and String databases. It gives the possibility to analyze such sets of data by multivariate statistical procedures like Canonical Correlation Analysis (CCA) or Partial Least Squares (PLS). The integration of metabolomic and transcriptomic data based on the methodology contained in OmicsON helps to easily obtain information on the connection of data from two different sets. This information can significantly help in assessing the relationship between gene expression and metabolite concentrations, which in turn facilitates the biological interpretation of the analyzed process.
大量的生物原子化数据在各种数据库中被收集,需要通过理论方法来描述它们之间的关系,这促使数据集成方法的发展。本文介绍了通过将生物知识与数学程序集成到 OmicsON R 库中来进行组学数据分析。OmicsON 是一种用于集成两组数据的工具:转录组学和代谢组学。在库的工作流程中,应用了功能分组和统计分析。基于 Reactome 和 String 数据库中存储的生物知识,在转录组学和代谢组学数据集中创建子组。它通过多元统计过程(如典型相关分析 (CCA) 或偏最小二乘 (PLS))分析此类数据集成为可能。基于 OmicsON 中包含的方法对代谢组学和转录组学数据进行集成,有助于轻松获得来自两个不同数据集的数据连接信息。该信息可显著帮助评估基因表达与代谢物浓度之间的关系,从而促进对分析过程的生物学解释。