Duruflé Harold, Déjean Sébastien
INRAE, ONF, BioForA, UMR 0588, Orléans, France.
Institut de Mathématiques de Toulouse, Université de Toulouse, CNRS, UPS, UMR 5219, Toulouse, France.
Methods Mol Biol. 2023;2642:295-318. doi: 10.1007/978-1-0716-3044-0_16.
In order to answer new biological questions, high-throughput data generated by new biotechnologies can be very meaningful but require specific and adapted statistical treatments. Thus, in the context of abiotic stress signaling studies, understanding the integration of cascading mechanisms from stress perception to biochemical and physiological adjustments necessarily entails efficient and valid analysis of multilevel and heterogeneous data. In this chapter, we propose examples to manage, analyze, and integrate multi-omics heterogeneous data. This workflow suggests and follows different general biological questions or issues answered with detailed code, data analysis, multiple visualizations, and always followed by brief interpretations. We illustrated this using the mixOmics package for the R software, as it specifically provides tools to address vertical and horizontal data integration issues. In order to illustrate this workflow, we used the usual omics datasets biologists can generate (phenomics, metabolomics, proteomics, and transcriptomics). These data were collected from two organs (leaf rosettes, floral stems) of five ecotypes of the model plant Arabidopsis thaliana exposed to two temperature growth conditions. They are available in the R package WallOmicsData. The workflow presented here is not limited to Arabidopsis thaliana and can be applied to any plant species. It can even be largely deployed to whatever the organisms of interest and the biological questions may be.
为了回答新的生物学问题,新生物技术产生的高通量数据可能非常有意义,但需要特定且适用的统计处理方法。因此,在非生物胁迫信号研究的背景下,理解从胁迫感知到生化和生理调节的级联机制的整合必然需要对多层次和异质性数据进行有效且有效的分析。在本章中,我们提出了管理、分析和整合多组学异质性数据的示例。这个工作流程提出并遵循不同的一般生物学问题或议题,并通过详细的代码、数据分析、多种可视化进行解答,且始终伴有简要解释。我们使用R软件的mixOmics包对此进行了说明,因为它专门提供了解决垂直和水平数据整合问题的工具。为了说明这个工作流程,我们使用了生物学家通常可以生成的组学数据集(表型组学、代谢组学、蛋白质组学和转录组学)。这些数据是从模式植物拟南芥的五个生态型的两个器官(莲座叶、花茎)中收集的,这些器官暴露于两种温度生长条件下。它们可在R包WallOmicsData中获取。这里介绍的工作流程不仅限于拟南芥,还可以应用于任何植物物种。它甚至可以广泛应用于任何感兴趣的生物体和生物学问题。