Iperi Cristian, Fernández-Ochoa Álvaro, Barturen Guillermo, Pers Jacques-Olivier, Foulquier Nathan, Bettacchioli Eleonore, Alarcón-Riquelme Marta, Cornec Divi, Bordron Anne, Jamin Christophe
LBAI, UMR1227, Univ Brest, Inserm, Brest, France.
Department of Analytical Chemistry, University of Granada, Granada, Spain.
BMC Bioinformatics. 2025 Jan 10;26(1):8. doi: 10.1186/s12859-024-06022-y.
Interpreting biological system changes requires interpreting vast amounts of multi-omics data. While user-friendly tools exist for single-omics analysis, integrating multiple omics still requires bioinformatics expertise, limiting accessibility for the broader scientific community.
BiomiX tackles the bottleneck in high-throughput omics data analysis, enabling efficient and integrated analysis of multiomics data obtained from two cohorts. BiomiX incorporates diverse omics data, using DESeq2/Limma packages for transcriptomics, and quantifying metabolomics peak differences, evaluated via the Wilcoxon test with the False Discovery Rate correction. The metabolomics annotation for Liquid Chromatography-Mass Spectrometry untargeted metabolomics is additionally supported using the mass-to-charge ratio in the CEU Mass Mediator database and fragmentation spectra in the TidyMass package. Methylomics analysis is performed using the ChAMP R package. Finally, Multi-Omics Factor Analysis (MOFA) integration identifies shared sources of variation across omics data. BiomiX also generates statistics, report figures and integrates EnrichR and GSEA for biological process exploration and subgroup analysis based on user-defined gene panels enhancing condition subtyping. BiomiX fine-tunes MOFA models, to optimize factors number selection, distinguishing between cohorts and providing tools to interpret discriminative MOFA factors. The interpretation relies on innovative bibliography research on Pubmed, which provides the articles most related to the discriminant factor contributors. Furthermore, discriminant MOFA factors are correlated with clinical data, and the top contributing pathways are explored, all with the aim of guiding the user in factor interpretation.
The analysis of single-omics and multi-omics integration in a standalone tool, along with MOFA implementation and its interpretability via literature, represents significant progress in the multi-omics field in line with the "Findable, Accessible, Interoperable, and Reusable" data principles. BiomiX offers a wide range of parameters and interactive data visualization, allowing for personalized analysis tailored to user needs. This R-based, user-friendly tool is compatible with multiple operating systems and aims to make multi-omics analysis accessible to non-experts in bioinformatics.
解读生物系统变化需要解读大量的多组学数据。虽然存在用于单组学分析的用户友好型工具,但整合多个组学仍需要生物信息学专业知识,这限制了更广泛科学界的可及性。
BiomiX解决了高通量组学数据分析中的瓶颈问题,能够对从两个队列获得的多组学数据进行高效且综合的分析。BiomiX整合了多种组学数据,使用DESeq2/Limma软件包进行转录组学分析,并通过带有错误发现率校正的Wilcoxon检验来量化代谢组学峰差异。此外,利用CEU Mass Mediator数据库中的质荷比和TidyMass软件包中的碎片谱来支持液相色谱 - 质谱非靶向代谢组学的代谢组学注释。甲基化组学分析使用ChAMP R软件包进行。最后,多组学因子分析(MOFA)整合可识别组学数据中的共同变异来源。BiomiX还生成统计数据、报告图表,并整合EnrichR和GSEA以基于用户定义的基因面板进行生物过程探索和亚组分析,从而增强条件亚型分析。BiomiX对MOFA模型进行微调,以优化因子数量选择,区分不同队列,并提供解释判别性MOFA因子的工具。这种解释依赖于对PubMed上的创新性文献研究,该研究提供了与判别因子贡献者最相关的文章。此外,判别性MOFA因子与临床数据相关,并探索了主要的贡献途径,所有这些都是为了指导用户进行因子解释。
在一个独立工具中对单组学和多组学整合进行分析,以及MOFA的实施及其通过文献的可解释性,代表了多组学领域符合“可查找、可访问、可互操作和可重用”数据原则的重大进展。BiomiX提供了广泛的参数和交互式数据可视化,允许根据用户需求进行个性化分析。这个基于R的、用户友好型工具与多个操作系统兼容,旨在使生物信息学非专家也能进行多组学分析。