Singh Amrit, Shannon Casey P, Gautier Benoît, Rohart Florian, Vacher Michaël, Tebbutt Scott J, Lê Cao Kim-Anh
Prevention of Organ Failure (PROOF) Centre of Excellence, University of British Columbia, Vancouver, BC, Canada.
The University of Queensland Diamantina Institute, Translational Research Institute, Woolloongabba, Queensland, Australia.
Bioinformatics. 2019 Sep 1;35(17):3055-3062. doi: 10.1093/bioinformatics/bty1054.
In the continuously expanding omics era, novel computational and statistical strategies are needed for data integration and identification of biomarkers and molecular signatures. We present Data Integration Analysis for Biomarker discovery using Latent cOmponents (DIABLO), a multi-omics integrative method that seeks for common information across different data types through the selection of a subset of molecular features, while discriminating between multiple phenotypic groups.
Using simulations and benchmark multi-omics studies, we show that DIABLO identifies features with superior biological relevance compared with existing unsupervised integrative methods, while achieving predictive performance comparable to state-of-the-art supervised approaches. DIABLO is versatile, allowing for modular-based analyses and cross-over study designs. In two case studies, DIABLO identified both known and novel multi-omics biomarkers consisting of mRNAs, miRNAs, CpGs, proteins and metabolites.
DIABLO is implemented in the mixOmics R Bioconductor package with functions for parameters' choice and visualization to assist in the interpretation of the integrative analyses, along with tutorials on http://mixomics.org and in our Bioconductor vignette.
Supplementary data are available at Bioinformatics online.
在不断扩展的组学时代,需要新的计算和统计策略来进行数据整合以及生物标志物和分子特征的识别。我们提出了使用潜在成分进行生物标志物发现的数据整合分析(DIABLO),这是一种多组学整合方法,通过选择分子特征子集来寻找不同数据类型之间的共同信息,同时区分多个表型组。
通过模拟和基准多组学研究,我们表明,与现有的无监督整合方法相比,DIABLO识别出的特征具有更高的生物学相关性,同时实现了与最先进的监督方法相当的预测性能。DIABLO用途广泛,允许基于模块的分析和交叉研究设计。在两个案例研究中,DIABLO识别出了由mRNA、miRNA、CpG、蛋白质和代谢物组成的已知和新型多组学生物标志物。
DIABLO在mixOmics R Bioconductor包中实现,具有参数选择和可视化功能,以协助解释整合分析,同时在http://mixomics.org和我们的Bioconductor小插图中有教程。
补充数据可在《生物信息学》在线获取。