Zhang Xiaoyu, Zhou Zhenwei, Xu Hanfei, Liu Ching-Ti
Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA.
Wiley Interdiscip Rev Comput Stat. 2022 May-Jun;14(3). doi: 10.1002/wics.1553. Epub 2021 Feb 7.
Integrative analysis of multi-omics data has drawn much attention from the scientific community due to the technological advancements which have generated various omics data. Leveraging these multi-omics data potentially provides a more comprehensive view of the disease mechanism or biological processes. Integrative multi-omics clustering is an unsupervised integrative method specifically used to find coherent groups of samples or features by utilizing information across multi-omics data. It aims to better stratify diseases and to suggest biological mechanisms and potential targeted therapies for the diseases. However, applying integrative multi-omics clustering is both statistically and computationally challenging due to various reasons such as high dimensionality and heterogeneity. In this review, we summarized integrative multi-omics clustering methods into three general categories: , , and based on when and how the multi-omics data are processed for clustering. We further classified the methods into different approaches under each category based on the main statistical strategy used during clustering. In addition, we have provided recommended practices tailored to four real-life scenarios to help researchers to strategize their selection in integrative multi-omics clustering methods for their future studies.
由于技术进步产生了各种组学数据,多组学数据的综合分析已引起科学界的广泛关注。利用这些多组学数据有可能提供对疾病机制或生物过程更全面的看法。综合多组学聚类是一种无监督的综合方法,专门用于通过利用多组学数据中的信息来找到样本或特征的连贯组。其目的是更好地对疾病进行分层,并为疾病提出生物学机制和潜在的靶向治疗方法。然而,由于高维度和异质性等各种原因,应用综合多组学聚类在统计和计算上都具有挑战性。在本综述中,我们根据多组学数据在何时以及如何进行聚类处理,将综合多组学聚类方法总结为三大类: 、 和 。我们根据聚类过程中使用的主要统计策略,将每一类方法进一步细分为不同的方法。此外,我们针对四种实际场景提供了推荐做法,以帮助研究人员在未来研究中为综合多组学聚类方法的选择制定策略。