Chen Wengxiang, Qiu Hang
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; Big Data Research Center, University of Electronic Science and Technology of China, Chengdu 611731, China.
Comput Methods Programs Biomed. 2025 Nov;271:109024. doi: 10.1016/j.cmpb.2025.109024. Epub 2025 Aug 13.
Integrating multi-omics data facilitates a comprehensive understanding of the etiology of complex diseases, which is critical for achieving precision medicine. Recently, graph-based approaches have been increasingly leveraged in the integrative multi-omics data analysis due to their robust expressive capability. However, these methods still face two limitations: 1) relying predominantly on a fixed sample similarity graph (SSG) to obtain omics-specific feature representation, and 2) insufficiently exploring the interrelations between different features from various omics. To this end, we propose MGDMCL, an innovative framework for integrating multiple omics data based on masked graph dynamic learning and multi-granularity feature contrastive learning.
For each type of omics data, a masked graph dynamic learning approach adaptively adjusts the SSG structure and achieves the learning of a reliable SSG in a graph dynamic learning manner, obtaining multi-layer feature representations from various graph convolutional networks (GCN) layers. Then, the multi-layer feature representations of different omics are concatenated at the layer-level, and a multi-granularity feature contrastive learning is designed to learn consensus feature representations of specific layers. Furthermore, to enhance classification robustness, the true class probability is introduced to evaluate the classification confidence of consensus feature representations from different layers.
Extensive experiments on five public datasets, including LGG, ROSMAP, LUSC, BRCA, and KIPAN, show that MGDMCL significantly surpasses state-of-the-art baselines in various biomedical classification tasks.
The proposed MGDMCL provides a more effective approach for integrative multi-omics data analysis, exhibiting great potential in biomedical classification applications. The implementation code of MGDMCL has been released at https://www.github.com/wxchen-uestc/MGDMCL.
整合多组学数据有助于全面了解复杂疾病的病因,这对实现精准医学至关重要。近年来,基于图的方法因其强大的表达能力在整合多组学数据分析中得到越来越广泛的应用。然而,这些方法仍面临两个局限性:1)主要依赖固定的样本相似性图(SSG)来获得组学特异性特征表示;2)对来自各种组学的不同特征之间的相互关系探索不足。为此,我们提出了MGDMCL,这是一种基于掩码图动态学习和多粒度特征对比学习的整合多组学数据的创新框架。
对于每种类型的组学数据,一种掩码图动态学习方法自适应地调整SSG结构,并以图动态学习的方式实现可靠SSG的学习,从各种图卷积网络(GCN)层获得多层特征表示。然后,将不同组学的多层特征表示在层级别进行拼接,并设计一种多粒度特征对比学习来学习特定层的共识特征表示。此外,为了增强分类鲁棒性,引入真实类别概率来评估来自不同层的共识特征表示的分类置信度。
在包括LGG、ROSMAP、LUSC、BRCA和KIPAN在内的五个公共数据集上进行的大量实验表明,MGDMCL在各种生物医学分类任务中显著超越了现有最先进的基线。
所提出的MGDMCL为整合多组学数据分析提供了一种更有效的方法,在生物医学分类应用中展现出巨大潜力。MGDMCL的实现代码已在https://www.github.com/wxchen-uestc/MGDMCL上发布。