Peng Cheng Laboratory, Shenzhen, 518055, China; School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, 999078, Macao Special Administrative Region of China.
Peng Cheng Laboratory, Shenzhen, 518055, China.
Comput Biol Med. 2023 Sep;164:107303. doi: 10.1016/j.compbiomed.2023.107303. Epub 2023 Aug 2.
With the rapid development and accumulation of high-throughput sequencing technology and omics data, many studies have conducted a more comprehensive understanding of human diseases from a multi-omics perspective. Meanwhile, graph-based methods have been widely used to process multi-omics data due to its powerful expressive ability. However, most existing graph-based methods utilize fixed graphs to learn sample embedding representations, which often leads to sub-optimal results. Furthermore, treating embedding representations of different omics equally usually cannot obtain more reasonable integrated information. In addition, the complex correlation between omics is not fully taken into account. To this end, we propose an end-to-end interpretable multi-omics integration method, named MOGLAM, for disease classification prediction. Dynamic graph convolutional network with feature selection is first utilized to obtain higher quality omic-specific embedding information by adaptively learning the graph structure and discover important biomarkers. Then, multi-omics attention mechanism is applied to adaptively weight the embedding representations of different omics, thereby obtaining more reasonable integrated information. Finally, we propose omic-integrated representation learning to capture complex common and complementary information between omics while performing multi-omics integration. Experimental results on three datasets show that MOGLAM achieves superior performance than other state-of-the-art multi-omics integration methods. Moreover, MOGLAM can identify important biomarkers from different omics data types in an end-to-end manner.
随着高通量测序技术和组学数据的快速发展和积累,许多研究从多组学的角度对人类疾病进行了更全面的了解。同时,由于图方法具有强大的表达能力,因此已广泛用于处理多组学数据。然而,大多数现有的基于图的方法利用固定的图来学习样本嵌入表示,这通常导致次优的结果。此外,平等对待不同组学的嵌入表示通常无法获得更合理的综合信息。此外,组学之间的复杂相关性没有得到充分考虑。为此,我们提出了一种端到端可解释的多组学集成方法 MOGLAM,用于疾病分类预测。首先利用具有特征选择的动态图卷积网络,通过自适应学习图结构和发现重要生物标志物来获得更高质量的特定于组学的嵌入信息。然后,应用多组学注意力机制自适应地加权不同组学的嵌入表示,从而获得更合理的综合信息。最后,我们提出了组学综合表示学习,以在进行多组学集成的同时捕获组学之间的复杂共同和互补信息。在三个数据集上的实验结果表明,MOGLAM 优于其他最先进的多组学集成方法。此外,MOGLAM 可以端到端地从不同的组学数据类型中识别重要的生物标志物。