School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China.
College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China.
Bioinformatics. 2021 Dec 11;37(24):4779-4786. doi: 10.1093/bioinformatics/btab535.
MOTIVATION: Cancer subtype identification aims to divide cancer patients into subgroups with distinct clinical phenotypes and facilitate the development for subgroup specific therapies. The massive amount of multi-omics datasets accumulated in the public databases have provided unprecedented opportunities to fulfill this task. As a result, great computational efforts have been made to accurately identify cancer subtypes via integrative analysis of these multi-omics datasets. RESULTS: In this article, we propose a Consensus Guided Graph Autoencoder (CGGA) to effectively identify cancer subtypes. First, we learn for each omic a new feature matrix by using graph autoencoders, where both structure information and node features can be effectively incorporated during the learning process. Second, we learn a set of omic-specific similarity matrices together with a consensus matrix based on the features obtained in the first step. The learned omic-specific similarity matrices are then fed back to the graph autoencoders to guide the feature learning. By iterating the two steps above, our method obtains a final consensus similarity matrix for cancer subtyping. To comprehensively evaluate the prediction performance of our method, we compare CGGA with several approaches ranging from general-purpose multi-view clustering algorithms to multi-omics-specific integrative methods. The experimental results on both generic datasets and cancer datasets confirm the superiority of our method. Moreover, we validate the effectiveness of our method in leveraging multi-omics datasets to identify cancer subtypes. In addition, we investigate the clinical implications of the obtained clusters for glioblastoma and provide new insights into the treatment for patients with different subtypes. AVAILABILITYAND IMPLEMENTATION: The source code of our method is freely available at https://github.com/alcs417/CGGA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
动机:癌症亚型识别旨在将癌症患者分为具有不同临床表型的亚组,并促进亚组特异性治疗方法的发展。公共数据库中积累的大量多组学数据集为完成这项任务提供了前所未有的机会。因此,为了通过整合分析这些多组学数据集来准确识别癌症亚型,已经进行了大量的计算工作。
结果:在本文中,我们提出了一种共识引导图自动编码器(CGGA)来有效地识别癌症亚型。首先,我们使用图自动编码器为每个组学学习一个新的特征矩阵,在学习过程中可以有效地结合结构信息和节点特征。其次,我们基于第一步获得的特征学习一组组学特定的相似性矩阵和一个共识矩阵。然后将学习到的组学特定相似性矩阵反馈给图自动编码器,以指导特征学习。通过迭代上述两个步骤,我们的方法得到了用于癌症亚分型的最终共识相似性矩阵。为了全面评估我们方法的预测性能,我们将 CGGA 与从通用多视图聚类算法到多组学特定整合方法的几种方法进行了比较。在通用数据集和癌症数据集上的实验结果证实了我们方法的优越性。此外,我们验证了我们的方法利用多组学数据集识别癌症亚型的有效性。此外,我们研究了获得的聚类对胶质母细胞瘤的临床意义,并为不同亚型的患者提供了新的治疗思路。
可用性和实现:我们方法的源代码可在 https://github.com/alcs417/CGGA 上免费获取。
补充信息:补充数据可在生物信息学在线获得。
Bioinformatics. 2021-12-11
IEEE/ACM Trans Comput Biol Bioinform. 2022
Bioinformatics. 2021-8-25
IEEE/ACM Trans Comput Biol Bioinform. 2023
G3 (Bethesda). 2022-11-4