Department of Bioscience and Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342037, India.
School of Artificial Intelligence and Data Science, Indian Institute of Technology, Jodhpur, Rajasthan, 342037, India.
Sci Rep. 2022 Sep 17;12(1):15629. doi: 10.1038/s41598-022-17585-2.
Cancer subtypes identification is one of the critical steps toward advancing personalized anti-cancerous therapies. Accumulation of a massive amount of multi-platform omics data measured across the same set of samples provides an opportunity to look into this deadly disease from several views simultaneously. Few integrative clustering approaches are developed to capture shared information from all the views to identify cancer subtypes. However, they have certain limitations. The challenge here is identifying the most relevant feature space from each omic view and systematically integrating them. Both the steps should lead toward a global clustering solution with biological significance. In this respect, a novel multi-omics clustering algorithm named RISynG (Recursive Integration of Synergised Graph-representations) is presented in this study. RISynG represents each omic view as two representation matrices that are Gramian and Laplacian. A parameterised combination function is defined to obtain a synergy matrix from these representation matrices. Then a recursive multi-kernel approach is applied to integrate the most relevant, shared, and complementary information captured via the respective synergy matrices. At last, clustering is applied to the integrated subspace. RISynG is benchmarked on five multi-omics cancer datasets taken from The Cancer Genome Atlas. The experimental results demonstrate RISynG's efficiency over the other approaches in this domain.
癌症亚型识别是推进个性化抗癌疗法的关键步骤之一。大量多平台组学数据的积累,这些数据是在同一组样本上测量的,为从多个角度同时研究这种致命疾病提供了机会。已经开发了一些集成聚类方法来从所有视图中捕获共享信息,以识别癌症亚型。然而,它们存在一定的局限性。这里的挑战是从每个组学视图中识别出最相关的特征空间,并对其进行系统地整合。这两个步骤都应该有助于实现具有生物学意义的全局聚类解决方案。在这方面,本研究提出了一种名为 RISynG(协同图表示递归集成)的新型多组学聚类算法。RISynG 将每个组学视图表示为两个表示矩阵,即 Gramian 和 Laplacian。定义了一个参数化组合函数,从这些表示矩阵中获得协同矩阵。然后应用递归多核方法来整合通过各自的协同矩阵捕获的最相关、共享和互补信息。最后,将聚类应用于集成子空间。RISynG 在来自癌症基因组图谱的五个多组学癌症数据集上进行了基准测试。实验结果表明,RISynG 在该领域的其他方法中具有效率优势。