School of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798, Singapore.
Sci Rep. 2022 Sep 14;12(1):15425. doi: 10.1038/s41598-022-19019-5.
Multi-omics data are increasingly being gathered for investigations of complex diseases such as cancer. However, high dimensionality, small sample size, and heterogeneity of different omics types pose huge challenges to integrated analysis. In this paper, we evaluate two network-based approaches for integration of multi-omics data in an application of clinical outcome prediction of neuroblastoma. We derive Patient Similarity Networks (PSN) as the first step for individual omics data by computing distances among patients from omics features. The fusion of different omics can be investigated in two ways: the network-level fusion is achieved using Similarity Network Fusion algorithm for fusing the PSNs derived for individual omics types; and the feature-level fusion is achieved by fusing the network features obtained from individual PSNs. We demonstrate our methods on two high-risk neuroblastoma datasets from SEQC project and TARGET project. We propose Deep Neural Network and Machine Learning methods with Recursive Feature Elimination as the predictor of survival status of neuroblastoma patients. Our results indicate that network-level fusion outperformed feature-level fusion for integration of different omics data whereas feature-level fusion is more suitable incorporating different feature types derived from same omics type. We conclude that the network-based methods are capable of handling heterogeneity and high dimensionality well in the integration of multi-omics.
多组学数据越来越多地被用于癌症等复杂疾病的研究。然而,不同组学类型的高维性、小样本量和异质性给综合分析带来了巨大的挑战。在本文中,我们评估了两种基于网络的方法,用于整合多组学数据在神经母细胞瘤临床结果预测中的应用。我们通过计算来自组学特征的患者之间的距离,从个体组学数据中推导出患者相似性网络 (PSN)。可以通过两种方式研究不同组学的融合:使用相似网络融合算法进行网络级融合,融合为个体组学类型推导的 PSN;以及通过融合从个体 PSN 获得的网络特征进行特征级融合。我们在 SEQC 项目和 TARGET 项目的两个高危神经母细胞瘤数据集上展示了我们的方法。我们提出了深度神经网络和机器学习方法,并结合递归特征消除作为神经母细胞瘤患者生存状态的预测因子。我们的结果表明,网络级融合在整合不同的组学数据方面优于特征级融合,而特征级融合更适合整合来自同一组学类型的不同特征类型。我们得出结论,基于网络的方法在整合多组学数据时能够很好地处理异质性和高维性。