Ren Yanjiao, Gao Yimeng, Du Wei, Qiao Weibo, Li Wei, Yang Qianqian, Liang Yanchun, Li Gaoyang
College of Information Technology, Smart Agriculture Research Institute, Jilin Agricultural University, Changchun, Jilin, China.
College of Computer Science and Technology, Jilin University, Changchun, China.
Front Genet. 2024 Feb 20;15:1363896. doi: 10.3389/fgene.2024.1363896. eCollection 2024.
As the evaluation indices, cancer grading and subtyping have diverse clinical, pathological, and molecular characteristics with prognostic and therapeutic implications. Although researchers have begun to study cancer differentiation and subtype prediction, most of relevant methods are based on traditional machine learning and rely on single omics data. It is necessary to explore a deep learning algorithm that integrates multi-omics data to achieve classification prediction of cancer differentiation and subtypes. This paper proposes a multi-omics data fusion algorithm based on a multi-view graph neural network (MVGNN) for predicting cancer differentiation and subtype classification. The model framework consists of a graph convolutional network (GCN) module for learning features from different omics data and an attention module for integrating multi-omics data. Three different types of omics data are used. For each type of omics data, feature selection is performed using methods such as the chi-square test and minimum redundancy maximum relevance (mRMR). Weighted patient similarity networks are constructed based on the selected omics features, and GCN is trained using omics features and corresponding similarity networks. Finally, an attention module integrates different types of omics features and performs the final cancer classification prediction. To validate the cancer classification predictive performance of the MVGNN model, we conducted experimental comparisons with traditional machine learning models and currently popular methods based on integrating multi-omics data using 5-fold cross-validation. Additionally, we performed comparative experiments on cancer differentiation and its subtypes based on single omics data, two omics data, and three omics data. This paper proposed the MVGNN model and it performed well in cancer classification prediction based on multiple omics data.
作为评估指标,癌症分级和亚型具有多样的临床、病理和分子特征,对预后和治疗具有重要意义。尽管研究人员已开始研究癌症分化和亚型预测,但大多数相关方法基于传统机器学习,且依赖单一组学数据。有必要探索一种整合多组学数据的深度学习算法,以实现癌症分化和亚型的分类预测。本文提出了一种基于多视图图神经网络(MVGNN)的多组学数据融合算法,用于预测癌症分化和亚型分类。该模型框架由一个用于从不同组学数据学习特征的图卷积网络(GCN)模块和一个用于整合多组学数据的注意力模块组成。使用了三种不同类型的组学数据。对于每种类型的组学数据,使用卡方检验和最小冗余最大相关性(mRMR)等方法进行特征选择。基于所选的组学特征构建加权患者相似性网络,并使用组学特征和相应的相似性网络训练GCN。最后,一个注意力模块整合不同类型的组学特征,并进行最终的癌症分类预测。为了验证MVGNN模型的癌症分类预测性能,我们使用5折交叉验证与传统机器学习模型以及当前流行的基于整合多组学数据的方法进行了实验比较。此外,我们基于单一组学数据、两组学数据和三组学数据对癌症分化及其亚型进行了对比实验。本文提出的MVGNN模型在基于多组学数据的癌症分类预测中表现良好。