School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China.
School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, China.
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab165.
Accurate identification of the miRNA-disease associations (MDAs) helps to understand the etiology and mechanisms of various diseases. However, the experimental methods are costly and time-consuming. Thus, it is urgent to develop computational methods towards the prediction of MDAs. Based on the graph theory, the MDA prediction is regarded as a node classification task in the present study. To solve this task, we propose a novel method MDA-GCNFTG, which predicts MDAs based on Graph Convolutional Networks (GCNs) via graph sampling through the Feature and Topology Graph to improve the training efficiency and accuracy. This method models both the potential connections of feature space and the structural relationships of MDA data. The nodes of the graphs are represented by the disease semantic similarity, miRNA functional similarity and Gaussian interaction profile kernel similarity. Moreover, we considered six tasks simultaneously on the MDA prediction problem at the first time, which ensure that under both balanced and unbalanced sample distribution, MDA-GCNFTG can predict not only new MDAs but also new diseases without known related miRNAs and new miRNAs without known related diseases. The results of 5-fold cross-validation show that the MDA-GCNFTG method has achieved satisfactory performance on all six tasks and is significantly superior to the classic machine learning methods and the state-of-the-art MDA prediction methods. Moreover, the effectiveness of GCNs via the graph sampling strategy and the feature and topology graph in MDA-GCNFTG has also been demonstrated. More importantly, case studies for two diseases and three miRNAs are conducted and achieved satisfactory performance.
准确识别 miRNA-疾病关联 (miRNA-disease associations, MDAs) 有助于理解各种疾病的病因和机制。然而,实验方法既昂贵又耗时。因此,迫切需要开发计算方法来预测 MDAs。本研究基于图论,将 MDA 预测视为节点分类任务。为了解决这个任务,我们提出了一种新的方法 MDA-GCNFTG,它通过图采样基于图卷积网络 (Graph Convolutional Networks, GCNs) 进行预测,通过特征和拓扑图来提高训练效率和准确性。该方法同时对特征空间的潜在连接和 MDA 数据的结构关系进行建模。图的节点由疾病语义相似性、miRNA 功能相似性和高斯互作用谱核相似性表示。此外,我们首次在 MDA 预测问题上同时考虑了六个任务,这确保了在平衡和不平衡样本分布下,MDA-GCNFTG 不仅可以预测新的 MDAs,还可以预测新的疾病(没有已知相关 miRNA)和新的 miRNA(没有已知相关疾病)。五折交叉验证的结果表明,MDA-GCNFTG 方法在所有六个任务上都取得了令人满意的性能,明显优于经典机器学习方法和最新的 MDA 预测方法。此外,还证明了 MDA-GCNFTG 中通过图采样策略和特征及拓扑图的 GCNs 的有效性。更重要的是,对两种疾病和三种 miRNA 进行了案例研究,取得了令人满意的性能。