School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China.
Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China.
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae168.
Identifying disease-associated microRNAs (miRNAs) could help understand the deep mechanism of diseases, which promotes the development of new medicine. Recently, network-based approaches have been widely proposed for inferring the potential associations between miRNAs and diseases. However, these approaches ignore the importance of different relations in meta-paths when learning the embeddings of miRNAs and diseases. Besides, they pay little attention to screening out reliable negative samples which is crucial for improving the prediction accuracy. In this study, we propose a novel approach named MGCNSS with the multi-layer graph convolution and high-quality negative sample selection strategy. Specifically, MGCNSS first constructs a comprehensive heterogeneous network by integrating miRNA and disease similarity networks coupled with their known association relationships. Then, we employ the multi-layer graph convolution to automatically capture the meta-path relations with different lengths in the heterogeneous network and learn the discriminative representations of miRNAs and diseases. After that, MGCNSS establishes a highly reliable negative sample set from the unlabeled sample set with the negative distance-based sample selection strategy. Finally, we train MGCNSS under an unsupervised learning manner and predict the potential associations between miRNAs and diseases. The experimental results fully demonstrate that MGCNSS outperforms all baseline methods on both balanced and imbalanced datasets. More importantly, we conduct case studies on colon neoplasms and esophageal neoplasms, further confirming the ability of MGCNSS to detect potential candidate miRNAs. The source code is publicly available on GitHub https://github.com/15136943622/MGCNSS/tree/master.
识别与疾病相关的 microRNAs(miRNAs)可以帮助理解疾病的深层机制,从而促进新药的开发。最近,基于网络的方法被广泛提出,用于推断 miRNAs 和疾病之间的潜在关联。然而,这些方法在学习 miRNAs 和疾病的嵌入时忽略了元路径中不同关系的重要性。此外,它们很少关注筛选出可靠的负样本,这对于提高预测准确性至关重要。在本研究中,我们提出了一种名为 MGCNSS 的新方法,该方法结合了多层图卷积和高质量负样本筛选策略。具体来说,MGCNSS 首先通过整合 miRNA 和疾病相似性网络以及它们已知的关联关系来构建一个全面的异构网络。然后,我们采用多层图卷积自动捕获异构网络中具有不同长度的元路径关系,并学习 miRNAs 和疾病的有区分性表示。之后,MGCNSS 采用基于负距离的负样本筛选策略从未标记的样本集中建立一个高度可靠的负样本集。最后,我们在无监督学习的方式下训练 MGCNSS,并预测 miRNAs 和疾病之间的潜在关联。实验结果充分表明,MGCNSS 在平衡和不平衡数据集上均优于所有基线方法。更重要的是,我们对结肠癌和食管癌进行了案例研究,进一步证实了 MGCNSS 检测潜在候选 miRNAs 的能力。该代码可在 GitHub 上公开获取:https://github.com/15136943622/MGCNSS/tree/master。