School of Computer Science and Engineering, Central South University, 410075, Changsha, China.
School of software, Xinjiang University, 830046, Urumqi, China.
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae231.
Drug repurposing offers a viable strategy for discovering new drugs and therapeutic targets through the analysis of drug-gene interactions. However, traditional experimental methods are plagued by their costliness and inefficiency. Despite graph convolutional network (GCN)-based models' state-of-the-art performance in prediction, their reliance on supervised learning makes them vulnerable to data sparsity, a common challenge in drug discovery, further complicating model development. In this study, we propose SGCLDGA, a novel computational model leveraging graph neural networks and contrastive learning to predict unknown drug-gene associations. SGCLDGA employs GCNs to extract vector representations of drugs and genes from the original bipartite graph. Subsequently, singular value decomposition (SVD) is employed to enhance the graph and generate multiple views. The model performs contrastive learning across these views, optimizing vector representations through a contrastive loss function to better distinguish positive and negative samples. The final step involves utilizing inner product calculations to determine association scores between drugs and genes. Experimental results on the DGIdb4.0 dataset demonstrate SGCLDGA's superior performance compared with six state-of-the-art methods. Ablation studies and case analyses validate the significance of contrastive learning and SVD, highlighting SGCLDGA's potential in discovering new drug-gene associations. The code and dataset for SGCLDGA are freely available at https://github.com/one-melon/SGCLDGA.
药物重定位通过分析药物-基因相互作用,为发现新药和治疗靶点提供了一种可行的策略。然而,传统的实验方法存在成本高和效率低的问题。尽管基于图卷积网络(GCN)的模型在预测方面表现出色,但它们对监督学习的依赖使得它们容易受到数据稀疏性的影响,这是药物发现中的一个常见挑战,进一步增加了模型开发的复杂性。在本研究中,我们提出了 SGCLDGA,这是一种利用图神经网络和对比学习来预测未知药物-基因关联的新型计算模型。SGCLDGA 使用 GCN 从原始二分图中提取药物和基因的向量表示。随后,奇异值分解(SVD)用于增强图并生成多个视图。该模型在这些视图之间进行对比学习,通过对比损失函数优化向量表示,以更好地区分正样本和负样本。最后一步涉及利用内积计算来确定药物和基因之间的关联分数。在 DGIdb4.0 数据集上的实验结果表明,SGCLDGA 的性能优于六种最先进的方法。消融研究和案例分析验证了对比学习和 SVD 的重要性,突出了 SGCLDGA 在发现新的药物-基因关联方面的潜力。SGCLDGA 的代码和数据集可在 https://github.com/one-melon/SGCLDGA 上免费获取。