The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China.
University of Chinese Academy of Sciences, Beijing, China.
BMC Bioinformatics. 2022 Dec 1;23(1):516. doi: 10.1186/s12859-022-05069-z.
Drug repositioning is a very important task that provides critical information for exploring the potential efficacy of drugs. Yet developing computational models that can effectively predict drug-disease associations (DDAs) is still a challenging task. Previous studies suggest that the accuracy of DDA prediction can be improved by integrating different types of biological features. But how to conduct an effective integration remains a challenging problem for accurately discovering new indications for approved drugs.
In this paper, we propose a novel meta-path based graph representation learning model, namely RLFDDA, to predict potential DDAs on heterogeneous biological networks. RLFDDA first calculates drug-drug similarities and disease-disease similarities as the intrinsic biological features of drugs and diseases. A heterogeneous network is then constructed by integrating DDAs, disease-protein associations and drug-protein associations. With such a network, RLFDDA adopts a meta-path random walk model to learn the latent representations of drugs and diseases, which are concatenated to construct joint representations of drug-disease associations. As the last step, we employ the random forest classifier to predict potential DDAs with their joint representations.
To demonstrate the effectiveness of RLFDDA, we have conducted a series of experiments on two benchmark datasets by following a ten-fold cross-validation scheme. The results show that RLFDDA yields the best performance in terms of AUC and F1-score when compared with several state-of-the-art DDAs prediction models. We have also conducted a case study on two common diseases, i.e., paclitaxel and lung tumors, and found that 7 out of top-10 diseases and 8 out of top-10 drugs have already been validated for paclitaxel and lung tumors respectively with literature evidence. Hence, the promising performance of RLFDDA may provide a new perspective for novel DDAs discovery over heterogeneous networks.
药物重定位是一项非常重要的任务,可为探索药物的潜在疗效提供关键信息。然而,开发能够有效预测药物-疾病关联(DDA)的计算模型仍然是一项具有挑战性的任务。先前的研究表明,通过整合不同类型的生物特征,可以提高 DDA 预测的准确性。但是,如何进行有效的整合仍然是一个具有挑战性的问题,需要准确地发现已批准药物的新适应症。
在本文中,我们提出了一种新颖的基于元路径的图表示学习模型 RLFDDA,用于在异构生物网络上预测潜在的 DDA。RLFDDA 首先计算药物-药物相似性和疾病-疾病相似性,作为药物和疾病的内在生物特征。然后,通过整合 DDA、疾病-蛋白质关联和药物-蛋白质关联来构建异构网络。通过这样的网络,RLFDDA 采用元路径随机游走模型来学习药物和疾病的潜在表示,将这些表示串联起来构建药物-疾病关联的联合表示。最后,我们使用随机森林分类器来预测具有联合表示的潜在 DDA。
为了证明 RLFDDA 的有效性,我们在两个基准数据集上进行了一系列实验,采用了十折交叉验证方案。结果表明,与几种先进的 DDA 预测模型相比,RLFDDA 在 AUC 和 F1 得分方面表现最佳。我们还对两种常见疾病(即紫杉醇和肺癌)进行了案例研究,发现前 10 种疾病中有 7 种和前 10 种药物中有 8 种已经有文献证据表明与紫杉醇和肺癌有关。因此,RLFDDA 的有前途的性能可能为在异构网络上发现新的 DDA 提供新的视角。