School of Software, East China Jiaotong University, Nanchang, 330013, China.
School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
BMC Med Genomics. 2022 Mar 6;15(1):48. doi: 10.1186/s12920-022-01203-1.
Besides binding to proteins, the most recent advances in pharmacogenomics indicate drugs can regulate the expression of non-coding RNAs (ncRNAs). The polypharmacological feature in drugs enables us to find new uses for existing drugs (namely drug repositioning). However, current computational methods for drug repositioning mainly consider proteins as drug targets. Meanwhile, these methods identify only statistical relationships between drugs and diseases. They provide little information about how drug-disease associations are formed at the molecular target level.
Herein, we first comprehensively collect proteins and two categories of ncRNAs as drug targets from public databases to construct drug-target interactions. Experimentally confirmed drug-disease associations are downloaded from an established database. A canonical correlation analysis (CCA) based method is then applied to the two datasets to extract correlated sets of targets and diseases. The correlated sets are regarded as canonical components, and they are used to investigate drug's mechanism of actions. We finally develop a strategy to predict novel drug-disease associations for drug repositioning by combining all the extracted correlated sets.
We receive 400 canonical components which correlate targets with diseases in our study. We select 4 components for analysis and find some top-ranking diseases in an extracted set might be treated by drugs interfacing with the top-ranking targets in the same set. Experimental results from 10-fold cross-validations show integrating different categories of target information results in better prediction performance than only using proteins or ncRNAs as targets. When compared with 3 state-of-the-art approaches, our method receives the highest AUC value 0.8576. We use our method to predict new indications for 789 drugs and confirm 24 predictions in the top 1 predictions.
To the best of our knowledge, this is the first computational effort which combines both proteins and ncRNAs as drug targets for drug repositioning. Our study provides a biologically relevant interpretation regarding the forming of drug-disease associations, which is useful for guiding future biomedical tests.
除了与蛋白质结合外,药物基因组学的最新进展表明,药物可以调节非编码 RNA(ncRNA)的表达。药物的多药理学特性使我们能够为现有药物找到新用途(即药物再定位)。然而,当前用于药物再定位的计算方法主要将蛋白质视为药物靶标。同时,这些方法仅识别药物和疾病之间的统计关系。它们提供的关于药物 - 疾病关联如何在分子靶标水平上形成的信息很少。
在这里,我们首先从公共数据库中全面收集蛋白质和两类 ncRNA 作为药物靶标,以构建药物 - 靶标相互作用。从已建立的数据库中下载经过实验验证的药物 - 疾病关联。然后,应用基于典型相关分析(CCA)的方法对两个数据集进行分析,以提取相关的靶标和疾病集。相关集被视为典型组件,并用于研究药物的作用机制。最后,我们通过结合所有提取的相关集来开发一种用于药物再定位的预测新的药物 - 疾病关联的策略。
在我们的研究中,我们收到了 400 个与靶标和疾病相关的典型组件。我们选择了 4 个组件进行分析,发现提取的一组中排名较高的疾病可能是通过与同一组中排名较高的靶标相互作用的药物治疗的。来自 10 倍交叉验证的实验结果表明,整合不同类别的靶标信息的预测性能优于仅使用蛋白质或 ncRNA 作为靶标。与 3 种最先进的方法相比,我们的方法获得了最高的 AUC 值 0.8576。我们使用我们的方法预测了 789 种药物的新适应症,并在前 1 名预测中确认了 24 种预测。
据我们所知,这是首次将蛋白质和 ncRNA 联合作为药物靶标进行药物再定位的计算研究。我们的研究提供了关于药物 - 疾病关联形成的生物学上有意义的解释,这有助于指导未来的生物医学测试。