利用知识图嵌入发现蛋白药物靶点。

Discovering protein drug targets using knowledge graph embeddings.

机构信息

Data Science Institute, College of Engineering and Informatics.

Insight Centre for Data Analytics, NUI Galway, Galway, Ireland.

出版信息

Bioinformatics. 2020 Jan 15;36(2):603-610. doi: 10.1093/bioinformatics/btz600.

DOI:10.1093/bioinformatics/btz600

PMID:31368482

Abstract

MOTIVATION

Computational approaches for predicting drug-target interactions (DTIs) can provide valuable insights into the drug mechanism of action. DTI predictions can help to quickly identify new promising (on-target) or unintended (off-target) effects of drugs. However, existing models face several challenges. Many can only process a limited number of drugs and/or have poor proteome coverage. The current approaches also often suffer from high false positive prediction rates.

RESULTS

We propose a novel computational approach for predicting drug target proteins. The approach is based on formulating the problem as a link prediction in knowledge graphs (robust, machine-readable representations of networked knowledge). We use biomedical knowledge bases to create a knowledge graph of entities connected to both drugs and their potential targets. We propose a specific knowledge graph embedding model, TriModel, to learn vector representations (i.e. embeddings) for all drugs and targets in the created knowledge graph. These representations are consequently used to infer candidate drug target interactions based on their scores computed by the trained TriModel model. We have experimentally evaluated our method using computer simulations and compared it to five existing models. This has shown that our approach outperforms all previous ones in terms of both area under ROC and precision-recall curves in standard benchmark tests.

AVAILABILITY AND IMPLEMENTATION

The data, predictions and models are available at: drugtargets.insight-centre.org.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

计算方法预测药物-靶标相互作用（DTIs）可以为药物作用机制提供有价值的见解。DTI 预测有助于快速识别药物的新的有希望的（靶标）或意外的（非靶标）作用。然而，现有的模型面临着几个挑战。许多模型只能处理有限数量的药物，或者蛋白质组覆盖率低。当前的方法也经常受到高假阳性预测率的困扰。

结果

我们提出了一种新的预测药物靶标蛋白的计算方法。该方法基于将问题表述为知识图中的链接预测（网络知识的健壮、机器可读表示）。我们使用生物医学知识库创建一个连接药物及其潜在靶标的实体的知识图。我们提出了一种特定的知识图嵌入模型 TriModel，用于学习在创建的知识图中所有药物和靶标的向量表示（即嵌入）。然后，这些表示用于根据训练后的 TriModel 模型计算的得分推断候选药物-靶标相互作用。我们使用计算机模拟实验评估了我们的方法，并将其与五种现有模型进行了比较。这表明，在标准基准测试中，我们的方法在 ROC 曲线下面积和精度-召回率曲线方面均优于所有以前的方法。