Department of Computer Science, Radboud University Nijmegen, Nijmegen, The Netherlands.
Bioinformatics. 2011 Nov 1;27(21):3036-43. doi: 10.1093/bioinformatics/btr500. Epub 2011 Sep 4.
The in silico prediction of potential interactions between drugs and target proteins is of core importance for the identification of new drugs or novel targets for existing drugs. However, only a tiny portion of all drug-target pairs in current datasets are experimentally validated interactions. This motivates the need for developing computational methods that predict true interaction pairs with high accuracy.
We show that a simple machine learning method that uses the drug-target network as the only source of information is capable of predicting true interaction pairs with high accuracy. Specifically, we introduce interaction profiles of drugs (and of targets) in a network, which are binary vectors specifying the presence or absence of interaction with every target (drug) in that network. We define a kernel on these profiles, called the Gaussian Interaction Profile (GIP) kernel, and use a simple classifier, (kernel) Regularized Least Squares (RLS), for prediction drug-target interactions. We test comparatively the effectiveness of RLS with the GIP kernel on four drug-target interaction networks used in previous studies. The proposed algorithm achieves area under the precision-recall curve (AUPR) up to 92.7, significantly improving over results of state-of-the-art methods. Moreover, we show that using also kernels based on chemical and genomic information further increases accuracy, with a neat improvement on small datasets. These results substantiate the relevance of the network topology (in the form of interaction profiles) as source of information for predicting drug-target interactions.
Software and Supplementary Material are available at http://cs.ru.nl/~tvanlaarhoven/drugtarget2011/.
tvanlaarhoven@cs.ru.nl; elenam@cs.ru.nl.
Supplementary data are available at Bioinformatics online.
药物与靶蛋白之间潜在相互作用的计算预测对于鉴定新药或现有药物的新靶标至关重要。然而,当前数据集中只有一小部分药物-靶标对是经过实验验证的相互作用。这就需要开发能够高精度预测真实相互作用对的计算方法。
我们表明,一种仅使用药物-靶标网络作为唯一信息源的简单机器学习方法能够以高精度预测真实的相互作用对。具体来说,我们引入了药物(和靶标)在网络中的相互作用谱,这是一个二进制向量,指定了与网络中每个靶标(药物)的相互作用存在或不存在。我们定义了一个关于这些谱的核,称为高斯相互作用谱(GIP)核,并使用简单的分类器(核)正则化最小二乘法(RLS)进行药物-靶标相互作用的预测。我们在以前的研究中使用的四个药物-靶标相互作用网络上比较了 RLS 与 GIP 核的有效性。所提出的算法在精度-召回曲线下面积(AUPR)上达到 92.7,显著优于最先进方法的结果。此外,我们还表明,使用基于化学和基因组信息的核也可以提高准确性,在小数据集上有明显的改进。这些结果证实了网络拓扑(以相互作用谱的形式)作为预测药物-靶标相互作用的信息源的相关性。
软件和补充材料可在 http://cs.ru.nl/~tvanlaarhoven/drugtarget2011/ 获得。
tvanlaarhoven@cs.ru.nl; elenam@cs.ru.nl。
补充数据可在生物信息学在线获得。