Keum Jongsoo, Nam Hojung
School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 123 Cheomdangwgi-ro, Buk-gu, Gwangju, Republic of Korea.
PLoS One. 2017 Feb 13;12(2):e0171839. doi: 10.1371/journal.pone.0171839. eCollection 2017.
Predicting drug-target interactions is important for the development of novel drugs and the repositioning of drugs. To predict such interactions, there are a number of methods based on drug and target protein similarity. Although these methods, such as the bipartite local model (BLM), show promise, they often categorize unknown interactions as negative interaction. Therefore, these methods are not ideal for finding potential drug-target interactions that have not yet been validated as positive interactions. Thus, here we propose a method that integrates machine learning techniques, such as self-training support vector machine (SVM) and BLM, to develop a self-training bipartite local model (SELF-BLM) that facilitates the identification of potential interactions. The method first categorizes unlabeled interactions and negative interactions among unknown interactions using a clustering method. Then, using the BLM method and self-training SVM, the unlabeled interactions are self-trained and final local classification models are constructed. When applied to four classes of proteins that include enzymes, G-protein coupled receptors (GPCRs), ion channels, and nuclear receptors, SELF-BLM showed the best performance for predicting not only known interactions but also potential interactions in three protein classes compare to other related studies. The implemented software and supporting data are available at https://github.com/GIST-CSBL/SELF-BLM.
预测药物与靶点的相互作用对于新型药物的研发和药物重新定位至关重要。为了预测此类相互作用,有许多基于药物和靶点蛋白相似性的方法。尽管这些方法,如二分局部模型(BLM),显示出了前景,但它们常常将未知相互作用归类为负相互作用。因此,这些方法对于寻找尚未被验证为正相互作用的潜在药物 - 靶点相互作用并不理想。因此,在此我们提出一种方法,该方法整合了机器学习技术,如自训练支持向量机(SVM)和BLM,以开发一种自训练二分局部模型(SELF - BLM),便于识别潜在相互作用。该方法首先使用聚类方法对未知相互作用中的未标记相互作用和负相互作用进行分类。然后,使用BLM方法和自训练SVM,对未标记相互作用进行自训练并构建最终的局部分类模型。当应用于包括酶、G蛋白偶联受体(GPCR)、离子通道和核受体在内的四类蛋白质时,与其他相关研究相比,SELF - BLM在预测三类蛋白质中的已知相互作用和潜在相互作用方面表现出最佳性能。已实现的软件和支持数据可在https://github.com/GIST - CSBL/SELF - BLM获取。