Wang Qi, Feng YangHe, Huang JinCai, Wang TengJiao, Cheng GuangQuan
Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, Hunan, China.
Second Medical Military University, Shanghai, China.
PLoS One. 2017 Apr 28;12(4):e0176486. doi: 10.1371/journal.pone.0176486. eCollection 2017.
The identification of drug target proteins (IDTP) plays a critical role in biometrics. The aim of this study was to retrieve potential drug target proteins (DTPs) from a collected protein dataset, which represents an overwhelming task of great significance. Previously reported methodologies for this task generally employ protein-protein interactive networks but neglect informative biochemical attributes. We formulated a novel framework utilizing biochemical attributes to address this problem. In the framework, a biased support vector machine (BSVM) was combined with the deep embedded representation extracted using a deep learning model, stacked auto-encoders (SAEs). In cases of non-drug target proteins (NDTPs) contaminated by DTPs, the framework is beneficial due to the efficient representation of the SAE and relief of the imbalance effect by the BSVM. The experimental results demonstrated the effectiveness of our framework, and the generalization capability was confirmed via comparisons to other models. This study is the first to exploit a deep learning model for IDTP. In summary, nearly 23% of the NDTPs were predicted as likely DTPs, which are awaiting further verification based on biomedical experiments.
药物靶蛋白识别(IDTP)在生物识别中起着关键作用。本研究的目的是从收集的蛋白质数据集中检索潜在的药物靶蛋白(DTP),这是一项极具意义的艰巨任务。此前针对该任务报道的方法通常采用蛋白质-蛋白质相互作用网络,但忽略了信息丰富的生化属性。我们制定了一个利用生化属性来解决此问题的新颖框架。在该框架中,将有偏支持向量机(BSVM)与使用深度学习模型堆叠自编码器(SAE)提取的深度嵌入表示相结合。在存在被DTP污染的非药物靶蛋白(NDTP)的情况下,由于SAE的有效表示以及BSVM对不平衡效应的缓解,该框架具有优势。实验结果证明了我们框架的有效性,并通过与其他模型的比较证实了其泛化能力。本研究首次利用深度学习模型进行IDTP。总之,近23%的NDTP被预测为可能的DTP,有待基于生物医学实验进行进一步验证。