Sawada Ryusuke, Kotera Masaaki, Yamanishi Yoshihiro
Division of System Cohort, Multi-scale Research Center for Medical Science, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka, Fukuoka 812-8582, Japan phone/fax:+81-92-642-6699/+81-92-642-6692.
Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan.
Mol Inform. 2014 Dec;33(11-12):719-31. doi: 10.1002/minf.201400066. Epub 2014 Nov 24.
The identification of drug-target interactions, or interactions between drug candidate compounds and target candidate proteins, is a crucial process in genomic drug discovery. In silico chemogenomic methods are recently recognized as a promising approach for genome-wide scale prediction of drug-target interactions, but the prediction performance depends heavily on the descriptors and similarity measures of drugs and proteins. In this paper, we investigated the performance of various descriptors and similarity measures of drugs and proteins for the drug-target interaction prediction using a chemogenomic approach. We compared the prediction accuracy of 18 chemical descriptors of drugs (e.g., ECFP, FCFP,E-state, CDK, KlekotaRoth, MACCS, PubChem, Dragon, KCF-S, and graph kernels) and 4 descriptors of proteins (e.g., amino acid composition, domain profile, local sequence similarity, and string kernel) on about one hundred thousand drug-target interactions. We examined the combinatorial effects of drug descriptors and protein descriptors using the same benchmark data under several experimental conditions. Large-scale experiments showed that our proposed KCF-S descriptor worked the best in terms of prediction accuracy. The comparative results are expected to be useful for selecting chemical descriptors in various pharmaceutical applications.
药物-靶点相互作用的识别,即候选药物化合物与候选靶点蛋白之间的相互作用,是基因组药物发现中的一个关键过程。计算机化学基因组学方法最近被认为是一种在全基因组范围内预测药物-靶点相互作用的有前景的方法,但其预测性能在很大程度上取决于药物和蛋白质的描述符及相似性度量。在本文中,我们使用化学基因组学方法研究了各种药物和蛋白质描述符及相似性度量在药物-靶点相互作用预测中的性能。我们比较了18种药物化学描述符(如ECFP、FCFP、E态、CDK、KlekotaRoth、MACCS、PubChem、Dragon、KCF-S和图核)和4种蛋白质描述符(如氨基酸组成、结构域概况、局部序列相似性和弦核)在约十万个药物-靶点相互作用上的预测准确性。我们在几种实验条件下,使用相同的基准数据研究了药物描述符和蛋白质描述符的组合效应。大规模实验表明,我们提出的KCF-S描述符在预测准确性方面表现最佳。这些比较结果有望对各种药物应用中化学描述符的选择有所帮助。