Dai X U, Jing Run Yu, Guo Yanzhi, Dong Yong Cheng, Wang Yue Long, Liu Yuan, Pu XueMei, Li Menglong
College of Chemistry, Sichuan University, Chengdu, Sichuan 610064, PR China.
Curr Pharm Des. 2015;21(21):3051-61. doi: 10.2174/1381612821666150309143106.
Protein-protein interactions (PPIs) are becoming highly attractive targets for drug discovery. Motivated by the rapid accumulation of PPI data in public database and the success stories concerning the targeting of PPIs, a machine-learning method based on sequence and structure properties was developed to access the druggability of PPIs. Here, a comprehensive non-redundant set of 34 druggable and 122 less druggable PPIs were firstly presented from the perspective of pockets. When tested by outer 5-fold cross-validation, the most representative model in discriminating the druggable PPIs from the less-druggable ones yielded an average accuracy of 88.24% (sensitivity of 82.38% and specificity of 92.00%). Moreover, a promising result was also obtained for the independent test set. Compared to other methods, the method gives a comparative performance, which is most likely due to the construction of a training set that encompasses less druggable PPIs and also the information of active pockets that have evolved to bind a natural ligand.
蛋白质-蛋白质相互作用(PPIs)正成为药物研发极具吸引力的靶点。受公共数据库中PPI数据的快速积累以及靶向PPIs的成功案例的推动,开发了一种基于序列和结构特性的机器学习方法来评估PPIs的成药潜力。在此,首先从口袋的角度给出了一组由34个可成药和122个难成药的PPIs组成的全面的非冗余集合。当通过外部5折交叉验证进行测试时,在区分可成药PPIs和难成药PPIs方面最具代表性的模型平均准确率为88.24%(敏感性为82.38%,特异性为92.00%)。此外,独立测试集也获得了有前景的结果。与其他方法相比,该方法具有相当的性能,这很可能是由于构建了一个包含难成药PPIs以及已进化为结合天然配体的活性口袋信息的训练集。