School of Pharmaceutical Sciences, Central South University, Changsha 410013, PR China.
Anal Chim Acta. 2013 Aug 20;792:10-8. doi: 10.1016/j.aca.2013.07.003. Epub 2013 Jul 10.
The kinase family is one of the largest target families in the human genome. The family's key function in signal transduction for all organisms makes it a very attractive target class for the therapeutic interventions in many diseases states such as cancer, diabetes, inflammation and arthritis. A first step toward accelerating kinase drug discovery process is to fast identify whether a chemical and a kinase interact or not. Experimentally, these interactions can be identified by in vitro binding assay - an expensive and laborious procedure that is not applicable on a large scale. Therefore, there is an urgent need to develop statistically efficient approaches for identifying kinase-inhibitor interactions. For the first time, the quantitative binding affinities of kinase-inhibitor pairs are differentiated as a measurement to define if an inhibitor interacts with a kinase, and then a chemogenomics framework using an unbiased set of general integrated features (drug descriptors and protein descriptors) and random forest (RF) is employed to construct a predictive model which can accurately classify kinase-inhibitor pairs. Our results show that RF with integrated features gave prediction accuracy of 93.76%, sensitivity of 92.26%, and specificity of 95.27%, respectively. The results are superior to those by only considering two separated spaces (chemical space and protein space), demonstrating that these integrated features contribute cooperatively. Based on the constructed model, we provided a high confidence list of drug-target associations for subsequent experimental investigation guidance at a low false discovery rate.
激酶家族是人类基因组中最大的靶标家族之一。该家族在所有生物体的信号转导中的关键功能使其成为许多疾病状态(如癌症、糖尿病、炎症和关节炎)中治疗干预的非常有吸引力的靶标类别。加速激酶药物发现过程的第一步是快速确定一种化学物质和一种激酶是否相互作用。在实验中,这些相互作用可以通过体外结合测定来识别 - 这是一种昂贵且费力的程序,不适用于大规模应用。因此,迫切需要开发用于识别激酶 - 抑制剂相互作用的统计高效方法。首次将激酶 - 抑制剂对的定量结合亲和力作为一种度量标准来定义抑制剂是否与激酶相互作用,然后使用无偏的通用综合特征(药物描述符和蛋白质描述符)和随机森林(RF)的化学生物组学框架来构建一个预测模型,该模型可以准确地对激酶 - 抑制剂对进行分类。我们的结果表明,集成特征的 RF 分别给出了 93.76%的预测准确性、92.26%的灵敏度和 95.27%的特异性。结果优于仅考虑两个分离空间(化学空间和蛋白质空间)的结果,表明这些综合特征协同作用。基于构建的模型,我们提供了一个具有高置信度的药物 - 靶标关联列表,以在低假阳性率下指导后续的实验研究。