Kuthuru Srikanth, Szafran Adam T, Stossi Fabio, Mancini Michael A, Rao Arvind
Department of Electrical and Computer Engineering, Rice University, Houston, TX, USA.
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
Cancer Inform. 2019 Jun 12;18:1176935119856595. doi: 10.1177/1176935119856595. eCollection 2019.
In recent years, protein kinases have become some of the most significant drug targets in cancer patients. Kinases are known to regulate the activity of many human proteins, and consequently their inhibition has been used to control cancer proliferation. A significant challenge in drug discovery is the rapid and efficient identification of new small molecules. In this study, we propose a novel in silico drug discovery approach to identify kinase targets that impinge on nuclear receptor signaling with data generated using high-content analysis (HCA). A high-throughput imaging dataset was generated from an siRNA human kinome screen on engineered cells that allow direct visualization of effects on estrogen receptor-α or a chimeric progesterone receptor B binding to specific DNA. Two types of kinase descriptors are extracted from these imaging data: first, a population-median-based descriptor and second a bag-of-words (BoW) descriptor that can capture heterogeneity information in the imaging data. Using these descriptors, we provide prediction results of drug-kinase-target interactions based on single-task learning, multi-task learning, and collaborative filtering methods. The best performing model in target-based drug discovery gives an area under the receiver operating characteristic curve (AUC) of 0.86, whereas the best model in ligand-based discovery gives an AUC of 0.79. These promising results suggest that imaging-based information can be used as an additional source of information to existing virtual screening methods, thereby making the drug discovery process more time and cost efficient.
近年来,蛋白激酶已成为癌症患者最重要的一些药物靶点。激酶已知可调节许多人类蛋白质的活性,因此对其抑制已被用于控制癌症增殖。药物研发中的一个重大挑战是快速有效地识别新的小分子。在本研究中,我们提出了一种新颖的计算机辅助药物研发方法,利用高内涵分析(HCA)生成的数据来识别影响核受体信号传导的激酶靶点。通过对工程细胞进行siRNA人类激酶组筛选生成了一个高通量成像数据集,这些工程细胞能够直接观察对雌激素受体-α或嵌合孕激素受体B与特定DNA结合的影响。从这些成像数据中提取了两种类型的激酶描述符:第一,基于总体中位数的描述符;第二,能够捕捉成像数据中异质性信息的词袋(BoW)描述符。使用这些描述符,我们基于单任务学习、多任务学习和协同过滤方法提供了药物-激酶-靶点相互作用的预测结果。在基于靶点的药物研发中表现最佳的模型在受试者工作特征曲线(AUC)下的面积为0.86,而在基于配体的研发中最佳模型的AUC为0.79。这些有前景的结果表明,基于成像的信息可作为现有虚拟筛选方法的额外信息来源,从而使药物研发过程更具时间和成本效益。