Department of Pathology and Immunology , Washington University in St. Louis , Saint Louis , Missouri 63110 , United States.
Department of Pathology , Brigham and Women's Hospital , Boston , Massachusetts 02115 , United States.
J Chem Inf Model. 2018 Aug 27;58(8):1483-1500. doi: 10.1021/acs.jcim.8b00104. Epub 2018 Jul 23.
Scientists rely on high-throughput screening tools to identify promising small-molecule compounds for the development of biochemical probes and drugs. This study focuses on the identification of promiscuous bioactive compounds, which are compounds that appear active in many high-throughput screening experiments against diverse targets but are often false-positives which may not be easily developed into successful probes. These compounds can exhibit bioactivity due to nonspecific, intractable mechanisms of action and/or by interference with specific assay technology readouts. Such "frequent hitters" are now commonly identified using substructure filters, including pan assay interference compounds (PAINS). Herein, we show that mechanistic modeling of small-molecule reactivity using deep learning can improve upon PAINS filters when modeling promiscuous bioactivity in PubChem assays. Without training on high-throughput screening data, a deep learning model of small-molecule reactivity achieves a sensitivity and specificity of 18.5% and 95.5%, respectively, in identifying promiscuous bioactive compounds. This performance is similar to PAINS filters, which achieve a sensitivity of 20.3% at the same specificity. Importantly, such reactivity modeling is complementary to PAINS filters. When PAINS filters and reactivity models are combined, the resulting model outperforms either method alone, achieving a sensitivity of 24% at the same specificity. However, as a probabilistic model, the sensitivity and specificity of the deep learning model can be tuned by adjusting the threshold. Moreover, for a subset of PAINS filters, this reactivity model can help discriminate between promiscuous and nonpromiscuous bioactive compounds even among compounds matching those filters. Critically, the reactivity model provides mechanistic hypotheses for assay interference by predicting the precise atoms involved in compound reactivity. Overall, our analysis suggests that deep learning approaches to modeling promiscuous compound bioactivity may provide a complementary approach to current methods for identifying promiscuous compounds.
科学家们依赖高通量筛选工具来识别有前途的小分子化合物,以开发生化探针和药物。本研究侧重于识别混杂的生物活性化合物,这些化合物在针对多种靶标的许多高通量筛选实验中表现出活性,但往往是假阳性,可能不容易开发成成功的探针。这些化合物可能由于非特异性、难以解决的作用机制和/或与特定测定技术读数的干扰而表现出生物活性。现在,通常使用亚结构过滤器(包括泛分析干扰化合物 (PAINS))来识别这些“频繁敲击者”。本文中,我们表明,使用深度学习对小分子反应性进行机制建模可以在对 PubChem 测定中的混杂生物活性进行建模时改进 PAINS 过滤器。在没有对高通量筛选数据进行训练的情况下,小分子反应性的深度学习模型在识别混杂生物活性化合物方面分别实现了 18.5%和 95.5%的灵敏度和特异性。这种性能与 PAINS 过滤器相似,PAINS 过滤器在相同特异性下的灵敏度为 20.3%。重要的是,这种反应性建模与 PAINS 过滤器互补。当 PAINS 过滤器和反应性模型结合使用时,所得到的模型的性能优于单独使用任何一种方法,在相同特异性下达到 24%的灵敏度。然而,作为一种概率模型,深度学习模型的灵敏度和特异性可以通过调整阈值来调整。此外,对于 PAINS 过滤器的一个子集,即使对于匹配这些过滤器的化合物,该反应性模型也可以帮助区分混杂和非混杂的生物活性化合物。至关重要的是,该反应性模型通过预测化合物反应性中涉及的精确原子,为测定干扰提供了机制假设。总体而言,我们的分析表明,用于模拟混杂化合物生物活性的深度学习方法可能为识别混杂化合物提供一种互补方法。