Wawer Mathias J, Jaramillo David E, Dančík Vlado, Fass Daniel M, Haggarty Stephen J, Shamji Alykhan F, Wagner Bridget K, Schreiber Stuart L, Clemons Paul A
Center for the Science of Therapeutics, Broad Institute, Cambridge, MA, USA.
Center for the Science of Therapeutics, Broad Institute, Cambridge, MA, USA Mathematical Institute of the Slovak Academy of Sciences, Košice, Slovakia (on leave).
J Biomol Screen. 2014 Jun;19(5):738-48. doi: 10.1177/1087057114530783. Epub 2014 Apr 7.
Understanding the structure-activity relationships (SARs) of small molecules is important for developing probes and novel therapeutic agents in chemical biology and drug discovery. Increasingly, multiplexed small-molecule profiling assays allow simultaneous measurement of many biological response parameters for the same compound (e.g., expression levels for many genes or binding constants against many proteins). Although such methods promise to capture SARs with high granularity, few computational methods are available to support SAR analyses of high-dimensional compound activity profiles. Many of these methods are not generally applicable or reduce the activity space to scalar summary statistics before establishing SARs. In this article, we present a versatile computational method that automatically extracts interpretable SAR rules from high-dimensional profiling data. The rules connect chemical structural features of compounds to patterns in their biological activity profiles. We applied our method to data from novel cell-based gene-expression and imaging assays collected on more than 30,000 small molecules. Based on the rules identified for this data set, we prioritized groups of compounds for further study, including a novel set of putative histone deacetylase inhibitors.
了解小分子的构效关系(SARs)对于化学生物学和药物研发中开发探针及新型治疗药物至关重要。越来越多的多重小分子分析方法能够同时测量同一化合物的多个生物学反应参数(例如,多个基因的表达水平或与多种蛋白质的结合常数)。尽管此类方法有望以高分辨率捕捉构效关系,但用于支持高维化合物活性谱的构效关系分析的计算方法却很少。其中许多方法并不普遍适用,或者在建立构效关系之前就将活性空间简化为标量汇总统计量。在本文中,我们提出了一种通用的计算方法,该方法可从高维分析数据中自动提取可解释的构效关系规则。这些规则将化合物的化学结构特征与其生物活性谱中的模式联系起来。我们将我们的方法应用于从基于细胞的新型基因表达和成像分析中收集的超过30,000种小分子的数据。基于为该数据集确定的规则,我们对化合物组进行了优先级排序,以便进一步研究,包括一组新的假定组蛋白脱乙酰酶抑制剂。