Novartis Institutes for BioMedical Research, Novartis Pharma AG, Forum 1, Novartis Campus, CH-4056 Basel, Switzerland, and 250 Massachusetts Avenue, Cambridge, Massachusetts 02139, USA.
J Chem Inf Model. 2010 Dec 27;50(12):2067-78. doi: 10.1021/ci100203e. Epub 2010 Nov 12.
The main goal of high-throughput screening (HTS) is to identify active chemical series rather than just individual active compounds. In light of this goal, a new method (called compound set enrichment) to identify active chemical series from primary screening data is proposed. The method employs the scaffold tree compound classification in conjunction with the Kolmogorov-Smirnov statistic to assess the overall activity of a compound scaffold. The application of this method to seven PubChem data sets (containing between 9389 and 263679 molecules) is presented, and the ability of this method to identify compound classes with only weakly active compounds (potentially latent hits) is demonstrated. The analysis presented here shows how methods based on an activity cutoff can distort activity information, leading to the incorrect activity assignment of compound series. These results suggest that this method might have utility in the rational selection of active classes of compounds (and not just individual active compounds) for followup and validation.
高通量筛选(HTS)的主要目标是鉴定具有活性的化合物系列,而不仅仅是单个活性化合物。鉴于这一目标,提出了一种从初级筛选数据中鉴定活性化合物系列的新方法(称为化合物集富集)。该方法采用支架树化合物分类结合柯尔莫哥洛夫-斯米尔诺夫统计来评估化合物支架的整体活性。将该方法应用于七个 PubChem 数据集(包含 9389 到 263679 个分子),并展示了该方法识别仅有弱活性化合物(潜在命中)的化合物类别的能力。这里的分析表明,基于活性截止值的方法如何扭曲活性信息,导致化合物系列的错误活性分配。这些结果表明,该方法可能在合理选择具有活性的化合物类(而不仅仅是单个活性化合物)进行后续验证方面具有实用性。