Mackey Mark D, Melville James L
Cresset BioMolecular Discovery Ltd., BioPark Hertfordshire, Welwyn Garden City, Hertfordshire AL7 3AX, United Kingdom.
J Chem Inf Model. 2009 May;49(5):1154-62. doi: 10.1021/ci8003978.
Chemotype enrichment is increasingly recognized as an important measure of virtual screening performance. However, little attention has been paid to producing metrics which can quantify chemotype retrieval. Here, we examine two different protocols for analyzing chemotype retrieval: "cluster averaging", where the contribution of each active to the scoring metric is proportional to the number of other actives with the same chemotype, and "first found", where only the first active for a given chemotype contributes to the score. We demonstrate that this latter analysis, common in the qualitative analysis used in the current literature, has important drawbacks when combined with quantitative metrics.
化学型富集越来越被认为是虚拟筛选性能的一项重要指标。然而,对于能够量化化学型检索的指标却很少有人关注。在此,我们研究了两种分析化学型检索的不同方案:“聚类平均”,即每个活性化合物对评分指标的贡献与具有相同化学型的其他活性化合物的数量成正比;以及“首次发现”,即给定化学型中只有第一个活性化合物对分数有贡献。我们证明,后一种分析方法在当前文献的定性分析中很常见,但与定量指标结合时存在重要缺陷。