Institut de Chimie Organique et Analytique (ICOA), Université d'Orléans-CNRS, UMR 7311 B.P. 6759 Rue de Chartres, 45067 Orléans Cedex 2, France.
J Chem Inf Model. 2012 Feb 27;52(2):327-42. doi: 10.1021/ci200535y. Epub 2012 Jan 5.
High Throughput Screening (HTS) is a standard technique widely used to find hit compounds in drug discovery projects. The high costs associated with such experiments have highlighted the need to carefully design screening libraries in order to avoid wasting resources. Molecular diversity is an established concept that has been used to this end for many years. In this article, a new approach to quantify the molecular diversity of screening libraries is presented. The approach is based on the Delimited Reference Chemical Subspace (DRCS) methodology, a new method that can be used to delimit the densest subspace spanned by a reference library in a reduced 2D continuous space. A total of 22 diversity indices were implemented or adapted to this methodology, which is used here to remove outliers and obtain a relevant cell-based partition of the subspace. The behavior of these indices was assessed and compared in various extreme situations and with respect to a set of theoretical rules that a diversity function should satisfy when libraries of different sizes have to be compared. Some gold standard indices are found inappropriate in such a context, while none of the tested indices behave perfectly in all cases. Five DRCS-based indices accounting for different aspects of diversity were finally selected, and a simple framework is proposed to use them effectively. Various libraries have been profiled with respect to more specific subspaces, which further illustrate the interest of the method.
高通量筛选 (HTS) 是一种广泛用于药物发现项目中寻找命中化合物的标准技术。此类实验的高成本突出表明需要仔细设计筛选库,以避免浪费资源。分子多样性是一个已被多年来用于此目的的既定概念。本文提出了一种新的方法来量化筛选库的分子多样性。该方法基于限定参考化学子空间 (DRCS) 方法,这是一种新的方法,可用于在减少的二维连续空间中限定参考库所跨越的最密集子空间。总共实现或改编了 22 种多样性指数,用于该方法,用于在此处剔除异常值并获得子空间的相关基于细胞的分区。评估了这些指标在各种极端情况下的行为,并与一套理论规则进行了比较,当必须比较不同大小的库时,多样性函数应满足这些规则。在这种情况下,发现一些黄金标准指数不合适,而测试的指数在所有情况下都没有完美表现。最终选择了五个基于 DRCS 的指数,这些指数分别代表多样性的不同方面,并提出了一个简单的框架来有效地使用它们。已经针对更具体的子空间对各种库进行了分析,这进一步说明了该方法的意义。