Gobbi Alberto, Lee Man-Ling
Anadys Pharmaceuticals Inc., San Diego, California 92121, USA.
J Chem Inf Comput Sci. 2003 Jan-Feb;43(1):317-23. doi: 10.1021/ci025554v.
The Sphere Exclusion algorithm is a well-known algorithm used to select diverse subsets from chemical-compound libraries or collections. It can be applied with any given distance measure between two structures. It is popular because of the intuitive geometrical interpretation of the method and its good performance on large data sets. This paper describes Directed Sphere Exclusion (DISE), a modification of the Sphere Exclusion algorithm, which retains all positive properties of the Sphere Exclusion algorithm but generates a more even distribution of the selected compounds in the chemical space. In addition, the computational requirement is significantly reduced, thus it can be applied to very large data sets.
球体排除算法是一种用于从化合物库或集合中选择多样子集的著名算法。它可以与任意给定的两个结构之间的距离度量一起使用。该方法因直观的几何解释及其在大数据集上的良好性能而广受欢迎。本文描述了定向球体排除算法(DISE),它是球体排除算法的一种改进,保留了球体排除算法的所有优点,但在化学空间中所选化合物的分布更加均匀。此外,计算需求显著降低,因此可应用于非常大的数据集。