Molecular Modeling and Drug Design Lab and the Alex Grass Center for Drug Design and Synthesis, Institute of Drug Research, The Hebrew University of Jerusalem, Israel.
J Chem Inf Model. 2010 Mar 22;50(3):437-45. doi: 10.1021/ci9004354.
Integration of computational methods in the early stages of drug discovery has been one of the key trends in the pharmaceutical industry. Starting with high quality drug candidates should ultimately minimize clinical attrition rates and give rise to higher success rates. In this paper, we present a novel approach for indexing oral druglikeness of compounds. With the Iterative Stochastic Elimination (ISE) Algorithm, we distinguish between orally available drugs and nondrugs by generating sets of optimized descriptors' ranges, each set constituting a "filter". We delineate in this paper how to produce an ensemble of best k-descriptor sets out of the huge number of possibilities, and how to construct a "filter bank" that retains diverse filters by clustering. Finally, we define the "orally bioavailable drug-like" character of individual molecules by combining the filters into an "Orally Bioavailable Druglike Index" (OB-DLI) which may be used to prioritize molecules in databases and discuss its uses in several potential applications. The predictive power with sets of 4-6 descriptors is high (i.e., one filter of 5 descriptors retrieved 81% true positives and >77% true negatives). Thus, OB-DLI has advantages over binary decisions (that use only one filter) not only in raising discriminative power but also in ranking drug candidates according to their chance to be successful oral drugs. We demonstrate the ability of our approach to discover molecular entities with the required property, orally bioavailable drug likeness, that are structurally dissimilar to those of the training set. Comparison of this ISE application to some of the current main methods for classification reveals that our approach has >13% improvement in the Matthews Correlation Coefficient, which measures the success of identifying true and false positives and negatives.
在药物发现的早期阶段整合计算方法一直是制药行业的主要趋势之一。从高质量的药物候选物开始,最终应该将临床淘汰率降到最低,并提高成功率。在本文中,我们提出了一种新的化合物口服药物相似性的索引方法。通过迭代随机消除(ISE)算法,我们通过生成优化描述符范围的集合来区分可口服药物和非药物,每个集合构成一个“滤波器”。本文阐述了如何从大量可能性中生成最佳 k 描述符集的集合,以及如何通过聚类保留不同滤波器的“滤波器库”。最后,我们通过将滤波器组合成一个“口服生物可用药物样指数”(OB-DLI)来定义单个分子的“口服生物可用药物样”特征,该指数可用于在数据库中对分子进行优先级排序,并讨论其在几个潜在应用中的用途。具有 4-6 个描述符的集合的预测能力很高(即,一个由 5 个描述符组成的滤波器可检索 81%的真阳性和>77%的真阴性)。因此,OB-DLI 不仅在提高判别能力方面,而且在根据候选药物成为成功口服药物的机会对其进行排序方面,都优于仅使用一个滤波器的二进制决策。我们证明了我们的方法能够发现具有所需性质的分子实体,即口服生物可用性药物相似性,其结构与训练集的结构不同。将这种 ISE 应用与一些当前的主要分类方法进行比较表明,我们的方法在马修斯相关系数(一种衡量识别真阳性和假阳性和假阴性的成功的指标)上提高了>13%。