Collaborative Drug Discovery, Inc. , 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States.
J Chem Inf Model. 2014 Oct 27;54(10):2996-3004. doi: 10.1021/ci500445u. Epub 2014 Oct 7.
In a decade with over half a billion dollars of investment, more than 300 chemical probes have been identified to have biological activity through NIH funded screening efforts. We have collected the evaluations of an experienced medicinal chemist on the likely chemistry quality of these probes based on a number of criteria including literature related to the probe and potential chemical reactivity. Over 20% of these probes were found to be undesirable. Analysis of the molecular properties of these compounds scored as desirable suggested higher pKa, molecular weight, heavy atom count, and rotatable bond number. We were particularly interested whether the human evaluation aspect of medicinal chemistry due diligence could be computationally predicted. We used a process of sequential Bayesian model building and iterative testing as we included additional probes. Following external validation of these methods and comparing different machine learning methods, we identified Bayesian models with accuracy comparable to other measures of drug-likeness and filtering rules created to date.
在过去十年中,NIH 资助的筛选工作已经确定了超过 30 亿个具有生物活性的化学探针。我们收集了一位经验丰富的药物化学家对这些探针的化学质量的评估,这些评估是基于多个标准的,包括与探针相关的文献和潜在的化学反应性。超过 20%的探针被认为是不理想的。对这些化合物的分子性质进行分析,得分较高的有较高的 pKa、分子量、重原子数和可旋转键数。我们特别感兴趣的是,药物化学尽职调查的人为评估是否可以通过计算预测。我们使用了一种顺序贝叶斯模型构建和迭代测试的过程,因为我们还包括了其他探针。在对这些方法进行外部验证并比较不同的机器学习方法后,我们确定了贝叶斯模型与其他药物相似性指标的准确性相当,并创建了迄今为止的筛选规则。