Schrodinger, Inc., 120 West 45th Street, New York, NY 10036, United States.
GlaxoSmithKline, 1250 South Collegeville Road, Collegeville, PA 19426, United States.
Curr Opin Struct Biol. 2018 Oct;52:103-110. doi: 10.1016/j.sbi.2018.09.002. Epub 2018 Oct 12.
Drug discovery is widely recognized to be a difficult and costly activity in large part due to the challenge of identifying chemical matter which simultaneously optimizes multiple properties, one of which is affinity for the primary biological target. Further, many of these properties are difficult to predict ahead of expensive and time-consuming compound synthesis and experimental testing. Here we highlight recent work to develop compound affinity prediction models, and extensively investigate the value such models may provide to preclinical drug discovery. We demonstrate that the ability of these models to improve the overall probability of success is crucially dependent on the shape of the error distribution, not just the root-mean-square error. In particular, while scoring more molecule ideas generally improves the probability of project success when the error distribution is Gaussian, fat-tail distributions such as a Cauchy distribution, can lead to a situation where scoring more ideas actually decreases the overall probability of success.
药物发现被广泛认为是一项困难且昂贵的活动,这在很大程度上是由于识别同时优化多种性质的化学物质的挑战,其中之一是与主要生物靶标结合的亲和力。此外,这些性质中的许多性质在昂贵且耗时的化合物合成和实验测试之前都难以预测。在这里,我们重点介绍了最近开发化合物亲和力预测模型的工作,并广泛研究了这些模型在临床前药物发现中可能提供的价值。我们证明,这些模型提高整体成功率的能力取决于误差分布的形状,而不仅仅是均方根误差。具体来说,当误差分布为高斯分布时,对更多分子想法进行评分通常会提高项目成功的概率,而像柯西分布这样的胖尾分布,则可能导致对更多想法进行评分实际上会降低整体成功率的情况。