Centre de Recherche en Cancérologie de Marseille (CRCM), Inserm, U1068, Marseille, F-13009, France.
CNRS, UMR7258, Marseille, F-13009, France.
Sci Rep. 2017 Jun 19;7(1):3820. doi: 10.1038/s41598-017-04264-w.
Many computational methods to predict the macromolecular targets of small organic molecules have been presented to date. Despite progress, target prediction methods still have important limitations. For example, the most accurate methods implicitly restrict their predictions to a relatively small number of targets, are not systematically validated on drugs (whose targets are harder to predict than those of non-drug molecules) and often lack a reliability score associated with each predicted target. Here we present a systematic validation of ligand-centric target prediction methods on a set of clinical drugs. These methods exploit a knowledge-base covering 887,435 known ligand-target associations between 504,755 molecules and 4,167 targets. Based on this dataset, we provide a new estimate of the polypharmacology of drugs, which on average have 11.5 targets below IC 10 µM. The average performance achieved across clinical drugs is remarkable (0.348 precision and 0.423 recall, with large drug-dependent variability), especially given the unusually large coverage of the target space. Furthermore, we show how a sparse ligand-target bioactivity matrix to retrospectively validate target prediction methods could underestimate prospective performance. Lastly, we present and validate a first-in-kind score capable of accurately predicting the reliability of target predictions.
迄今为止,已经提出了许多预测小分子的大分子靶标的计算方法。尽管取得了进展,但靶标预测方法仍然存在重要的局限性。例如,最准确的方法隐含地将其预测限制在相对较少的目标上,不能系统地在药物上进行验证(其靶标比非药物分子更难预测),并且通常缺乏与每个预测靶标相关的可靠性评分。在这里,我们在一组临床药物上对基于配体的靶标预测方法进行了系统验证。这些方法利用了一个知识库,其中包含了 504,755 种分子和 4,167 个靶标之间 887,435 个已知的配体-靶标关联。基于这个数据集,我们提供了对药物多药理学的新估计,这些药物的平均 IC 10μM 以下靶标有 11.5 个。在临床药物中取得的平均性能非常显著(精度为 0.348,召回率为 0.423,具有较大的药物依赖性变异性),尤其是考虑到目标空间的异常大覆盖范围。此外,我们展示了如何使用稀疏的配体-靶标生物活性矩阵来回顾性验证靶标预测方法可能会低估前瞻性性能。最后,我们提出并验证了一种能够准确预测靶标预测可靠性的首创评分。