Lampa Samuel, Alvarsson Jonathan, Arvidsson Mc Shane Staffan, Berg Arvid, Ahlberg Ernst, Spjuth Ola
Pharmaceutical Bioinformatics Group, Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
Predictive Compound ADME and Safety, Drug Safety and Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
Front Pharmacol. 2018 Nov 6;9:1256. doi: 10.3389/fphar.2018.01256. eCollection 2018.
Ligand-based models can be used in drug discovery to obtain an early indication of potential off-target interactions that could be linked to adverse effects. Another application is to combine such models into a panel, allowing to compare and search for compounds with similar profiles. Most contemporary methods and implementations however lack valid measures of confidence in their predictions, and only provide point predictions. We here describe a methodology that uses Conformal Prediction for predicting off-target interactions, with models trained on data from 31 targets in the ExCAPE-DB dataset selected for their utility in broad early hazard assessment. Chemicals were represented by the signature molecular descriptor and support vector machines were used as the underlying machine learning method. By using conformal prediction, the results from predictions come in the form of confidence -values for each class. The full pre-processing and model training process is openly available as scientific workflows on GitHub, rendering it fully reproducible. We illustrate the usefulness of the developed methodology on a set of compounds extracted from DrugBank. The resulting models are published online and are available via a graphical web interface and an OpenAPI interface for programmatic access.
基于配体的模型可用于药物发现,以早期指示可能与不良反应相关的潜在脱靶相互作用。另一个应用是将这些模型组合成一个面板,以便比较和搜索具有相似特征的化合物。然而,大多数当代方法和实现缺乏对其预测的有效置信度度量,并且仅提供点预测。我们在此描述一种使用共形预测来预测脱靶相互作用的方法,其模型是基于从ExCAPE-DB数据集中选择的31个靶点的数据进行训练的,这些靶点因其在广泛的早期危害评估中的效用而被选中。化学物质由特征分子描述符表示,支持向量机被用作基础机器学习方法。通过使用共形预测,预测结果以每个类别的置信度值的形式呈现。完整的预处理和模型训练过程作为科学工作流程在GitHub上公开可用,使其完全可重现。我们在从DrugBank提取的一组化合物上说明了所开发方法的有用性。所得模型在线发布,并可通过图形化网络界面和OpenAPI接口进行编程访问。