Suppr超能文献

脱靶P ML:一种用于小分子脱靶面板安全性评估的开源机器学习框架。

Off-targetP ML: an open source machine learning framework for off-target panel safety assessment of small molecules.

作者信息

Naga Doha, Muster Wolfgang, Musvasva Eunice, Ecker Gerhard F

机构信息

Roche Pharma Research & Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland.

Department of Pharmaceutical Sciences, University of Vienna, Vienna, Austria.

出版信息

J Cheminform. 2022 May 7;14(1):27. doi: 10.1186/s13321-022-00603-w.

Abstract

Unpredicted drug safety issues constitute the majority of failures in the pharmaceutical industry according to several studies. Some of these preclinical safety issues could be attributed to the non-selective binding of compounds to targets other than their intended therapeutic target, causing undesired adverse events. Consequently, pharmaceutical companies routinely run in-vitro safety screens to detect off-target activities prior to preclinical and clinical studies. Hereby we present an open source machine learning framework aiming at the prediction of our in-house 50 off-target panel activities for ~ 4000 compounds, directly from their structure. This framework is intended to guide chemists in the drug design process prior to synthesis and to accelerate drug discovery. We also present a set of ML approaches that require minimum programming experience for deployment. The workflow incorporates different ML approaches such as deep learning and automated machine learning. It also accommodates popular issues faced in bioactivity predictions, as data imbalance, inter-target duplicated measurements and duplicated public compound identifiers. Throughout the workflow development, we explore and compare the capability of Neural Networks and AutoML in constructing prediction models for fifty off-targets of different protein classes, different dataset sizes, and high-class imbalance. Outcomes from different methods are compared in terms of efficiency and efficacy. The most important challenges and factors impacting model construction and performance in addition to suggestions on how to overcome such challenges are also discussed.

摘要

根据多项研究,不可预测的药物安全问题是制药行业失败案例的主要原因。其中一些临床前安全问题可能归因于化合物与其预期治疗靶点以外的靶点发生非选择性结合,从而导致不良事件。因此,制药公司在临床前和临床研究之前,通常会进行体外安全筛选,以检测脱靶活性。在此,我们提出了一个开源机器学习框架,旨在直接从约4000种化合物的结构预测我们内部的50种脱靶活性。该框架旨在在药物设计过程中指导化学家进行合成前的工作,并加速药物发现。我们还提出了一组部署所需编程经验最少的机器学习方法。该工作流程结合了不同的机器学习方法,如深度学习和自动化机器学习。它还解决了生物活性预测中常见的问题,如数据不平衡、靶点间重复测量和公共化合物标识符重复。在整个工作流程开发过程中,我们探索并比较了神经网络和自动化机器学习在构建针对不同蛋白质类别、不同数据集大小和高度类别不平衡的50种脱靶预测模型方面的能力。从效率和效果方面比较了不同方法的结果。还讨论了影响模型构建和性能的最重要挑战和因素,以及如何克服这些挑战的建议。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce82/9077900/f7f123f9041f/13321_2022_603_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验