ETH Zurich, Department of Chemistry and Applied Biosciences, Zurich, Switzerland.
KNIME GmbH, Reichenaustrasse 11, 78467, Konstanz, Germany.
ChemMedChem. 2018 Nov 6;13(21):2281-2289. doi: 10.1002/cmdc.201800309. Epub 2018 Oct 2.
The metabolism of xenobiotics by humans and other organisms is a complex process involving numerous enzymes that catalyze phase I (functionalization) and phase II (conjugation) reactions. Herein we introduce MetScore, a machine learning model that can predict both phase I and phase II reaction sites of drugs in a single prediction run. We developed cheminformatics workflows to filter and process reactions to obtain suitable phase I and phase II data sets for model training. Employing a recently developed molecular representation based on quantum chemical partial charges, we constructed random forest machine learning models for phase I and phase II reactions. After combining these models with our previous cytochrome P450 model and calibrating the combination against Bayer in-house data, we obtained the MetScore model that shows good performance, with Matthews correlation coefficients of 0.61 and 0.76 for diverse phase I and phase II reaction types, respectively. We validated its potential applicability to lead optimization campaigns for a new and independent data set compiled from recent publications. The results of this study demonstrate the usefulness of quantum-chemistry-derived molecular representations for reactivity prediction.
人体和其他生物体中外来生物的新陈代谢是一个复杂的过程,涉及许多酶,这些酶催化相 I(功能化)和相 II(结合)反应。在此,我们介绍了 MetScore,这是一个机器学习模型,可以在单次预测运行中预测药物的相 I 和相 II 反应部位。我们开发了化学信息学工作流程来筛选和处理反应,以获得适合相 I 和相 II 数据的模型训练。我们采用了最近开发的基于量子化学部分电荷的分子表示方法,为相 I 和相 II 反应构建了随机森林机器学习模型。将这些模型与我们之前的细胞色素 P450 模型结合,并根据 Bayer 内部数据对其进行校准后,我们获得了 MetScore 模型,该模型显示出良好的性能,对于不同的相 I 和相 II 反应类型,马氏相关系数分别为 0.61 和 0.76。我们验证了它在最近出版物中编译的新的独立数据集的先导优化活动中的潜在适用性。这项研究的结果表明,基于量子化学的分子表示在反应性预测方面具有实用性。