State Key Laboratory for Chemistry and Molecular Engineering of Medicinal Resources/Key Laboratory for Chemistry and Molecular Engineering of Medicinal Resources (Ministry of Education of China), Collaborative Innovation Center for Guangxi Ethnic Medicine, School of Chemistry and Pharmaceutical Sciences, Guangxi Normal University, 15 Yucai Road, Guilin, 541004, People's Republic of China.
School of Life and Pharmaceutical Sciences, Dalian University of Technology, 2 Dagong Road, Panjin, 124221, People's Republic of China.
Arch Toxicol. 2024 May;98(5):1457-1467. doi: 10.1007/s00204-024-03701-w. Epub 2024 Mar 16.
Cytochrome P450 (P450)-mediated bioactivation, which can lead to the hepatotoxicity through the formation of reactive metabolites (RMs), has been regarded as the major problem of drug failures. Herein, we purposed to establish machine learning models to predict the bioactivation of P450. On the basis of the literature-derived bioactivation dataset, models for Benzene ring, Nitrogen heterocycle and Sulfur heterocycle were developed with machine learning methods, i.e., Random Forest, Random Subspace, SVM and Naïve Bayes. The models were assessed by metrics like "Precision", "Recall", "F-Measure", "AUC" (Area Under the Curve), etc. Random Forest algorithms illustrated the best predictability, with nice AUC values of 0.949, 0.973 and 0.958 for the test sets of Benzene ring, Nitrogen heterocycle and Sulfur heterocycle models, respectively. 2D descriptors like topological indices, 2D autocorrelations and Burden eigenvalues, etc. contributed most to the models. Furthermore, the models were applied to predict the occurrence of bioactivation of an external verification set. Drugs like selpercatinib, glafenine, encorafenib, etc. were predicted to undergo bioactivation into toxic RMs. In vitro, IC shift experiment was performed to assess the potential of bioactivation to validate the prediction. Encorafenib and tirbanibulin were observed of bioactivation potential with shifts of 3-6 folds or so. Overall, this study provided a reliable and robust strategy to predict the P450-mediated bioactivation, which will be helpful to the assessment of adverse drug reactions (ADRs) in clinic and the design of new candidates with lower toxicities.
细胞色素 P450(P450)介导的生物活化作用可通过形成反应性代谢物(RM)导致肝毒性,已被认为是药物失败的主要问题。在此,我们提出建立机器学习模型来预测 P450 的生物活化作用。在文献衍生的生物活化数据集的基础上,使用机器学习方法(如随机森林、随机子空间、SVM 和朴素贝叶斯)为苯环、氮杂环和硫杂环开发了模型。通过“精度”、“召回率”、“F 度量”、“AUC”(曲线下面积)等指标评估模型。随机森林算法显示出最好的预测能力,苯环、氮杂环和硫杂环模型的测试集的 AUC 值分别为 0.949、0.973 和 0.958。拓扑指数、二维自相关和 Burden 特征值等二维描述符对模型的贡献最大。此外,该模型还被应用于预测外部验证集生物活化的发生。赛普替尼、格拉非宁、恩考芬尼等药物被预测会被生物活化成有毒的 RM。体外,进行了 IC 移位实验以评估生物活化的潜力来验证预测。恩考芬尼和替拉滨布具有约 3-6 倍的生物活化潜力。总体而言,这项研究提供了一种可靠和强大的策略来预测 P450 介导的生物活化作用,这将有助于临床评估不良反应(ADR)和设计毒性更低的新候选物。