Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States.
Mol Pharm. 2020 Jul 6;17(7):2628-2637. doi: 10.1021/acs.molpharmaceut.0c00326. Epub 2020 Jun 8.
Drug-induced liver injury (DILI) is one the most unpredictable adverse reactions to xenobiotics in humans and the leading cause of postmarketing withdrawals of approved drugs. To date, these drugs have been collated by the FDA to form the DILIRank database, which classifies DILI severity and potential. These classifications have been used by various research groups in generating computational predictions for this type of liver injury. Recently, groups from Pfizer and AstraZeneca have collated DILI data and physicochemical properties for compounds that can be used along with data from the FDA to build machine learning models for DILI. In this study, we have used these data sets, as well as the Biopharmaceutics Drug Disposition Classification System data set, to generate Bayesian machine learning models with our in-house software, Assay Central. The performance of all machine learning models was assessed through both the internal 5-fold cross-validation metrics and prediction accuracy of an external test set of compounds with known hepatotoxicity. The best-performing Bayesian model was based on the DILI-concern category from the DILIRank database with an ROC of 0.814, a sensitivity of 0.741, a specificity of 0.755, and an accuracy of 0.746. A comparison of alternative machine learning algorithms, such as k-nearest neighbors, support vector classification, AdaBoosted decision trees, and deep learning methods, produced similar statistics to those generated with the Bayesian algorithm in Assay Central. This study demonstrates machine learning models grouped in a tool called MegaTox that can be used to predict early-stage clinical compounds, as well as recent FDA-approved drugs, to identify potential DILI.
药物性肝损伤(DILI)是人体对外源化学物质最不可预测的不良反应之一,也是已批准药物上市后撤回的主要原因。迄今为止,这些药物已被 FDA 整理到 DILIRank 数据库中,该数据库对 DILI 的严重程度和潜在性进行分类。这些分类已被各个研究小组用于生成此类肝损伤的计算预测。最近,辉瑞和阿斯利康的研究团队整理了 DILI 数据和化合物的理化性质,这些数据可以与 FDA 的数据结合使用,为 DILI 构建机器学习模型。在这项研究中,我们使用了这些数据集,以及生物药剂学药物处置分类系统数据集,使用我们内部的软件 Assay Central 生成贝叶斯机器学习模型。通过内部 5 折交叉验证指标和外部已知肝毒性化合物测试集的预测准确性,评估了所有机器学习模型的性能。性能最佳的贝叶斯模型基于 DILIRank 数据库中的 DILI 关注类别,ROC 为 0.814,敏感性为 0.741,特异性为 0.755,准确性为 0.746。替代机器学习算法(如 k-最近邻、支持向量分类、AdaBoosted 决策树和深度学习方法)的比较产生了与 Assay Central 中贝叶斯算法生成的统计数据相似的结果。这项研究展示了一种名为 MegaTox 的机器学习模型,该模型可用于预测早期临床化合物以及最近 FDA 批准的药物,以识别潜在的 DILI。