Kotsampasakou Eleni, Montanari Floriane, Ecker Gerhard F
University of Vienna, Department of Pharmaceutical Chemistry, Althanstrasse 14, 1090 Vienna, Austria.
University of Vienna, Department of Pharmaceutical Chemistry, Althanstrasse 14, 1090 Vienna, Austria.
Toxicology. 2017 Aug 15;389:139-145. doi: 10.1016/j.tox.2017.06.003. Epub 2017 Jun 23.
Drug-induced liver injury (DILI) is a major issue for both patients and pharmaceutical industry due to insufficient means of prevention/prediction. In the current work we present a 2-class classification model for DILI, generated with Random Forest and 2D molecular descriptors on a dataset of 966 compounds. In addition, predicted transporter inhibition profiles were also included into the models. The initially compiled dataset of 1773 compounds was reduced via a 2-step approach to 966 compounds, resulting in a significant increase (p-value<0.05) in model performance. The models have been validated via 10-fold cross-validation and against three external test sets of 921, 341 and 96 compounds, respectively. The final model showed an accuracy of 64% (AUC 68%) for 10-fold cross-validation (average of 50 iterations) and comparable values for two test sets (AUC 59%, 71% and 66%, respectively). In the study we also examined whether the predictions of our in-house transporter inhibition models for BSEP, BCRP, P-glycoprotein, and OATP1B1 and 1B3 contributed in improvement of the DILI mode. Finally, the model was implemented with open-source 2D RDKit descriptors in order to be provided to the community as a Python script.
由于预防/预测手段不足,药物性肝损伤(DILI)对患者和制药行业来说都是一个重大问题。在当前工作中,我们基于966种化合物的数据集,利用随机森林和二维分子描述符生成了一个用于DILI的二类分类模型。此外,预测的转运体抑制谱也被纳入模型。最初编制的1773种化合物的数据集通过两步法缩减至966种化合物,从而使模型性能显著提高(p值<0.05)。这些模型已通过10折交叉验证进行验证,并分别针对921、341和96种化合物的三个外部测试集进行验证。最终模型在10折交叉验证(50次迭代的平均值)中显示出64%的准确率(AUC为68%),在两个测试集中也有类似的值(AUC分别为59%、71%和66%)。在该研究中,我们还研究了我们内部针对BSEP、BCRP、P-糖蛋白以及OATP1B1和1B3的转运体抑制模型的预测是否有助于改善DILI模型。最后,该模型使用开源的二维RDKit描述符实现,以便作为Python脚本提供给社区。