Seal Srijit, Williams Dominic P, Hosseini-Gerami Layla, Mahale Manas, Carpenter Anne E, Spjuth Ola, Bender Andreas
Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Rd, CB2 1EW, Cambridge, United Kingdom.
Imaging Platform, Broad Institute of MIT and Harvard, US.
bioRxiv. 2024 Jun 8:2024.01.10.575128. doi: 10.1101/2024.01.10.575128.
Drug-induced liver injury (DILI) has been significant challenge in drug discovery, often leading to clinical trial failures and necessitating drug withdrawals. The existing suite of in vitro proxy-DILI assays is generally effective at identifying compounds with hepatotoxicity. However, there is considerable interest in enhancing in silico prediction of DILI because it allows for the evaluation of large sets of compounds more quickly and cost-effectively, particularly in the early stages of projects. In this study, we aim to study ML models for DILI prediction that first predicts nine proxy-DILI labels and then uses them as features in addition to chemical structural features to predict DILI. The features include (e.g., mitochondrial toxicity, bile salt export pump inhibition) data, (e.g., preclinical rat hepatotoxicity studies) data, pharmacokinetic parameters of maximum concentration, structural fingerprints, and physicochemical parameters. We trained DILI-prediction models on 888 compounds from the DILIst dataset and tested on a held-out external test set of 223 compounds from DILIst dataset. The best model, DILIPredictor, attained an AUC-ROC of 0.79. This model enabled the detection of top 25 toxic compounds compared to models using only structural features (2.68 LR+ score). Using feature interpretation from DILIPredictor, we were able to identify the chemical substructures causing DILI as well as differentiate cases DILI is caused by compounds in animals but not in humans. For example, DILIPredictor correctly recognized 2-butoxyethanol as non-toxic in humans despite its hepatotoxicity in mice models. Overall, the DILIPredictor model improves the detection of compounds causing DILI with an improved differentiation between animal and human sensitivity as well as the potential for mechanism evaluation. DILIPredictor is publicly available at https://broad.io/DILIPredictor for use web interface and with all code available for download and local implementation via https://pypi.org/project/dilipred/.
药物性肝损伤(DILI)一直是药物研发中的重大挑战,常常导致临床试验失败并迫使药物撤市。现有的一套体外替代性DILI检测方法通常能有效识别具有肝毒性的化合物。然而,人们对增强DILI的计算机预测兴趣浓厚,因为它能更快速且经济高效地评估大量化合物,尤其是在项目的早期阶段。在本研究中,我们旨在研究用于DILI预测的机器学习模型,该模型首先预测九个替代性DILI标签,然后将它们与化学结构特征一起用作特征来预测DILI。这些特征包括(例如,线粒体毒性、胆盐输出泵抑制)数据、(例如,临床前大鼠肝毒性研究)数据、最大浓度的药代动力学参数、结构指纹图谱和物理化学参数。我们在来自DILIst数据集的888种化合物上训练DILI预测模型,并在来自DILIst数据集的223种化合物的外部验证测试集上进行测试。最佳模型DILIPredictor的AUC-ROC达到了0.79。与仅使用结构特征的模型(2.68 LR+分数)相比,该模型能够检测出前25种有毒化合物。通过DILIPredictor的特征解释,我们能够识别导致DILI的化学亚结构,并区分DILI是由动物体内而非人类体内的化合物引起的情况。例如,DILIPredictor正确地将2-丁氧基乙醇识别为对人类无毒,尽管它在小鼠模型中具有肝毒性。总体而言,DILIPredictor模型改进了对导致DILI的化合物的检测,提高了动物和人类敏感性之间的区分能力以及进行机制评估的潜力。DILIPredictor可通过https://broad.io/DILIPredictor公开获取,用于网络界面,并且所有代码可通过https://pypi.org/project/dilipred/下载并进行本地实现。