State Key Laboratory of Fine Chemicals, Dalian University of Technology, Dalian, Liaoning 116024, China.
State Key Laboratory of Fine Chemicals, Dalian University of Technology, Dalian, Liaoning 116024, China.
Toxicology. 2024 Feb;502:153736. doi: 10.1016/j.tox.2024.153736. Epub 2024 Feb 1.
Drug-induced liver injury (DILI) is one the rare adverse drug reaction (ADR) and multifactorial endpoints. Current preclinical animal models struggle to anticipate it, and in silico methods have emerged as a way with significant potential for doing so. In this study, a high-quality dataset of 1573 compounds was assembled. The 48 classification models, which depended on six different molecular fingerprints, were built via deep neural network (DNN) and seven machine learning algorithms. Comparing the results of the DNN and machine learning models, the optional performing model was found as the one developed based on the DNN with ECFP_6 as input, which achieved the area under the receiver operating characteristic curve (AUC) of 0.713, balanced accuracy (BA) of 0.680, and F1 of 0.753. In addition, we used the SHapley Additive exPlanations (SHAP) algorithm to interpret the models, identified the crucial structural fragments related to DILI risk, and selected the top ten substructures with the highest contribution rankings to serve as warning indicators for subsequent drug hepatotoxicity screening studies. The study demonstrates that the DNN models developed based on molecular fingerprints can be a trustworthy and efficient tool for determining the risk of DILI during the pre-development of novel medications.
药物性肝损伤(DILI)是一种罕见的药物不良反应(ADR)和多因素终点。目前,临床前动物模型难以预测它,而基于计算的方法已成为一种具有重要潜力的方法。在这项研究中,我们构建了一个包含 1573 种化合物的高质量数据集。通过深度神经网络(DNN)和七种机器学习算法,构建了 48 个依赖于六种不同分子指纹的分类模型。通过比较 DNN 和机器学习模型的结果,发现基于 DNN 且以 ECFP_6 作为输入的模型表现最佳,其获得的受试者工作特征曲线下面积(AUC)为 0.713,平衡准确率(BA)为 0.680,F1 值为 0.753。此外,我们使用 SHapley Additive exPlanations(SHAP)算法对模型进行解释,确定了与 DILI 风险相关的关键结构片段,并选择了贡献排名最高的前十个子结构作为后续药物肝毒性筛选研究的警告指标。该研究表明,基于分子指纹的 DNN 模型可以成为新型药物开发前确定 DILI 风险的可靠、高效工具。