Li Xiao, Chen Yaojie, Song Xinrui, Zhang Yuan, Li Huanhuan, Zhao Yong
Beijing Beike Deyuan Bio-Pharm Technology Co. Ltd. 7 Fengxian road Beijing 100094 China
Beijing Key Laboratory of Cloud Computing Key Technology and Application, Beijing Computing Center, Beijing Academy of Science and Technology 7 Fengxian road Beijing 100094 China
RSC Adv. 2018 Feb 20;8(15):8101-8111. doi: 10.1039/c7ra12957b. eCollection 2018 Feb 19.
Drug-induced liver injury (DILI), caused by drugs, herbal agents or nutritional supplements, is a major issue for patients and the pharmaceutical industry. It has been a leading cause of clinical trials failure and withdrawal of FDA approval. In this research, we focused on estimation of chemical DILI potential on humans based on structurally diverse organic chemicals. We developed a series of binary classification models using five different machine learning methods and eight different feature reduction methods. The model, developed with the support vector machine (SVM) and the MACCS fingerprint, performed best both on the test set and external validation. It achieved a prediction accuracy of 80.39% on the test set and 82.78% on external validation. We made this model available at http://opensource.vslead.com/. The user can freely predict the DILI potential of molecules. Furthermore, we analyzed the difference of distributions of 12 key physical-chemical properties between DILI-positive and DILI-negative compounds and 20 privileged substructures responsible for DILI were identified from the Klekota-Roth fingerprint. Moreover, since traditional Chinese medicine (TCM)-induced liver injury is also one of the major concerns among the toxic effects, we evaluated the DILI potential of TCM ingredients using the MACCS_SVM model developed in this study. We hope the model and privileged substructures could be useful complementary tools for chemical DILI evaluation.
药物性肝损伤(DILI)由药物、草药制剂或营养补充剂引起,是患者和制药行业面临的一个重大问题。它一直是临床试验失败和美国食品药品监督管理局(FDA)批准撤回的主要原因。在本研究中,我们专注于基于结构多样的有机化学品评估对人类的化学性DILI潜力。我们使用五种不同的机器学习方法和八种不同的特征约简方法开发了一系列二元分类模型。使用支持向量机(SVM)和MACCS指纹开发的模型在测试集和外部验证中表现最佳。它在测试集上的预测准确率为80.39%,在外部验证中的准确率为82.78%。我们将此模型发布在http://opensource.vslead.com/ 上。用户可以自由预测分子的DILI潜力。此外,我们分析了DILI阳性和DILI阴性化合物之间12种关键物理化学性质的分布差异,并从Klekota-Roth指纹中识别出20个导致DILI的特权子结构。此外,由于中药引起的肝损伤也是毒性作用中的主要关注点之一,我们使用本研究中开发的MACCS_SVM模型评估了中药成分的DILI潜力。我们希望该模型和特权子结构能够成为化学性DILI评估的有用补充工具。