Han Jiyeon, Zhung Wonho, Jang Insoo, Lee Joongwon, Kang Min Ji, Lee Timothy Dain, Kwack Seung Jun, Kim Kyu-Bong, Hwang Daehee, Lee Byungwook, Kim Hyung Sik, Kim Woo Youn, Lee Sanghyuk
Department of Bio-Information Science, Ewha Womans University, Seoul, 03760, Republic of Korea.
Department of Chemistry, KAIST, Daejeon, 34141, Republic of Korea.
J Cheminform. 2025 Apr 8;17(1):48. doi: 10.1186/s13321-025-00992-8.
Liver toxicity poses a critical challenge in drug development due to the liver's pivotal role in drug metabolism and detoxification. Accurately predicting liver toxicity is crucial but is hindered by scattered information sources, a lack of curation standards, and the heterogeneity of data perspectives. To address these challenges, we developed the HepatoToxicity Portal (HTP), which integrates an expert-curated knowledgebase (HTP-KB) and a state-of-the-art machine learning model for toxicity prediction (HTP-Pred). The HTP-KB consolidates hepatotoxicity data from nine major databases, carefully reviewed by hepatotoxicity experts and categorized into three levels: in vitro, in vivo, and clinical, using the Medical Dictionary for Regulatory Activities (MedDRA) terminology. The knowledgebase includes information on 8,306 chemicals. This curated dataset was used to build a hepatotoxicity prediction module by fine-tuning a GNN-based foundation model, which was pre-trained with approximately 10 million chemicals in the PubChem database. Our model demonstrated excellent performance, achieving an area under the ROC curve (AUROC) of 0.761, surpassing existing methods for hepatotoxicity prediction. The HTP is publicly accessible at https://kobic.re.kr/htp/ , offering both curated data and prediction services through an intuitive interface, thus effectively supporting drug development efforts.Scientific contributionsHTP-KB consolidates comprehensive curated information on liver toxicity gathered from nine sources. HTP-Pred utilizes advanced deep learning techniques, significantly enhancing predictive accuracy. Together, these tools provide valuable resources for researchers and practitioners in drug development, accessible through a user-friendly interface.
由于肝脏在药物代谢和解毒过程中起着关键作用,肝脏毒性在药物研发中构成了一项严峻挑战。准确预测肝脏毒性至关重要,但却受到信息来源分散、缺乏整理标准以及数据视角异质性的阻碍。为应对这些挑战,我们开发了肝毒性门户(HTP),它集成了一个由专家整理的知识库(HTP-KB)和一个用于毒性预测的先进机器学习模型(HTP-Pred)。HTP-KB整合了来自九个主要数据库的肝毒性数据,这些数据经过肝毒性专家仔细审查,并使用监管活动医学词典(MedDRA)术语分为三个级别:体外、体内和临床。该知识库包含8306种化学物质的信息。这个经过整理的数据集被用于通过微调一个基于图神经网络(GNN)的基础模型来构建肝毒性预测模块,该基础模型在PubChem数据库中使用约1000万种化学物质进行了预训练。我们的模型表现出色,受试者工作特征曲线下面积(AUROC)达到0.761,超过了现有的肝毒性预测方法。HTP可通过https://kobic.re.kr/htp/ 公开访问,通过直观界面提供整理好的数据和预测服务,从而有效地支持药物研发工作。
科学贡献
HTP-KB整合了从九个来源收集的关于肝脏毒性的全面整理信息。HTP-Pred利用先进的深度学习技术,显著提高了预测准确性。这些工具共同为药物研发的研究人员和从业者提供了宝贵资源,可通过用户友好的界面访问。