Department of Computer Science, Aalto University, Konemiehentie 2, 02150 Espoo, Finland.
Predictive, Investigative and Translational Toxicology, PSTS, Janssen Research & Development, Pharmaceutical Companies of Johnson & Johnson, 2340 Beerse, Belgium.
Chem Res Toxicol. 2023 Aug 21;36(8):1238-1247. doi: 10.1021/acs.chemrestox.2c00378. Epub 2023 Aug 9.
Drug-induced liver injury (DILI) is an important safety concern and a major reason to remove a drug from the market. Advancements in recent machine learning methods have led to a wide range of in silico models for DILI predictive methods based on molecule chemical structures (fingerprints). Existing publicly available DILI data sets used for model building are based on the interpretation of drug labels or patient case reports, resulting in a typical binary clinical DILI annotation. We developed a novel phenotype-based annotation to process hepatotoxicity information extracted from repeated dose in vivo preclinical toxicology studies using INHAND annotation to provide a more informative and reliable data set for machine learning algorithms. This work resulted in a data set of 430 unique compounds covering diverse liver pathology findings which were utilized to develop multiple DILI prediction models trained on the publicly available data (TG-GATEs) using the compound's fingerprint. We demonstrate that the TG-GATEs compounds DILI labels can be predicted well and how the differences between TG-GATEs and the external test compounds (Johnson & Johnson) impact the model generalization performance.
药物性肝损伤(DILI)是一个重要的安全关注点,也是导致药物从市场上撤出的主要原因。最近机器学习方法的进步,已经产生了基于分子化学结构(指纹)的 DILI 预测方法的各种计算模型。现有的用于模型构建的公开可用的 DILI 数据集是基于药物标签或患者病例报告的解释,导致典型的临床 DILI 注释是二进制的。我们开发了一种新的基于表型的注释方法,用于处理从重复剂量体内临床前毒理学研究中提取的肝毒性信息,使用 INHAND 注释为机器学习算法提供更具信息量和更可靠的数据集。这项工作产生了一个包含 430 种独特化合物的数据集,涵盖了多种肝脏病理发现,这些化合物被用于使用化合物的指纹在公开可用的数据(TG-GATEs)上开发多个 DILI 预测模型。我们证明了 TG-GATEs 化合物的 DILI 标签可以很好地预测,以及 TG-GATEs 和外部测试化合物(Johnson & Johnson)之间的差异如何影响模型泛化性能。