National Institutes for Food and Drug Control, Institute for Control of Chinese Traditional Medicine and Ethnic Medicine, Beijing, 102629, China.
Chinese Academy of Medical Science and Peking Union Medical College, Institute of Materia Medica, Beijing, 100006, China.
J Ethnopharmacol. 2022 Nov 15;298:115620. doi: 10.1016/j.jep.2022.115620. Epub 2022 Aug 10.
Polygonum multiflorum Thunb. (PM) is a herb, extracts of which have been used as Chinese medicine for years. Although it is believed to be beneficial to the liver, heart, and kidneys, it causes idiosyncratic drug-induced liver injury (DILI).
We propose that the intrinsic DILI caused by natural products in PM (NPPM) is an important complementary mechanism to PM-related herb-induced liver injury, and aim to identify the ingredients with high DILI potential by machine learning methods.
One hundred and ninety-seven NPPM were collected from the literature to identify the intrinsic hepatotoxic compounds. Additionally, a DILI-labeled dataset consisting of 2384 compounds was collected and randomly split into training and test sets. A diparametric optimization method was developed to tune the parameters of extended-connectivity fingerprints (ECFPs), Rdkit, and atom-pair fingerprints as well as those of machine-learning (ML) algorithms. Subsequently, K means were employed to cluster the NPPM that were predicted to have a high DILI risk. An in vitro cell-viability assay was performed using HepaRG cells to validate the prediction results.
ECFPs with the top 35% of features ranked by the F-value with support vector machine (SVM) yielded the best performance. The optimized SVM model achieved an accuracy of 0.761 and recall value of 0.834 on the test dataset. The silico screening for NPPM resulted in 47 ingredients with high DILI potential, which were clustered into six groups based on the elbow method. A representative subgroup that contained 21 ingredients, of which two dianthrones exhibited the lowest IC value (0.7-0.9 μM) and anthraquinones showed moderate toxicity (15-25 μM), was constructed.
Using ML methods and in vitro screening, two classes of compounds, dianthrones and anthraquinones, were predicted and validated to have a high risk of DILI. The diparametric optimization method used in this study could provide a useful and powerful tool to screen toxicants for large datasets and is available at https://github.com/dreadlesss/Hepatotoxicity_predictor.
何首乌(PM)是一种草药,其提取物多年来一直被用作中药。虽然它被认为对肝脏、心脏和肾脏有益,但它会导致特发性药物性肝损伤(DILI)。
我们提出,PM 中天然产物(NPPM)引起的固有 DILI 是 PM 相关草药性肝损伤的一个重要补充机制,并旨在通过机器学习方法识别具有高 DILI 潜力的成分。
从文献中收集了 197 种 NPPM,以鉴定潜在的肝毒性化合物。此外,收集了一个带有 DILI 标签的数据集,其中包含 2384 种化合物,并将其随机分为训练集和测试集。开发了一种双参数优化方法来调整扩展连接指纹(ECFPs)、Rdkit、原子对指纹以及机器学习(ML)算法的参数。随后,采用 K 均值对预测具有高 DILI 风险的 NPPM 进行聚类。使用 HepaRG 细胞进行体外细胞活力测定来验证预测结果。
在 F 值最高的前 35%的 ECFPs 中,支持向量机(SVM)的特征排名最好。优化后的 SVM 模型在测试数据集上的准确率为 0.761,召回率为 0.834。对 NPPM 的计算机筛选得到了 47 种具有高 DILI 潜力的成分,根据肘部法将其聚类成六个组。基于代表性亚组,构建了包含 21 种成分的亚组,其中两种二蒽酮的 IC 值最低(0.7-0.9 μM),蒽醌的毒性适中(15-25 μM)。
使用 ML 方法和体外筛选,预测并验证了两类化合物,二蒽酮和蒽醌,具有高 DILI 风险。本研究中使用的双参数优化方法可以为大数据集的毒物筛选提供有用且强大的工具,可在 https://github.com/dreadlesss/Hepatotoxicity_predictor 上获取。