Chen Zhao, Zhao Mengzhu, You Liangzhen, Zheng Rui, Jiang Yin, Zhang Xiaoyu, Qiu Ruijin, Sun Yang, Pan Haie, He Tianmai, Wei Xuxu, Chen Zhineng, Zhao Chen, Shang Hongcai
Key Laboratory of Chinese Internal Medicine of Ministry of Education, Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing, China.
Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China.
Chin Med. 2022 May 17;17(1):58. doi: 10.1186/s13020-022-00617-4.
Traditional Chinese medicine and Western medicine combination (TCM-WMC) increased the complexity of compounds ingested.
To develop a method for screening hepatotoxic compounds in TCM-WMC based on chemical structures using artificial intelligence (AI) methods.
Drug-induced liver injury (DILI) data was collected from the public databases and published literatures. The total dataset formed by DILI data was randomly divided into training set and test set at a ratio of 3:1 approximately. Machine learning models of SGD (Stochastic Gradient Descent), kNN (k-Nearest Neighbor), SVM (Support Vector Machine), NB (Naive Bayes), DT (Decision Tree), RF (Random Forest), ANN (Artificial Neural Network), AdaBoost, LR (Logistic Regression) and one deep learning model (deep belief network, DBN) were adopted to construct models for screening hepatotoxic compounds.
Dataset of 2035 hepatotoxic compounds was collected in this research, in which 1505 compounds were as training set and 530 compounds were as test set. Results showed that RF obtained 0.838 of classification accuracy (CA), 0.827 of F1-score, 0.832 of Precision, 0.838 of Recall, 0.814 of area under the curve (AUC) on the training set and 0.767 of CA, 0.731 of F1, 0.739 of Precision, 0.767 of Recall, 0.739 of AUC on the test set, which was better than other eight machine learning methods. The DBN obtained 82.2% accuracy on the test set, which was higher than any other machine learning models on the test set.
The DILI AI models were expected to effectively screen hepatotoxic compounds in TCM-WMC.
中西医结合增加了摄入化合物的复杂性。
基于化学结构,利用人工智能方法开发一种筛选中西医结合中肝毒性化合物的方法。
从公共数据库和已发表文献中收集药物性肝损伤(DILI)数据。由DILI数据形成的总数据集以约3:1的比例随机分为训练集和测试集。采用随机梯度下降(SGD)、k近邻(kNN)、支持向量机(SVM)、朴素贝叶斯(NB)、决策树(DT)、随机森林(RF)、人工神经网络(ANN)、自适应增强(AdaBoost)、逻辑回归(LR)等机器学习模型以及一种深度学习模型(深度信念网络,DBN)构建筛选肝毒性化合物的模型。
本研究收集了2035种肝毒性化合物的数据集,其中1505种化合物作为训练集,530种化合物作为测试集。结果表明,RF在训练集上的分类准确率(CA)为0.838,F1分数为0.827,精确率为0.832,召回率为0.838,曲线下面积(AUC)为0.814;在测试集上的CA为0.767,F1为0.731,精确率为0.739,召回率为0.767,AUC为0.739,优于其他八种机器学习方法。DBN在测试集上的准确率为82.2%,高于测试集上的任何其他机器学习模型。
DILI人工智能模型有望有效筛选中西医结合中的肝毒性化合物。