Huo Mengqi, Peng Sha, Li Jing, Zhang Yanling, Qiao Yanjiang
Key Laboratory of TCM-information Engineer of State Administration of TCM, Beijing University of Chinese Medicine, Beijing, China.
School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China.
J Oncol. 2022 Jul 8;2022:8704784. doi: 10.1155/2022/8704784. eCollection 2022.
An accurate characterization of diseases and compounds is the key to predicting the compound-disease relationship (CDR). However, due to the difficulty of a comprehensive description of CDR, the accuracy of traditional drug development models for large-scale CDR prediction is usually unsatisfactory. In order to solve this problem, we propose a new method that integrates the molecular descriptors of compounds and the symptom descriptors of diseases to build a CDR two-dimensional matrix to predict candidate active compounds. The Matlab software draws grayscale images of CDRs, which are used as a benchmark dataset for training convolutional neural network (CNN) models. The trained model is used to predict candidate antitumor active compounds. Among the AlexNet and GoogLeNet models, we selected the GoogLeNet model for the prediction of active compounds in Chinese medicine, and its Acc, Sen, Pre, F-measure, MCC, and AUC are 0.960, 0.956, 0.965, 0.960, 0.920, and 0.964, respectively. In the prediction results of compounds, 1624 candidate CDRs were found in 124 Chinese medicines. Among them, we obtained 31 features of candidate antitumor active compounds. This method provides new insights for the discovery of candidate active compounds in Chinese medicine.
准确表征疾病和化合物是预测化合物-疾病关系(CDR)的关键。然而,由于全面描述CDR存在困难,传统用于大规模CDR预测的药物开发模型的准确性通常不尽人意。为了解决这个问题,我们提出了一种新方法,该方法整合化合物的分子描述符和疾病的症状描述符,构建一个CDR二维矩阵来预测候选活性化合物。Matlab软件绘制CDR的灰度图像,将其用作训练卷积神经网络(CNN)模型的基准数据集。训练后的模型用于预测候选抗肿瘤活性化合物。在AlexNet和GoogLeNet模型中,我们选择GoogLeNet模型来预测中药中的活性化合物,其Acc、Sen、Pre、F-measure、MCC和AUC分别为0.960、0.956、0.965、0.960、0.920和0.964。在化合物的预测结果中,在124种中药中发现了1624个候选CDR。其中,我们获得了31个候选抗肿瘤活性化合物的特征。该方法为发现中药中的候选活性化合物提供了新的见解。