Yoo Sunyong, Yang Hyung Chae, Lee Seongyeong, Shin Jaewook, Min Seyoung, Lee Eunjoo, Song Minkeun, Lee Doheon
School of Electronics and Computer Engineering, Chonnam National University, Gwangju, South Korea.
Department of Otorhinolaryngology-Head and Neck Surgery, Chonnam National University Medical School and Chonnam National University Hospital, Gwangju, South Korea.
Front Pharmacol. 2020 Nov 30;11:584875. doi: 10.3389/fphar.2020.584875. eCollection 2020.
Medicinal plants and their extracts have been used as important sources for drug discovery. In particular, plant-derived natural compounds, including phytochemicals, antioxidants, vitamins, and minerals, are gaining attention as they promote health and prevent disease. Although several methods have been developed to confirm the biological activities of natural compounds, there is still considerable room to reduce time and cost. To overcome these limitations, several methods have been proposed for conducting large-scale analysis, but they are still limited in terms of dealing with incomplete and heterogeneous natural compound data. Here, we propose a deep learning-based approach to identify the medicinal uses of natural compounds by exploiting massive and heterogeneous drug and natural compound data. The rationale behind this approach is that deep learning can effectively utilize heterogeneous features to alleviate incomplete information. Based on latent knowledge, molecular interactions, and chemical property features, we generated 686 dimensional features for 4,507 natural compounds and 2,882 approved and investigational drugs. The deep learning model was trained using the generated features and verified drug indication information. When the features of natural compounds were applied as input to the trained model, potential efficacies were successfully predicted with high accuracy, sensitivity, and specificity.
药用植物及其提取物一直是药物发现的重要来源。特别是,植物衍生的天然化合物,包括植物化学物质、抗氧化剂、维生素和矿物质,因其促进健康和预防疾病而受到关注。尽管已经开发了几种方法来确认天然化合物的生物活性,但在减少时间和成本方面仍有很大空间。为了克服这些限制,已经提出了几种进行大规模分析的方法,但在处理不完整和异质的天然化合物数据方面仍然存在局限性。在此,我们提出一种基于深度学习的方法,通过利用大量异质的药物和天然化合物数据来识别天然化合物的药用用途。这种方法背后的基本原理是深度学习可以有效地利用异质特征来缓解不完整信息。基于潜在知识、分子相互作用和化学性质特征,我们为4507种天然化合物以及2882种已批准和正在研究的药物生成了686维特征。使用生成的特征训练深度学习模型,并验证药物适应症信息。当将天然化合物的特征作为输入应用于训练好的模型时,成功地以高精度、高灵敏度和高特异性预测了潜在疗效。