Zhao Haochen, Li Yaohang, Wang Jianxin
Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China.
Department of Computer Science, Old Dominion University, Norfolk, VA 23529-0001, USA.
Bioinformatics. 2021 Sep 29;37(18):2841-2847. doi: 10.1093/bioinformatics/btab204.
The Anatomical Therapeutic Chemical (ATC) system is an official classification system established by the World Health Organization for medicines. Correctly assigning ATC classes to given compounds is an important research problem in drug discovery, which can not only discover the possible active ingredients of the compounds, but also infer theirs therapeutic, pharmacological and chemical properties.
In this article, we develop an end-to-end multi-label classifier called CGATCPred to predict 14 main ATC classes for given compounds. In order to extract rich features of each compound, we use the deep Convolutional Neural Network and shortcut connections to represent and learn the seven association scores between the given compound and others. Moreover, we construct the correlation graph of ATC classes and then apply graph convolutional network on the graph for label embedding abstraction. We use all label embedding to guide the learning process of compound representation. As a result, by using the Jackknife test, CGATCPred obtain reliable Aiming of 81.94%, Coverage of 82.88%, Accuracy 80.81%, Absolute True 76.58% and Absolute False 2.75%, yielding significantly improvements compared to exiting multi-label classifiers.
The codes of CGATCPred are available at https://github.com/zhc940702/CGATCPred and https://zenodo.org/record/4552917.
解剖学治疗学化学(ATC)系统是世界卫生组织建立的药品官方分类系统。为给定化合物正确分配ATC类别是药物发现中的一个重要研究问题,它不仅可以发现化合物可能的活性成分,还可以推断其治疗、药理和化学性质。
在本文中,我们开发了一种名为CGATCPred的端到端多标签分类器,用于预测给定化合物的14个主要ATC类别。为了提取每个化合物的丰富特征,我们使用深度卷积神经网络和捷径连接来表示和学习给定化合物与其他化合物之间的七个关联分数。此外,我们构建了ATC类别的相关图,然后在图上应用图卷积网络进行标签嵌入抽象。我们使用所有标签嵌入来指导化合物表示的学习过程。结果,通过留一法检验,CGATCPred获得了81.94%的可靠目标、82.88%的覆盖率、80.81%的准确率、76.58%的绝对真值和2.75%的绝对假值,与现有的多标签分类器相比有显著改进。
CGATCPred的代码可在https://github.com/zhc940702/CGATCPred和https://zenodo.org/record/4552917上获取。