Wang Xiangeng, Wang Yanjing, Xu Zhenyu, Xiong Yi, Wei Dong-Qing
State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
Front Pharmacol. 2019 Sep 5;10:971. doi: 10.3389/fphar.2019.00971. eCollection 2019.
Anatomical Therapeutic Chemical (ATC) classification system proposed by the World Health Organization is a widely accepted drug classification scheme in both academic and industrial realm. It is a multilabeling system which categorizes drugs into multiple classes according to their therapeutic, pharmacological, and chemical attributes. In this study, we adopted a data-driven network-based label space partition (NLSP) method for prediction of ATC classes of a given compound within the multilabel learning framework. The proposed method ATC-NLSP is trained on the similarity-based features such as chemical-chemical interaction and structural and fingerprint similarities of a compound to other compounds belonging to the different ATC categories. The NLSP method trains predictors for each label cluster (possibly intersecting) detected by community detection algorithms and takes the ensemble labels for a compound as final prediction. Experimental evaluation based on the jackknife test on the benchmark dataset demonstrated that our method has boosted the absolute true rate, which is the most stringent evaluation metrics in this study, from 0.6330 to 0.7497, in comparison to the state-of-the-art approaches. Moreover, the community structures of the label relation graph were detected through the label propagation method. The advantage of multilabel learning over the single-label models was shown by label-wise analysis. Our study indicated that the proposed method ATC-NLSP, which adopts ideas from network research community and captures the correlation of labels in a data driven manner, is the top-performing model in the ATC prediction task. We believed that the power of NLSP remains to be unleashed for the multilabel learning tasks in drug discovery. The source codes are freely available at https://github.com/dqwei-lab/ATC.
世界卫生组织提出的解剖治疗化学(ATC)分类系统是学术和工业领域广泛接受的药物分类方案。它是一种多标签系统,根据药物的治疗、药理和化学属性将药物分为多个类别。在本研究中,我们采用了一种基于数据驱动的网络标签空间划分(NLSP)方法,在多标签学习框架内预测给定化合物的ATC类别。所提出的ATC-NLSP方法基于基于相似性的特征进行训练,例如化合物与属于不同ATC类别的其他化合物之间的化学-化学相互作用、结构和指纹相似性。NLSP方法为通过社区检测算法检测到的每个标签簇(可能相交)训练预测器,并将化合物的集成标签作为最终预测。基于基准数据集的留一法测试的实验评估表明,与最先进的方法相比,我们的方法将绝对真率(本研究中最严格的评估指标)从0.6330提高到了0.7497。此外,通过标签传播方法检测了标签关系图的社区结构。通过标签-wise分析展示了多标签学习相对于单标签模型的优势。我们的研究表明,所提出的ATC-NLSP方法借鉴了网络研究社区的思想,并以数据驱动的方式捕捉标签的相关性,是ATC预测任务中表现最佳的模型。我们相信,NLSP的力量在药物发现的多标签学习任务中仍有待释放。源代码可在https://github.com/dqwei-lab/ATC上免费获取。