IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):256-265. doi: 10.1109/TCBB.2022.3155453. Epub 2023 Feb 3.
Identifying drug phenotypic effects, including therapeutic effects and adverse drug reactions (ADRs), is an inseparable part for evaluating the potentiality of new drug candidates (NDCs). However, current computational methods for predicting phenotypic effects of NDCs are mainly based on the overall structure of an NDC or a related target. These approaches often lead to inconsistencies between the structures and functions and limit the prediction space of NDCs. In this study, first, we constructed quantitative associations of substructure-domain, domain-ADR, and domain-ATC (Anatomical Therapeutic Chemical Classification System code) through L1LOG and L1SVM machine learning models. These associations represent relationships between phenotypes (ADRs and ATCs) and local structures of drugs and proteins. Then, based on these established associations, substructure-phenotype relationships were constructed which were utilized to quantify drug-phenotype relationships. Thus, this approach could achieve high-throughput and effective evaluations of the druggability of NDCs by referring to the established substructure-phenotype relationships and structural information of NDCs without additional prior knowledge. Using this computational pipeline, 83,205 drug-ATC relationships (including 1,479 drugs and 178 ATCs) and 306,421 drug-ADR relationships (including 1,752 drugs and 454 ADRs) were predicted in total. The prediction results were validated at four levels: five-fold cross validation, public databases, literature, and molecular docking. Furthermore, three case studies demonstrated the feasibility of our method. 79 ATCs and 269 ADRs were predicted to be related to Maraviroc, an approved drug, including the existing antiviral effect in clinical use. Additionally, we also found risk substructures of severe ADRs, for example, SUB215 (>= 1, saturated or only aromatic carbon ring size 7) can result in shock. And we analyzed the mechanism of action (MOA) of interested drugs based on the established drug-substructure-domain-protein associations. In a word, this approach through establishing drug-substructure-phenotype relationships can achieve quantitative prediction of phenotypes for a given NDC or drug without any prior knowledge except its structure information. Using that way, we can directly obtain the relationships between substructure and phenotype of a compound, which is more convenient to analyze the phenotypic mechanism of drugs and accelerate the process of rational drug design.
鉴定药物表型效应,包括治疗效果和药物不良反应(ADR),是评估新候选药物(NDC)潜力不可或缺的一部分。然而,目前用于预测 NDC 表型效应的计算方法主要基于 NDC 或相关靶标的整体结构。这些方法往往导致结构和功能之间的不一致,并限制了 NDC 的预测空间。在这项研究中,我们首先通过 L1LOG 和 L1SVM 机器学习模型构建了亚结构-域、域-ADR 和域-ATC(解剖治疗化学分类系统代码)的定量关联。这些关联代表了药物和蛋白质的表型(ADR 和 ATC)与局部结构之间的关系。然后,基于这些建立的关联,构建了亚结构-表型关系,用于量化药物-表型关系。因此,通过参考已建立的亚结构-表型关系和 NDC 的结构信息,而无需额外的先验知识,该方法可以实现对 NDC 可成药性的高通量和有效评估。使用该计算流程,总共预测了 83205 种药物-ATC 关系(包括 1479 种药物和 178 种 ATC)和 306421 种药物-ADR 关系(包括 1752 种药物和 454 种 ADR)。在五个折叠交叉验证、公共数据库、文献和分子对接等四个层面上验证了预测结果。此外,三个案例研究证明了我们方法的可行性。预测 Maraviroc(一种已批准的药物)与 79 种 ATC 和 269 种 ADR 有关,包括临床使用中的现有抗病毒作用。此外,我们还发现了严重 ADR 的风险亚结构,例如 SUB215(>=1,饱和或只有芳香碳环大小为 7)可导致休克。我们还基于已建立的药物-亚结构-域-蛋白关联分析了感兴趣药物的作用机制(MOA)。总之,通过建立药物-亚结构-表型关系,我们可以在不了解任何除结构信息以外的先验知识的情况下,对给定的 NDC 或药物进行表型的定量预测。通过这种方式,我们可以直接获得化合物亚结构和表型之间的关系,更方便地分析药物的表型机制,加快合理药物设计的过程。