Ye Qing, Zeng Yundian, Jiang Linlong, Kang Yu, Pan Peichen, Chen Jiming, Deng Yafeng, Zhao Haitao, He Shibo, Hou Tingjun, Hsieh Chang-Yu
College of Control Science and Engineering, Zhejiang University, Hangzhou, Zhejiang, 310027, China.
College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
Adv Sci (Weinh). 2025 Apr;12(16):e2412402. doi: 10.1002/advs.202412402. Epub 2025 Mar 6.
Discovering therapeutic molecules requires the integration of both phenotype-based drug discovery (PDD) and target-based drug discovery (TDD). However, this integration remains challenging due to the inherent heterogeneity, noise, and bias present in biomedical data. In this study, Knowledge-Guided Drug Relational Predictor (KGDRP), a graph representation learning approach is developed that effectively integrates multimodal biomedical data, including network data containing biological system information, gene expression data, and sequence data that incorporates chemical molecular structures, all within a heterogeneous graph (HG) structure. By incorporating biomedical HG (BioHG) into a heterogeneous graph neural network (HGNN)-based architecture, KGDRP exhibits a remarkable 12% improvement compared to previous methods in real-world screening scenarios. Notably, the biology-informed representation, derived from KGDRP, significantly enhance target prioritization by 26% in drug target discovery. Furthermore, zero-shot evaluation on COVID-19 exhibited a notably higher success rate in identifying diverse potential drugs. The utilization of BioHG facilitates a unique KGDRP-based analysis of cell-target-drug interactions, thereby enabling the elucidation of drug mechanisms. Overall, KGDRP provides a robust infrastructure for the seamlessly integration of multimodal data and biomedical networks, effectively accelerating PDD, guiding therapeutic target discovery, and ultimately expediting therapeutic molecule discovery.
发现治疗性分子需要整合基于表型的药物发现(PDD)和基于靶点的药物发现(TDD)。然而,由于生物医学数据中存在固有的异质性、噪声和偏差,这种整合仍然具有挑战性。在本研究中,开发了一种名为知识引导药物关系预测器(KGDRP)的图表示学习方法,该方法能有效地整合多模态生物医学数据,包括包含生物系统信息的网络数据、基因表达数据以及包含化学分子结构的序列数据,所有这些数据都在一个异构图(HG)结构中。通过将生物医学异构图(BioHG)纳入基于异构图神经网络(HGNN)的架构中,在实际筛选场景中,KGDRP与以前的方法相比有显著的12%的提升。值得注意的是,从KGDRP获得的生物学信息表示在药物靶点发现中显著提高了靶点优先级,提升了26%。此外,对COVID-19的零样本评估在识别多种潜在药物方面表现出显著更高的成功率。BioHG的利用促进了基于KGDRP的独特的细胞-靶点-药物相互作用分析,从而能够阐明药物作用机制。总体而言,KGDRP为多模态数据和生物医学网络的无缝整合提供了强大的基础设施,有效地加速了PDD,指导了治疗靶点发现,并最终加快了治疗性分子的发现。