College of Informatics, Huazhong Agricultural University, Wuhan, Hubei 430070, China.
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae438.
Accurate prediction of molecular properties is crucial in drug discovery. Traditional methods often overlook that real-world molecules typically exhibit multiple property labels with complex correlations. To this end, we propose a novel framework, HiPM, which stands for Hierarchical Prompted Molecular representation learning framework. HiPM leverages task-aware prompts to enhance the differential expression of tasks in molecular representations and mitigate negative transfer caused by conflicts in individual task information. Our framework comprises two core components: the Molecular Representation Encoder (MRE) and the Task-Aware Prompter (TAP). MRE employs a hierarchical message-passing network architecture to capture molecular features at both the atom and motif levels. Meanwhile, TAP utilizes agglomerative hierarchical clustering algorithm to construct a prompt tree that reflects task affinity and distinctiveness, enabling the model to consider multi-granular correlation information among tasks, thereby effectively handling the complexity of multi-label property prediction. Extensive experiments demonstrate that HiPM achieves state-of-the-art performance across various multi-label datasets, offering a novel perspective on multi-label molecular representation learning.
准确预测分子性质在药物发现中至关重要。传统方法通常忽略了一个事实,即实际分子通常表现出具有复杂相关性的多个属性标签。为此,我们提出了一种新颖的框架 HiPM,代表分层提示分子表示学习框架。HiPM 利用任务感知提示来增强分子表示中任务的差异表达,并减轻由于单个任务信息冲突引起的负迁移。我们的框架包括两个核心组件:分子表示编码器 (MRE) 和任务感知提示器 (TAP)。MRE 采用分层消息传递网络架构,在原子和基元级别捕获分子特征。同时,TAP 利用凝聚层次聚类算法构建一个提示树,反映任务亲和力和独特性,使模型能够考虑任务之间多粒度的相关信息,从而有效地处理多标签属性预测的复杂性。广泛的实验表明,HiPM 在各种多标签数据集上实现了最先进的性能,为多标签分子表示学习提供了新的视角。