Suppr超能文献

基于拓扑和几何的自监督分子表示学习

Self-Supervised Molecular Representation Learning With Topology and Geometry.

作者信息

Zang Xuan, Zhang Junjie, Tang Buzhou

出版信息

IEEE J Biomed Health Inform. 2025 Jan;29(1):700-710. doi: 10.1109/JBHI.2024.3479194. Epub 2025 Jan 7.

Abstract

Molecular representation learning is of great importance for drug molecular analysis. The development in molecular representation learning has demonstrated great promise through self-supervised pre-training strategy to overcome the scarcity of labeled molecular property data. Recent studies concentrate on pre-training molecular representation encoders by integrating both 2D topological and 3D geometric structures. However, existing methods rely on molecule-level or atom-level alignment for different views, while overlooking hierarchical self-supervised learning to capture both inter-molecule and intra-molecule correlation. Additionally, most methods employ 2D or 3D encoders to individually extract molecular characteristics locally or globally for molecular property prediction. The potential for effectively fusing these two molecular representations remains to be explored. In this work, we propose a Multi-View Molecular Representation Learning method (MVMRL) for molecular property prediction. First, hierarchical pre-training pretext tasks are designed, including fine-grained atom-level tasks for 2D molecular graphs as well as coarse-grained molecule-level tasks for 3D molecular graphs to provide complementary information to each other. Subsequently, a motif-level fusion pattern of multi-view molecular representations is presented during fine-tuning to enhance the performance of molecular property prediction. We evaluate the effectiveness of the proposed MVMRL by comparing with state-of-the-art baselines on molecular property prediction tasks, and the experimental results demonstrate the superiority of MVMRL.

摘要

分子表示学习对于药物分子分析至关重要。分子表示学习的发展通过自监督预训练策略展现出了巨大的前景,以克服标记分子属性数据的稀缺性。最近的研究集中于通过整合二维拓扑结构和三维几何结构来预训练分子表示编码器。然而,现有方法依赖于不同视图的分子级或原子级对齐,而忽略了分层自监督学习以捕捉分子间和分子内的相关性。此外,大多数方法采用二维或三维编码器分别在局部或全局提取分子特征以进行分子属性预测。有效融合这两种分子表示的潜力仍有待探索。在这项工作中,我们提出了一种用于分子属性预测的多视图分子表示学习方法(MVMRL)。首先,设计了分层预训练的 pretext 任务,包括针对二维分子图的细粒度原子级任务以及针对三维分子图的粗粒度分子级任务,以便相互提供补充信息。随后,在微调期间呈现了多视图分子表示的基序级融合模式,以提高分子属性预测的性能。我们通过与分子属性预测任务上的最新基线进行比较来评估所提出的 MVMRL 的有效性,实验结果证明了 MVMRL 的优越性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验