Suppr超能文献

MoleMCL:一种用于分子预训练的多层次对比学习框架。

MoleMCL: a multi-level contrastive learning framework for molecular pre-training.

机构信息

Department of Computer Science and Technology, Xiamen University, Xiamen 361005, China.

National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China.

出版信息

Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae164.

Abstract

MOTIVATION

Molecular representation learning plays an indispensable role in crucial tasks such as property prediction and drug design. Despite the notable achievements of molecular pre-training models, current methods often fail to capture both the structural and feature semantics of molecular graphs. Moreover, while graph contrastive learning has unveiled new prospects, existing augmentation techniques often struggle to retain their core semantics. To overcome these limitations, we propose a gradient-compensated encoder parameter perturbation approach, ensuring efficient and stable feature augmentation. By merging enhancement strategies grounded in attribute masking and parameter perturbation, we introduce MoleMCL, a new MOLEcular pre-training model based on multi-level contrastive learning.

RESULTS

Experimental results demonstrate that MoleMCL adeptly dissects the structure and feature semantics of molecular graphs, surpassing current state-of-the-art models in molecular prediction tasks, paving a novel avenue for molecular modeling.

AVAILABILITY AND IMPLEMENTATION

The code and data underlying this work are available in GitHub at https://github.com/BioSequenceAnalysis/MoleMCL.

摘要

动机

分子表示学习在属性预测和药物设计等关键任务中起着不可或缺的作用。尽管分子预训练模型取得了显著的成就,但目前的方法往往无法同时捕捉分子图的结构和特征语义。此外,尽管图对比学习揭示了新的前景,但现有的增强技术往往难以保留其核心语义。为了克服这些限制,我们提出了一种梯度补偿编码器参数扰动方法,确保高效稳定的特征增强。通过合并基于属性掩蔽和参数扰动的增强策略,我们引入了 MoleMCL,这是一种基于多层次对比学习的新型 MOLEcular 预训练模型。

结果

实验结果表明,MoleMCL 能够巧妙地剖析分子图的结构和特征语义,在分子预测任务中超越了当前最先进的模型,为分子建模开辟了一条新途径。

可用性和实现

这项工作的代码和数据可在 GitHub 上获得,网址为 https://github.com/BioSequenceAnalysis/MoleMCL。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4eec/11001485/c2e5f4dd1c4c/btae164f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验