MolPLA：用于学习核心、R 基团及其连接键的分子预训练框架。

MolPLA: a molecular pretraining framework for learning cores, R-groups and their linker joints.

机构信息

Department of Computer Science, Korea University, Seoul 02841, Republic of Korea.

AIGEN Sciences, Seoul 04778, Republic of Korea.

出版信息

Bioinformatics. 2024 Jun 28;40(Suppl 1):i369-i380. doi: 10.1093/bioinformatics/btae256.

DOI:10.1093/bioinformatics/btae256

PMID:38940143

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11211832/

Abstract

MOTIVATION

Molecular core structures and R-groups are essential concepts in drug development. Integration of these concepts with conventional graph pre-training approaches can promote deeper understanding in molecules. We propose MolPLA, a novel pre-training framework that employs masked graph contrastive learning in understanding the underlying decomposable parts in molecules that implicate their core structure and peripheral R-groups. Furthermore, we formulate an additional framework that grants MolPLA the ability to help chemists find replaceable R-groups in lead optimization scenarios.

RESULTS

Experimental results on molecular property prediction show that MolPLA exhibits predictability comparable to current state-of-the-art models. Qualitative analysis implicate that MolPLA is capable of distinguishing core and R-group sub-structures, identifying decomposable regions in molecules and contributing to lead optimization scenarios by rationally suggesting R-group replacements given various query core templates.

AVAILABILITY AND IMPLEMENTATION

The code implementation for MolPLA and its pre-trained model checkpoint is available at https://github.com/dmis-lab/MolPLA.

摘要

动机

分子核心结构和 R 基团是药物开发中的重要概念。将这些概念与传统的图预训练方法相结合，可以促进对分子的更深入理解。我们提出了 MolPLA，这是一种新颖的预训练框架，它采用掩蔽图对比学习来理解分子中隐含其核心结构和外围 R 基团的可分解部分。此外，我们还提出了一个额外的框架，使 MolPLA 能够帮助化学家在先导优化场景中找到可替换的 R 基团。