Suppr超能文献

Mix-Key:用于分子性质预测的图混合与键结构

Mix-Key: graph mixup with key structures for molecular property prediction.

机构信息

Institute of Cyberspace Security, College of Information Engineering, Zhejiang University of Technology, 310023, Hangzhou, China.

Binjiang Institute of Artificial Intelligence, Zhejiang University of Technology, 310056, Hangzhou, China.

出版信息

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae165.

Abstract

Molecular property prediction faces the challenge of limited labeled data as it necessitates a series of specialized experiments to annotate target molecules. Data augmentation techniques can effectively address the issue of data scarcity. In recent years, Mixup has achieved significant success in traditional domains such as image processing. However, its application in molecular property prediction is relatively limited due to the irregular, non-Euclidean nature of graphs and the fact that minor variations in molecular structures can lead to alterations in their properties. To address these challenges, we propose a novel data augmentation method called Mix-Key tailored for molecular property prediction. Mix-Key aims to capture crucial features of molecular graphs, focusing separately on the molecular scaffolds and functional groups. By generating isomers that are relatively invariant to the scaffolds or functional groups, we effectively preserve the core information of molecules. Additionally, to capture interactive information between the scaffolds and functional groups while ensuring correlation between the original and augmented graphs, we introduce molecular fingerprint similarity and node similarity. Through these steps, Mix-Key determines the mixup ratio between the original graph and two isomers, thus generating more informative augmented molecular graphs. We extensively validate our approach on molecular datasets of different scales with several Graph Neural Network architectures. The results demonstrate that Mix-Key consistently outperforms other data augmentation methods in enhancing molecular property prediction on several datasets.

摘要

分子性质预测面临着标记数据有限的挑战,因为它需要一系列专门的实验来注释目标分子。数据增强技术可以有效地解决数据稀缺的问题。近年来,Mixup 在图像处理等传统领域取得了显著的成功。然而,由于图的不规则性和非欧几里得性质,以及分子结构的微小变化可能导致性质的改变,其在分子性质预测中的应用相对有限。为了解决这些挑战,我们提出了一种名为 Mix-Key 的新型数据增强方法,专门用于分子性质预测。Mix-Key 的目的是捕捉分子图的关键特征,分别关注分子支架和功能基团。通过生成对支架或功能基团相对不变的异构体,我们有效地保留了分子的核心信息。此外,为了在确保原始图和增强图之间相关性的同时捕捉支架和功能基团之间的交互信息,我们引入了分子指纹相似度和节点相似度。通过这些步骤,Mix-Key 确定了原始图和两个异构体之间的混合比例,从而生成了更具信息量的增强分子图。我们在具有不同规模的分子数据集和几种图神经网络架构上广泛验证了我们的方法。结果表明,Mix-Key 在增强几个数据集上的分子性质预测方面始终优于其他数据增强方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b14/11070654/bffcc34f7f20/bbae165f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验