Suppr超能文献

基于有效课程学习的分子图学习策略。

An efficient curriculum learning-based strategy for molecular graph learning.

机构信息

Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China.

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China.

出版信息

Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac099.

Abstract

Computational methods have been widely applied to resolve various core issues in drug discovery, such as molecular property prediction. In recent years, a data-driven computational method-deep learning had achieved a number of impressive successes in various domains. In drug discovery, graph neural networks (GNNs) take molecular graph data as input and learn graph-level representations in non-Euclidean space. An enormous amount of well-performed GNNs have been proposed for molecular graph learning. Meanwhile, efficient use of molecular data during training process, however, has not been paid enough attention. Curriculum learning (CL) is proposed as a training strategy by rearranging training queue based on calculated samples' difficulties, yet the effectiveness of CL method has not been determined in molecular graph learning. In this study, inspired by chemical domain knowledge and task prior information, we proposed a novel CL-based training strategy to improve the training efficiency of molecular graph learning, called CurrMG. Consisting of a difficulty measurer and a training scheduler, CurrMG is designed as a plug-and-play module, which is model-independent and easy-to-use on molecular data. Extensive experiments demonstrated that molecular graph learning models could benefit from CurrMG and gain noticeable improvement on five GNN models and eight molecular property prediction tasks (overall improvement is 4.08%). We further observed CurrMG's encouraging potential in resource-constrained molecular property prediction. These results indicate that CurrMG can be used as a reliable and efficient training strategy for molecular graph learning. Availability: The source code is available in https://github.com/gu-yaowen/CurrMG.

摘要

计算方法已广泛应用于解决药物发现中的各种核心问题,如分子性质预测。近年来,一种数据驱动的计算方法——深度学习在各个领域取得了许多令人瞩目的成功。在药物发现中,图神经网络 (GNN) 以分子图数据作为输入,并在非欧几里得空间中学习图级表示。已经提出了大量性能良好的 GNN 用于分子图学习。然而,在训练过程中有效地利用分子数据尚未得到足够的重视。课程学习 (CL) 是一种通过根据计算出的样本难度重新排列训练队列的训练策略,但 CL 方法在分子图学习中的有效性尚未确定。在这项研究中,受化学领域知识和任务先验信息的启发,我们提出了一种基于 CL 的新型训练策略,以提高分子图学习的训练效率,称为 CurrMG。由难度度量器和训练调度器组成,CurrMG 被设计为一个即插即用的模块,它与模型无关,易于在分子数据上使用。广泛的实验表明,分子图学习模型可以从 CurrMG 中受益,并在五个 GNN 模型和八个分子性质预测任务上获得显著提高(总体提高 4.08%)。我们进一步观察到 CurrMG 在资源受限的分子性质预测方面的令人鼓舞的潜力。这些结果表明 CurrMG 可以用作分子图学习的可靠且高效的训练策略。

可用性

源代码可在 https://github.com/gu-yaowen/CurrMG 获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验