Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA.
Meta AI Research, Menlo Park, California 94025, USA.
J Chem Phys. 2022 May 14;156(18):184702. doi: 10.1063/5.0088019.
Recent advances in Graph Neural Networks (GNNs) have transformed the space of molecular and catalyst discovery. Despite the fact that the underlying physics across these domains remain the same, most prior work has focused on building domain-specific models either in small molecules or in materials. However, building large datasets across all domains is computationally expensive; therefore, the use of transfer learning (TL) to generalize to different domains is a promising but under-explored approach to this problem. To evaluate this hypothesis, we use a model that is pretrained on the Open Catalyst Dataset (OC20), and we study the model's behavior when fine-tuned for a set of different datasets and tasks. This includes MD17, the *CO adsorbate dataset, and OC20 across different tasks. Through extensive TL experiments, we demonstrate that the initial layers of GNNs learn a more basic representation that is consistent across domains, whereas the final layers learn more task-specific features. Moreover, these well-known strategies show significant improvement over the non-pretrained models for in-domain tasks with improvements of 53% and 17% for the *CO dataset and across the Open Catalyst Project (OCP) task, respectively. TL approaches result in up to 4× speedup in model training depending on the target data and task. However, these do not perform well for the MD17 dataset, resulting in worse performance than the non-pretrained model for few molecules. Based on these observations, we propose transfer learning using attentions across atomic systems with graph Neural Networks (TAAG), an attention-based approach that adapts to prioritize and transfer important features from the interaction layers of GNNs. The proposed method outperforms the best TL approach for out-of-domain datasets, such as MD17, and gives a mean improvement of 6% over a model trained from scratch.
最近图神经网络 (GNN) 的进展改变了分子和催化剂发现的领域。尽管这些领域的基础物理仍然相同,但大多数之前的工作都集中在小分子或材料领域构建特定于领域的模型上。然而,在所有领域构建大型数据集在计算上是昂贵的;因此,使用迁移学习 (TL) 将模型推广到不同的领域是解决这个问题的一种很有前途但尚未充分探索的方法。为了评估这个假设,我们使用在 Open Catalyst Dataset (OC20) 上预训练的模型,并研究了在一组不同的数据集和任务上进行微调时模型的行为。这包括 MD17、*CO 吸附物数据集和 OC20 的不同任务。通过广泛的 TL 实验,我们证明 GNN 的初始层学习了一种更基本的表示,这种表示在不同的领域是一致的,而最后一层学习了更多的任务特定特征。此外,这些广为人知的策略在域内任务上取得了显著的改进,与非预训练模型相比,*CO 数据集的改进分别为 53%和 17%,在 Open Catalyst Project (OCP) 任务上的改进分别为 53%和 17%。TL 方法在模型训练中最多可以实现 4 倍的速度提升,具体取决于目标数据和任务。然而,对于 MD17 数据集,这些方法的效果并不理想,对于少数分子,其性能甚至比非预训练模型还要差。基于这些观察,我们提出了使用注意力在原子系统之间进行图神经网络的迁移学习 (TAAG),这是一种基于注意力的方法,它可以适应从 GNN 的交互层中优先和转移重要特征。所提出的方法在 MD17 等域外数据集上优于最佳 TL 方法,并比从头开始训练的模型提高了 6%的平均水平。