Lederer Jonas, Gastegger Michael, Schütt Kristof T, Kampffmeyer Michael, Müller Klaus-Robert, Unke Oliver T
Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany.
Phys Chem Chem Phys. 2023 Oct 4;25(38):26370-26379. doi: 10.1039/d3cp03845a.
In recent years, the prediction of quantum mechanical observables with machine learning methods has become increasingly popular. Message-passing neural networks (MPNNs) solve this task by constructing atomic representations, from which the properties of interest are predicted. Here, we introduce a method to automatically identify chemical moieties (molecular building blocks) from such representations, enabling a variety of applications beyond property prediction, which otherwise rely on expert knowledge. The required representation can either be provided by a pretrained MPNN, or be learned from scratch using only structural information. Beyond the data-driven design of molecular fingerprints, the versatility of our approach is demonstrated by enabling the selection of representative entries in chemical databases, the automatic construction of coarse-grained force fields, as well as the identification of reaction coordinates.
近年来,利用机器学习方法预测量子力学可观测量变得越来越流行。消息传递神经网络(MPNNs)通过构建原子表示来解决此任务,从中预测感兴趣的属性。在这里,我们介绍一种从这些表示中自动识别化学基团(分子构建块)的方法,从而实现超越属性预测的各种应用,否则这些应用依赖于专家知识。所需的表示既可以由预训练的MPNN提供,也可以仅使用结构信息从头开始学习。除了分子指纹的数据驱动设计外,我们的方法的通用性还体现在能够在化学数据库中选择代表性条目、自动构建粗粒度力场以及识别反应坐标。