Schmidt Jonathan, Pettersson Love, Verdozzi Claudio, Botti Silvana, Marques Miguel A L
Institut für Physik, Martin-Luther-Universität Halle-Wittenberg, 06120 Halle (Saale), Germany.
Department of Physics, Lund University Box 118, 221 00 Lund, Sweden.
Sci Adv. 2021 Dec 3;7(49):eabi7948. doi: 10.1126/sciadv.abi7948.
Graph neural networks for crystal structures typically use the atomic positions and the atomic species as input. Unfortunately, this information is not available when predicting new materials, for which the precise geometrical information is unknown. We circumvent this problem by replacing the precise bond distances with embeddings of graph distances. This allows our networks to be applied directly in high-throughput studies based on both composition and crystal structure prototype without using relaxed structures as input. To train these networks, we curate a dataset of over 2 million density functional calculations of crystals with consistent calculation parameters. We apply the resulting model to the high-throughput search of 15 million tetragonal perovskites of composition ABCD. As a result, we identify several thousand potentially stable compounds and demonstrate that transfer learning from the newly curated dataset reduces the required training data by 50%.
用于晶体结构的图神经网络通常将原子位置和原子种类作为输入。不幸的是,在预测新材料时,这些信息是不可用的,因为新材料的精确几何信息是未知的。我们通过用图距离的嵌入来代替精确的键长来规避这个问题。这使得我们的网络能够直接应用于基于成分和晶体结构原型的高通量研究中,而无需使用松弛结构作为输入。为了训练这些网络,我们精心策划了一个包含超过200万个具有一致计算参数的晶体密度泛函计算的数据集。我们将所得模型应用于对1500万个ABCD组成的四方钙钛矿的高通量搜索。结果,我们识别出数千种潜在稳定的化合物,并证明从新策划的数据集中进行迁移学习可将所需的训练数据减少50%。