Hou Bowen, Wu Jinyuan, Qiu Diana Y
Department of Mechanical Engineering and Material Sciences, Yale University, New Haven, CT, 06511, USA.
Nat Commun. 2024 Nov 2;15(1):9481. doi: 10.1038/s41467-024-53748-7.
Representation learning for the electronic structure problem is a major challenge of machine learning in computational condensed matter and materials physics. Within quantum mechanical first principles approaches, density functional theory (DFT) is the preeminent tool for understanding electronic structure, and the high-dimensional DFT wavefunctions serve as building blocks for downstream calculations of correlated many-body excitations and related physical observables. Here, we use variational autoencoders (VAE) for the unsupervised learning of DFT wavefunctions and show that these wavefunctions lie in a low-dimensional manifold within latent space. Our model autonomously determines the optimal representation of the electronic structure, avoiding limitations due to manual feature engineering. To demonstrate the utility of the latent space representation of the DFT wavefunction, we use it for the supervised training of neural networks (NN) for downstream prediction of quasiparticle bandstructures within the GW formalism. The GW prediction achieves a low error of 0.11 eV for a combined test set of two-dimensional metals and semiconductors, suggesting that the latent space representation captures key physical information from the original data. Finally, we explore the generative ability and interpretability of the VAE representation.
电子结构问题的表示学习是计算凝聚态物质和材料物理中机器学习的一项重大挑战。在量子力学第一性原理方法中,密度泛函理论(DFT)是理解电子结构的卓越工具,高维DFT波函数是下游关联多体激发及相关物理可观测量计算的基石。在此,我们使用变分自编码器(VAE)对DFT波函数进行无监督学习,并表明这些波函数位于潜在空间的低维流形中。我们的模型自主确定电子结构的最优表示,避免了因手动特征工程带来的局限性。为了证明DFT波函数潜在空间表示的效用,我们将其用于神经网络(NN)的监督训练,以在GW形式体系下对准粒子能带结构进行下游预测。对于二维金属和半导体的组合测试集,GW预测实现了0.11 eV的低误差,这表明潜在空间表示从原始数据中捕获了关键物理信息。最后,我们探索了VAE表示的生成能力和可解释性。