Knøsgaard Nikolaj Rørbæk, Thygesen Kristian Sommer
Computational Atomic-scale Materials Design (CAMD), Department of Physics, Technical University of Denmark, 2800, Kgs. Lyngby, Denmark.
Nat Commun. 2022 Feb 3;13(1):468. doi: 10.1038/s41467-022-28122-0.
Choosing optimal representation methods of atomic and electronic structures is essential when machine learning properties of materials. We address the problem of representing quantum states of electrons in a solid for the purpose of machine leaning state-specific electronic properties. Specifically, we construct a fingerprint based on energy decomposed operator matrix elements (ENDOME) and radially decomposed projected density of states (RAD-PDOS), which are both obtainable from a standard density functional theory (DFT) calculation. Using such fingerprints we train a gradient boosting model on a set of 46k GW quasiparticle energies. The resulting model predicts the self-energy correction of states in materials not seen by the model with a mean absolute error of 0.14 eV. By including the material's calculated dielectric constant in the fingerprint the error can be further reduced by 30%, which we find is due to an enhanced ability to learn the correlation/screening part of the self-energy. Our work paves the way for accurate estimates of quasiparticle band structures at the cost of a standard DFT calculation.
在机器学习材料特性时,选择原子和电子结构的最佳表示方法至关重要。为了机器学习特定状态的电子特性,我们解决了表示固体中电子量子态的问题。具体而言,我们基于能量分解算符矩阵元(ENDOME)和径向分解投影态密度(RAD-PDOS)构建了一个指纹,这两者都可以从标准密度泛函理论(DFT)计算中获得。使用这样的指纹,我们在一组46k个GW准粒子能量上训练了一个梯度提升模型。所得模型预测模型未见过的材料中态的自能修正,平均绝对误差为0.14 eV。通过在指纹中纳入材料的计算介电常数,误差可进一步降低30%,我们发现这是由于学习自能的关联/屏蔽部分的能力增强。我们的工作为以标准DFT计算为代价准确估计准粒子能带结构铺平了道路。