Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
Phys Chem Chem Phys. 2023 Mar 15;25(11):8103-8116. doi: 10.1039/d3cp00258f.
Virtual high-throughput screening (VHTS) and machine learning (ML) with density functional theory (DFT) suffer from inaccuracies from the underlying density functional approximation (DFA). Many of these inaccuracies can be traced to the lack of derivative discontinuity that leads to a curvature in the energy with electron addition or removal. Over a dataset of nearly one thousand transition metal complexes typical of VHTS applications, we computed and analyzed the average curvature (, deviation from piecewise linearity) for 23 density functional approximations spanning multiple rungs of "Jacob's ladder". While we observe the expected dependence of the curvatures on Hartree-Fock exchange, we note limited correlation of curvature values between different rungs of "Jacob's ladder". We train ML models (, artificial neural networks or ANNs) to predict the curvature and the associated frontier orbital energies for each of these 23 functionals and then interpret differences in curvature among the different DFAs through analysis of the ML models. Notably, we observe spin to play a much more important role in determining the curvature of range-separated and double hybrids in comparison to semi-local functionals, explaining why curvature values are weakly correlated between these and other families of functionals. Over a space of 187.2k hypothetical compounds, we use our ANNs to pinpoint DFAs for which representative transition metal complexes have near-zero curvature with low uncertainty, demonstrating an approach to accelerate screening of complexes with targeted optical gaps.
虚拟高通量筛选 (VHTS) 和基于密度泛函理论 (DFT) 的机器学习 (ML) 受到基础密度泛函近似 (DFA) 的不准确性的影响。这些不准确性中的许多可以追溯到缺乏导数不连续性,这导致能量随着电子的添加或去除而出现曲率。在近一千个典型 VHTS 应用的过渡金属配合物数据集上,我们计算和分析了 23 种密度泛函近似的平均曲率 (,偏离分段线性),这些密度泛函近似跨越了“雅各布阶梯”的多个梯级。虽然我们观察到曲率对 Hartree-Fock 交换的预期依赖性,但我们注意到不同梯级的曲率值之间相关性有限。我们训练 ML 模型(,人工神经网络或 ANNs)来预测这些 23 种函数中的每一种的曲率和相关前沿轨道能量,然后通过分析 ML 模型来解释不同 DFA 之间的曲率差异。值得注意的是,我们观察到自旋在决定范围分离和双杂交的曲率方面比半局部函数更为重要,这解释了为什么这些函数和其他功能家族之间的曲率值相关性较弱。在 187.2k 个假设化合物的空间中,我们使用我们的 ANNs 来确定具有低不确定性的代表性过渡金属配合物具有近零曲率的 DFA,展示了一种加速具有目标光学间隙的配合物筛选的方法。