Nair Akhil S, Foppa Lucas, Scheffler Matthias
The NOMAD Laboratory at the Fritz Haber Institute of the Max Planck Society, Faradayweg 4-6, D-14195, Berlin, Germany.
Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, Arnimallee 22, 14195, Berlin, Germany.
Sci Data. 2025 Aug 29;12(1):1518. doi: 10.1038/s41597-025-05867-z.
Materials databases built from calculations based on density functional approximations play an important role in the discovery of materials with improved properties. Most databases thus constructed rely on the generalized gradient approximation (GGA) for electron exchange and correlation. This limits the reliability of these databases, as well as that of the artificial intelligence (AI) models trained on them, in particular for materials and properties which are not accurately described by GGA. Here, we describe a database of 7,024 inorganic materials presenting diverse structures and compositions. Crucially, the database was generated using hybrid functional calculations,efficiently implemented in the all-electron code FHI-aims. The database is used to evaluate the thermodynamic and electrochemical stability of oxides relevant to catalysis and energy related applications. We illustrate how the database can be used to train AI models for material properties using the sure-independence screening and sparsifying operator (SISSO) approach.
基于密度泛函近似计算构建的材料数据库在发现具有改进性能的材料方面发挥着重要作用。大多数这样构建的数据库依赖于广义梯度近似(GGA)来处理电子交换和关联。这限制了这些数据库以及基于它们训练的人工智能(AI)模型的可靠性,特别是对于那些GGA不能准确描述的材料和性能。在这里,我们描述了一个包含7024种无机材料的数据库,这些材料具有多样的结构和组成。至关重要的是,该数据库是使用混合泛函计算生成的,并在全电子代码FHI-aims中高效实现。该数据库用于评估与催化和能源相关应用相关的氧化物的热力学和电化学稳定性。我们说明了如何使用确定独立筛选和稀疏化算子(SISSO)方法,利用该数据库训练用于材料性能的AI模型。