Wan Zhongyu, Wang Quan-De, Liu Dongchang, Liang Jinhu
Low Carbon Energy Institute and School of Chemical Engineering, China University of Mining and Technology, Xuzhou, 221008, People's Republic of China.
Department of Physics, Sungkyunkwan University, Suwon 16419, Korea.
Phys Chem Chem Phys. 2021 Jul 28;23(29):15675-15684. doi: 10.1039/d1cp02066h.
Metal oxides are widely used in the fields of chemistry, physics and materials science. Oxygen vacancy formation energy is a key parameter to describe the chemical, mechanical, and thermodynamic properties of metal oxides. How to acquire quickly and accurately oxygen vacancy formation energy remains a challenge for both experimental and theoretical researchers. Herein, we propose a machine learning model for the prediction of oxygen vacancy formation energy via data-driven analysis and the definition of simple descriptors. Starting with the database containing oxygen vacancy formation energies for 1750 metal oxides with enough structural diversity, new descriptors that effectively avoid the defects of molecular fingerprints, molecular graphic descriptors and site descriptors are defined. The descriptors have obvious physical meanings and wide practicability. Multiple linear regression analysis is then used to screen important features for machine learning model development, and two strongly associated features are obtained. The selected descriptors are used as input for the training of 21 machine learning models to select and develop the most accurate machine learning model. Finally, it is shown that the least squares support vector regression method exhibits the best performance for accurate prediction of the targeted oxygen vacancy formation energy through systematic error analysis, and the prediction accuracy is also verified by the external dataset. Our work establishes a novel and simple computational approach for accurate prediction of the oxygen vacancy formation energy of metal oxides and highlights the availability of data-driven analysis for metal oxide material research.
金属氧化物广泛应用于化学、物理和材料科学领域。氧空位形成能是描述金属氧化物化学、机械和热力学性质的关键参数。如何快速准确地获取氧空位形成能,对实验和理论研究人员来说仍然是一个挑战。在此,我们通过数据驱动分析和简单描述符的定义,提出了一种用于预测氧空位形成能的机器学习模型。从包含1750种具有足够结构多样性的金属氧化物的氧空位形成能的数据库出发,定义了有效避免分子指纹、分子图形描述符和位点描述符缺陷的新描述符。这些描述符具有明显的物理意义和广泛的实用性。然后使用多元线性回归分析来筛选用于机器学习模型开发的重要特征,并获得两个强相关特征。所选描述符用作训练21个机器学习模型的输入,以选择和开发最准确的机器学习模型。最后,通过系统误差分析表明,最小二乘支持向量回归方法在准确预测目标氧空位形成能方面表现出最佳性能,并且外部数据集也验证了预测准确性。我们的工作建立了一种新颖且简单的计算方法来准确预测金属氧化物的氧空位形成能,并突出了数据驱动分析在金属氧化物材料研究中的可用性。