Department of Computer Science, University of North Carolina at Charlotte, Charlotte, North Carolina 28223, United States.
Department of Chemical Engineering, University of South Carolina, Columbia, South Carolina 29208, United States.
J Chem Inf Model. 2024 Jan 22;64(2):327-339. doi: 10.1021/acs.jcim.3c00594. Epub 2024 Jan 10.
Catalyst screening is a critical step in the discovery and development of heterogeneous catalysts, which are vital for a wide range of chemical processes. In recent years, computational catalyst screening, primarily through density functional theory (DFT), has gained significant attention as a method for identifying promising catalysts. However, the computation of adsorption energies for all likely chemical intermediates present in complex surface chemistries is computationally intensive and costly due to the expensive nature of these calculations and the intrinsic idiosyncrasies of the methods or data sets used. This study introduces a novel machine learning (ML) method to learn adsorption energies from multiple DFT functionals by using invariant molecular representations (IMRs). To do this, we first extract molecular fingerprints for the reaction intermediates and later use a Siamese-neural-network-based training strategy to learn invariant molecular representations or the IMR across all available functionals. Our Siamese network-based representations demonstrate superior performance in predicting adsorption energies compared with other molecular representations. Notably, when considering mean absolute values of adsorption energies as 0.43 eV (PBE-D3), 0.46 eV (BEEF-vdW), 0.81 eV (RPBE), and 0.37 eV (scan+rVV10), our IMR method has achieved the lowest mean absolute errors (MAEs) of 0.18 0.10, 0.16, and 0.18 eV, respectively. These results emphasize the superior predictive capacity of our Siamese network-based representations. The empirical findings in this study illuminate the efficacy, robustness, and dependability of our proposed ML paradigm in predicting adsorption energies, specifically for propane dehydrogenation on a platinum catalyst surface.
催化剂筛选是发现和开发多相催化剂的关键步骤,多相催化剂在广泛的化学过程中至关重要。近年来,计算催化剂筛选,主要是通过密度泛函理论(DFT),作为一种识别有前途的催化剂的方法引起了极大的关注。然而,由于这些计算的昂贵性质以及所使用的方法或数据集的固有特殊性,对复杂表面化学中所有可能的化学中间体的吸附能进行计算在计算上是密集和昂贵的。本研究引入了一种新的机器学习(ML)方法,通过使用不变分子表示(IMR)从多个 DFT 泛函中学习吸附能。为此,我们首先为反应中间体提取分子指纹,然后使用基于孪生神经网络的训练策略来学习所有可用泛函上的不变分子表示或 IMR。与其他分子表示相比,我们基于孪生网络的表示在预测吸附能方面表现出优异的性能。值得注意的是,当考虑吸附能的平均绝对值为 0.43 eV(PBE-D3)、0.46 eV(BEEF-vdW)、0.81 eV(RPBE)和 0.37 eV(scan+rVV10)时,我们的 IMR 方法分别达到了 0.18、0.10、0.16 和 0.18 eV 的最低平均绝对误差(MAE)。这些结果强调了我们基于孪生网络的表示的卓越预测能力。本研究中的经验发现说明了我们提出的 ML 范式在预测吸附能方面的有效性、稳健性和可靠性,特别是对于丙烷在铂催化剂表面上的脱氢反应。