催化作用的潜变量机器学习框架：通用模型、迁移学习与可解释性

Latent Variable Machine Learning Framework for Catalysis: General Models, Transfer Learning, and Interpretability.

作者信息

Kayode Gbolade O, Montemore Matthew M

机构信息

Department of Chemical and Biomolecular Engineering, Tulane University, New Orleans, Louisiana 70118, United States.

出版信息

JACS Au. 2023 Dec 19;4(1):80-91. doi: 10.1021/jacsau.3c00419. eCollection 2024 Jan 22.

DOI:10.1021/jacsau.3c00419

PMID:38274257

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10807004/

Abstract

Machine learning has been successfully applied in recent years to screen materials for a variety of applications. However, despite recent advances, most screening-based machine learning approaches are limited in generality and transferability, requiring new models to be created from scratch for each new application. This is particularly apparent in catalysis, where there are many possible intermediates and transition states of interest in addition to a large number of potential catalytic materials. In this work, we developed a new machine learning framework that is built on chemical principles and allows the creation of general, interpretable, reusable models. Our new architecture uses latent variables to create a set of submodels that each take on a relatively simple learning task, leading to higher data efficiency and promoting transfer learning. This architecture infuses fundamental chemical principles, such as the existence of elements as discrete entities. We show that this architecture allows for the creation of models that can be reused for many different applications, providing significant improvements in efficiency and convenience. For example, our architecture allows simultaneous prediction of adsorption energies for many adsorbates on a broad array of alloy surfaces with mean absolute errors (MAEs) around 0.20-0.25 eV. The integration of latent variables provides physical interpretability, as predictions can be explained in terms of the learned chemical environment as represented by the latent space. Further, these latent variables also serve as new feature representations, allowing efficient transfer learning. For example, new models with useful levels of accuracy can be created with less than 10 data points, including transfer learning to an experimental data set with an MAE less than 0.15 eV. Lastly, we show that our new machine learning architecture is general and robust enough to handle heterogeneous and multifidelity data sets, allowing researchers to leverage existing data sets to speed up screening using their own computational setup.

摘要

近年来，机器学习已成功应用于筛选适用于各种应用的材料。然而，尽管有最近的进展，但大多数基于筛选的机器学习方法在通用性和可转移性方面都受到限制，需要为每个新应用从头创建新模型。这在催化领域尤为明显，除了大量潜在的催化材料外，还有许多可能感兴趣的中间体和过渡态。在这项工作中，我们开发了一种基于化学原理的新机器学习框架，该框架允许创建通用、可解释、可重复使用的模型。我们的新架构使用潜在变量来创建一组子模型，每个子模型承担相对简单的学习任务，从而提高数据效率并促进迁移学习。这种架构融入了基本的化学原理，例如元素作为离散实体的存在。我们表明，这种架构允许创建可用于许多不同应用的模型，在效率和便利性方面有显著提高。例如，我们的架构允许同时预测多种吸附质在各种合金表面上的吸附能，平均绝对误差（MAE）约为0.20 - 0.25电子伏特。潜在变量的整合提供了物理可解释性，因为预测可以根据潜在空间所代表的学习化学环境来解释。此外，这些潜在变量还用作新的特征表示，允许进行有效的迁移学习。例如，可以用少于10个数据点创建具有有用精度水平的新模型，包括迁移学习到MAE小于0.15电子伏特的实验数据集。最后，我们表明我们的新机器学习架构具有足够的通用性和鲁棒性来处理异构和多保真度数据集，使研究人员能够利用现有数据集，使用他们自己的计算设置来加速筛选。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acc7/10807004/ec26c75ec795/au3c00419_0001.jpg

相似文献

Latent Variable Machine Learning Framework for Catalysis: General Models, Transfer Learning, and Interpretability.

JACS Au. 2023 Dec 19;4(1):80-91. doi: 10.1021/jacsau.3c00419. eCollection 2024 Jan 22.

Group and Period-Based Representations for Improved Machine Learning Prediction of Heterogeneous Alloy Catalysts.

J Phys Chem Lett. 2021 Jun 3;12(21):5156-5162. doi: 10.1021/acs.jpclett.1c01319. Epub 2021 May 25.

Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.

Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.

Improved Representations of Heterogeneous Carbon Reforming Catalysis Using Machine Learning.

J Chem Theory Comput. 2019 Dec 10;15(12):6882-6894. doi: 10.1021/acs.jctc.9b00420. Epub 2019 Nov 6.

Importance of Engineered and Learned Molecular Representations in Predicting Organic Reactivity, Selectivity, and Chemical Properties.

Acc Chem Res. 2021 Feb 16;54(4):827-836. doi: 10.1021/acs.accounts.0c00745. Epub 2021 Feb 3.

Unravelling CO Reduction Reaction Intermediates on High Entropy Alloy Catalysts: An Interpretable Machine Learning Approach to Establish Scaling Relations.

Chemistry. 2024 Jan 26;30(6):e202302679. doi: 10.1002/chem.202302679. Epub 2023 Dec 14.

Predicting chemical ecotoxicity by learning latent space chemical representations.

Environ Int. 2022 May;163:107224. doi: 10.1016/j.envint.2022.107224. Epub 2022 Apr 1.

Adsorption Enthalpies for Catalysis Modeling through Machine-Learned Descriptors.

Acc Chem Res. 2021 Jun 15;54(12):2741-2749. doi: 10.1021/acs.accounts.1c00153. Epub 2021 Jun 3.

Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).

Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.

Improving Accuracy and Transferability of Machine Learning Chemical Activation Energies by Adding Electronic Structure Information.

J Chem Inf Model. 2023 Mar 13;63(5):1454-1461. doi: 10.1021/acs.jcim.2c01502. Epub 2023 Mar 2.

本文引用的文献

Transition Metal Chalcogenides as a Versatile and Tunable Platform for Catalytic CO and N Electroreduction.

ACS Mater Au. 2021 May 24;1(1):6-36. doi: 10.1021/acsmaterialsau.1c00006. eCollection 2021 Sep 8.

Recent Integrations of Latent Variable Network Modeling With Psychometric Models.

Front Psychol. 2021 Dec 9;12:773289. doi: 10.3389/fpsyg.2021.773289. eCollection 2021.

Infusing theory into deep learning for interpretable reactivity prediction.

Nat Commun. 2021 Sep 6;12(1):5288. doi: 10.1038/s41467-021-25639-8.

Artificial Intelligence in Chemistry: Current Trends and Future Directions.

J Chem Inf Model. 2021 Jul 26;61(7):3197-3212. doi: 10.1021/acs.jcim.1c00619. Epub 2021 Jul 15.

Discovering Relationships between OSDAs and Zeolites through Data Mining and Generative Neural Networks.

ACS Cent Sci. 2021 May 26;7(5):858-867. doi: 10.1021/acscentsci.1c00024. Epub 2021 Apr 16.

Group and Period-Based Representations for Improved Machine Learning Prediction of Heterogeneous Alloy Catalysts.

J Phys Chem Lett. 2021 Jun 3;12(21):5156-5162. doi: 10.1021/acs.jpclett.1c01319. Epub 2021 May 25.

Machine-Learning-Guided Discovery and Optimization of Additives in Preparing Cu Catalysts for CO Reduction.

J Am Chem Soc. 2021 Apr 21;143(15):5755-5762. doi: 10.1021/jacs.1c00339. Epub 2021 Apr 12.

Machine learned features from density of states for accurate adsorption energy prediction.

Nat Commun. 2021 Jan 4;12(1):88. doi: 10.1038/s41467-020-20342-6.

Probing Substrate Scope with Molecular Volcanoes.

Org Lett. 2020 Oct 16;22(20):7936-7941. doi: 10.1021/acs.orglett.0c02862. Epub 2020 Sep 25.

Accelerated discovery of CO electrocatalysts using active machine learning.

Nature. 2020 May;581(7807):178-183. doi: 10.1038/s41586-020-2242-8. Epub 2020 May 13.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

催化作用的潜变量机器学习框架：通用模型、迁移学习与可解释性

Latent Variable Machine Learning Framework for Catalysis: General Models, Transfer Learning, and Interpretability.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献