Osaro Etinosa, Fajardo-Rojas Fernando, Cooper Gregory M, Gómez-Gualdrón Diego, Colón Yamil J
Department of Chemical and Biomolecular Engineering, University of Notre Dame IN 46556 USA
Department of Chemical and Biological Engineering, Colorado School of Mines 1500 Illinois St Golden CO 80401 USA.
Chem Sci. 2024 Oct 8;15(42):17671-84. doi: 10.1039/d4sc02156h.
Adsorption is a fundamental process studied in materials science and engineering because it plays a critical role in various applications, including gas storage and separation. Understanding and predicting gas adsorption within porous materials demands comprehensive computational simulations that are often resource intensive, limiting the identification of promising materials. Active learning (AL) methods offer an effective strategy to reduce the computational burden by selectively acquiring critical data for model training. Metal-organic frameworks (MOFs) exhibit immense potential across various adsorption applications due to their porous structure and their modular nature, leading to diverse pore sizes and chemistry that serve as an ideal platform to develop adsorption models. Here, we demonstrate the efficacy of AL in predicting gas adsorption within MOFs using "alchemical" molecules and their interactions as surrogates for real molecules. We first applied AL separately to each MOF, reducing the training dataset size by 57.5% while retaining predictive accuracy. Subsequently, we amalgamated the refined datasets across 1800 MOFs to train a multilayer perceptron (MLP) model, successfully predicting adsorption of real molecules. Furthermore, by integrating MOF features into the AL framework using principal component analysis (PCA), we navigated MOF space effectively, achieving high predictive accuracy with only a subset of MOFs. Our results highlight AL's efficiency in reducing dataset size, enhancing model performance, and offering insights into adsorption phenomenon in large datasets of MOFs. This study underscores AL's crucial role in advancing computational material science and developing more accurate and less data intensive models for gas adsorption in porous materials.
吸附是材料科学与工程领域研究的一个基本过程,因为它在包括气体存储和分离在内的各种应用中起着关键作用。理解和预测多孔材料中的气体吸附需要全面的计算模拟,而这通常资源消耗大,限制了对有前景材料的识别。主动学习(AL)方法提供了一种有效的策略,通过有选择地获取模型训练的关键数据来减轻计算负担。金属有机框架(MOF)由于其多孔结构和模块化性质,在各种吸附应用中展现出巨大潜力,从而产生了多样的孔径和化学性质,为开发吸附模型提供了理想平台。在此,我们展示了利用“炼金术”分子及其相互作用作为真实分子的替代物,主动学习在预测MOF中的气体吸附方面的有效性。我们首先将主动学习分别应用于每个MOF,在保持预测准确性的同时将训练数据集大小减少了57.5%。随后,我们合并了1800个MOF的精炼数据集来训练一个多层感知器(MLP)模型,成功预测了真实分子的吸附。此外,通过使用主成分分析(PCA)将MOF特征整合到主动学习框架中,我们有效地探索了MOF空间,仅用一部分MOF就实现了高预测准确性。我们的结果突出了主动学习在减少数据集大小、提高模型性能以及洞察MOF大数据集中的吸附现象方面的效率。这项研究强调了主动学习在推进计算材料科学以及开发用于多孔材料中气体吸附的更准确且数据密集度更低的模型方面的关键作用。