Tang Hongjian, Xu Qisong, Wang Mao, Jiang Jianwen
Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore.
ACS Appl Mater Interfaces. 2021 Nov 17;13(45):53454-53467. doi: 10.1021/acsami.1c13786. Epub 2021 Oct 19.
At present, 100 000+ metal-organic frameworks (MOFs) have been synthesized, and it is challenging to identity the best candidate for a specific application. In this study, MOFs are rapidly screened via a hierarchical approach for propane/propylene (CH/CH) separation. First, the adsorption capacity and selectivity of CH/CH mixture in "Computation-Ready, Experimental" (CoRE) MOFs are predicted via a molecular simulation (MS) method. The relationships between separation metrics and structural factors are established, and top-performing CoRE MOFs are identified. Then, machine learning (ML) models are trained and developed upon the CoRE MOFs using pore size, pore geometry, and framework chemistry as feature descriptors. By introducing binned pore size distributions and geometric descriptors, the accuracy of ML models is substantially improved. The feature importance of the descriptors is physically interpreted by the Gini impurities and Shapley Additive Explanations. Subsequently, the ML models are used to rapidly screen experimental "Cambridge Structural Database" (CSD) MOFs and hypothetical MOFs for CH/CH separation. In the CSD MOFs, the out-of-sample predictions are found to agree well with simulation results, demonstrating the excellent transferability of the ML models from the CoRE to CSD MOFs. Moreover, nine CSD MOFs are identified to possess separation performance superior to top-performing CoRE MOFs. Finally, the similarity and diversity among experimental and hypothetical MOFs are visualized and compared by the t-Distributed Stochastic Neighbor Embedding (t-SNE) feature projections. Remarkably, the CoRE and CSD MOFs are revealed to share a close similarity in both chemical and geometric feature spaces. By synergizing MS and ML, the hierarchical approach developed in this study would advance the rapid screening of MOFs across different databases toward industrially important separation processes.
目前,已合成了100,000多种金属有机框架材料(MOF),要确定特定应用的最佳候选材料具有挑战性。在本研究中,通过一种分层方法对MOF进行快速筛选,以实现丙烷/丙烯(CH₃/CH₂)分离。首先,通过分子模拟(MS)方法预测“计算就绪,实验性”(CoRE)MOF中CH₃/CH₂混合物的吸附容量和选择性。建立分离指标与结构因素之间的关系,并确定表现最佳的CoRE MOF。然后,以孔径、孔几何形状和骨架化学为特征描述符,在CoRE MOF上训练和开发机器学习(ML)模型。通过引入分箱孔径分布和几何描述符,ML模型的准确性得到了显著提高。描述符的特征重要性通过基尼杂质和夏普力加性解释进行物理解释。随后,ML模型用于快速筛选用于CH₃/CH₂分离的实验性“剑桥结构数据库”(CSD)MOF和假设性MOF。在CSD MOF中,发现样本外预测与模拟结果吻合良好,证明了ML模型从CoRE MOF到CSD MOF的出色可转移性。此外,确定有9种CSD MOF具有优于表现最佳的CoRE MOF的分离性能。最后,通过t分布随机邻域嵌入(t-SNE)特征投影对实验性和假设性MOF之间的相似性和多样性进行可视化和比较。值得注意的是,CoRE和CSD MOF在化学和几何特征空间中显示出密切的相似性。通过将MS和ML协同作用,本研究开发的分层方法将推动跨不同数据库对MOF进行快速筛选,以用于工业上重要的分离过程。