School of Physical Science and Technology , ShanghaiTech University , Shanghai 201210 , China.
J Chem Inf Model. 2019 Nov 25;59(11):4636-4644. doi: 10.1021/acs.jcim.9b00623. Epub 2019 Nov 12.
In this work, we propose a computational framework for machine learning prediction on structural and performance properties of nanoporous materials for methane storage application. For our machine learning prediction, two descriptors based on pore geometry barcodes were developed; one descriptor is a set of distances from a structure to the most diverse set in barcode space, and the second descriptor extracts and uses the most important features from the barcodes. First, to identify the optimal condition for machine learning prediction, the effects of training set preparation method, training set size, and machine learning models were investigated. Our analysis showed that kernel ridge regression provides the highest prediction accuracy, and randomly selected 5% structures of the entire set would work well as a training set. Our results showed that both descriptors accurately predicted performance and even structural properties of zeolites. Furthermore, we demonstrated that our approach predicts accurately properties of metal-organic frameworks, which might indicate the possibility of this approach to be easily applied to predict the properties of other types of nanoporous materials.
在这项工作中,我们提出了一个用于机器学习预测的计算框架,用于甲烷存储应用的纳米多孔材料的结构和性能特性。对于我们的机器学习预测,我们开发了两个基于孔几何条形码的描述符; 一个描述符是一组从结构到条形码空间中最多样化的结构的距离,第二个描述符提取并使用条形码中的最重要特征。首先,为了确定机器学习预测的最佳条件,研究了训练集准备方法、训练集大小和机器学习模型的影响。我们的分析表明,核岭回归提供了最高的预测准确性,并且整个集合的随机选择 5%的结构作为训练集效果很好。我们的结果表明,这两个描述符都可以准确地预测沸石的性能甚至结构特性。此外,我们证明了我们的方法可以准确地预测金属有机骨架的性质,这可能表明这种方法很容易应用于预测其他类型的纳米多孔材料的性质。