Department of Computer Science and Engineering, University of South Carolina, Columbia, South Carolina 29201, United States.
Department of Mechanical Engineering, University of South Carolina, Columbia, South Carolina 29208, United States.
J Phys Chem A. 2021 Jan 14;125(1):435-450. doi: 10.1021/acs.jpca.0c08103. Epub 2020 Dec 23.
Prediction models of lattice thermal conductivity (κ) have wide applications in the discovery of thermoelectrics, thermal barrier coatings, and thermal management of semiconductors. However, κ is notoriously difficult to predict. Although classic models such as the Debye-Callaway model and the Slack model have been used to approximate the κ of inorganic compounds, their accuracy is far from being satisfactory. Herein we propose a genetic programming-based symbolic regression (SR) approach for finding analytical κ models and compare them with multilayer perceptron neural networks and random forest regression models using a hybrid cross-validation (CV) approach including both -fold CV and holdout validation. Four formulae have been discovered by our SR approach that outperform the Slack formula as evaluated on our dataset. Through the analysis of our models' performance and the formulae generated, we found that the trained formulae successfully reproduce the correct physical law that governs the lattice thermal conductivity of materials. We also systematically show that currently extrapolative prediction over datasets with different distributions as the training set remains to be a big challenge for both SR and machine learning-based prediction models.
晶格热导率 (κ) 的预测模型在热电材料的发现、热障涂层和半导体的热管理等领域有广泛的应用。然而,κ 非常难以预测。虽然经典模型,如德拜-卡洛尔模型和斯莱克模型,已经被用来近似无机化合物的 κ,但它们的准确性远不能令人满意。在此,我们提出了一种基于遗传编程的符号回归 (SR) 方法,用于寻找分析 κ 模型,并使用混合交叉验证 (CV) 方法(包括 - 折 CV 和留出验证)将其与多层感知机神经网络和随机森林回归模型进行比较。我们的 SR 方法发现了四个公式,在我们的数据集上的评估中优于斯莱克公式。通过对我们模型的性能和生成的公式的分析,我们发现,训练的公式成功地再现了控制材料晶格热导率的正确物理规律。我们还系统地表明,对于 SR 和基于机器学习的预测模型来说,目前在训练数据集分布不同的数据集上进行外推预测仍然是一个巨大的挑战。