Department of Mechanical Engineering, Indian Institute of Technology Guwahati, Guwahati, 781039, India.
School of Engineering, London South Bank University, 103 Borough Road, London, SE1 0AA, UK.
Sci Rep. 2023 Mar 23;13(1):4811. doi: 10.1038/s41598-023-31461-7.
Nearly ~ 10 types of High entropy alloys (HEAs) can be developed from about 64 elements in the periodic table. A major challenge for materials scientists and metallurgists at this stage is to predict their crystal structure and, therefore, their mechanical properties to reduce experimental efforts, which are energy and time intensive. Through this paper, we show that it is possible to use machine learning (ML) in this arena for phase prediction to develop novel HEAs. We tested five robust algorithms namely, K-nearest neighbours (KNN), support vector machine (SVM), decision tree classifier (DTC), random forest classifier (RFC) and XGBoost (XGB) in their vanilla form (base models) on a large dataset screened specifically from experimental data concerning HEA fabrication using melting and casting manufacturing methods. This was necessary to avoid the discrepancy inherent with comparing HEAs obtained from different synthesis routes as it causes spurious effects while treating an imbalanced data-an erroneous practice we observed in the reported literature. We found that (i) RFC model predictions were more reliable in contrast to other models and (ii) the synthetic data augmentation is not a neat practice in materials science specially to develop HEAs, where it cannot assure phase information reliably. To substantiate our claim, we compared the vanilla RFC (V-RFC) model for original data (1200 datasets) with SMOTE-Tomek links augmented RFC (ST-RFC) model for the new datasets (1200 original + 192 generated = 1392 datasets). We found that although the ST-RFC model showed a higher average test accuracy of 92%, no significant breakthroughs were observed, when testing the number of correct and incorrect predictions using confusion matrix and ROC-AUC scores for individual phases. Based on our RFC model, we report the development of a new HEA (NiCuFeCoAl) exhibiting an FCC phase proving the robustness of our predictions.
大约有 64 种元素可以形成近 10 种高熵合金 (HEA)。现阶段,材料科学家和冶金学家面临的主要挑战是预测其晶体结构,从而预测其机械性能,以减少实验工作量,这是一项耗费能源和时间的工作。通过本文,我们展示了在这个领域中使用机器学习 (ML) 进行相预测以开发新型 HEA 的可能性。我们在一个大型数据集中测试了五种强大的算法,即 K-最近邻 (KNN)、支持向量机 (SVM)、决策树分类器 (DTC)、随机森林分类器 (RFC) 和 XGBoost (XGB),并在特定的从使用熔融和铸造制造方法的 HEA 制造实验数据中筛选出的大型数据集中以原始形式 (基础模型) 进行测试。这是必要的,因为从不同合成路线获得的 HEA 进行比较会导致固有差异,从而在处理不平衡数据时产生虚假效果,这是我们在报告的文献中观察到的错误做法。我们发现:(i)RFC 模型的预测结果更可靠,与其他模型相比;(ii)合成数据增强在材料科学中并不是一个整洁的实践,特别是在开发 HEA 时,它不能可靠地保证相信息。为了证实我们的说法,我们将原始数据的原始 RFC (V-RFC) 模型与新数据集 (1200 个原始数据集+192 个生成数据集=1392 个数据集) 的 SMOTE-Tomek 链接增强 RFC (ST-RFC) 模型进行了比较。我们发现,尽管 ST-RFC 模型的平均测试准确率达到 92%,但使用混淆矩阵和 ROC-AUC 分数对各相的正确和错误预测数量进行测试时,并没有观察到显著的突破。基于我们的 RFC 模型,我们报告了一种新的 HEA (NiCuFeCoAl) 的开发,该合金具有 FCC 相,证明了我们预测的稳健性。