Peivaste Iman, Jossou Ericmoore, Tiamiyu Ahmed A
Department of Mechanical and Manufacturing Engineering, University of Calgary, Calgary, Alberta, T2N 1N4, Canada.
Nuclear Science and Technology Department, Brookhaven National Laboratory, Upton, NY, 11973, USA.
Sci Rep. 2023 Dec 18;13(1):22556. doi: 10.1038/s41598-023-50044-0.
High-entropy alloys (HEAs) represent a promising class of materials with exceptional structural and functional properties. However, their design and optimization pose challenges due to the large composition-phase space coupled with the complex and diverse nature of the phase formation dynamics. In this study, a data-driven approach that utilizes machine learning (ML) techniques to predict HEA phases and their composition-dependent phases is proposed. By employing a comprehensive dataset comprising 5692 experimental records encompassing 50 elements and 11 phase categories, we compare the performance of various ML models. Our analysis identifies the most influential features for accurate phase prediction. Furthermore, the class imbalance is addressed by employing data augmentation methods, raising the number of records to 1500 in each category, and ensuring a balanced representation of phase categories. The results show that XGBoost and Random Forest consistently outperform the other models, achieving 86% accuracy in predicting all phases. Additionally, this work provides an extensive analysis of HEA phase formers, showing the contributions of elements and features to the presence of specific phases. We also examine the impact of including different phases on ML model accuracy and feature significance. Notably, the findings underscore the need for ML model selection based on specific applications and desired predictions, as feature importance varies across models and phases. This study significantly advances the understanding of HEA phase formation, enabling targeted alloy design and fostering progress in the field of materials science.
高熵合金(HEAs)是一类具有优异结构和功能特性的有前途的材料。然而,由于其巨大的成分-相空间以及相形成动力学的复杂多样性质,它们的设计和优化面临挑战。在本研究中,提出了一种利用机器学习(ML)技术预测高熵合金相及其成分依赖相的数据驱动方法。通过使用包含5692条实验记录、涵盖50种元素和11个相类别的综合数据集,我们比较了各种ML模型的性能。我们的分析确定了准确相预测中最具影响力的特征。此外,通过采用数据增强方法解决了类别不平衡问题,将每个类别的记录数量增加到1500条,并确保相类别的均衡表示。结果表明,XGBoost和随机森林始终优于其他模型,在预测所有相时达到了86%的准确率。此外,这项工作对高熵合金相形成元素进行了广泛分析,展示了元素和特征对特定相存在的贡献。我们还研究了纳入不同相对ML模型准确性和特征重要性的影响。值得注意的是,研究结果强调了根据特定应用和期望预测选择ML模型的必要性,因为特征重要性因模型和相而异。本研究显著推进了对高熵合金相形成的理解,实现了有针对性的合金设计,并促进了材料科学领域的进展。