Kamnis Spyros, Delibasis Konstantinos
Department of Computer Science and Biomedical Informatics, University of Thessaly, 35100, Lamia, Greece.
Castolin Eutectic-Monitor Coatings Ltd., Newcastle upon Tyne, NE29 8SE, UK.
Sci Rep. 2025 Apr 7;15(1):11861. doi: 10.1038/s41598-025-95170-z.
This study introduces a language transformer-based machine learning model to predict key mechanical properties of high-entropy alloys (HEAs), addressing the challenges due to their complex, multi-principal element compositions and limited experimental data. By pre-training the transformer on extensive synthetic materials data and fine-tuning it with specific HEA datasets, the model effectively captures intricate elemental interactions through self-attention mechanisms. This approach mitigates data scarcity issues via transfer learning, enhancing predictive accuracy for properties like elongation (%) and ultimate tensile strength compared to traditional regression models such as random forests and Gaussian processes. The model's interpretability is enhanced by visualizing attention weights, revealing significant elemental relationships that align with known metallurgical principles. This work demonstrates the potential of transformer models to accelerate materials discovery and optimization, enabling accurate property predictions, thereby advancing the field of materials informatics. To fully realize the model's potential in practical applications, future studies should incorporate more advanced preprocessing methods, realistic constraints during synthetic dataset generation, and more refined tokenization techniques.
本研究引入了一种基于语言变换器的机器学习模型来预测高熵合金(HEA)的关键力学性能,以应对因其复杂的多主元成分和有限的实验数据所带来的挑战。通过在大量合成材料数据上对变换器进行预训练,并使用特定的HEA数据集对其进行微调,该模型通过自注意力机制有效地捕捉复杂的元素相互作用。与随机森林和高斯过程等传统回归模型相比,这种方法通过迁移学习缓解了数据稀缺问题,提高了对伸长率(%)和极限抗拉强度等性能的预测准确性。通过可视化注意力权重增强了模型的可解释性,揭示了与已知冶金原理相符的重要元素关系。这项工作展示了变换器模型在加速材料发现和优化方面的潜力,能够实现准确的性能预测,从而推动材料信息学领域的发展。为了在实际应用中充分发挥该模型的潜力,未来的研究应纳入更先进的预处理方法、合成数据集生成过程中的实际约束以及更精细的词元化技术。