Lee Jin-Woong, Park Chaewon, Do Lee Byung, Park Joonseo, Goo Nam Hoon, Sohn Kee-Sun
Nanotechnology & Advanced Materials Engineering, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul, 143-747, South Korea.
Advanced Research Team, Hyundai Steel DangJin Works, DangJin, Chungnam, 31719, South Korea.
Sci Rep. 2021 May 26;11(1):11012. doi: 10.1038/s41598-021-90237-z.
Predicting mechanical properties such as yield strength (YS) and ultimate tensile strength (UTS) is an intricate undertaking in practice, notwithstanding a plethora of well-established theoretical and empirical models. A data-driven approach should be a fundamental exercise when making YS/UTS predictions. For this study, we collected 16 descriptors (attributes) that implicate the compositional and processing information and the corresponding YS/UTS values for 5473 thermo-mechanically controlled processed (TMCP) steel alloys. We set up an integrated machine-learning (ML) platform consisting of 16 ML algorithms to predict the YS/UTS based on the descriptors. The integrated ML platform involved regularization-based linear regression algorithms, ensemble ML algorithms, and some non-linear ML algorithms. Despite the dirty nature of most real-world industry data, we obtained acceptable holdout dataset test results such as R > 0.6 and MSE < 0.01 for seven non-linear ML algorithms. The seven fully trained non-linear ML models were used for the ensuing 'inverse design (prediction)' based on an elitist-reinforced, non-dominated sorting genetic algorithm (NSGA-II). The NSGA-II enabled us to predict solutions that exhibit desirable YS/UTS values for each ML algorithm. In addition, the NSGA-II-driven solutions in the 16-dimensional input feature space were visualized using holographic research strategy (HRS) in order to systematically compare and analyze the inverse-predicted solutions for each ML algorithm.
尽管有大量成熟的理论和经验模型,但在实际中预测诸如屈服强度(YS)和抗拉强度(UTS)等力学性能是一项复杂的工作。在进行YS/UTS预测时,数据驱动的方法应是一项基本操作。在本研究中,我们收集了16个描述符(属性),这些描述符涉及5473种热机械控制轧制(TMCP)钢合金的成分和加工信息以及相应的YS/UTS值。我们建立了一个由16种机器学习(ML)算法组成的集成机器学习平台,以基于这些描述符预测YS/UTS。该集成ML平台涉及基于正则化的线性回归算法、集成ML算法和一些非线性ML算法。尽管大多数实际工业数据存在脏数据的问题,但我们针对七种非线性ML算法获得了可接受的留出数据集测试结果,如R > 0.6和MSE < 0.01。这七个经过充分训练的非线性ML模型被用于基于精英强化非支配排序遗传算法(NSGA-II)的后续“逆向设计(预测)”。NSGA-II使我们能够为每种ML算法预测出具有理想YS/UTS值的解决方案。此外,使用全息研究策略(HRS)对16维输入特征空间中由NSGA-II驱动的解决方案进行了可视化,以便系统地比较和分析每种ML算法的逆预测解决方案。