Khan Naseer Muhammad, Ma Liqiang, Inqiad Waleed Bin, Khan Muhammad Saud, Iqbal Imtiaz, Emad Muhammad Zaka, Alarifi Saad S
Xinjiang Key Laboratory of Coal-bearing Resources Exploration and Exploitation, Xinjiang Institute of Engineering, Urumqi, 830023, China.
Key Laboratory of Xinjiang Coal Resources Green Mining (Xinjiang Institute of Engineering), Ministry of Education, Urumqi, 830023, China.
Sci Rep. 2025 Jun 3;15(1):19414. doi: 10.1038/s41598-025-01327-1.
The use of naturally available materials such as metakaolin (MK) can greatly reduce the utilization of emission intensive materials like cement in the construction sector. This would reduce the stress on depleting natural resources and foster a sustainable construction industry. However, the laboratory determination of 28 day compressive strength (C-S) of MK-based mortar is associated with several time and resource constraints. Thus, this study was conducted to develop reliable empirical prediction models to assess CS of MK-based mortar from its mixture proportion using machine learning algorithms like gene expression programming (GEP), extreme gradient boosting (XGB), multi expression programming (MEP), bagging regressor (BR), and AdaBoost etc. A comprehensive dataset compiled from published literature having five input parameters including water-to-binder ratio, mortar age, and maximum aggregate diameter etc. was used for this purpose. The developed models were validated by means of error metrics, residual assessment, and external validation checks which revealed that XGB is the most accurate algorithm having testing [Formula: see text] of 0.998 followed by BR having [Formula: see text] values equal to 0.946 while MEP had the lowest testing [Formula: see text] of 0.893. However, MEP and GEP algorithms expressed their output in the form of empirical equations which other black-box algorithms couldn't produce. Moreover, interpretable machine learning approaches including shapely additive explanatory analysis (SHAP), individual conditional expectation (ICE), and partial dependence plots (PDP) were conducted on the XGB model which highlighted that water-to-binder ratio and sample age are some of the most significant variables to predict the C-S of MK-based cement mortars. Finally, a graphical user interface (GUI) was made for implementation of findings of this study in the civil engineering industry.
使用偏高岭土(MK)等天然可用材料可以大大减少建筑行业中水泥等排放密集型材料的使用。这将减轻对日益枯竭的自然资源的压力,并促进可持续建筑业的发展。然而,基于MK的砂浆28天抗压强度(C-S)的实验室测定存在若干时间和资源限制。因此,本研究旨在开发可靠的经验预测模型,使用基因表达式编程(GEP)、极端梯度提升(XGB)、多表达式编程(MEP)、袋装回归器(BR)和AdaBoost等机器学习算法,根据其混合比例评估基于MK的砂浆的抗压强度。为此,使用了从已发表文献中汇编的综合数据集,该数据集有五个输入参数,包括水灰比、砂浆龄期和最大集料直径等。通过误差度量、残差评估和外部验证检查对开发的模型进行了验证,结果表明XGB是最准确的算法,测试[公式:见正文]为0.998,其次是BR,[公式:见正文]值等于0.946,而MEP的测试[公式:见正文]最低,为0.893。然而,MEP和GEP算法以经验方程的形式表达其输出,这是其他黑箱算法无法产生的。此外,对XGB模型进行了包括Shapley加法解释分析(SHAP)、个体条件期望(ICE)和部分依赖图(PDP)在内的可解释机器学习方法,结果突出表明水灰比和样本龄期是预测基于MK的水泥砂浆抗压强度的一些最重要变量。最后,制作了一个图形用户界面(GUI),以便在土木工程行业中应用本研究的结果。