Department of Environmental Sciences, Shahid Rajaee Teacher Training University, Lavizan, 1678815811, Tehran, Iran.
International Network for Environment and Health (INEH), School of Geography, Archaeology and Irish Studies, University of Galway, Galway, Ireland.
Environ Geochem Health. 2024 Feb 17;46(3):80. doi: 10.1007/s10653-023-01845-9.
Combining the results of base models to create a meta-model is one of the ensemble approaches known as stacking. In this study, stacking of five base learners, including eXtreme gradient boosting, random forest, feed-forward neural networks, generalized linear models with Lasso or Elastic Net regularization, and support vector machines, was used to study the spatial variation of Mn, Cd, Pb, and nitrate in Qom-Kahak Aquifers, Iran. The stacking strategy proved to be an effective substitute predictor for existing machine learning approaches due to its high accuracy and stability when compared to individual learners. Contrarily, there was not any best-performing base model for all of the involved parameters. For instance, in the case of cadmium, random forest produced the best results, with adjusted R and RMSE of 0.108 and 0.014, as opposed to 0.337 and 0.013 obtained by the stacking method. The Mn and Cd showed a tight link with phosphate by the redundancy analysis (RDA). This demonstrates the effect of phosphate fertilizers on agricultural operations. In order to analyze the causes of groundwater pollution, spatial methodologies can be used with multivariate analytic techniques, such as RDA, to help uncover hidden sources of contamination that would otherwise go undetected. Lead has a larger health risk than nitrate, according to the probabilistic health risk assessment, which found that 34.4% and 6.3% of the simulated values for children and adults, respectively, were higher than HQ = 1. Furthermore, cadmium exposure risk affected 84% of children and 47% of adults in the research area.
将基础模型的结果结合起来创建元模型是集成方法之一,称为堆叠。在这项研究中,使用了包括极端梯度提升、随机森林、前馈神经网络、具有 Lasso 或弹性网正则化的广义线性模型和支持向量机在内的五种基础学习者的堆叠策略来研究伊朗 Qom-Kahak 含水层中 Mn、Cd、Pb 和硝酸盐的空间变化。与单个学习者相比,堆叠策略由于其高精度和稳定性,被证明是现有机器学习方法的有效替代预测器。相反,对于所有涉及的参数,没有任何一个表现最好的基础模型。例如,在镉的情况下,随机森林产生了最佳的结果,调整后的 R 和 RMSE 分别为 0.108 和 0.014,而堆叠方法的结果分别为 0.337 和 0.013。冗余分析(RDA)表明 Mn 和 Cd 与磷酸盐密切相关。这表明磷酸盐肥料对农业作业的影响。为了分析地下水污染的原因,可以使用空间方法和多元分析技术,如 RDA,以帮助揭示隐藏的污染来源,否则这些来源可能会被忽视。根据概率健康风险评估,铅比硝酸盐的健康风险更大,该评估发现模拟值的 34.4%和 6.3%分别为儿童和成人的 HQ>1。此外,在研究区域,镉暴露风险影响了 84%的儿童和 47%的成年人。