Suppr超能文献

利用可解释堆叠集成学习的新框架识别地下水中硝酸盐的空间格局和驱动因素。

Identifying the spatial pattern and driving factors of nitrate in groundwater using a novel framework of interpretable stacking ensemble learning.

机构信息

School of Hydraulic Engineering, Dalian University of Technology, Dalian, 116024, China.

British Geological Survey, Keyworth, Nottingham, NG12 5GG, UK.

出版信息

Environ Geochem Health. 2024 Oct 29;46(11):482. doi: 10.1007/s10653-024-02201-1.

Abstract

Groundwater nitrate contamination poses a potential threat to human health and environmental safety globally. This study proposes an interpretable stacking ensemble learning (SEL) framework for enhancing and interpreting groundwater nitrate spatial predictions by integrating the two-level heterogeneous SEL model and SHapley Additive exPlanations (SHAP). In the SEL model, five commonly used machine learning models were utilized as base models (gradient boosting decision tree, extreme gradient boosting, random forest, extremely randomized trees, and k-nearest neighbor), whose outputs were taken as input data for the meta-model. When applied to the agricultural intensive area, the Eden Valley in the UK, the SEL model outperformed the individual models in predictive performance and generalization ability. It reveals a mean groundwater nitrate level of 2.22 mg/L-N, with 2.46% of sandstone aquifers exceeding the drinking standard of 11.3 mg/L-N. Alarmingly, 8.74% of areas with high groundwater nitrate remain outside the designated nitrate vulnerable zones. Moreover, SHAP identified that transmissivity, baseflow index, hydraulic conductivity, the percentage of arable land, and the C:N ratio in the soil were the top five key driving factors of groundwater nitrate. With nitrate threatening groundwater globally, this study presents a high-accuracy, interpretable, and flexible modeling framework that enhances our understanding of the mechanisms behind groundwater nitrate contamination. It implies that the interpretable SEL framework has great promise for providing valuable evidence for environmental management, water resource protection, and sustainable development, particularly in the data-scarce area.

摘要

地下水硝酸盐污染对全球人类健康和环境安全构成潜在威胁。本研究提出了一种可解释的堆叠集成学习(SEL)框架,通过整合两级异构 SEL 模型和 SHapley Additive exPlanations(SHAP),提高和解释地下水硝酸盐空间预测。在 SEL 模型中,使用了五个常用的机器学习模型作为基础模型(梯度提升决策树、极端梯度提升、随机森林、极度随机树和 k-最近邻),其输出作为元模型的输入数据。当应用于农业密集区英国的 Eden Valley 时,SEL 模型在预测性能和泛化能力方面优于单个模型。结果表明,地下水硝酸盐的平均水平为 2.22mg/L-N,2.46%的砂岩含水层超过了 11.3mg/L-N 的饮用水标准。令人震惊的是,8.74%的高地下水硝酸盐地区仍不在指定的硝酸盐脆弱区范围内。此外,SHAP 确定了渗透率、基流指数、水力传导率、耕地百分比和土壤中的 C:N 比是地下水硝酸盐的前五个关键驱动因素。由于硝酸盐在全球范围内威胁地下水,本研究提出了一个高精度、可解释和灵活的建模框架,增强了我们对地下水硝酸盐污染机制的理解。这意味着可解释的 SEL 框架在为环境管理、水资源保护和可持续发展提供有价值的证据方面具有很大的潜力,特别是在数据稀缺的地区。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e02/11522174/7176b8ab4f61/10653_2024_2201_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验