Escuela de Ingeniería y Ciencias, Tecnologico de Monterrey, Campus Monterrey, Eugenio Garza Sada 2501, Monterrey, NL 64849, Mexico.
Escuela de Ingeniería y Ciencias, Tecnologico de Monterrey, Campus Monterrey, Eugenio Garza Sada 2501, Monterrey, NL 64849, Mexico.
Sci Total Environ. 2023 Dec 20;905:166863. doi: 10.1016/j.scitotenv.2023.166863. Epub 2023 Sep 9.
Nitrate contamination in groundwater poses a significant threat to water quality and public health, especially in regions with limited data availability. This study addresses this challenge by employing machine learning (ML) techniques to predict nitrate (NO-N) concentrations in Mexico's groundwater. Four ML algorithms-Extreme Gradient Boosting (XGB), Boosted Regression Trees (BRT), Random Forest (RF), and Support Vector Machines (SVM)-were executed to model NO-N concentrations across the country. Despite data limitations, the ML models achieved robust predictive performances. XGB and BRT algorithms demonstrated superior accuracy (0.80 and 0.78, respectively). Notably, this was achieved using ∼10 times less information than previous large-scale assessments. The novelty lies in the first-ever implementation of the 'Support Points-based Split Approach' during data pre-processing. The models considered initially 68 covariates and identified 13-19 significant predictors of NO-N concentration spanning from climate, geomorphology, soil, hydrogeology, and human factors. Rainfall, elevation, and slope emerged as key predictors. A validation incorporated nationwide waste disposal sites, yielding an encouraging correlation. Spatial risk mapping unveiled significant pollution hotspots across Mexico. Regions with elevated NO-N concentrations (>10 mg/L) were identified, particularly in the north-central and northeast parts of the country, associated with agricultural and industrial activities. Approximately 21 million people, accounting for 10 % of Mexico's population, are potentially exposed to elevated NO-N levels in groundwater. Moreover, the NO-N hotspots align with reported NO-N health implications such as gastric and colorectal cancer. This study not only demonstrates the potential of ML in data-scarce regions but also offers actionable insights for policy and management strategies. Our research underscores the urgency of implementing sustainable agricultural practices and comprehensive domestic waste management measures to mitigate NO-N contamination. Moreover, it advocates for the establishment of effective policies based on real-time monitoring and collaboration among stakeholders.
地下水硝酸盐污染对水质和公众健康构成重大威胁,特别是在数据有限的地区。本研究通过采用机器学习 (ML) 技术来预测墨西哥地下水的硝酸盐 (NO-N) 浓度,解决了这一挑战。执行了四种 ML 算法——极端梯度提升 (XGB)、提升回归树 (BRT)、随机森林 (RF) 和支持向量机 (SVM)——来对全国范围内的 NO-N 浓度进行建模。尽管数据有限,ML 模型仍取得了稳健的预测性能。XGB 和 BRT 算法的准确性最高 (分别为 0.80 和 0.78)。值得注意的是,这是在使用比以前的大规模评估少约 10 倍的信息的情况下实现的。该研究的新颖之处在于,在数据预处理过程中首次实施了“基于支持点的分裂方法”。模型最初考虑了 68 个协变量,并确定了 13-19 个与气候、地貌、土壤、水文地质和人为因素有关的 NO-N 浓度的重要预测因子。降雨、海拔和坡度是关键预测因子。纳入全国范围内的废物处理场进行验证,结果相关性令人鼓舞。空间风险图揭示了墨西哥各地的重大污染热点。确定了 NO-N 浓度升高 (>10 mg/L) 的地区,特别是在该国中北部和东北部,与农业和工业活动有关。约有 2100 万人,占墨西哥人口的 10%,可能接触到地下水的高浓度 NO-N。此外,NO-N 热点与报告的与 NO-N 有关的健康影响(如胃癌和结直肠癌)一致。本研究不仅展示了 ML 在数据匮乏地区的潜力,还为政策和管理策略提供了可行的见解。我们的研究强调了实施可持续农业实践和全面国内废物管理措施以减轻 NO-N 污染的紧迫性。此外,它主张根据实时监测和利益相关者之间的合作制定有效的政策。