U.S. Geological Survey, California Water Science Center Sacramento, Sacramento, CA, United States.
U.S. Geological Survey Headquarters, Reston, VA, United States.
Sci Total Environ. 2022 Feb 10;807(Pt 3):151065. doi: 10.1016/j.scitotenv.2021.151065. Epub 2021 Oct 18.
Groundwater is an important source of drinking water supplies in the conterminous United State (CONUS), and presence of high nitrate concentrations may limit usability of groundwater in some areas because of the potential negative health effects. Prediction of locations of high nitrate groundwater is needed to focus mitigation and relief efforts. A three-dimensional extreme gradient boosting (XGB) machine learning model was developed to predict the distribution of nitrate. Nitrate was predicted at a 1 km resolution for two drinking water zones, each of variable depth, one for domestic supply and one for public supply. The model used measured nitrate concentrations from 12,082 wells and included predictor variables representing well characteristics, hydrologic conditions, soil type, geology, land use, climate, and nitrogen inputs. Predictor variables derived from empirical or numerical process-based models were also included to integrate information on controlling processes and conditions. The model provided accurate estimates at national and regional scales: the training (R of 0.83) and hold-out (R of 0.49) data fits compared favorably to previous studies. Predicted nitrate concentrations were less than 1 mg/L across most of the CONUS. Nationally, well depth, soil and climate characteristics, and the absence of developed land use were among the most influential explanatory factors. Only 1% of the area in either water supply zone had predicted nitrate concentrations greater than 10 mg/L; however, about 1.4 M people depend on groundwater for their drinking supplies in those areas. Predicted high concentrations of nitrate were most prevalent in the central CONUS. In areas of predicted high nitrate concentration, applied manure, farm fertilizer, and agricultural land use were influential predictor variables. This work represents the first application of XGB to a three-dimensional national-scale groundwater quality model and provides a significant milestone in the efforts to document nitrate in groundwater across the CONUS.
地下水是美国本土(CONUS)饮用水供应的重要来源,由于潜在的负面健康影响,高硝酸盐浓度可能会限制某些地区地下水的可用性。需要预测高硝酸盐地下水的位置,以便集中进行缓解和补救工作。本研究开发了一个三维极端梯度提升(XGB)机器学习模型来预测硝酸盐的分布。在两个不同深度的饮用水区中,以 1km 的分辨率预测硝酸盐,一个用于家庭供应,一个用于公共供应。该模型使用了 12082 口井的实测硝酸盐浓度作为输入数据,包括代表井特征、水文条件、土壤类型、地质、土地利用、气候和氮输入的预测变量。还包括了从经验或数值过程模型中派生的预测变量,以整合控制过程和条件的信息。该模型在国家和地区尺度上提供了准确的估计值:训练数据(R2 为 0.83)和保留数据(R2 为 0.49)的拟合效果与之前的研究相比非常理想。预测的硝酸盐浓度在 CONUS 的大部分地区都小于 1mg/L。在全国范围内,井深、土壤和气候特征以及没有开发的土地利用是最具影响力的解释因素。在这两个供水区中,只有 1%的地区预测的硝酸盐浓度大于 10mg/L;然而,大约 140 万人依赖这些地区的地下水作为饮用水。在 CONUS 的中心地区,预测的硝酸盐浓度较高的情况最为普遍。在预测的高硝酸盐浓度区域,应用的粪肥、农田肥料和农业土地利用是重要的预测变量。这项工作代表了 XGB 在三维全国尺度地下水质量模型中的首次应用,是在美国本土范围内记录地下水硝酸盐工作的一个重要里程碑。