U.S. Geological Survey, Nashville, TN, USA.
U.S. Geological Survey, Carlisle, MA, USA.
Ground Water. 2022 May;60(3):362-376. doi: 10.1111/gwat.13164. Epub 2022 Jan 7.
Manganese (Mn) concentrations and the probability of arsenic (As) exceeding the drinking-water standard of 10 μg/L were predicted in the Mississippi River Valley alluvial aquifer (MRVA) using boosted regression trees (BRT). BRT, a type of ensemble-tree machine-learning model, were created using predictor variables that affect Mn and As distribution in groundwater. These variables included iron (Fe) concentrations and specific conductance predicted from previously developed BRT models, groundwater flux and age estimates from MODFLOW, and hydrologic characteristics. The models also included results from the first airborne geophysical survey conducted in the United States to target an entire aquifer system. Predictions of high Mn and As occurred where Fe was high. Predicted high Mn concentrations were correlated with fraction of young groundwater (less than 65 years) computed from MODFLOW results. High probabilities of As exceedance were predicted where groundwater was relatively old and airborne electromagnetic resistivity was high, typically proximal to streams. Two-variable partial-dependence plots and sensitivity analysis were used to provide insight into the factors controlling Mn and As distribution in groundwater. The maps of predicted Mn concentrations and As exceedance probabilities can be used to identify areas where these constituents may be high, and that could be targeted for further study. This paper shows that incorporation of a selected set of process-informed data, such as MODFLOW results and airborne geophysics, into a machine-learning model improves model interpretability. Incorporation of process-rich information into machine-learning models will likely be useful for addressing a wide range of problems of interest to groundwater hydrologists.
使用提升回归树(BRT)预测密西西比河谷冲积含水层(MRVA)中的锰(Mn)浓度和砷(As)超过饮用水标准 10μg/L 的概率。BRT 是一种集成树机器学习模型,它使用影响地下水 Mn 和 As 分布的预测变量来创建。这些变量包括从先前开发的 BRT 模型中预测的铁(Fe)浓度和比导率、MODFLOW 中的地下水通量和年龄估计值以及水文特征。该模型还包括了在美国进行的第一次针对整个含水层系统的航空地球物理调查的结果。高 Mn 和 As 的预测发生在 Fe 含量高的地方。预测的高 Mn 浓度与从 MODFLOW 结果计算的年轻地下水(小于 65 年)的分数相关。当地下水相对较老且航空电磁电阻率较高时,通常靠近溪流,预测的 As 超标概率较高。两变量偏依赖图和敏感性分析用于深入了解控制地下水 Mn 和 As 分布的因素。预测的 Mn 浓度和 As 超标概率图可用于识别这些成分可能较高的区域,并可针对这些区域进行进一步研究。本文表明,将一组特定的过程信息数据(如 MODFLOW 结果和航空地球物理)纳入机器学习模型可提高模型的可解释性。将富含过程的信息纳入机器学习模型中可能对地下水水文学家感兴趣的各种问题的解决有用。