Guo Wenjing, Gao Zhipeng, Guo Huaming, Cao Wengeng
State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Beijing 100083, PR China; MOE Key Laboratory of Groundwater Circulation and Environmental Evolution, School of Water Resources and Environment, China University of Geosciences (Beijing), Beijing 100083, PR China.
State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Beijing 100083, PR China; MOE Key Laboratory of Groundwater Circulation and Environmental Evolution, School of Water Resources and Environment, China University of Geosciences (Beijing), Beijing 100083, PR China.
Sci Total Environ. 2023 Nov 1;897:165511. doi: 10.1016/j.scitotenv.2023.165511. Epub 2023 Jul 12.
The relative importance of groundwater geochemicals and sediment characteristics in predicting groundwater arsenic distributions was rarely documented. To figure this out, we established a random forest machine-learning model to predict groundwater arsenic distributions in the Hetao Basin, China, by using 22 variables of climate, topographic features, soil properties, sediment characteristics, groundwater geochemicals, and hydraulic gradients of 492 groundwater samples. The established model precisely captured the patchy distributions of groundwater arsenic concentrations in the basin with an AUC value of 0.84. Results suggest that Fe(II) was the most prominent variable in predicting groundwater arsenic concentrations, which supported that the enrichment of arsenic in groundwater was caused by the reductive dissolution of Fe(III) oxides. The high relative importance of SO indicated that sulfate reduction was also conducive to groundwater arsenic enrichment in inland basins. Nevertheless, parameters of climate variables, sediment characteristics, and soil properties showed secondly important roles in predicting groundwater arsenic concentrations. The other two models, which excluded parameters of groundwater geochemicals and/or sediment characteristics, showed much worse predictions than the model considering all variables. This highlights the importance of variables of groundwater geochemicals and sediment characteristics in improving the precision and accuracy of predicting results. Future studies should probe a method constructing the random forest predicting model with high precision based on the limited number of groundwater samples and sediment samples.
地下水地球化学物质和沉积物特征在预测地下水砷分布中的相对重要性鲜有文献记载。为弄清楚这一点,我们建立了一个随机森林机器学习模型,通过使用492个地下水样本的气候、地形特征、土壤性质、沉积物特征、地下水地球化学物质和水力梯度等22个变量,来预测中国河套盆地的地下水砷分布。所建立的模型以0.84的AUC值精确捕捉到了盆地中地下水砷浓度的斑块状分布。结果表明,Fe(II)是预测地下水砷浓度最突出的变量,这支持了地下水中砷的富集是由Fe(III)氧化物的还原溶解引起的。SO的高相对重要性表明,硫酸盐还原在内陆盆地中也有利于地下水砷的富集。然而,气候变量、沉积物特征和土壤性质参数在预测地下水砷浓度方面显示出次要的重要作用。另外两个排除了地下水地球化学物质和/或沉积物特征参数的模型,其预测效果比考虑所有变量的模型差得多。这凸显了地下水地球化学物质和沉积物特征变量在提高预测结果的精度和准确性方面的重要性。未来的研究应探索一种基于有限数量的地下水样本和沉积物样本构建高精度随机森林预测模型的方法。