Zheng Hang, Liu Yueyi, Wan Wenhua, Zhao Jianshi, Xie Guanti
School of Environment and Civil Engineering, Dongguan University of Technology, Dongguan, 523808, China.
Department of Hydraulic Engineering, Tsinghua University, Beijing, 100084, China.
J Environ Manage. 2023 Apr 1;331:117309. doi: 10.1016/j.jenvman.2023.117309. Epub 2023 Jan 17.
Deep learning methods, which have strong capabilities for mapping highly nonlinear relationships with acceptable calculation speed, have been increasingly applied for water quality prediction in recent studies. However, it is argued that the practicality of deep learning methods is limited due to the lack of physical mechanics to explain the prediction results of water quality changes. A knowledge gap exists in rationalizing the deep learning results for water quality predictions. To address this gap, an interpretable deep learning framework was established to predict the spatiotemporal variations of water quality parameters in a large spatial region. Mereological, land-use, and socioeconomic variables were adopted to predict the daily variations of stream water quality parameters across 138 sub-catchments in a total of over 575,250 km in southern China. The coefficients of determination of chemical oxygen demand (COD), total phosphorus (TP), and total nitrogen (TN) predictions were over 0.80, suggesting a satisfactory prediction performance. The model performance in terms of prediction accuracy could be improved by involving land-use and socioeconomic predictors in addition to hydrological variables. The SHapley Additive exPlanations method used in this study was demonstrated to be effective for interpreting the prediction results by identifying the significant variables and reasoning their influencing directions on the variation of each water quality parameter. The air temperature, proportion of forest area, grain production, population density, and proportion of urban area in each sub-catchment as well as the accumulated rainfall within the previous 3 days were identified as the most significant variables affecting the variations of dissolved oxygen, COD, ammoniacal nitrogen(NH-N), TN, TP, and turbidity in the stream water in the case area, respectively.
深度学习方法具有强大的能力来映射高度非线性关系且计算速度可接受,在最近的研究中已越来越多地应用于水质预测。然而,有人认为深度学习方法的实用性有限,因为缺乏物理机制来解释水质变化的预测结果。在使深度学习水质预测结果合理化方面存在知识空白。为了弥补这一空白,建立了一个可解释的深度学习框架,以预测大空间区域内水质参数的时空变化。采用了地貌、土地利用和社会经济变量来预测中国南方总面积超过575,250平方公里的138个流域的河流水质参数的每日变化。化学需氧量(COD)、总磷(TP)和总氮(TN)预测的决定系数超过0.80,表明预测性能令人满意。除了水文变量外,纳入土地利用和社会经济预测变量可以提高模型在预测准确性方面的性能。本研究中使用的SHapley加性解释方法被证明对于解释预测结果是有效的,通过识别显著变量并推断它们对每个水质参数变化的影响方向。各流域的气温、森林面积比例、粮食产量、人口密度和城市面积比例以及前3天的累计降雨量分别被确定为影响案例区域内河流水体中溶解氧、COD、氨氮(NH-N)、TN、TP和浊度变化的最显著变量。