National Key Laboratory of Water Environmental Simulation and Pollution Control, Guangdong Key Laboratory of Water and Air Pollution Control, South China Institute of Environmental Sciences, Ministry of Environmental Protection of the People's Republic of China, Guangzhou 510530, China.
National Key Laboratory of Water Environmental Simulation and Pollution Control, Guangdong Key Laboratory of Water and Air Pollution Control, South China Institute of Environmental Sciences, Ministry of Environmental Protection of the People's Republic of China, Guangzhou 510530, China.
Sci Total Environ. 2020 Aug 20;731:139099. doi: 10.1016/j.scitotenv.2020.139099. Epub 2020 May 4.
Dissolved oxygen (DO) concentration is an essential index for water environment assessment. Here, we present a modeling approach to estimate DO concentrations using input variable selection and data-driven models. Specifically, the input variable selection technique, the maximal information coefficient (MIC), was used to identify and screen the primary environmental factors driving variation in DO. The data-driven model, support vector regression (SVR), was then used to construct a robust model to estimate DO concentration. The approach was illustrated through a case study of the Pearl River Basin in China. We show that the MIC technique can effectively screen major local environmental factors affecting DO concentrations. MIC value tended to stabilize when the sample size >3000 and EC had the highest score with an MIC >0.3 at both of the stations. The variable-reduced datasets improved the performance of the SVR model by a reduction of 28.65% in RMSE, and increase of 22.16%, 56.27% in R, NSE, respectively, relative to complete candidate sets. The MIC-SVR model constructed at the tidal river network performed better than nontidal river network by a reduction of approximately 63.01% in RMSE, an increase of 62.36% in NSE, and R >0.9. Overall, the proposed technique was able to handle nonlinearity among environmental factors and accurately estimate DO concentrations in tidal river network regions.
溶解氧 (DO) 浓度是水环境评估的重要指标。在这里,我们提出了一种使用输入变量选择和数据驱动模型来估计 DO 浓度的建模方法。具体来说,使用最大信息系数 (MIC) 作为输入变量选择技术,以识别和筛选驱动 DO 变化的主要环境因素。然后,使用支持向量回归 (SVR) 作为数据驱动模型来构建稳健的模型来估计 DO 浓度。该方法通过中国珠江流域的案例研究进行了说明。我们表明,MIC 技术可以有效地筛选影响 DO 浓度的主要局部环境因素。当样本量>3000 时,MIC 值趋于稳定,并且在两个站点处,EC 的 MIC 值均>0.3,得分最高。与完整的候选集相比,减少变量的数据集使 SVR 模型的性能提高了 28.65%,RMSE 降低,R 和 NSE 分别提高了 22.16%和 56.27%。在潮汐河网中构建的 MIC-SVR 模型的 RMSE 降低了约 63.01%,NSE 提高了 62.36%,R>0.9,性能优于非潮汐河网。总体而言,所提出的技术能够处理环境因素之间的非线性关系,并准确估计潮汐河网地区的 DO 浓度。