Academy of Scientific and Innovative Research, Anusandhan Bhawan, Rafi marg, New Delhi, 110 001, India,
Environ Sci Pollut Res Int. 2014 May;21(9):6001-15. doi: 10.1007/s11356-014-2517-4. Epub 2014 Jan 25.
Groundwater hydrochemistry of an urban industrial region in Indo-Gangetic plains of north India was investigated. Groundwater samples were collected both from the industrial and non-industrial areas of Kanpur. The hydrochemical data were analyzed using various water quality indices and nonparametric statistical methods. Principal components analysis (PCA) was performed to identify the factors responsible for groundwater contamination. Ensemble learning-based decision treeboost (DTB) models were constructed to develop discriminating and regression functions to differentiate the groundwater hydrochemistry of the three different areas, to identify the responsible factors, and to predict the groundwater quality using selected measured variables. The results indicated non-normal distribution and wide variability of water quality variables in all the study areas, suggesting for nonhomogenous distribution of sources in the region. PCA results showed contaminants of industrial origin dominating in the region. DBT classification model identified pH, redox potential, total-Cr, and λ 254 as the discriminating variables in water quality of the three areas with the average accuracy of 99.51 % in complete data. The regression model predicted the groundwater chemical oxygen demand values exhibiting high correlation with measured values (0.962 in training; 0.918 in test) and the respective low root mean-squared error of 2.24 and 2.01 in training and test arrays. The statistical and chemometric approaches used here suggest that groundwater hydrochemistry differs in the three areas and is dominated by different variables. The proposed methods can be used as effective tools in groundwater management.
对印度恒河平原北部一个城市工业区的地下水水化学进行了研究。从坎普尔的工业区和非工业区采集了地下水样本。利用各种水质指数和非参数统计方法对水化学数据进行了分析。采用主成分分析(PCA)确定了导致地下水污染的因素。基于集成学习的决策树增强(DTB)模型被构建,以开发区分功能和回归功能,区分三个不同地区的地下水水化学,识别负责因素,并使用选定的测量变量预测地下水质量。结果表明,所有研究区域的水质变量均呈非正态分布且具有较大的变异性,表明该地区的污染源分布不均匀。PCA 结果表明,工业来源的污染物在该地区占主导地位。DTB 分类模型确定 pH 值、氧化还原电位、总铬和 λ 254 为三个地区水质的区分变量,在完整数据中的平均准确率为 99.51%。回归模型预测了地下水化学需氧量值,与实测值高度相关(训练中为 0.962,测试中为 0.918),在训练和测试数组中各自的均方根误差分别为 2.24 和 2.01。这里使用的统计和化学计量学方法表明,三个地区的地下水水化学存在差异,受不同变量控制。所提出的方法可以作为地下水管理的有效工具。