Department of Computer Science and Engineering, Indian Institute of Information Technology Ranchi, Namkum, Ranchi, 834010, Jharkhand, India.
Environ Monit Assess. 2023 May 5;195(6):641. doi: 10.1007/s10661-023-11231-8.
Groundwater is an essential resource; around 2.5 billion people depend on it for drinking and irrigation. Groundwater arsenic contamination is due to natural and anthropogenic sources. The World Health Organization (WHO) has proposed a guideline value for arsenic concentration in groundwater samples of 10[Formula: see text]g/L. Continuous consumption of arsenic-contaminated water causes various carcinogenic and non-carcinogenic health risks. In this paper, we introduce a geospatial-based machine learning method for classifying arsenic concentration levels as high (1) or low (0) using physicochemical properties of water, soil type, land use land cover, digital elevation, subsoil sand, silt, clay, and organic content of the region. The groundwater samples were collected from multiple sites along the river Ganga's banks of Varanasi district in Uttar Pradesh, India. The dataset was subjected to descriptive statistics and spatial analysis for all parameters. This study assesses the various contributing parameters responsible for the occurrence of arsenic in the study area based on the Pearson correlation feature selection method. The performance of machine learning models, i.e., Extreme Gradient Boosting (XGBoost), Gradient Boosting Machine (GBM), Decision Tree, Random Forest, Naïve Bayes, and Deep Neural Network (DNN), were compared to validate the parameters responsible for the dissolution of arsenic in groundwater aquifers. Among all the models, the DNN algorithm outclasses other classifiers as it has a high accuracy of 92.30%, a sensitivity of 100%, and a specificity of 75%. Policymakers can utilize the accuracy of the DNN model to approximate individuals prone to arsenic poisoning and formulate mitigation strategies based on spatial maps.
地下水是一种重要的资源;约有 25 亿人依赖地下水饮用和灌溉。地下水砷污染是由自然和人为来源造成的。世界卫生组织(WHO)曾提出地下水样本中砷浓度的指导值为 10[Formula: see text]g/L。持续饮用受砷污染的水会导致各种致癌和非致癌的健康风险。在本文中,我们介绍了一种基于地理空间的机器学习方法,用于使用水的理化性质、土壤类型、土地利用/土地覆盖、数字高程、底土砂、淤泥、粘土和该地区的有机含量,将砷浓度水平分类为高(1)或低(0)。地下水样本取自印度北方邦瓦拉纳西地区恒河沿岸的多个地点。数据集经过描述性统计和所有参数的空间分析。本研究根据皮尔逊相关特征选择方法,评估了导致研究区域砷发生的各种相关参数。比较了机器学习模型(如极端梯度提升(XGBoost)、梯度提升机(GBM)、决策树、随机森林、朴素贝叶斯和深度神经网络(DNN))的性能,以验证负责溶解地下水中砷的参数。在所有模型中,DNN 算法优于其他分类器,因为它具有 92.30%的高准确率、100%的灵敏度和 75%的特异性。决策者可以利用 DNN 模型的准确性来估算易受砷中毒影响的个体,并根据空间图制定缓解策略。