Biswas Sourav, Chattopadhyay Aparajita, Schilling Kathrin, Das Ayushi
Department of Population & Development, International Institute for Population Sciences, Mumbai, Maharashtra, 400088, India.
Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, 650 West 168th Street, New York, NY, USA.
J Expo Sci Environ Epidemiol. 2025 Jun 4. doi: 10.1038/s41370-025-00776-0.
One-fourth of Indians are hypertensive, and the majority relies on groundwater for drinking. But the role of groundwater physicochemical properties and contamination in hypertension remains understudied.
The study investigates the association between physicochemical groundwater characteristics andcontaminants and hypertension risk in India.
This study used data from the fifth round of the National Family Health Survey (NFHS-5 collected 2019-2021), including health, socio-demographics, and food and dietary information (n = 712,666 individuals). The physicochemical characteristics of groundwater data were derived from the Central Groundwater Board (CGWB, 2019-2021). This groundwater data from raster maps was linked to NFHS-5 records using cluster shapefiles and merging them with individual records via cluster IDs.
Bivariate and multivariable regressions were used to identify factors associated with hypertension at the individual level. Moran's I statistics, Local Indicator of Spatial Association (LISA) cluster maps, and the Spatial Error Model (SEM) were used at district levels to investigate the spatial association. Machine learning models, including Artificial Neural Networks (ANN), Random Forest and Extreme Gradient Boosting (XGBoost), were used to predict hypertension risk zones.
Physicochemical drinking water composition is a key factor in hypertension risk. Elevated groundwater pH (>8.5, Adjusted Odds Ratio (AOR): 2.12), electrical conductivity (>300 μS/cm, AOR: 1.06), sulphate (>200 mg/L, AOR: 1.16), arsenic (>0.01 mg/L, AOR: 1.09), nitrate (>45 mg/L, AOR: 1.07), and magnesium (>30 mg/L, AOR: 1.03) are associated to higher odds of hypertension. The Random Forest model demonstrated the highest predictive performance, with a coefficient of determination (R²) of 0.9970, mean absolute error (MAE) of 0.0012, and mean squared error (MSE) of 0.0077. It effectively identified high-risk zones in the northwestern (Delhi, Punjab, Haryana, and Rajasthan) and eastern (West Bengal and Bihar) regions of India.
This study highlights how important groundwater quality is in determining the incidence of hypertension, pointing to groundwater physicochemical properties and contaminants such as electrical conductivity, sulphate, arsenic, nitrate, and magnesium as essential factors. Our research is the first of its kind to comprehensively map hypertension risk zones using machine learning models and geospatial analysis. The findings highlight that water quality is a modifiable risk factor, reinforcing the need for improved drinking water supply systems, regular water quality testing, and targeted interventions in high-risk regions. This study emphasizes the importance of intersectoral collaborations to enhance public health outcomes.
四分之一的印度人患有高血压,且大多数人依靠地下水作为饮用水源。但地下水的物理化学性质及污染在高血压发病中的作用仍未得到充分研究。
本研究调查印度地下水的物理化学特征及污染物与高血压风险之间的关联。
本研究使用了第五轮全国家庭健康调查(NFHS-5,于2019 - 2021年收集)的数据,包括健康状况、社会人口统计学以及食物和饮食信息(n = 712,666人)。地下水数据的物理化学特征源自中央地下水委员会(CGWB,2019 - 2021年)。这些来自栅格地图的地下水数据通过聚类形状文件与NFHS-5记录相链接,并通过聚类ID与个体记录合并。
采用双变量和多变量回归来识别个体层面与高血压相关的因素。在地区层面使用莫兰指数(Moran's I)统计量、局部空间自相关指标(LISA)聚类图以及空间误差模型(SEM)来研究空间关联。使用机器学习模型,包括人工神经网络(ANN)、随机森林和极端梯度提升(XGBoost)来预测高血压风险区域。
饮用水的物理化学成分是高血压风险的关键因素。地下水pH值升高(>8.5,调整后的优势比(AOR):2.12)、电导率升高(>300 μS/cm,AOR:1.06)、硫酸盐升高(>200 mg/L,AOR:1.16)、砷升高(>0.01 mg/L,AOR:1.09)、硝酸盐升高(>45 mg/L,AOR:1.07)以及镁升高(>30 mg/L,AOR:1.03)与高血压的较高发病几率相关。随机森林模型展现出最高的预测性能,决定系数(R²)为0.9970,平均绝对误差(MAE)为0.0012,均方误差(MSE)为0.0077。它有效地识别出了印度西北部(德里、旁遮普邦、哈里亚纳邦和拉贾斯坦邦)和东部(西孟加拉邦和比哈尔邦)地区的高风险区域。
本研究凸显了地下水质量在决定高血压发病率方面的重要性,指出电导率、硫酸盐、砷、硝酸盐和镁等地下水物理化学性质及污染物是关键因素。我们的研究首次使用机器学习模型和地理空间分析全面绘制高血压风险区域。研究结果表明水质是一个可改变的风险因素,强化了改善饮用水供应系统、定期进行水质检测以及在高风险地区进行针对性干预的必要性。本研究强调了跨部门合作对改善公众健康结果的重要性。