School of Environmental Studies, Queen's University, 99 University Ave, Kingston, ON, Canada.
Technological University Dublin, Park House, 191 N Circular Rd, Dublin, Ireland.
Sci Total Environ. 2024 Oct 15;947:174408. doi: 10.1016/j.scitotenv.2024.174408. Epub 2024 Jul 5.
Big data have become increasingly important for policymakers and scientists but have yet to be employed for the development of spatially specific groundwater contamination indices or protecting human and environmental health. The current study sought to develop a series of indices via analyses of three variables: Non-E. coli coliform (NEC) concentration, E. coli concentration, and the calculated NEC:E. coli concentration ratio. A large microbial water quality dataset comprising 1,104,094 samples collected from 292,638 Ontarian wells between 2010 and 2021 was used. Getis-Ord Gi* (Gi*), Local Moran's I (LMI), and space-time scanning were employed for index development based on identified cluster recurrence. Gi* and LMI identify hot and cold spots, i.e., spatially proximal subregions with similarly high or low contamination magnitudes. Indices were statistically compared with mapped well density and age-adjusted enteric infection rates (i.e., campylobacteriosis, cryptosporidiosis, giardiasis, verotoxigenic E. coli (VTEC) enteritis) at a subregional (N = 298) resolution for evaluation and final index selection. Findings suggest that index development via Gi* represented the most efficacious approach. Developed Gi* indices exhibited no correlation with well density, implying that indices are not biased by rural population density. Gi* indices exhibited positive correlations with mapped infection rates, and were particularly associated with higher bacterial (Campylobacter, VTEC) infection rates among younger sub-populations (p < 0.05). Conversely, no association was found between developed indices and giardiasis rates, an infection not typically associated with private groundwater contamination. Findings suggest that a notable proportion of bacterial infections are associated with groundwater and that the developed Gi* index represents an appropriate spatiotemporal reflection of long-term groundwater quality. Bacterial infection correlations with the NEC:E. coli ratio index (p < 0.001) were markedly different compared to correlations with the E. coli index, implying that the ratio may supplement E. coli monitoring as a groundwater assessment metric capable of elucidating contamination mechanisms. This study may serve as a methodological blueprint for the development of big data-based groundwater contamination indices across the globe.
大数据对于政策制定者和科学家来说变得越来越重要,但尚未被用于开发针对特定空间的地下水污染指数,以保护人类和环境健康。本研究旨在通过对三个变量的分析来开发一系列指数:非大肠杆菌大肠菌群(NEC)浓度、大肠杆菌浓度和计算得出的 NEC:大肠杆菌浓度比。本研究使用了一个包含 1104094 个样本的大型微生物水质数据集,这些样本是在 2010 年至 2021 年间从安大略省的 292638 口井中采集的。本研究使用 Getis-Ord Gi*(Gi*)、局部 Moran I(LMI)和时空扫描,基于识别出的聚类重现来开发指数。Gi和 LMI 确定热点和冷点,即具有相似高或低污染程度的空间上邻近的子区域。指数以子区域(N=298)分辨率与映射井密度和年龄调整后的肠道感染率(即弯曲杆菌病、隐孢子虫病、贾第虫病、产肠毒素性大肠杆菌(VTEC)肠炎)进行了统计比较,以进行评估和最终指数选择。研究结果表明,通过 Gi开发指数是最有效的方法。开发的 Gi指数与井密度没有相关性,这表明指数不受农村人口密度的影响。Gi指数与映射感染率呈正相关,特别是与年轻子群体中更高的细菌(弯曲杆菌、VTEC)感染率呈正相关(p<0.05)。相反,开发的指数与贾第虫病率之间没有关联,贾第虫病通常与私人地下水污染无关。研究结果表明,相当一部分细菌感染与地下水有关,并且开发的 Gi*指数代表了地下水长期质量的适当时空反映。NEC:大肠杆菌比指数的细菌感染相关性(p<0.001)与大肠杆菌指数的相关性明显不同,这表明该比率可以补充大肠杆菌监测,作为能够阐明污染机制的地下水评估指标。本研究可以作为全球基于大数据的地下水污染指数开发的方法蓝图。