Sosa-Moreno Andrea, Lee Gwenyth O, Wu Zhenke, Fanny S Aya, Trueba Gabriel, Cooper Philip J, Levy Karen, Eisenberg Joseph N S
Department of Epidemiology, University of Michigan, 1415 Washington Heights, Ann Arbor 48109, Michigan, United States.
Rutgers Global Health Institute, 112 Paterson St., New Brunswick 08901, New Jersey, United States.
ACS ES T Water. 2025 Apr 23;5(5):2244-2254. doi: 10.1021/acsestwater.4c01117. eCollection 2025 May 9.
Methods to measure concentrations in water vary in precision, complexity, and cost. Low-precision methods are more affordable, faster, and simpler to implement in low-resource settings but may reduce statistical power. We compared the statistical power of low- and high-precision methods using data from UNICEF's Multiple Indicator Cluster Surveys across 11 low-income regions, and from a birth cohort study in Ecuador. Both data sets included continuous concentrations from high-precision methods, which we categorized to emulate low-precision methods outcomes. Using logistic regression, we modeled associations between water quality and two dichotomous outcomes: water treatment (treated/untreated) and water storage (stored/not stored). We compared the sample size needed to reach 80% power for detecting statistically significant differences between these groups. Power was calculated using a bootstrap-based algorithm. Compared to continuous measures, categorizing concentrations required 10-90% larger sample sizes in treatment models and about 10% in storage models, except in regions with good water quality, where similar or lower sample sizes were sufficient. Our findings indicate that low-precision methods can reliably infer associations between water practices and water quality but often require larger sample sizes, highlighting a trade-off between cost and statistical power in resource-limited settings.
测量水中浓度的方法在精度、复杂性和成本方面各不相同。低精度方法成本更低、速度更快,在资源有限的环境中实施起来更简单,但可能会降低统计效力。我们使用联合国儿童基金会在11个低收入地区进行的多指标类集调查数据,以及厄瓜多尔一项出生队列研究的数据,比较了低精度和高精度方法的统计效力。这两个数据集都包含高精度方法得出的连续浓度数据,我们对这些数据进行了分类,以模拟低精度方法的结果。我们使用逻辑回归对水质与两个二分结果之间的关联进行建模:水处理(处理过/未处理)和水储存(储存/未储存)。我们比较了检测这些组之间具有统计学显著差异所需达到80%效力的样本量。效力是使用基于自助法的算法计算的。与连续测量相比,在处理模型中,对浓度进行分类所需的样本量要大10%-90%,在储存模型中约大10%,但在水质良好的地区除外,在这些地区,相似或更小的样本量就足够了。我们的研究结果表明,低精度方法能够可靠地推断水的处理方式与水质之间的关联,但通常需要更大的样本量,这凸显了在资源有限的环境中成本与统计效力之间的权衡。