Department of Sociology, University of Washington, Seattle, WA, USA.
Department of Statistics, University of Washington, Seattle, WA, USA.
Stat Methods Med Res. 2021 May;30(5):1187-1210. doi: 10.1177/0962280220988742. Epub 2021 Feb 1.
There is an increasing focus on reducing inequalities in health outcomes in developing countries. Subnational variation is of particular interest, with geographically-indexed data being used to understand the spatial risk of detrimental outcomes and to identify who is at greatest risk. While some health surveys provide observations with associated geographic coordinates (point data), many others provide data that have their locations masked and instead only report the strata (polygon information) within which the data resides (masked data). How to harmonize these data sources for spatial analysis has been previously considered although only ad hoc methods and comparison of methods is lacking. In this paper, we present a new method for analyzing masked survey data, using a method that is consistent with the data-generating process. In addition, we critique two previously proposed approaches to analyzing masked data and illustrate that they are fundamentally flawed methodologically. To validate our method, we compare our approach with previously formulated solutions in several realistic simulation environments in which the underlying structure of the risk field is known. We simulate samples from spatiotemporal fields in a way that mimics the sampling frame implemented in the most common health surveys in low- and middle-income countries, the Demographic and Health Surveys and Multiple Indicator Cluster Surveys. In simulations, the newly proposed approach outperforms previously proposed approaches in terms of minimizing error while increasing the precision of estimates. The approaches are subsequently compared using child mortality data from the Dominican Republic where our findings are reinforced. The ability to accurately increase precision of child mortality estimates, and health outcomes in general, by leveraging various types of data, improves our ability to implement precision public health initiatives and better understand the landscape of geographic health inequalities.
越来越多的人关注减少发展中国家健康结果的不平等。亚国家差异尤其受到关注,使用地理索引数据来了解有害结果的空间风险,并确定谁面临最大的风险。虽然一些健康调查提供了具有相关地理坐标的观测结果(点数据),但许多其他调查提供的数据其位置被掩盖,而只报告数据所在的层(多边形信息)(掩蔽数据)。以前已经考虑过如何协调这些数据源进行空间分析,尽管缺乏特定方法和方法比较。在本文中,我们提出了一种分析掩蔽调查数据的新方法,该方法与数据生成过程一致。此外,我们还批评了以前提出的两种分析掩蔽数据的方法,并说明它们在方法上存在根本缺陷。为了验证我们的方法,我们在几个现实的模拟环境中比较了我们的方法和以前提出的解决方案,其中风险场的基本结构是已知的。我们以模仿在低收入和中等收入国家最常见的健康调查(人口与健康调查和多指标类集调查)中实施的抽样框架的方式从时空场中模拟样本。在模拟中,新提出的方法在最小化误差的同时提高估计精度方面优于以前提出的方法。随后,使用多米尼加共和国的儿童死亡率数据比较了这两种方法,我们的发现得到了加强。通过利用各种类型的数据来准确提高儿童死亡率估计值和一般健康结果的精度,可以提高我们实施精确公共卫生计划的能力,并更好地了解地理健康不平等的状况。