Department of Geography, Planning and Recreation, Northern Arizona University, PO Box 15015, Flagstaff, AZ, 86011, USA.
Department of Geography, Penn State University, 302 Walker Building, University Park, PA, 16801, USA.
Environ Health. 2021 May 4;20(1):51. doi: 10.1186/s12940-021-00734-x.
The growth of geolocated data has opened the door to a wealth of new research opportunities in the health fields. One avenue of particular interest is the relationship between the spaces where people spend time and their health outcomes. This research model typically intersects individual data collected on a specific cohort with publicly available socioeconomic or environmental aggregate data. In spatial terms: individuals are represented as points on map at a particular time, and context is represented as polygons containing aggregated or modeled data from sampled observations. Uncertainty abounds in these kinds of complex representations.
We present four sensitivity analysis approaches that interrogate the stability of spatial and temporal relationships between point and polygon data. Positional accuracy assesses the significance of assigning the point to the correct polygon. Neighborhood size investigates how the size of the context assumed to be relevant impacts observed results. Life course considers the impact of variation in contextual effects over time. Time of day recognizes that most people occupy different spaces throughout the day, and that exposure is not simply a function residential location. We use eight years of point data from a longitudinal study of children living in rural Pennsylvania and North Carolina and eight years of air pollution and population data presented at 0.5 mile (0.805 km) grid cells. We first identify the challenges faced for research attempting to match individual outcomes to contextual effects, then present methods for estimating the effect this uncertainty could introduce into an analysis and finally contextualize these measures as part of a larger framework on uncertainty analysis.
Spatial and temporal uncertainty is highly variable across the children within our cohort and the population in general. For our test datasets, we find greater uncertainty over the life course than in positional accuracy and neighborhood size. Time of day uncertainty is relatively low for these children.
Spatial and temporal uncertainty should be considered for each individual in a study since the magnitude can vary considerably across observations. The underlying assumptions driving the source data play an important role in the level of measured uncertainty.
地理位置数据的增长为健康领域的研究提供了丰富的新机会。特别感兴趣的一个途径是人们花费时间的空间与他们的健康结果之间的关系。这种研究模型通常将特定队列中收集的个体数据与公共可用的社会经济或环境综合数据交叉。从空间上讲:个体在特定时间表示为地图上的点,而上下文表示为包含从抽样观测中聚合或建模的数据的多边形。这些复杂表示中存在大量不确定性。
我们提出了四种敏感性分析方法,这些方法检验了点和多边形数据之间的空间和时间关系的稳定性。位置准确性评估将点分配给正确多边形的重要性。邻域大小调查了假定相关的上下文的大小如何影响观察结果。生命历程考虑了随时间变化的上下文效应的影响。一天中的时间认识到,大多数人在一天中占据不同的空间,暴露不仅仅是居住位置的函数。我们使用了一项针对宾夕法尼亚州和北卡罗来纳州农村地区儿童的纵向研究中八年的点数据,以及八年的空气污染和人口数据,这些数据以 0.5 英里(0.805 公里)的网格单元呈现。我们首先确定了试图将个体结果与上下文效应匹配的研究所面临的挑战,然后提出了用于估计不确定性对分析引入的影响的方法,最后将这些措施作为不确定性分析的更大框架的一部分进行了背景化。
我们队列中的儿童和一般人群的空间和时间不确定性变化很大。对于我们的测试数据集,我们发现生命历程中的不确定性大于位置准确性和邻域大小。对于这些孩子来说,一天中的时间不确定性相对较低。
应该考虑到研究中的每个个体的空间和时间不确定性,因为其幅度在观察中可能会有很大的变化。驱动源数据的基本假设在测量的不确定性水平中起着重要作用。