Zandbergen Paul A, Chakraborty Jayajit
Department of Geography, University of South Florida, 4202 E, Fowler Ave, NES107, Tampa, FL 33620, USA.
Int J Health Geogr. 2006 May 25;5:23. doi: 10.1186/1476-072X-5-23.
Assessments of environmental exposure and health risks that utilize Geographic Information Systems (GIS) often make simplifying assumptions when using: (a) one or more discrete buffer distances to define the spatial extent of impacted regions, and (b) aggregated demographic data at the level of census enumeration units to derive the characteristics of the potentially exposed population. A case-study of school children in Orange County, Florida, is used to demonstrate how these limitations can be overcome by the application of cumulative distribution functions (CDFs) and individual geocoded locations. Exposure potential for 159,923 school children was determined at the childrens' home residences and at school locations by determining the distance to the nearest gasoline station, stationary air pollution source, and industrial facility listed in the Toxic Release Inventory (TRI). Errors and biases introduced by the use of discrete buffer distances and data aggregation were examined.
The use of discrete buffers distances in proximity-based exposure analysis introduced substantial bias in terms of determining the potentially exposed population, and the results are strongly dependent on the choice of buffer distance(s). Comparisons of exposure potential between home and school locations indicated that different buffer distances yield different results and contradictory conclusions. The use of a CDF provided a much more meaningful representation and is not based on the a-priori assumption that any particular distance is more relevant than another. The use of individual geocoded locations also provided a more accurate characterization of the exposed population and allowed for more reliable comparisons among sub-groups. In the comparison of children's home residences and school locations, the use of data aggregated at the census block group and tract level introduced variability as well as bias, leading to incorrect conclusions as to whether exposure potential was higher at school or at home.
The use of CDFs in distance-based environmental exposure assessment provides more robust results than the use of discrete buffer distances. Unless specific circumstances warrant the use of discrete buffer distances, their applcation should be discouraged in favor of CDFs. The use of aggregated data at the census tract or block group level introduces substantial bias in environmental exposure assessment, which can be reduced through individual geocoding. The use of aggregation should be minimized when individual-level data are available. Existing GIS analysis techniques are well suited to determine CDFs as well as reliably geocode large datasets, and computational issues do not present a barrier for their more widespread use in environmental exposure and risk assessment.
利用地理信息系统(GIS)进行的环境暴露和健康风险评估在使用时通常会做出简化假设:(a)使用一个或多个离散缓冲距离来定义受影响区域的空间范围,以及(b)使用人口普查枚举单位层面的汇总人口数据来推导潜在暴露人群的特征。以佛罗里达州奥兰治县的学童为例,说明如何通过应用累积分布函数(CDF)和个体地理编码位置来克服这些局限性。通过确定到最近的加油站、固定空气污染源以及有毒物质排放清单(TRI)中列出的工业设施的距离,确定了159,923名学童在家中和学校地点的暴露潜力。研究了使用离散缓冲距离和数据汇总所引入的误差和偏差。
在基于距离的暴露分析中使用离散缓冲距离在确定潜在暴露人群方面引入了重大偏差,并且结果强烈依赖于缓冲距离的选择。对家和学校地点的暴露潜力进行比较表明,不同的缓冲距离会产生不同的结果和相互矛盾的结论。使用CDF提供了更有意义的表示,并且不基于任何特定距离比其他距离更相关的先验假设。使用个体地理编码位置还提供了对暴露人群更准确的特征描述,并允许在亚组之间进行更可靠的比较。在比较儿童的家和学校地点时,使用普查街区组和普查区层面汇总的数据引入了变异性以及偏差,导致关于在学校还是在家中暴露潜力更高的错误结论。
在基于距离的环境暴露评估中使用CDF比使用离散缓冲距离能提供更可靠的结果。除非特定情况需要使用离散缓冲距离,否则应不鼓励使用它们,而应倾向于使用CDF。在普查区或普查街区组层面使用汇总数据在环境暴露评估中引入了重大偏差,可通过个体地理编码来减少这种偏差。当有个体层面的数据时,应尽量减少汇总数据的使用。现有的GIS分析技术非常适合确定CDF以及对大型数据集进行可靠的地理编码,并且计算问题不会阻碍它们在环境暴露和风险评估中更广泛的应用。