Murdie R A, Spitzer W O, Suissa S
Department of Geography, York University, Toronto, Ontario, Canada.
Scand J Work Environ Health. 1988 Jun;14(3):168-74. doi: 10.5271/sjweh.1935.
A procedure for selecting reference areas in epidemiologic research employing census data and squared Euclidian distance is described. The procedure was adapted from cluster analysis, a multivariate statistical technique that has been applied in many disciplines. With the use of 12 census variables as the basis for evaluating sociodemographic differentiation, squared Euclidian distances were calculated between a geographically delineated index area in southwest Alberta, where residents had complained for several years about the effects of exposure to sour gas emissions, and 119 provincial census tracts in the rest of nonmetropolitan southern Alberta. The Euclidian distances can be interpreted as social distance scores with values close to zero representing a high level of sociodemographic similarity between the index area and potential reference areas. The social distance scores, in association with environmental data, suggested a clear choice for the most comparable unexposed reference area and illustrated the difficulty of finding a suitable most comparable exposed reference area. Results from the demographic component of the subsequent health survey indicated that the index area and reference area were similar in most respects. Furthermore, tests with and without statistical adjustment for confounding variables produced negligible differences on most of the important target outcome variables.
本文描述了一种利用人口普查数据和欧几里得距离平方在流行病学研究中选择参考区域的方法。该方法改编自聚类分析,这是一种已应用于许多学科的多元统计技术。以12个人口普查变量为评估社会人口差异的基础,计算了艾伯塔省西南部一个地理划定的指数区域(该区域居民多年来一直抱怨酸性气体排放暴露的影响)与艾伯塔省南部非都市地区其他119个省级人口普查区之间的欧几里得距离平方。欧几里得距离可解释为社会距离得分,值接近零表示指数区域与潜在参考区域之间的社会人口相似性较高。社会距离得分与环境数据相结合,为最具可比性的未暴露参考区域提供了明确选择,并说明了找到合适的最具可比性的暴露参考区域的困难。后续健康调查的人口统计学部分结果表明,指数区域和参考区域在大多数方面相似。此外,对混杂变量进行统计调整和不进行统计调整的测试在大多数重要目标结局变量上产生的差异可忽略不计。