Earnest Arul, Morgan Geoff, Mengersen Kerrie, Ryan Louise, Summerhayes Richard, Beard John
Northern Rivers University Department of Rural Health, The University of Sydney, New South Wales, Australia.
Int J Health Geogr. 2007 Nov 29;6:54. doi: 10.1186/1476-072X-6-54.
The Conditional Autoregressive (CAR) model is widely used in many small-area ecological studies to analyse outcomes measured at an areal level. There has been little evaluation of the influence of different neighbourhood weight matrix structures on the amount of smoothing performed by the CAR model. We examined this issue in detail.
We created several neighbourhood weight matrices and applied them to a large dataset of births and birth defects in New South Wales (NSW), Australia within 198 Statistical Local Areas. Between the years 1995-2003, there were 17,595 geocoded birth defects and 770,638 geocoded birth records with available data. Spatio-temporal models were developed with data from 1995-2000 and their fit evaluated within the following time period: 2001-2003.
We were able to create four adjacency-based weight matrices, seven distance-based weight matrices and one matrix based on similarity in terms of a key covariate (i.e. maternal age). In terms of agreement between observed and predicted relative risks, categorised in epidemiologically relevant groups, generally the distance-based matrices performed better than the adjacency-based neighbourhoods. In terms of recovering the underlying risk structure, the weight-7 model (smoothing by maternal-age 'Covariate model') was able to correctly classify 35/47 high-risk areas (sensitivity 74%) with a specificity of 47%, and the 'Gravity' model had sensitivity and specificity values of 74% and 39% respectively.
We found considerable differences in the smoothing properties of the CAR model, depending on the type of neighbours specified. This in turn had an effect on the models' ability to recover the observed risk in an area. Prior to risk mapping or ecological modelling, an exploratory analysis of the neighbourhood weight matrix to guide the choice of a suitable weight matrix is recommended. Alternatively, the weight matrix can be chosen a priori based on decision-theoretic considerations including loss, cost and inferential aims.
条件自回归(CAR)模型在许多小区域生态研究中被广泛用于分析区域层面测量的结果。对于不同邻域权重矩阵结构对CAR模型执行的平滑量的影响,几乎没有评估。我们详细研究了这个问题。
我们创建了几个邻域权重矩阵,并将它们应用于澳大利亚新南威尔士州(NSW)198个统计局部区域内的一个大型出生和出生缺陷数据集。在1995 - 2003年期间,有17595条地理编码的出生缺陷记录和770638条地理编码的出生记录且数据可用。利用1995 - 2000年的数据开发时空模型,并在随后的时间段(2001 - 2003年)内评估其拟合情况。
我们能够创建四个基于邻接的权重矩阵、七个基于距离的权重矩阵以及一个基于关键协变量(即母亲年龄)相似性的矩阵。就观察到的和预测的相对风险之间的一致性而言,按流行病学相关组分类,一般来说基于距离的矩阵比基于邻接的邻域表现更好。就恢复潜在风险结构而言,权重 - 7模型(按母亲年龄“协变量模型”进行平滑)能够正确分类47个高风险区域中的35个(敏感性74%),特异性为47%,“引力”模型的敏感性和特异性值分别为74%和39%。
我们发现CAR模型的平滑特性存在相当大的差异,这取决于指定的邻域类型。这反过来又影响了模型恢复一个区域中观察到的风险的能力。在进行风险绘图或生态建模之前,建议对邻域权重矩阵进行探索性分析,以指导选择合适的权重矩阵。或者,可以基于包括损失、成本和推断目标在内的决策理论考虑事先选择权重矩阵。