Banerjee Sudipto, Wall Melanie M, Carlin Bradley P
Division of Biostatistics, School of Public Health, University of Minnesota, Mayo Mail Code 303, Minneapolis, Minnesota 55455, USA.
Biostatistics. 2003 Jan;4(1):123-42. doi: 10.1093/biostatistics/4.1.123.
The use of survival models involving a random effect or 'frailty' term is becoming more common. Usually the random effects are assumed to represent different clusters, and clusters are assumed to be independent. In this paper, we consider random effects corresponding to clusters that are spatially arranged, such as clinical sites or geographical regions. That is, we might suspect that random effects corresponding to strata in closer proximity to each other might also be similar in magnitude. Such spatial arrangement of the strata can be modeled in several ways, but we group these ways into two general settings: geostatistical approaches, where we use the exact geographic locations (e.g. latitude and longitude) of the strata, and lattice approaches, where we use only the positions of the strata relative to each other (e.g. which counties neighbor which others). We compare our approaches in the context of a dataset on infant mortality in Minnesota counties between 1992 and 1996. Our main substantive goal here is to explain the pattern of infant mortality using important covariates (sex, race, birth weight, age of mother, etc.) while accounting for possible (spatially correlated) differences in hazard among the counties. We use the GIS ArcView to map resulting fitted hazard rates, to help search for possible lingering spatial correlation. The DIC criterion (Spiegelhalter et al., Journal of the Royal Statistical Society, Series B 2002, to appear) is used to choose among various competing models. We investigate the quality of fit of our chosen model, and compare its results when used to investigate neonatal versus post-neonatal mortality. We also compare use of our time-to-event outcome survival model with the simpler dichotomous outcome logistic model. Finally, we summarize our findings and suggest directions for future research.
涉及随机效应或“脆弱性”项的生存模型的使用正变得越来越普遍。通常假定随机效应代表不同的聚类,并且假定聚类是相互独立的。在本文中,我们考虑与空间排列的聚类相对应的随机效应,例如临床地点或地理区域。也就是说,我们可能怀疑彼此距离较近的层所对应的随机效应在大小上也可能相似。层的这种空间排列可以用几种方式建模,但我们将这些方式归为两种一般情况:地理统计方法,其中我们使用层的精确地理位置(例如纬度和经度);格网方法,其中我们仅使用层相对于彼此的位置(例如哪些县与哪些其他县相邻)。我们在1992年至1996年明尼苏达各县婴儿死亡率数据集的背景下比较我们的方法。我们这里的主要实质性目标是在考虑各县之间危险可能存在的(空间相关)差异的同时,使用重要的协变量(性别、种族、出生体重、母亲年龄等)来解释婴儿死亡率模式。我们使用GIS ArcView来绘制所得的拟合危险率,以帮助寻找可能存在的持续空间相关性。DIC准则(Spiegelhalter等人,《皇家统计学会杂志》,B辑,2002年,即将发表)用于在各种竞争模型中进行选择。我们研究所选模型的拟合质量,并比较其用于研究新生儿死亡率与新生儿后期死亡率时的结果。我们还将我们的事件发生时间结局生存模型的使用与更简单的二分结局逻辑模型进行比较。最后,我们总结我们的发现并提出未来研究的方向。