Banerjee Sudipto, Carlin Bradley P
Division of Biostatistics, School of Public Health, University of Minnesota, MMC 303, 420 Delaware Street SE, Minneapolis, Minnesota 55455, USA.
Biometrics. 2004 Mar;60(1):268-75. doi: 10.1111/j.0006-341X.2004.00032.x.
Several recent papers (e.g., Chen, Ibrahim, and Sinha, 1999, Journal of the American Statistical Association 94, 909-919; Ibrahim, Chen, and Sinha, 2001a, Biometrics 57, 383-388) have described statistical methods for use with time-to-event data featuring a surviving fraction (i.e., a proportion of the population that never experiences the event). Such cure rate models and their multivariate generalizations are quite useful in studies of multiple diseases to which an individual may never succumb, or from which an individual may reasonably be expected to recover following treatment (e.g., various types of cancer). In this article we extend these models to allow for spatial correlation (estimable via zip code identifiers for the subjects) as well as interval censoring. Our approach is Bayesian, where posterior summaries are obtained via a hybrid Markov chain Monte Carlo algorithm. We compare across a broad collection of rather high-dimensional hierarchical models using the deviance information criterion, a tool recently developed for just this purpose. We apply our approach to the analysis of a smoking cessation study where the subjects reside in 53 southeastern Minnesota zip codes. In addition to the usual posterior estimates, our approach yields smoothed zip code level maps of model parameters related to the relapse rates over time and the ultimate proportion of quitters (the cure rates).
最近的几篇论文(例如,Chen、Ibrahim和Sinha,1999年,《美国统计协会杂志》94卷,909 - 919页;Ibrahim、Chen和Sinha,2001a年,《生物统计学》57卷,383 - 388页)描述了用于具有生存比例(即从未经历该事件的人群比例)的事件发生时间数据的统计方法。这种治愈率模型及其多变量推广在研究个体可能永远不会患的多种疾病,或者个体在接受治疗后有望合理康复的疾病(例如各种类型的癌症)的研究中非常有用。在本文中,我们扩展了这些模型,以考虑空间相关性(可通过受试者的邮政编码标识符进行估计)以及区间删失。我们的方法是贝叶斯方法,其中后验摘要通过混合马尔可夫链蒙特卡罗算法获得。我们使用偏差信息准则在一系列相当高维的分层模型中进行比较,偏差信息准则是最近专门为此目的开发的一种工具。我们将我们的方法应用于一项戒烟研究的分析,该研究中的受试者居住在明尼苏达州东南部的53个邮政编码区域。除了通常的后验估计外,我们的方法还生成了与随时间的复发率和最终戒烟者比例(治愈率)相关的模型参数的邮政编码水平平滑图。