Zhang Yufen, Hodges James S, Banerjee Sudipto
Novartis Pharmaceuticals, East Hanover, New Jersey 07936, USA.
Ann Appl Stat. 2009;3(4):1805-1830. doi: 10.1214/09-AOAS267.
Rapid developments in geographical information systems (GIS) continue to generate interest in analyzing complex spatial datasets. One area of activity is in creating smoothed disease maps to describe the geographic variation of disease and generate hypotheses for apparent differences in risk. With multiple diseases, a multivariate conditionally autoregressive (MCAR) model is often used to smooth across space while accounting for associations between the diseases. The MCAR, however, imposes complex covariance structures that are difficult to interpret and estimate. This article develops a much simpler alternative approach building upon the techniques of smoothed ANOVA (SANOVA). Instead of simply shrinking effects without any structure, here we use SANOVA to smooth spatial random effects by taking advantage of the spatial structure. We extend SANOVA to cases in which one factor is a spatial lattice, which is smoothed using a CAR model, and a second factor is, for example, type of cancer. Datasets routinely lack enough information to identify the additional structure of MCAR. SANOVA offers a simpler and more intelligible structure than the MCAR while performing as well. We demonstrate our approach with simulation studies designed to compare SANOVA with different design matrices versus MCAR with different priors. Subsequently a cancer-surveillance dataset, describing incidence of 3-cancers in Minnesota's 87 counties, is analyzed using both approaches, showing the competitiveness of the SANOVA approach.
地理信息系统(GIS)的快速发展持续引发人们对分析复杂空间数据集的兴趣。其中一个活跃领域是创建平滑疾病地图,以描述疾病的地理变异并生成风险明显差异的假设。对于多种疾病,多变量条件自回归(MCAR)模型常被用于在考虑疾病之间关联的同时进行空间平滑。然而,MCAR施加了难以解释和估计的复杂协方差结构。本文基于平滑方差分析(SANOVA)技术开发了一种更为简单的替代方法。我们不是简单地在没有任何结构的情况下收缩效应,而是利用空间结构,通过SANOVA对空间随机效应进行平滑。我们将SANOVA扩展到一个因素是空间格网(使用CAR模型进行平滑)且另一个因素例如是癌症类型的情况。数据集通常缺乏足够信息来识别MCAR的额外结构。SANOVA在表现相当的同时,提供了比MCAR更简单、更易懂的结构。我们通过模拟研究展示我们的方法,该研究旨在比较具有不同设计矩阵的SANOVA与具有不同先验的MCAR。随后,使用这两种方法分析了一个癌症监测数据集,该数据集描述了明尼苏达州87个县的三种癌症发病率,显示了SANOVA方法的竞争力。