Ribeiro Sérgio Henrique Rodrigues, Costa Marcelo Azevedo
Department of Statistics, Universidade Federal de Minas Gerais, Belo Horizonte, MG 31270-901, Brazil.
Spat Spatiotemporal Epidemiol. 2012 Jun;3(2):107-20. doi: 10.1016/j.sste.2012.04.004. Epub 2012 Apr 21.
Circular and elliptic spatial scan statistics requires the user to choose a maximum cluster size. A common value for this parameter is 50% of the underlying population. In addition to the detected primary cluster, the user may be interested in the analysis of significant secondary clusters. It can also be argued that if the true cluster is irregular, then choosing a small value for the maximum cluster size and evaluating significant secondary clusters may improve cluster detection and avoid the use of irregular cluster methods. This work explores the performance of the circular, elliptic and double scan statistics for different values of the maximum cluster size and different options for the analysis of secondary clusters. Empirical results show that for hot-spot clusters, the analysis of secondary clusters which are statistically significant do not improve the detection of the true unknown cluster, on average. There is evidence that a variable maximum cluster size improves performance. That is, the double scan statistic applies an early-stopping procedure which improves positive predictive values.
圆形和椭圆形空间扫描统计要求用户选择最大聚类大小。该参数的一个常用值是基础总体的50%。除了检测到的主要聚类外,用户可能还对显著的次要聚类分析感兴趣。也可以认为,如果真实聚类是不规则的,那么为最大聚类大小选择一个较小的值并评估显著的次要聚类可能会改善聚类检测,并避免使用不规则聚类方法。这项工作探讨了圆形、椭圆形和双重扫描统计在不同最大聚类大小值和次要聚类分析的不同选项下的性能。实证结果表明,对于热点聚类,对具有统计学显著性的次要聚类进行分析平均而言并不能改善对真实未知聚类的检测。有证据表明可变的最大聚类大小可提高性能。也就是说,双重扫描统计应用了一种提前停止程序,该程序提高了阳性预测值。