Kim Sehwi, Jung Inkyung
Department of Biostatistics and Medical Informatics, Yonsei University College of Medicine, Seoul, Korea.
PLoS One. 2017 Jul 28;12(7):e0182234. doi: 10.1371/journal.pone.0182234. eCollection 2017.
The spatial scan statistic is an important tool for spatial cluster detection. There have been numerous studies on scanning window shapes. However, little research has been done on the maximum scanning window size or maximum reported cluster size. Recently, Han et al. proposed to use the Gini coefficient to optimize the maximum reported cluster size. However, the method has been developed and evaluated only for the Poisson model. We adopt the Gini coefficient to be applicable to the spatial scan statistic for ordinal data to determine the optimal maximum reported cluster size. Through a simulation study and application to a real data example, we evaluate the performance of the proposed approach. With some sophisticated modification, the Gini coefficient can be effectively employed for the ordinal model. The Gini coefficient most often picked the optimal maximum reported cluster sizes that were the same as or smaller than the true cluster sizes with very high accuracy. It seems that we can obtain a more refined collection of clusters by using the Gini coefficient. The Gini coefficient developed specifically for the ordinal model can be useful for optimizing the maximum reported cluster size for ordinal data and helpful for properly and informatively discovering cluster patterns.
空间扫描统计量是用于空间聚类检测的重要工具。关于扫描窗口形状已有大量研究。然而,对于最大扫描窗口大小或最大报告聚类大小的研究却很少。最近,韩等人提出使用基尼系数来优化最大报告聚类大小。然而,该方法仅针对泊松模型进行了开发和评估。我们采用基尼系数使其适用于有序数据的空间扫描统计量,以确定最优的最大报告聚类大小。通过模拟研究和实际数据示例应用,我们评估了所提出方法的性能。经过一些复杂的修改,基尼系数可有效地用于有序模型。基尼系数最常挑选出与真实聚类大小相同或更小的最优最大报告聚类大小,且准确率非常高。似乎通过使用基尼系数,我们可以获得更精细的聚类集合。专门为有序模型开发的基尼系数对于优化有序数据的最大报告聚类大小可能有用,并且有助于正确且有意义地发现聚类模式。