Torabi Mahmoud, Rosychuk Rhonda J
Department of Pediatrics, University of Alberta, Edmonton, Alberta, Canada.
Int J Health Geogr. 2008 Dec 12;7:61. doi: 10.1186/1476-072X-7-61.
In geographic surveillance of disease, areas with large numbers of disease cases are to be identified so that investigations of the causes of high disease rates can be pursued. Areas with high rates are called disease clusters and statistical cluster detection tests are used to identify geographic areas with higher disease rates than expected by chance alone. Typically cluster detection tests are applied to incident or prevalent cases of disease, but surveillance of disease-related events, where an individual may have multiple events, may also be of interest. Previously, a compound Poisson approach that detects clusters of events by testing individual areas that may be combined with their neighbours has been proposed. However, the relevant probabilities from the compound Poisson distribution are obtained from a recursion relation that can be cumbersome if the number of events are large or analyses by strata are performed. We propose a simpler approach that uses an approximate normal distribution. This method is very easy to implement and is applicable to situations where the population sizes are large and the population distribution by important strata may differ by area. We demonstrate the approach on pediatric self-inflicted injury presentations to emergency departments and compare the results for probabilities based on the recursion and the normal approach. We also implement a Monte Carlo simulation to study the performance of the proposed approach.
In a self-inflicted injury data example, the normal approach identifies twelve out of thirteen of the same clusters as the compound Poisson approach, noting that the compound Poisson method detects twelve significant clusters in total. Through simulation studies, the normal approach well approximates the compound Poisson approach for a variety of different population sizes and case and event thresholds.
A drawback of the compound Poisson approach is that the relevant probabilities must be determined through a recursion relation and such calculations can be computationally intensive if the cluster size is relatively large or if analyses are conducted with strata variables. On the other hand, the normal approach is very flexible, easily implemented, and hence, more appealing for users. Moreover, the concepts may be more easily conveyed to non-statisticians interested in understanding the methodology associated with cluster detection test results.
在疾病的地理监测中,需要识别出疾病病例数量众多的区域,以便对高发病率的原因进行调查。发病率高的区域被称为疾病聚集区,统计聚集检测测试用于识别发病率高于仅由偶然因素预期的地理区域。通常,聚集检测测试应用于疾病的新发病例或现患病例,但对疾病相关事件的监测也可能很有意义,在这种监测中,一个人可能有多个事件。此前,有人提出了一种复合泊松方法,该方法通过测试可能与其邻居合并的单个区域来检测事件聚集区。然而,复合泊松分布的相关概率是通过递归关系获得的,如果事件数量很大或按分层进行分析,这种关系可能会很麻烦。我们提出一种更简单的方法,该方法使用近似正态分布。这种方法非常易于实现,适用于人口规模较大且重要分层的人口分布可能因地区而异的情况。我们在儿科急诊部门的自我伤害就诊案例中展示了该方法,并比较了基于递归和正态方法的概率结果。我们还进行了蒙特卡罗模拟,以研究该方法的性能。
在一个自我伤害数据示例中,正态方法识别出的聚集区与复合泊松方法识别出的13个聚集中的12个相同,复合泊松方法总共检测到12个显著聚集区。通过模拟研究,对于各种不同的人口规模以及病例和事件阈值,正态方法很好地近似了复合泊松方法。
复合泊松方法的一个缺点是,相关概率必须通过递归关系来确定,如果聚集区规模相对较大或使用分层变量进行分析,此类计算可能会计算量很大。另一方面,正态方法非常灵活,易于实现,因此对用户更具吸引力。此外,这些概念可能更容易传达给对理解与聚集检测测试结果相关的方法感兴趣的非统计人员。