Van Meter Karla C, Christiansen Lasse E, Hertz-Picciotto Irva, Azari Rahman, Carpenter Tim E
Department of Public Health Sciences, School of Medicine, University of California, Davis, USA.
Int J Health Geogr. 2008 May 28;7:26. doi: 10.1186/1476-072X-7-26.
Individual point data can be analyzed against an entire cohort instead of only sampled controls to accurately picture the geographic distribution of populations at risk for low prevalence diseases. Analyzed as individual points, many smaller clusters with high relative risks (RR) and low empirical p values are indistinguishable from a random distribution. When points are aggregated into areal units, small clusters may result in a larger cluster with a low RR or be lost if divided into pieces included in units of larger populations that show no increased prevalence. Previous simulation studies showed lowered validity of spatial scan tests for true clusters with low RR. Using simulations, this study explored the effects of low cluster RR and areal unit size on local area clustering test (LACT) results, proposing a procedure to improve accuracy of cohort spatial analysis for rare events.
Our simulations demonstrated the relationship of true RR to observed RR and p values with various, randomly located, cluster shapes, areal unit sizes and scanning window shapes in a diverse population distribution. Clusters with RR < 1.7 had elevated observed RRs and high p values. We propose a cluster identification procedure that applies parallel multiple LACTs, one on point data and three on two distinct sets of areal units created with varying population parameters that minimize the range of population sizes among units. By accepting only clusters identified by all LACTs, having a minimum population size, a minimum relative risk and a maximum p value, this procedure improves the specificity achieved by any one of these tests alone on a cohort study of low prevalence data while retaining sensitivity for small clusters. The procedure is demonstrated on two study regions, each with a five-year cohort of births and cases of a rare developmental disorder.
For truly exploratory research on a rare disorder, false positive clusters can cause costly diverted research efforts. By limiting false positives, this procedure identifies 'crude' clusters that can then be analyzed for known demographic risk factors to focus exploration for geographically-based environmental exposure on areas of otherwise unexplained raised incidence.
个体点数据可针对整个队列进行分析,而非仅与抽样对照进行分析,以便准确描绘低患病率疾病风险人群的地理分布情况。作为个体点进行分析时,许多相对风险(RR)高且经验p值低的较小聚集区与随机分布难以区分。当点被汇总为区域单元时,小聚集区可能会形成一个RR较低的较大聚集区,或者如果被分割成包含在患病率未增加的较大人群单元中的部分,就会丢失。先前的模拟研究表明,对于RR较低的真实聚集区,空间扫描检验的有效性会降低。本研究通过模拟,探讨了低聚集区RR和区域单元大小对局部区域聚集性检验(LACT)结果的影响,提出了一种提高罕见事件队列空间分析准确性的方法。
我们的模拟展示了在多样化的人群分布中,真实RR与观察到的RR以及p值之间的关系,其中涉及各种随机定位的聚集区形状、区域单元大小和扫描窗口形状。RR<1.7的聚集区观察到的RR升高且p值较高。我们提出了一种聚集区识别程序,该程序应用并行的多个LACT,一个用于点数据,三个用于两组不同的区域单元,这些区域单元通过不同的人群参数创建,以最小化单元间人群大小的范围。通过仅接受所有LACT识别出的聚集区,且这些聚集区具有最小人群大小、最小相对风险和最大p值,该程序提高了在低患病率数据队列研究中单独使用这些检验中的任何一个所实现的特异性,同时保留了对小聚集区的敏感性。该程序在两个研究区域进行了演示,每个区域都有一个为期五年的出生队列和一种罕见发育障碍的病例。
对于罕见疾病的真正探索性研究,假阳性聚集区可能导致代价高昂的研究精力转移。通过限制假阳性,该程序识别出“粗略”的聚集区,然后可以对其进行已知人口统计学风险因素的分析,以便将基于地理的环境暴露探索集中在发病率异常升高但原因不明的区域。