Lemke Dorothea, Mattauch Volkmar, Heidinger Oliver, Pebesma Edzer, Hense Hans-Werner
Institute of Epidemiology and Social Medicine, Medical Faculty, Westfälische Wilhelms-Universität Münster, Münster, Germany.
Institute for Geoinformatics, Geosciences Faculty, Westfälische Wilhelms-Universität Münster, Münster, Germany.
Int J Health Geogr. 2015 Mar 31;14:15. doi: 10.1186/s12942-015-0005-9.
Monitoring spatial disease risk (e.g. identifying risk areas) is of great relevance in public health research, especially in cancer epidemiology. A common strategy uses case-control studies and estimates a spatial relative risk function (sRRF) via kernel density estimation (KDE). This study was set up to evaluate the sRRF estimation methods, comparing fixed with adaptive bandwidth-based KDE, and how they were able to detect 'risk areas' with case data from a population-based cancer registry.
The sRRF were estimated within a defined area, using locational information on incident cancer cases and on a spatial sample of controls, drawn from a high-resolution population grid recognized as underestimating the resident population in urban centers. The spatial extensions of these areas with underestimated resident population were quantified with population reference data and used in this study as 'true risk areas'. Sensitivity and specificity analyses were conducted by spatial overlay of the 'true risk areas' and the significant (α=.05) p-contour lines obtained from the sRRF.
We observed that the fixed bandwidth-based sRRF was distinguished by a conservative behavior in identifying these urban 'risk areas', that is, a reduced sensitivity but increased specificity due to oversmoothing as compared to the adaptive risk estimator. In contrast, the latter appeared more competitive through variance stabilization, resulting in a higher sensitivity, while the specificity was equal as compared to the fixed risk estimator. Halving the originally determined bandwidths led to a simultaneous improvement of sensitivity and specificity of the adaptive sRRF, while the specificity was reduced for the fixed estimator.
The fixed risk estimator contrasts with an oversmoothing tendency in urban areas, while overestimating the risk in rural areas. The use of an adaptive bandwidth regime attenuated this pattern, but led in general to a higher false positive rate, because, in our study design, the majority of true risk areas were located in urban areas. However, there is a strong need for further optimizing the bandwidth selection methods, especially for the adaptive sRRF.
监测空间疾病风险(如识别风险区域)在公共卫生研究中具有重要意义,尤其是在癌症流行病学领域。一种常见策略是采用病例对照研究,并通过核密度估计(KDE)来估计空间相对风险函数(sRRF)。本研究旨在评估sRRF估计方法,比较基于固定带宽和自适应带宽的KDE,以及它们如何利用基于人群的癌症登记处的病例数据来检测“风险区域”。
在一个定义区域内估计sRRF,使用来自高分辨率人群网格的发病癌症病例和对照空间样本的位置信息,该网格被认为低估了城市中心的常住人口。利用人口参考数据对这些常住人口被低估区域的空间范围进行量化,并在本研究中用作“真实风险区域”。通过“真实风险区域”与从sRRF获得的显著(α = 0.05)p等值线的空间叠加进行敏感性和特异性分析。
我们观察到,基于固定带宽的sRRF在识别这些城市“风险区域”时表现出保守行为,即与自适应风险估计器相比,由于过度平滑导致敏感性降低但特异性增加。相比之下,后者通过方差稳定显得更具竞争力,导致更高的敏感性,而与固定风险估计器相比特异性相当。将最初确定的带宽减半导致自适应sRRF的敏感性和特异性同时提高,而固定估计器的特异性降低。
固定风险估计器在城市地区存在过度平滑倾向,而在农村地区高估风险。使用自适应带宽方案减弱了这种模式,但总体上导致更高的假阳性率,因为在我们的研究设计中,大多数真实风险区域位于城市地区。然而,迫切需要进一步优化带宽选择方法,特别是对于自适应sRRF。