Department of Biostatistics, University of Florida, Gainesville, FL 32611, U.S.A.
Stat Med. 2018 Jun 15;37(13):2094-2107. doi: 10.1002/sim.7622. Epub 2018 Feb 21.
To monitor the incidence rates of cancers, AIDS, cardiovascular diseases, and other chronic or infectious diseases, some global, national, and regional reporting systems have been built to collect/provide population-based data about the disease incidence. Such databases usually report daily, monthly, or yearly disease incidence numbers at the city, county, state, or country level, and the disease incidence numbers collected at different places and different times are often correlated, with the ones closer in place or time being more correlated. The correlation reflects the impact of various confounding risk factors, such as weather, demographic factors, lifestyles, and other cultural and environmental factors. Because such impact is complicated and challenging to describe, the spatiotemporal (ST) correlation in the observed disease incidence data has complicated ST structure as well. Furthermore, the ST correlation is hidden in the observed data and cannot be observed directly. In the literature, there has been some discussion about ST data modeling. But, the existing methods either impose various restrictive assumptions on the ST correlation that are hard to justify, or ignore partially or entirely the ST correlation. This paper aims to develop a flexible and effective method for ST disease incidence data modeling, using nonparametric local smoothing methods. This method can properly accommodate the ST data correlation. Theoretical justifications and numerical studies show that it works well in practice.
为了监测癌症、艾滋病、心血管疾病和其他慢性或传染病的发病率,已经建立了一些全球、国家和地区报告系统,以收集/提供基于人群的疾病发病率数据。这些数据库通常按日、月或年报告城市、县、州或国家一级的疾病发病率数字,不同地点和不同时间收集的疾病发病率数字往往相关,地点或时间越接近,相关性越强。这种相关性反映了各种混杂风险因素的影响,如天气、人口因素、生活方式以及其他文化和环境因素。由于这种影响很复杂,难以描述,因此观察到的疾病发病率数据中的时空(ST)相关性也具有复杂的 ST 结构。此外,ST 相关性隐藏在观察数据中,无法直接观察到。在文献中,已经有一些关于 ST 数据建模的讨论。但是,现有的方法要么对 ST 相关性施加各种难以证明的严格假设,要么部分或完全忽略 ST 相关性。本文旨在使用非参数局部平滑方法为 ST 疾病发病率数据建模开发一种灵活有效的方法。该方法可以适当容纳 ST 数据相关性。理论论证和数值研究表明,它在实践中效果良好。