Khana Diba, Rossen Lauren M, Hedegaard Holly, Warner Margaret
Division of Research Methodology, National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD 207822.
Division of Vital Statistics, National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD 20782.
J Data Sci. 2018 Jan;16(1):147-182.
Hierarchical Bayes models have been used in disease mapping to examine small scale geographic variation. State level geographic variation for less common causes of mortality outcomes have been reported however county level variation is rarely examined. Due to concerns about statistical reliability and confidentiality, county-level mortality rates based on fewer than 20 deaths are suppressed based on Division of Vital Statistics, National Center for Health Statistics (NCHS) statistical reliability criteria, precluding an examination of spatio-temporal variation in less common causes of mortality outcomes such as suicide rates (SRs) at the county level using direct estimates. Existing Bayesian spatio-temporal modeling strategies can be applied via Integrated Nested Laplace Approximation (INLA) in R to a large number of rare causes of mortality outcomes to enable examination of spatio-temporal variations on smaller geographic scales such as counties. This method allows examination of spatiotemporal variation across the entire U.S., even where the data are sparse. We used mortality data from 2005-2015 to explore spatiotemporal variation in SRs, as one particular application of the Bayesian spatio-temporal modeling strategy in R-INLA to predict year and county-specific SRs. Specifically, hierarchical Bayesian spatio-temporal models were implemented with spatially structured and unstructured random effects, correlated time effects, time varying confounders and space-time interaction terms in the software R-INLA, borrowing strength across both counties and years to produce smoothed county level SRs. Model-based estimates of SRs were mapped to explore geographic variation.
分层贝叶斯模型已被用于疾病地图绘制,以研究小规模的地理变异。已有关于死亡率结果中较不常见病因的州级地理变异的报道,然而县级变异很少被研究。出于对统计可靠性和保密性的担忧,根据国家卫生统计中心(NCHS)生命统计部门的统计可靠性标准,基于少于20例死亡的县级死亡率数据被抑制,这使得无法使用直接估计方法研究县级层面上诸如自杀率(SRs)等较不常见死亡率结果的时空变异。现有的贝叶斯时空建模策略可通过R语言中的集成嵌套拉普拉斯近似(INLA)应用于大量罕见的死亡率结果病因,以便在更小的地理尺度(如县)上研究时空变异。这种方法允许研究整个美国的时空变异,即使在数据稀疏的地区也是如此。我们使用2005 - 2015年的死亡率数据来探索自杀率的时空变异,这是贝叶斯时空建模策略在R-INLA中的一个具体应用,用于预测特定年份和特定县的自杀率。具体而言,在软件R-INLA中使用具有空间结构化和非结构化随机效应、相关时间效应、随时间变化的混杂因素以及时空交互项的分层贝叶斯时空模型,利用各县和各年份的数据来生成平滑的县级自杀率。基于模型的自杀率估计值被绘制出来以探索地理变异。