Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran, Non-Communicable Disease Research Center, Endocrinology and Metabolism Population Science Research Institute, Tehran University of Medical Sciences, Tehran, Iran. mohamadk@ tums.ac.ir,
Non-Communicable Disease Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, Tehran, Iran Endocrinology and Metabolism Research Center, Endocrinology and Metabolism Institute, Tehran University of Medical Sciences, Tehran, Iran.
Arch Iran Med. 2014 Jan;17(1):28-33.
Identifying the burden of disease and its inequality between geographical regions is an important issue to study health priorities. Estimating burden of diseases using statistical models is inevitable especially in the context of rare data availability. To this purpose, the spatio-temporal model can provide a statistically sound approach for explaining the response variable observed over a region and various times. However, there are some methodological challenges in analysis of these complex data. Our primary objective is to provide some remedies to overcome these challenges.
Data from nationally representative surveys and systematic reviews have been gathered across contiguous areal units over a period of more than 20 years (1990 - 2013). Generally, observations of areal units are spatially and temporally correlated in such a way that observations closer in space and time tend to be more correlated than observations farther away. It is critical to determine the correlation structure in space-time process which has been observed over a set of irregular regions. Moreover, these data sets are subject to high percentage of missing, including misaligned areal units, areas with small sample size, and may have nonlinear trends over space and time. Furthermore, the Gaussian assumption might be overly restrictive to represent the data. In this setting, the traditional statistical techniques are not appropriate and more flexible and comprehensive methodology is required. Particularly, we focus on approaches that allow extending spatio-temporal models proposed previously in the literature.Since statistical models include both continuous and categorical outcomes, we assume a latent variable framework for describing the underlying structure in mixed outcomes and use a conditionally autoregressive (CAR) prior for the random effects. In addition, we will employ misalignment modeling to combine incompatible areal units between data sources and/or over the years to obtain a unified clear picture of population health status over this period. In order to take parameter uncertainties into account, we pursue a Bayesian sampling-based inference. Hence, a hierarchical Bayes approach is constructed to model the data. The hierarchical structure enables us to "borrow information" from neighboring areal units to improve estimates for areas with missing values and small number of observations. For their general applicability and ease of implementation, the MCMC methods are the most adapted tool to perform Bayesian inference.
This study aims to combine different available data sources and produce precise and reliable evidences for Iranian burden of diseases and risk factors and their disparities among geographical regions over time. Providing appropriate statistical methods and models for analyzing the data is undoubtedly crucial to circumvent the problems and obtain satisfactory estimates of model parameters and reach accurate assessment.
确定疾病负担及其在地理区域之间的不平等是研究卫生重点的重要问题。使用统计模型估计疾病负担在数据可用性有限的情况下是不可避免的。为此,时空模型可以为解释在一个地区和不同时间观察到的响应变量提供一种统计上合理的方法。然而,在分析这些复杂数据时存在一些方法学挑战。我们的主要目标是提供一些补救措施来克服这些挑战。
数据来自全国代表性调查和系统评价,跨越 20 多年(1990-2013 年)的连续地域单元。通常,地域单元的观测在空间和时间上是相关的,空间和时间上较近的观测比较远的观测更相关。确定在一组不规则区域中观察到的时空过程的相关结构至关重要。此外,这些数据集存在大量缺失,包括错位地域单元、样本量小的区域,并且可能在空间和时间上存在非线性趋势。此外,高斯假设可能过于严格,无法表示数据。在这种情况下,传统的统计技术并不适用,需要更灵活和全面的方法。特别是,我们专注于允许扩展文献中先前提出的时空模型的方法。由于统计模型包括连续和分类结果,因此我们假设潜在变量框架来描述混合结果中的基本结构,并为随机效应使用条件自回归(CAR)先验。此外,我们将采用错位建模来组合数据源之间和/或多年来不兼容的地域单元,以获得该时期人口健康状况的统一清晰图景。为了考虑参数不确定性,我们采用基于贝叶斯抽样的推理。因此,构建了分层贝叶斯方法来对数据进行建模。分层结构使我们能够从相邻的地域单元中“借用信息”,以提高对缺失值和观测值较少的区域的估计值。为了具有通用性和易于实现,MCMC 方法是执行贝叶斯推理的最适合工具。
本研究旨在结合不同的可用数据源,为伊朗疾病负担和风险因素及其随时间在地理区域之间的差异提供准确可靠的证据。提供适当的统计方法和模型来分析数据无疑是至关重要的,可以避免问题并获得模型参数的满意估计值,从而进行准确的评估。