Virgili-Gervais Gabrielle, Schmidt Alexandra M, Bixby Honor, Cavanaugh Alicia, Owusu George, Agyei-Mensah Samuel, Robinson Brian, Baumgartner Jill
Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada.
Institute of Public Health and Wellbeing, University of Essex, Colchester, UK.
J R Stat Soc Ser A Stat Soc. 2024 Aug 20:qnae080. doi: 10.1093/jrsssa/qnae080.
We propose a Bayesian hierarchical model to estimate a socio-economic status (SES) index based on mixed dichotomous and continuous variables. In particular, we extend Quinn's ([2004]. Bayesian factor analysis for mixed ordinal and continuous responses. (4), 338-353. https://doi.org/10.1093/pan/mph022) and Schliep and Hoeting's ([2013]. Multilevel latent Gaussian process model for mixed discrete and continuous multivariate response data. (4), 492-513. https://doi.org/10.1007/s13253-013-0136-z) factor analysis models for mixed dichotomous and continuous variables by allowing a spatial hierarchical structure of key parameters of the model. Unlike most SES assessment models proposed in the literature, the hierarchical nature of this model enables the use of census observations at the household level without needing to aggregate any information . Therefore, it better accommodates the variability of the SES between census tracts and the number of households per area. The proposed model is used in the estimation of a socio-economic index using 10% of the 2010 Ghana census in the Greater Accra Metropolitan area. Out of the 20 observed variables, the number of people per room, access to water piping and flushable toilets differentiated high and low SES areas the best.
我们提出了一种贝叶斯分层模型,用于基于混合二分变量和连续变量来估计社会经济地位(SES)指数。具体而言,我们扩展了奎因([2004年]。混合有序和连续响应的贝叶斯因子分析。(4),338 - 353。https://doi.org/10.1093/pan/mph022)以及施利普和霍廷([2013年]。混合离散和连续多元响应数据的多层潜在高斯过程模型。(4),492 - 513。https://doi.org/10.1007/s13253-013-0136-z)针对混合二分变量和连续变量的因子分析模型,允许模型关键参数具有空间分层结构。与文献中提出的大多数SES评估模型不同,该模型的分层性质使得能够使用家庭层面的人口普查观测数据,而无需汇总任何信息。因此,它能更好地适应普查区域之间SES的变异性以及每个区域的家庭数量。所提出的模型用于使用大阿克拉都会区2010年加纳人口普查10%的数据来估计社会经济指数。在20个观测变量中,每间房的人数、是否接通水管和是否有冲水马桶对高SES区域和低SES区域的区分效果最佳。