Murray Jared S, Dunson David B, Carin Lawrence, Lucas Joseph E
Dept. of Statistical Science, Duke University, Durham, NC 27708 (
J Am Stat Assoc. 2013 Jun 1;108(502):656-665. doi: 10.1080/01621459.2012.762328.
Gaussian factor models have proven widely useful for parsimoniously characterizing dependence in multivariate data. There is a rich literature on their extension to mixed categorical and continuous variables, using latent Gaussian variables or through generalized latent trait models acommodating measurements in the exponential family. However, when generalizing to non-Gaussian measured variables the latent variables typically influence both the dependence structure and the form of the marginal distributions, complicating interpretation and introducing artifacts. To address this problem we propose a novel class of Bayesian Gaussian copula factor models which decouple the latent factors from the marginal distributions. A semiparametric specification for the marginals based on the extended rank likelihood yields straightforward implementation and substantial computational gains. We provide new theoretical and empirical justifications for using this likelihood in Bayesian inference. We propose new default priors for the factor loadings and develop efficient parameter-expanded Gibbs sampling for posterior computation. The methods are evaluated through simulations and applied to a dataset in political science. The models in this paper are implemented in the R package bfa.
高斯因子模型已被证明在简洁地刻画多元数据中的相关性方面非常有用。关于将其扩展到混合分类变量和连续变量,有大量文献,这些文献使用潜在高斯变量或通过广义潜在特质模型来处理指数族中的测量值。然而,当推广到非高斯测量变量时,潜在变量通常会同时影响依赖结构和边际分布的形式,这使得解释变得复杂并引入了人为因素。为了解决这个问题,我们提出了一类新颖的贝叶斯高斯copula因子模型,该模型将潜在因子与边际分布解耦。基于扩展秩似然的边际半参数规范实现起来很直接,并且在计算上有很大的优势。我们为在贝叶斯推断中使用这种似然提供了新的理论和实证依据。我们为因子载荷提出了新的默认先验,并开发了用于后验计算的高效参数扩展吉布斯抽样。通过模拟对这些方法进行了评估,并将其应用于一个政治学数据集。本文中的模型在R包bfa中实现。