Zhang Xiao
Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA.
Stat Med. 2020 Nov 10;39(25):3637-3652. doi: 10.1002/sim.8685. Epub 2020 Jul 24.
Data augmentation has been commonly utilized to analyze correlated binary data using multivariate probit models in Bayesian analysis. However, the identification issue in the multivariate probit models necessitates a rigorous Metropolis-Hastings algorithm for sampling a correlation matrix, which may cause slow convergence and inefficiency of Markov chains. It is well-known that the parameter-expanded data augmentation, by introducing a working/artificial parameter or parameter vector, makes an identifiable model be non-identifiable and improves the mixing and convergence of data augmentation components. Therefore, we motivate to develop efficient parameter-expanded data augmentations to analyze correlated binary data using multivariate probit models. We investigate both the identifiable and non-identifiable multivariate probit models and develop the corresponding parameter-expanded data augmentation algorithms. We point out that the approaches, based on one non-identifiable model, circumvent a Metropolis-Hastings algorithm for sampling a correlation matrix and improve the convergence and mixing of correlation parameters; the identifiable model may produce the estimated regression parameters with smaller standard errors than the non-identifiable model does. We illustrate our proposed approaches using simulation studies and through the application to a longitudinal dataset from the Six Cities study.
在贝叶斯分析中,数据增强已被广泛用于使用多元概率单位模型分析相关二元数据。然而,多元概率单位模型中的识别问题需要一种严格的Metropolis-Hastings算法来对相关矩阵进行采样,这可能会导致马尔可夫链收敛缓慢和效率低下。众所周知,通过引入一个工作/人工参数或参数向量,参数扩展的数据增强会使一个可识别模型变得不可识别,并改善数据增强组件的混合和收敛性。因此,我们有动力开发高效的参数扩展数据增强方法,以使用多元概率单位模型分析相关二元数据。我们研究了可识别和不可识别的多元概率单位模型,并开发了相应的参数扩展数据增强算法。我们指出,基于一个不可识别模型的方法规避了用于对相关矩阵进行采样的Metropolis-Hastings算法,并改善了相关参数的收敛和混合;可识别模型可能会产生比不可识别模型标准误差更小的估计回归参数。我们通过模拟研究和应用于来自六城市研究的纵向数据集来说明我们提出的方法。