Lu Kaifeng, Chen Fang
Global Statistics, 527310BeiGene, Ridgefield Park, NJ, USA.
2297SAS Institute Inc., Cary, NC, USA.
Stat Methods Med Res. 2022 Dec;31(12):2261-2286. doi: 10.1177/09622802221122403. Epub 2022 Sep 21.
Dichotomous response data observed over multiple time points, especially data that exhibit longitudinal structures, are important in many applied fields. The multivariate probit model has been an attractive tool in such situations for its ability to handle correlations among the outcomes, typically by modeling the covariance (correlation) structure of the latent variables. In addition, a multivariate probit model facilitates controlled imputations for nonignorable dropout, a phenomenon commonly observed in clinical trials of experimental drugs or biologic products. While the model is relatively simple to specify, estimation, particularly from a Bayesian perspective that relies on Markov chain Monte Carlo sampling, is not as straightforward. Here we compare five sampling algorithms for the correlation matrix and discuss their merits: a parameter-expanded Metropolis-Hastings algorithm (Zhang et al., 2006), a parameter-expanded Gibbs sampling algorithm (Talhouk et al., 2012), a parameter-expanded Gibbs sampling algorithm with unit constraints on conditional variances (Tang, 2018), a partial autocorrelation parameterization approach (Gaskins et al., 2014), and a semi-partial correlation parameterization approach (Ghosh et al., 2021). We describe each algorithm, use simulation studies to evaluate their performance, and focus on comparison criteria such as computational cost, convergence time, robustness, and ease of implementations. We find that the parameter-expanded Gibbs sampling algorithm by Talhouk et al. (2012) often has the most efficient convergence with relatively low computational complexity, while the partial autocorrelation parameterization approach is more flexible for estimating the correlation matrix of latent variables for typical late phase longitudinal studies.
在多个时间点观察到的二分反应数据,尤其是呈现纵向结构的数据,在许多应用领域都很重要。多元概率单位模型在这种情况下一直是一种有吸引力的工具,因为它能够处理结果之间的相关性,通常是通过对潜在变量的协方差(相关)结构进行建模。此外,多元概率单位模型有助于对不可忽略的失访进行可控插补,这是在实验药物或生物制品的临床试验中常见的现象。虽然该模型的指定相对简单,但估计,特别是从依赖马尔可夫链蒙特卡罗抽样的贝叶斯角度来看,并不那么直接。在这里,我们比较了五种用于相关矩阵的抽样算法,并讨论了它们的优点:参数扩展的Metropolis-Hastings算法(Zhang等人,2006年)、参数扩展的吉布斯抽样算法(Talhouk等人,2012年)、对条件方差有单位约束的参数扩展的吉布斯抽样算法(Tang,2018年)、偏自相关参数化方法(Gaskins等人,2014年)和半偏相关参数化方法(Ghosh等人,2021年)。我们描述了每种算法,使用模拟研究来评估它们的性能,并关注计算成本、收敛时间、稳健性和实现难易程度等比较标准。我们发现,Talhouk等人(2012年)的参数扩展的吉布斯抽样算法通常具有最有效的收敛性,计算复杂度相对较低,而偏自相关参数化方法对于典型的后期纵向研究中潜在变量的相关矩阵估计更灵活。