Yang Hongxia, O'Brien Sean, Dunson David B
Mathematical Sciences Department, Watson Research Center, IBM, Yorktown Heights, NY 10598 (
J Am Stat Assoc. 2011 Sep 1;106(495):807-817. doi: 10.1198/jasa.2011.ap10058.
Latent class models (LCMs) are used increasingly for addressing a broad variety of problems, including sparse modeling of multivariate and longitudinal data, model-based clustering, and flexible inferences on predictor effects. Typical frequentist LCMs require estimation of a single finite number of classes, which does not increase with the sample size, and have a well-known sensitivity to parametric assumptions on the distributions within a class. Bayesian nonparametric methods have been developed to allow an infinite number of classes in the general population, with the number represented in a sample increasing with sample size. In this article, we propose a new nonparametric Bayes model that allows predictors to flexibly impact the allocation to latent classes, while limiting sensitivity to parametric assumptions by allowing class-specific distributions to be unknown subject to a stochastic ordering constraint. An efficient MCMC algorithm is developed for posterior computation. The methods are validated using simulation studies and applied to the problem of ranking medical procedures in terms of the distribution of patient morbidity.
潜在类别模型(LCMs)越来越多地用于解决各种各样的问题,包括多变量和纵向数据的稀疏建模、基于模型的聚类以及对预测效应的灵活推断。典型的频率主义LCMs需要估计有限数量的类别,该数量不会随样本量增加,并且对类内分布的参数假设具有众所周知的敏感性。贝叶斯非参数方法已被开发出来,以允许总体中有无限数量的类别,样本中表示的类别数量随样本量增加。在本文中,我们提出了一种新的非参数贝叶斯模型,该模型允许预测变量灵活地影响对潜在类别的分配,同时通过允许特定类别的分布在随机排序约束下未知来限制对参数假设的敏感性。开发了一种有效的MCMC算法用于后验计算。通过模拟研究对这些方法进行了验证,并将其应用于根据患者发病率分布对医疗程序进行排名的问题。