Zhao Shiwen, Engelhardt Barbara E, Mukherjee Sayan, Dunson David B
Department of Statistical Science, Duke University, Durham, NC.
Department of Computer Science and Center for Statistics and Machine Learning, Princeton University, Princeton, NJ.
J Am Stat Assoc. 2018;113(524):1528-1540. doi: 10.1080/01621459.2017.1341839. Epub 2018 Nov 13.
We develop a generalized method of moments (GMM) approach for fast parameter estimation in a new class of Dirichlet latent variable models with mixed data types. Parameter estimation via GMM has computational and statistical advantages over alternative methods, such as expectation maximization, variational inference, and Markov chain Monte Carlo. A key computational advantage of our method, Moment Estimation for latent Dirichlet models (MELD), is that parameter estimation does not require instantiation of the latent variables. Moreover, performance is agnostic to distributional assumptions of the observations. We derive population moment conditions after marginalizing out the sample-specific Dirichlet latent variables. The moment conditions only depend on component mean parameters. We illustrate the utility of our approach on simulated data, comparing results from MELD to alternative methods, and we show the promise of our approach through the application to several datasets. Supplementary materials for this article are available online.
我们开发了一种广义矩方法(GMM),用于在一类具有混合数据类型的新型狄利克雷潜变量模型中进行快速参数估计。与期望最大化、变分推断和马尔可夫链蒙特卡罗等替代方法相比,通过GMM进行参数估计具有计算和统计优势。我们的方法——潜狄利克雷模型矩估计(MELD)的一个关键计算优势在于,参数估计不需要实例化潜变量。此外,性能与观测值的分布假设无关。在对特定样本的狄利克雷潜变量进行边缘化之后,我们推导了总体矩条件。矩条件仅取决于分量均值参数。我们在模拟数据上说明了我们方法的效用,将MELD的结果与替代方法进行了比较,并通过应用于几个数据集展示了我们方法的前景。本文的补充材料可在线获取。