Suppr超能文献

广义潜在狄利克雷模型的快速矩估计

Fast Moment Estimation for Generalized Latent Dirichlet Models.

作者信息

Zhao Shiwen, Engelhardt Barbara E, Mukherjee Sayan, Dunson David B

机构信息

Department of Statistical Science, Duke University, Durham, NC.

Department of Computer Science and Center for Statistics and Machine Learning, Princeton University, Princeton, NJ.

出版信息

J Am Stat Assoc. 2018;113(524):1528-1540. doi: 10.1080/01621459.2017.1341839. Epub 2018 Nov 13.

Abstract

We develop a generalized method of moments (GMM) approach for fast parameter estimation in a new class of Dirichlet latent variable models with mixed data types. Parameter estimation via GMM has computational and statistical advantages over alternative methods, such as expectation maximization, variational inference, and Markov chain Monte Carlo. A key computational advantage of our method, Moment Estimation for latent Dirichlet models (MELD), is that parameter estimation does not require instantiation of the latent variables. Moreover, performance is agnostic to distributional assumptions of the observations. We derive population moment conditions after marginalizing out the sample-specific Dirichlet latent variables. The moment conditions only depend on component mean parameters. We illustrate the utility of our approach on simulated data, comparing results from MELD to alternative methods, and we show the promise of our approach through the application to several datasets. Supplementary materials for this article are available online.

摘要

我们开发了一种广义矩方法(GMM),用于在一类具有混合数据类型的新型狄利克雷潜变量模型中进行快速参数估计。与期望最大化、变分推断和马尔可夫链蒙特卡罗等替代方法相比,通过GMM进行参数估计具有计算和统计优势。我们的方法——潜狄利克雷模型矩估计(MELD)的一个关键计算优势在于,参数估计不需要实例化潜变量。此外,性能与观测值的分布假设无关。在对特定样本的狄利克雷潜变量进行边缘化之后,我们推导了总体矩条件。矩条件仅取决于分量均值参数。我们在模拟数据上说明了我们方法的效用,将MELD的结果与替代方法进行了比较,并通过应用于几个数据集展示了我们方法的前景。本文的补充材料可在线获取。

相似文献

1
Fast Moment Estimation for Generalized Latent Dirichlet Models.
J Am Stat Assoc. 2018;113(524):1528-1540. doi: 10.1080/01621459.2017.1341839. Epub 2018 Nov 13.
2
Stochastic Generalized Method of Moments.
J Comput Graph Stat. 2011 Sep 1;20(3):714-727. doi: 10.1198/jcgs.2011.09210.
3
Mixture class recovery in GMM under varying degrees of class separation: frequentist versus Bayesian estimation.
Psychol Methods. 2013 Jun;18(2):186-219. doi: 10.1037/a0031609. Epub 2013 Mar 25.
4
Variational Bayesian Learning of Generalized Dirichlet-Based Hidden Markov Models Applied to Unusual Events Detection.
IEEE Trans Neural Netw Learn Syst. 2019 Apr;30(4):1034-1047. doi: 10.1109/TNNLS.2018.2855699. Epub 2018 Aug 8.
5
Dirichlet Process Mixture of Generalized Inverted Dirichlet Distributions for Positive Vector Data With Extended Variational Inference.
IEEE Trans Neural Netw Learn Syst. 2022 Nov;33(11):6089-6102. doi: 10.1109/TNNLS.2021.3072209. Epub 2022 Oct 27.
6
Parameter Expanded Algorithms for Bayesian Latent Variable Modeling of Genetic Pleiotropy Data.
J Comput Graph Stat. 2016;25(2):405-425. doi: 10.1080/10618600.2014.988337. Epub 2016 May 10.
7
Efficient inference in state-space models through adaptive learning in online Monte Carlo expectation maximization.
Comput Stat. 2020;35(3):1319-1344. doi: 10.1007/s00180-019-00937-4. Epub 2019 Dec 3.
8
Meta-analysis using Dirichlet process.
Stat Methods Med Res. 2016 Feb;25(1):352-65. doi: 10.1177/0962280212453891. Epub 2012 Jul 16.
9
A Bayesian framework for image segmentation with spatially varying mixtures.
IEEE Trans Image Process. 2010 Sep;19(9):2278-89. doi: 10.1109/TIP.2010.2047903. Epub 2010 Apr 8.
10
Hidden Markov latent variable models with multivariate longitudinal data.
Biometrics. 2017 Mar;73(1):313-323. doi: 10.1111/biom.12536. Epub 2016 May 5.

引用本文的文献

本文引用的文献

1
FastMotif: spectral sequence motif discovery.
Bioinformatics. 2015 Aug 15;31(16):2623-31. doi: 10.1093/bioinformatics/btv208. Epub 2015 Apr 16.
2
Model-implied instrumental variable-generalized method of moments (MIIV-GMM) estimators for latent variable models.
Psychometrika. 2014 Jan;79(1):20-50. doi: 10.1007/s11336-013-9335-3. Epub 2013 Apr 11.
3
Bayesian Gaussian Copula Factor Models for Mixed Data.
J Am Stat Assoc. 2013 Jun 1;108(502):656-665. doi: 10.1080/01621459.2012.762328.
4
Simplex Factor Models for Multivariate Unordered Categorical Data.
J Am Stat Assoc. 2012 Mar 1;107(497):362-377. doi: 10.1080/01621459.2011.646934.
5
Nonparametric Bayes Modeling of Multivariate Categorical Data.
J Am Stat Assoc. 2012 Jan 1;104(487):1042-1051. doi: 10.1198/jasa.2009.tm08439.
6
Patterns of cis regulatory variation in diverse human populations.
PLoS Genet. 2012;8(4):e1002639. doi: 10.1371/journal.pgen.1002639. Epub 2012 Apr 19.
7
Inference of population structure using multilocus genotype data.
Genetics. 2000 Jun;155(2):945-59. doi: 10.1093/genetics/155.2.945.
8
Association mapping in structured populations.
Am J Hum Genet. 2000 Jul;67(1):170-81. doi: 10.1086/302959. Epub 2000 May 26.
9
Analysis of E. coli promoter sequences.
Nucleic Acids Res. 1987 Mar 11;15(5):2343-61. doi: 10.1093/nar/15.5.2343.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验