Suppr超能文献

多元无序分类数据的单纯形因子模型

Simplex Factor Models for Multivariate Unordered Categorical Data.

作者信息

Bhattacharya Anirban, Dunson David B

机构信息

Department of Statistical Science, Duke University, NC 27708.

出版信息

J Am Stat Assoc. 2012 Mar 1;107(497):362-377. doi: 10.1080/01621459.2011.646934.

Abstract

Gaussian latent factor models are routinely used for modeling of dependence in continuous, binary, and ordered categorical data. For unordered categorical variables, Gaussian latent factor models lead to challenging computation and complex modeling structures. As an alternative, we propose a novel class of simplex factor models. In the single-factor case, the model treats the different categorical outcomes as independent with unknown marginals. The model can characterize flexible dependence structures parsimoniously with few factors, and as factors are added, any multivariate categorical data distribution can be accurately approximated. Using a Bayesian approach for computation and inferences, a Markov chain Monte Carlo (MCMC) algorithm is proposed that scales well with increasing dimension, with the number of factors treated as unknown. We develop an efficient proposal for updating the base probability vector in hierarchical Dirichlet models. Theoretical properties are described, and we evaluate the approach through simulation examples. Applications are described for modeling dependence in nucleotide sequences and prediction from high-dimensional categorical features.

摘要

高斯潜在因子模型通常用于对连续、二元和有序分类数据中的相关性进行建模。对于无序分类变量,高斯潜在因子模型会导致具有挑战性的计算和复杂的建模结构。作为一种替代方法,我们提出了一类新颖的单纯形因子模型。在单因子情况下,该模型将不同的分类结果视为具有未知边际分布的独立变量。该模型可以用较少的因子简洁地刻画灵活的依赖结构,并且随着因子的增加,可以准确地近似任何多元分类数据分布。使用贝叶斯方法进行计算和推断,我们提出了一种马尔可夫链蒙特卡罗(MCMC)算法,该算法随着维度的增加具有良好的扩展性,其中因子的数量被视为未知。我们开发了一种有效的提议,用于在分层狄利克雷模型中更新基础概率向量。描述了理论性质,并通过模拟示例评估了该方法。描述了该方法在核苷酸序列相关性建模和高维分类特征预测中的应用。

相似文献

1
Simplex Factor Models for Multivariate Unordered Categorical Data.多元无序分类数据的单纯形因子模型
J Am Stat Assoc. 2012 Mar 1;107(497):362-377. doi: 10.1080/01621459.2011.646934.
2
Nonparametric Bayes Modeling of Multivariate Categorical Data.多变量分类数据的非参数贝叶斯建模
J Am Stat Assoc. 2012 Jan 1;104(487):1042-1051. doi: 10.1198/jasa.2009.tm08439.
3
Bayesian Conditional Tensor Factorizations for High-Dimensional Classification.用于高维分类的贝叶斯条件张量分解
J Am Stat Assoc. 2016;111(514):656-669. doi: 10.1080/01621459.2015.1029129. Epub 2016 Aug 18.
4
TENSOR DECOMPOSITIONS AND SPARSE LOG-LINEAR MODELS.张量分解与稀疏对数线性模型
Ann Stat. 2017;45(1):1-38. doi: 10.1214/15-AOS1414. Epub 2017 Feb 21.
6
Bayesian Gaussian Copula Factor Models for Mixed Data.用于混合数据的贝叶斯高斯Copula因子模型
J Am Stat Assoc. 2013 Jun 1;108(502):656-665. doi: 10.1080/01621459.2012.762328.
7
Bayesian dynamic modeling of latent trait distributions.潜在特质分布的贝叶斯动态建模。
Biostatistics. 2006 Oct;7(4):551-68. doi: 10.1093/biostatistics/kxj025. Epub 2006 Feb 17.

引用本文的文献

2
Optimal High-order Tensor SVD via Tensor-Train Orthogonal Iteration.通过张量列正交迭代实现最优高阶张量奇异值分解
IEEE Trans Inf Theory. 2022 Jun;68(6):3991-4019. doi: 10.1109/tit.2022.3152733. Epub 2022 Feb 18.
3
Fast Moment Estimation for Generalized Latent Dirichlet Models.广义潜在狄利克雷模型的快速矩估计
J Am Stat Assoc. 2018;113(524):1528-1540. doi: 10.1080/01621459.2017.1341839. Epub 2018 Nov 13.
6
Bayesian Conditional Tensor Factorizations for High-Dimensional Classification.用于高维分类的贝叶斯条件张量分解
J Am Stat Assoc. 2016;111(514):656-669. doi: 10.1080/01621459.2015.1029129. Epub 2016 Aug 18.
7
Bayesian factorizations of big sparse tensors.大稀疏张量的贝叶斯因式分解
J Am Stat Assoc. 2015;110(512):1562-1576. doi: 10.1080/01621459.2014.983233. Epub 2016 Jan 15.

本文引用的文献

1
Posterior consistency in conditional distribution estimation.条件分布估计中的后验一致性
J Multivar Anal. 2013 Apr 1;116:456-472. doi: 10.1016/j.jmva.2013.01.011.
4
Nonparametric Bayes Modeling of Multivariate Categorical Data.多变量分类数据的非参数贝叶斯建模
J Am Stat Assoc. 2012 Jan 1;104(487):1042-1051. doi: 10.1198/jasa.2009.tm08439.
10
Kernel stick-breaking processes.核折断过程
Biometrika. 2008;95(2):307-323. doi: 10.1093/biomet/asn012.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验