Suppr超能文献

多变量分类数据的非参数贝叶斯建模

Nonparametric Bayes Modeling of Multivariate Categorical Data.

作者信息

Dunson David B, Xing Chuanhua

机构信息

Department of Statistical Science, Duke University, Durham, NC 27705.

出版信息

J Am Stat Assoc. 2012 Jan 1;104(487):1042-1051. doi: 10.1198/jasa.2009.tm08439.

Abstract

Modeling of multivariate unordered categorical (nominal) data is a challenging problem, particularly in high dimensions and cases in which one wishes to avoid strong assumptions about the dependence structure. Commonly used approaches rely on the incorporation of latent Gaussian random variables or parametric latent class models. The goal of this article is to develop a nonparametric Bayes approach, which defines a prior with full support on the space of distributions for multiple unordered categorical variables. This support condition ensures that we are not restricting the dependence structure a priori. We show this can be accomplished through a Dirichlet process mixture of product multinomial distributions, which is also a convenient form for posterior computation. Methods for nonparametric testing of violations of independence are proposed, and the methods are applied to model positional dependence within transcription factor binding motifs.

摘要

多元无序分类(名义)数据的建模是一个具有挑战性的问题,特别是在高维情况下以及希望避免对依赖结构做出强假设的情形中。常用方法依赖于纳入潜在高斯随机变量或参数化潜在类别模型。本文的目标是开发一种非参数贝叶斯方法,该方法在多个无序分类变量的分布空间上定义一个具有完全支撑的先验。这个支撑条件确保我们不会先验地限制依赖结构。我们表明这可以通过乘积多项分布的狄利克雷过程混合来实现,这对于后验计算也是一种方便的形式。提出了用于独立性违反的非参数检验方法,并将这些方法应用于转录因子结合基序内的位置依赖性建模。

相似文献

1
Nonparametric Bayes Modeling of Multivariate Categorical Data.
J Am Stat Assoc. 2012 Jan 1;104(487):1042-1051. doi: 10.1198/jasa.2009.tm08439.
2
Simplex Factor Models for Multivariate Unordered Categorical Data.
J Am Stat Assoc. 2012 Mar 1;107(497):362-377. doi: 10.1080/01621459.2011.646934.
3
Bayesian nonparametric hierarchical modeling.
Biom J. 2009 Apr;51(2):273-84. doi: 10.1002/bimj.200800183.
4
Nonparametric Bayes Classification and Hypothesis Testing on Manifolds.
J Multivar Anal. 2012 Oct 1;111:1-19. doi: 10.1016/j.jmva.2012.02.020. Epub 2012 Apr 17.
5
MULTIVARIATE KERNEL PARTITION PROCESS MIXTURES.
Stat Sin. 2010 Oct 10;20(4):1395-1422.
6
Semiparametric Bayes hierarchical models with mean and variance constraints.
Comput Stat Data Anal. 2010 Sep 1;54(9):2172-2186. doi: 10.1016/j.csda.2010.03.025.
7
8
Modeling continuous diagnostic test data using approximate Dirichlet process distributions.
Stat Med. 2011 Sep 20;30(21):2648-62. doi: 10.1002/sim.4320. Epub 2011 Jul 22.
9
Nonparametric Bayes Stochastically Ordered Latent Class Models.
J Am Stat Assoc. 2011 Sep 1;106(495):807-817. doi: 10.1198/jasa.2011.ap10058.
10
Bayesian Kernel Mixtures for Counts.
J Am Stat Assoc. 2011 Dec 1;106(496):1528-1539. doi: 10.1198/jasa.2011.tm10552. Epub 2012 Jan 24.

引用本文的文献

3
BAYESIAN NESTED LATENT CLASS MODELS FOR CAUSE-OF-DEATH ASSIGNMENT USING VERBAL AUTOPSIES ACROSS MULTIPLE DOMAINS.
Ann Appl Stat. 2024 Jun;18(2):1137-1159. doi: 10.1214/23-aoas1826. Epub 2024 Apr 5.
5
A Bayesian nonparametric approach for multiple mediators with applications in mental health studies.
Biostatistics. 2024 Jul 1;25(3):919-932. doi: 10.1093/biostatistics/kxad038.
7
Optimal High-order Tensor SVD via Tensor-Train Orthogonal Iteration.
IEEE Trans Inf Theory. 2022 Jun;68(6):3991-4019. doi: 10.1109/tit.2022.3152733. Epub 2022 Feb 18.
8
Fast Moment Estimation for Generalized Latent Dirichlet Models.
J Am Stat Assoc. 2018;113(524):1528-1540. doi: 10.1080/01621459.2017.1341839. Epub 2018 Nov 13.
9
COMPOSITE MIXTURE OF LOG-LINEAR MODELS WITH APPLICATION TO PSYCHIATRIC STUDIES.
Ann Appl Stat. 2022 Jun;16(2):765-790. doi: 10.1214/21-aoas1515. Epub 2022 Jun 13.
10
A Gibbs sampler for a class of random convex polytopes.
J Am Stat Assoc. 2021;116(535):1181-1192. doi: 10.1080/01621459.2021.1945458. Epub 2021 Apr 22.

本文引用的文献

1
Structural Learning of Chain Graphs via Decomposition.
J Mach Learn Res. 2008 Dec 1;9:2847-2880.
2
Bayesian Analysis of Multivariate Nominal Measures Using Multivariate Multinomial Probit Models.
Comput Stat Data Anal. 2008 Mar 15;52(7):3697-3708. doi: 10.1016/j.csda.2007.12.012.
3
Kernel stick-breaking processes.
Biometrika. 2008;95(2):307-323. doi: 10.1093/biomet/asn012.
4
BioBayesNet: a web server for feature extraction and Bayesian network modeling of biological sequence data.
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W688-93. doi: 10.1093/nar/gkm292. Epub 2007 May 30.
5
Position dependencies in transcription factor binding sites.
Bioinformatics. 2007 Apr 15;23(8):933-41. doi: 10.1093/bioinformatics/btm055. Epub 2007 Feb 18.
6
A global map of p53 transcription-factor binding sites in the human genome.
Cell. 2006 Jan 13;124(1):207-19. doi: 10.1016/j.cell.2005.10.043.
7
Subset clustering of binary sequences, with an application to genomic abnormality data.
Biometrics. 2005 Dec;61(4):1027-36. doi: 10.1111/j.1541-0420.2005.00381.x.
8
Modeling within-motif dependence for transcription factor binding site predictions.
Bioinformatics. 2004 Apr 12;20(6):909-16. doi: 10.1093/bioinformatics/bth006. Epub 2004 Jan 29.
10
Surfing the p53 network.
Nature. 2000 Nov 16;408(6810):307-10. doi: 10.1038/35042675.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验