Camerlenghi Federico, Dunson David B, Lijoi Antonio, Prünster Igor, Rodríguez Abel
Department of Economics, Management and Statistics, University of Milano - Bicocca, Piazza dell'Ateneo Nuovo 1, 20126 Milano, Italy.
Also affiliated to Collegio Carlo Alberto, Torino and BIDSA, Bocconi University, Milano, Italy.
Bayesian Anal. 2019 Dec;14(4):1303-1356. doi: 10.1214/19-BA1169. Epub 2019 Jun 27.
Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalizing to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop a Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by-product. The results and their inferential implications are showcased on synthetic and real data.
离散随机结构是贝叶斯非参数中的重要工具,由此产生的模型已被证明在密度估计、聚类、主题建模和预测等方面有效。在本文中,我们考虑嵌套过程并研究它们所诱导的依赖结构。依赖程度介于对应于完全可交换性的同质性和对应于样本间(无条件)独立性的最大异质性之间。当在观察或潜在层面的样本间存在平局时,流行的嵌套狄利克雷过程会退化为完全可交换的情况。为了克服嵌套一般离散随机测度所固有的这一缺点,我们引入了一类新颖的潜在嵌套过程。这些过程是通过添加共同的和特定组的完全随机测度,然后进行归一化以产生相关的随机概率测度而得到的。我们给出了由潜在嵌套过程诱导的划分分布的结果,并开发了用于贝叶斯推断的马尔可夫链蒙特卡罗采样器。作为副产品,我们得到了一个用于检验组间分布同质性的检验。结果及其推断意义在合成数据和真实数据上得到了展示。