Filippi Sarah, Holmes Chris C, Nieto-Barajas Luis E
Department of Statistics, University of Oxford, England.
Department of Statistics, ITAM, Mexico.
Electron J Stat. 2016 Nov 16;10(2):3338-3354. doi: 10.1214/16-ejs1171.
In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets.
在本文中,我们提出了新颖的贝叶斯非参数方法,该方法使用狄利克雷过程混合(DPM)模型来检测随机变量之间的成对依赖性,同时考虑潜在分布形式中的不确定性。一个关键标准是这些方法应能扩展到大数据集。在这方面,我们发现对于依赖与独立的DPM联合概率测度,贝叶斯因子的形式计算在计算上是不可行的。为了解决这个问题,我们提出了贝叶斯诊断度量,用于表征反对成对独立性“零模型”的证据。在模拟研究以及实际数据分析中,我们表明我们的方法为大型多变量数据集的探索性非参数贝叶斯分析提供了一个有用的工具。