Wang Yixin, Degleris Anthony, Williams Alex, Linderman Scott W
Department of Statistics, University of Michigan, Ann Arbor, MI, USA.
Department of Electrical Engineering, Stanford University, Stanford, CA, USA.
J Am Stat Assoc. 2024;119(547):2382-2395. doi: 10.1080/01621459.2023.2257896. Epub 2023 Nov 9.
Neyman-Scott processes (NSPs) are point process models that generate clusters of points in time or space. They are natural models for a wide range of phenomena, ranging from neural spike trains to document streams. The clustering property is achieved via a doubly stochastic formulation: first, a set of latent events is drawn from a Poisson process; then, each latent event generates a set of observed data points according to another Poisson process. This construction is similar to Bayesian nonparametric mixture models like the Dirichlet process mixture model (DPMM) in that the number of latent events (i.e. clusters) is a random variable, but the point process formulation makes the NSP especially well suited to modeling spatiotemporal data. While many specialized algorithms have been developed for DPMMs, comparatively fewer works have focused on inference in NSPs. Here, we present novel connections between NSPs and DPMMs, with the key link being a third class of Bayesian mixture models called mixture of finite mixture models (MFMMs). Leveraging this connection, we adapt the standard collapsed Gibbs sampling algorithm for DPMMs to enable scalable Bayesian inference on NSP models. We demonstrate the potential of Neyman-Scott processes on a variety of applications including sequence detection in neural spike trains and event detection in document streams.
内曼 - 斯科特过程(NSPs)是一种点过程模型,用于生成时间或空间上的点簇。它们是广泛现象的自然模型,涵盖从神经脉冲序列到文档流等各种领域。聚类特性通过双重随机公式实现:首先,从泊松过程中抽取一组潜在事件;然后,每个潜在事件根据另一个泊松过程生成一组观测数据点。这种构造类似于贝叶斯非参数混合模型,如狄利克雷过程混合模型(DPMM),因为潜在事件的数量(即簇的数量)是一个随机变量,但点过程公式使NSP特别适合对时空数据进行建模。虽然已经为DPMM开发了许多专门算法,但相对较少的工作专注于NSPs中的推断。在这里,我们展示了NSPs和DPMMs之间的新颖联系,关键纽带是第三类贝叶斯混合模型,称为有限混合模型的混合(MFMMs)。利用这种联系,我们改编了用于DPMMs的标准塌缩吉布斯采样算法,以实现对NSP模型的可扩展贝叶斯推断。我们在各种应用中展示了内曼 - 斯科特过程的潜力,包括神经脉冲序列中的序列检测和文档流中的事件检测。