Caron François, Fox Emily B
University of Oxford UK.
University of Washington Seattle USA.
J R Stat Soc Series B Stat Methodol. 2017 Nov;79(5):1295-1366. doi: 10.1111/rssb.12233. Epub 2017 Sep 23.
Statistical network modelling has focused on representing the graph as a discrete structure, namely the adjacency matrix. When assuming exchangeability of this array-which can aid in modelling, computations and theoretical analysis-the Aldous-Hoover theorem informs us that the graph is necessarily either dense or empty. We instead consider representing the graph as an exchangeable and appeal to the Kallenberg representation theorem for this object. We explore using completely random measures (CRMs) to define the exchangeable random measure, and we show how our CRM construction enables us to achieve sparse graphs while maintaining the attractive properties of exchangeability. We relate the sparsity of the graph to the Lévy measure defining the CRM. For a specific choice of CRM, our graphs can be tuned from dense to sparse on the basis of a single parameter. We present a scalable Hamiltonian Monte Carlo algorithm for posterior inference, which we use to analyse network properties in a range of real data sets, including networks with hundreds of thousands of nodes and millions of edges.
统计网络建模一直专注于将图表示为一种离散结构,即邻接矩阵。当假设这个数组具有可交换性时(这有助于建模、计算和理论分析),奥尔德斯 - 胡佛定理告诉我们,该图必然要么是密集的,要么是空的。相反,我们考虑将图表示为可交换的,并诉诸于针对此对象的卡伦伯格表示定理。我们探索使用完全随机测度(CRM)来定义可交换随机测度,并展示我们的CRM构造如何使我们能够实现稀疏图,同时保持可交换性的吸引人的性质。我们将图的稀疏性与定义CRM的列维测度联系起来。对于CRM的特定选择,我们的图可以基于单个参数从密集调整为稀疏。我们提出了一种用于后验推断的可扩展哈密顿蒙特卡罗算法,我们用它来分析一系列真实数据集中的网络属性,包括具有数十万节点和数百万条边的网络。