Suppr超能文献

稀疏图的半监督聚类:跨越信息论阈值

Semi-Supervised Clustering of Sparse Graphs: Crossing the Information-Theoretic Threshold.

作者信息

Sheng Junda, Strohmer Thomas

机构信息

Department of Mathematics, University of California, Davis, CA 95616-5270, USA.

Department of Mathematics and Center of Data Science and Artificial Intelligence Research, University of California, Davis, CA 95616-5270, USA.

出版信息

J Mach Learn. 2024;3(1):64-106. doi: 10.4208/jml.230624.

Abstract

The stochastic block model is a canonical random graph model for clustering and community detection on network-structured data. Decades of extensive study on the problem have established many profound results, among which the phase transition at the Kesten-Stigum threshold is particularly interesting both from a mathematical and an applied standpoint. It states that no estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below a certain threshold. Nevertheless, if we slightly extend the horizon to the ubiquitous semi-supervised setting, such a fundamental limitation will disappear completely. We prove that with an arbitrary fraction of the labels revealed, the detection problem is feasible throughout the parameter domain. Moreover, we introduce two efficient algorithms, one combinatorial and one based on optimization, to integrate label information with graph structures. Our work brings a new perspective to the stochastic model of networks and semidefinite program research.

摘要

随机块模型是一种用于对网络结构数据进行聚类和社区检测的典型随机图模型。数十年来对该问题的广泛研究已经建立了许多深刻的结果,其中在凯斯滕 - 斯蒂古姆阈值处的相变从数学和应用的角度来看都特别有趣。它表明,如果模型参数低于某个阈值,那么基于网络拓扑的任何估计器在稀疏图上的表现都不会比随机猜测好太多。然而,如果我们将视野稍微扩展到普遍存在的半监督设置,这样一个基本限制将完全消失。我们证明,在揭示任意比例的标签的情况下,检测问题在整个参数域内都是可行的。此外,我们引入了两种高效算法,一种是组合算法,另一种基于优化算法,用于将标签信息与图结构相结合。我们的工作为网络的随机模型和半定规划研究带来了新的视角。

相似文献

4
Multi-graph Fusion Graph Convolutional Networks with pseudo-label supervision.具有伪标签监督的多图融合图卷积网络
Neural Netw. 2023 Jan;158:305-317. doi: 10.1016/j.neunet.2022.11.027. Epub 2022 Nov 28.

本文引用的文献

2
Phase transitions in semidefinite relaxations.半定松弛中的相变。
Proc Natl Acad Sci U S A. 2016 Apr 19;113(16):E2218-23. doi: 10.1073/pnas.1523097113. Epub 2016 Mar 21.
3
Phase transitions in semisupervised clustering of sparse networks.稀疏网络半监督聚类中的相变
Phys Rev E Stat Nonlin Soft Matter Phys. 2014 Nov;90(5-1):052802. doi: 10.1103/PhysRevE.90.052802. Epub 2014 Nov 5.
4
5
Spectral redemption in clustering sparse networks.聚类稀疏网络中的谱救赎。
Proc Natl Acad Sci U S A. 2013 Dec 24;110(52):20935-40. doi: 10.1073/pnas.1312486110. Epub 2013 Nov 25.
6
Efficient discovery of overlapping communities in massive networks.在大规模网络中高效发现重叠社区。
Proc Natl Acad Sci U S A. 2013 Sep 3;110(36):14534-9. doi: 10.1073/pnas.1221839110. Epub 2013 Aug 15.
7
Graph spectra and the detectability of community structure in networks.图频谱与网络中社区结构的可检测性。
Phys Rev Lett. 2012 May 4;108(18):188701. doi: 10.1103/PhysRevLett.108.188701. Epub 2012 May 1.
8
Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications.模块化网络随机块模型的渐近分析及其算法应用。
Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Dec;84(6 Pt 2):066106. doi: 10.1103/PhysRevE.84.066106. Epub 2011 Dec 12.
10
Finding and evaluating community structure in networks.在网络中寻找并评估社区结构。
Phys Rev E Stat Nonlin Soft Matter Phys. 2004 Feb;69(2 Pt 2):026113. doi: 10.1103/PhysRevE.69.026113. Epub 2004 Feb 26.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验