使用图嵌入技术在网络中进行社区检测。

Community detection in networks using graph embeddings.

作者信息

Tandon Aditya, Albeshri Aiiad, Thayananthan Vijey, Alhalabi Wadee, Radicchi Filippo, Fortunato Santo

机构信息

Luddy School of Informatics, Computing and Engineering, Indiana University, Bloomington, Indiana 47408, USA.

Department of Computer Science, Faculty of Computing and Information Technology King Abdulaziz University, Jeddah 21589, Kingdom of Saudi Arabia.

出版信息

Phys Rev E. 2021 Feb;103(2-1):022316. doi: 10.1103/PhysRevE.103.022316.

DOI:10.1103/PhysRevE.103.022316

PMID:33736102

Abstract

Graph embedding methods are becoming increasingly popular in the machine learning community, where they are widely used for tasks such as node classification and link prediction. Embedding graphs in geometric spaces should aid the identification of network communities as well because nodes in the same community should be projected close to each other in the geometric space, where they can be detected via standard data clustering algorithms. In this paper, we test the ability of several graph embedding techniques to detect communities on benchmark graphs. We compare their performance against that of traditional community detection algorithms. We find that the performance is comparable, if the parameters of the embedding techniques are suitably chosen. However, the optimal parameter set varies with the specific features of the benchmark graphs, like their size, whereas popular community detection algorithms do not require any parameter. So, it is not possible to indicate beforehand good parameter sets for the analysis of real networks. This finding, along with the high computational cost of embedding a network and grouping the points, suggests that, for community detection, current embedding techniques do not represent an improvement over network clustering algorithms.

摘要

图嵌入方法在机器学习社区中越来越受欢迎，在该社区中它们被广泛用于节点分类和链接预测等任务。在几何空间中嵌入图也应该有助于识别网络社区，因为同一社区中的节点在几何空间中应该被投影到彼此附近，在那里可以通过标准数据聚类算法检测到它们。在本文中，我们测试了几种图嵌入技术在基准图上检测社区的能力。我们将它们的性能与传统社区检测算法的性能进行比较。我们发现，如果适当地选择嵌入技术的参数，性能是可比的。然而，最优参数集随基准图的特定特征（如大小）而变化，而流行的社区检测算法不需要任何参数。因此，不可能事先指出用于分析真实网络的良好参数集。这一发现，连同嵌入网络和对点进行分组的高计算成本，表明对于社区检测，当前的嵌入技术并不比网络聚类算法有改进。