Max Planck Institute for Intelligent Systems, Cyber Valley, 72076, Tübingen, Germany.
Department of Methodology, London School of Economics and Political Science, London, WC2A 2AE, UK.
Sci Rep. 2020 Sep 25;10(1):15736. doi: 10.1038/s41598-020-72626-y.
Community detection in networks is commonly performed using information about interactions between nodes. Recent advances have been made to incorporate multiple types of interactions, thus generalizing standard methods to multilayer networks. Often, though, one can access additional information regarding individual nodes, attributes, or covariates. A relevant question is thus how to properly incorporate this extra information in such frameworks. Here we develop a method that incorporates both the topology of interactions and node attributes to extract communities in multilayer networks. We propose a principled probabilistic method that does not assume any a priori correlation structure between attributes and communities but rather infers this from data. This leads to an efficient algorithmic implementation that exploits the sparsity of the dataset and can be used to perform several inference tasks; we provide an open-source implementation of the code online. We demonstrate our method on both synthetic and real-world data and compare performance with methods that do not use any attribute information. We find that including node information helps in predicting missing links or attributes. It also leads to more interpretable community structures and allows the quantification of the impact of the node attributes given in input.
网络中的社区检测通常使用节点之间相互作用的信息来完成。最近已经取得了一些进展,可以整合多种类型的相互作用,从而将标准方法推广到多层网络中。然而,通常可以访问有关单个节点、属性或协变量的其他信息。因此,一个相关的问题是如何在这种框架中正确地整合这些额外的信息。在这里,我们开发了一种方法,将相互作用的拓扑结构和节点属性结合起来,从多层网络中提取社区。我们提出了一种基于概率的原则性方法,该方法不假设属性和社区之间存在任何先验相关结构,而是从数据中推断出来。这导致了一种高效的算法实现,利用了数据集的稀疏性,可以用于执行多个推断任务;我们在线提供了代码的开源实现。我们在合成和真实世界的数据上展示了我们的方法,并与不使用任何属性信息的方法进行了性能比较。我们发现包含节点信息有助于预测缺失的链接或属性。它还可以得到更具可解释性的社区结构,并允许量化输入的节点属性的影响。