Koskin Vladimir, Kells Adam, Clayton Joe, Hartmann Alexander K, Annibale Alessia, Rosta Edina
Department of Chemistry, King's College London, SE1 1DB London, United Kingdom.
Department of Physics and Astronomy, University College London, WC1E 6BT London, United Kingdom.
J Chem Phys. 2023 Mar 14;158(10):104112. doi: 10.1063/5.0105099.
Efficiently identifying the most important communities and key transition nodes in weighted and unweighted networks is a prevalent problem in a wide range of disciplines. Here, we focus on the optimal clustering using variational kinetic parameters, linked to Markov processes defined on the underlying networks, namely, the slowest relaxation time and the Kemeny constant. We derive novel relations in terms of mean first passage times for optimizing clustering via the Kemeny constant and show that the optimal clustering boundaries have equal round-trip times to the clusters they separate. We also propose an efficient method that first projects the network nodes onto a 1D reaction coordinate and subsequently performs a variational boundary search using a parallel tempering algorithm, where the variational kinetic parameters act as an energy function to be extremized. We find that maximization of the Kemeny constant is effective in detecting communities, while the slowest relaxation time allows for detection of transition nodes. We demonstrate the validity of our method on several test systems, including synthetic networks generated from the stochastic block model and real world networks (Santa Fe Institute collaboration network, a network of co-purchased political books, and a street network of multiple cities in Luxembourg). Our approach is compared with existing clustering algorithms based on modularity and the robust Perron cluster analysis, and the identified transition nodes are compared with different notions of node centrality.
在加权和非加权网络中高效识别最重要的群落和关键过渡节点是众多学科中普遍存在的问题。在此,我们专注于使用与基础网络上定义的马尔可夫过程相关的变分动力学参数进行最优聚类,即最慢弛豫时间和凯梅尼常数。我们通过凯梅尼常数推导出关于优化聚类的平均首次通过时间的新关系,并表明最优聚类边界到它们所分隔的簇的往返时间相等。我们还提出了一种有效方法,该方法首先将网络节点投影到一维反应坐标上,随后使用并行回火算法进行变分边界搜索,其中变分动力学参数充当要极值化的能量函数。我们发现最大化凯梅尼常数在检测群落方面有效,而最慢弛豫时间允许检测过渡节点。我们在几个测试系统上证明了我们方法的有效性,包括从随机块模型生成的合成网络和真实世界网络(圣达菲研究所合作网络、共同购买政治书籍的网络以及卢森堡多个城市的街道网络)。我们的方法与基于模块度的现有聚类算法以及鲁棒佩龙聚类分析进行了比较,并且将识别出的过渡节点与不同的节点中心性概念进行了比较。