Wu Haixia, Song Chunyao, Ge Yao, Ge Tingjian
College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Nankai University, Tianjin, China.
University of Massachusetts Lowell, Massachusetts, United States.
Data Sci Eng. 2022;7(3):253-278. doi: 10.1007/s41019-022-00188-2. Epub 2022 Jun 21.
Complex networks have been used widely to model a large number of relationships. The outbreak of COVID-19 has had a huge impact on various complex networks in the real world, for example global trade networks, air transport networks, and even social networks, known as racial equality issues caused by the spread of the epidemic. Link prediction plays an important role in complex network analysis in that it can find missing links or predict the links which will arise in the future in the network by analyzing the existing network structures. Therefore, it is extremely important to study the link prediction problem on complex networks. There are a variety of techniques for link prediction based on the topology of the network and the properties of entities. In this work, a new taxonomy is proposed to divide the link prediction methods into five categories and a comprehensive overview of these methods is provided. The network embedding-based methods, especially graph neural network-based methods, which have attracted increasing attention in recent years, have been creatively investigated as well. Moreover, we analyze thirty-six datasets and divide them into seven types of networks according to their topological features shown in real networks and perform comprehensive experiments on these networks. We further analyze the results of experiments in detail, aiming to discover the most suitable approach for each kind of network.
复杂网络已被广泛用于对大量关系进行建模。新冠疫情的爆发对现实世界中的各种复杂网络产生了巨大影响,例如全球贸易网络、航空运输网络,甚至社交网络,疫情传播引发的种族平等问题等。链接预测在复杂网络分析中起着重要作用,因为它可以通过分析现有网络结构来发现缺失的链接或预测网络中未来会出现的链接。因此,研究复杂网络上的链接预测问题极其重要。基于网络拓扑和实体属性,有多种链接预测技术。在这项工作中,提出了一种新的分类法,将链接预测方法分为五类,并对这些方法进行了全面概述。近年来受到越来越多关注的基于网络嵌入的方法,尤其是基于图神经网络的方法,也得到了创新性研究。此外,我们分析了36个数据集,并根据它们在真实网络中显示的拓扑特征将其分为七种类型的网络,并在这些网络上进行了全面实验。我们进一步详细分析了实验结果,旨在为每种类型的网络找到最合适的方法。