Grover Aditya, Leskovec Jure
Stanford University.
KDD. 2016 Aug;2016:855-864. doi: 10.1145/2939672.2939754.
Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose , an algorithmic framework for learning continuous feature representations for nodes in networks. In , we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.
网络中节点和边的预测任务需要在设计学习算法所使用的特征时付出精心的努力。在表示学习这一更广泛领域的最新研究,通过学习特征本身在自动化预测方面取得了显著进展。然而,目前的特征学习方法在表达能力上还不足以捕捉网络中观察到的连接模式的多样性。在此,我们提出了一种用于学习网络中节点的连续特征表示的算法框架。在该框架中,我们学习节点到低维特征空间的映射,以最大化保留节点网络邻域的可能性。我们定义了节点网络邻域的灵活概念,并设计了一种有偏随机游走过程,该过程能有效地探索不同的邻域。我们的算法推广了基于网络邻域严格概念的先前工作,并且我们认为在探索邻域时增加的灵活性是学习更丰富表示的关键。我们在来自不同领域的几个真实世界网络中的多标签分类和链接预测任务上,证明了该算法框架相较于现有最先进技术的有效性。总体而言,我们的工作代表了一种在复杂网络中高效学习最先进的与任务无关表示的新方法。