IEEE Trans Neural Netw Learn Syst. 2012 Nov;23(11):1793-804. doi: 10.1109/TNNLS.2012.2215337.
Predicting new links in a network is a problem of interest in many application domains. Most of the prediction methods utilize information on the network's entities, such as nodes, to build a model of links. Network structures are usually not used except for networks with similarity or relatedness semantics. In this paper, we use network structures for link prediction with a more general network type with latent feature models. The problem with these models is the computational cost to train the models directly for large data. We propose a method to solve this problem using kernels and cast the link prediction problem into a binary classification problem. The key idea is not to infer latent features explicitly, but to represent these features implicitly in the kernels, making the method scalable to large networks. In contrast to the other methods for latent feature models, our method inherits all the advantages of the kernel framework: optimality, efficiency, and nonlinearity. On sparse graphs, we show that our proposed kernels are close enough to the ideal kernels defined directly on latent features. We apply our method to real data of protein-protein interaction and gene regulatory networks to show the merits of our method.
预测网络中的新链接是许多应用领域感兴趣的问题。大多数预测方法都利用网络实体(如节点)的信息来构建链接模型。除非网络具有相似性或相关性语义,否则通常不会使用网络结构。在本文中,我们使用具有潜在特征模型的更通用网络类型来进行链接预测。这些模型的问题在于直接对大型数据进行模型训练的计算成本。我们提出了一种使用核函数的方法来解决这个问题,并将链接预测问题转化为二分类问题。关键思想不是显式推断潜在特征,而是在核函数中隐式表示这些特征,从而使该方法能够扩展到大型网络。与潜在特征模型的其他方法相比,我们的方法继承了核函数框架的所有优点:最优性、效率和非线性。在稀疏图上,我们表明我们提出的核函数与直接在潜在特征上定义的理想核函数足够接近。我们将我们的方法应用于蛋白质-蛋白质相互作用和基因调控网络的真实数据,以展示我们方法的优点。