Wang Jingyan, Li Yongping
Shanghai Institute of Applied Physics, Chinese Academy of Sciences, 2019 Jialuo Road, Jiading District, Shanghai 201800, PR China.
J Bioinform Comput Biol. 2011 Dec;9(6):663-79. doi: 10.1142/s0219720011005550.
Predicting protein function is one of the most challenging problems of the post-genomic era. The development of experimental methods for genome scale analysis of molecular interaction networks has provided new approaches to inferring protein function. In this paper we introduce a new graph-based semi-supervised classification algorithm Sequential Linear Neighborhood Propagation (SLNP), which addresses the problem of the classification of partially labeled protein interaction networks. The proposed SLNP first constructs a sequence of node sets according to their shortest distance to the labeled nodes, and then predicts the function of the unlabel proteins from the set closer to labeled one, using Linear Neighborhood Propagation. Its performance is assessed on the Saccharomyces cerevisiae PPI network data sets, with good results compared with three current state-of-the-art algorithms, especially in settings where only a small fraction of the proteins are labeled.
预测蛋白质功能是后基因组时代最具挑战性的问题之一。用于分子相互作用网络基因组规模分析的实验方法的发展为推断蛋白质功能提供了新途径。在本文中,我们介绍了一种新的基于图的半监督分类算法——顺序线性邻域传播(SLNP),该算法解决了部分标记的蛋白质相互作用网络的分类问题。所提出的SLNP首先根据节点到标记节点的最短距离构建一系列节点集,然后使用线性邻域传播从更接近标记节点的集合中预测未标记蛋白质的功能。我们在酿酒酵母PPI网络数据集上评估了它的性能,与当前三种最先进的算法相比,结果良好,特别是在只有一小部分蛋白质被标记的情况下。