Piovesan Damiano, Giollo Manuel, Ferrari Carlo, Tosatto Silvio C E
Department of Biomedical Sciences, University of Padua, Viale G. Colombo 3, 35131, Padua, Italy.
Department of Information Engineering, University of Padua, Via Gradenigo 6, 35121, Padua, Italy.
Amino Acids. 2015 Dec;47(12):2583-92. doi: 10.1007/s00726-015-2049-3. Epub 2015 Jul 28.
Protein function prediction from sequence using the Gene Ontology (GO) classification is useful in many biological problems. It has recently attracted increasing interest, thanks in part to the Critical Assessment of Function Annotation (CAFA) challenge. In this paper, we introduce Guilty by Association on STRING (GAS), a tool to predict protein function exploiting protein-protein interaction networks without sequence similarity. The assumption is that whenever a protein interacts with other proteins, it is part of the same biological process and located in the same cellular compartment. GAS retrieves interaction partners of a query protein from the STRING database and measures enrichment of the associated functional annotations to generate a sorted list of putative functions. A performance evaluation based on CAFA metrics and a fair comparison with optimized BLAST similarity searches is provided. The consensus of GAS and BLAST is shown to improve overall performance. The PPI approach is shown to outperform similarity searches for biological process and cellular compartment GO predictions. Moreover, an analysis of the best practices to exploit protein-protein interaction networks is also provided.
利用基因本体论(GO)分类从序列预测蛋白质功能在许多生物学问题中都很有用。最近,它引起了越来越多的关注,这在一定程度上要归功于功能注释关键评估(CAFA)挑战。在本文中,我们介绍了基于STRING的关联有罪(GAS),这是一种利用蛋白质-蛋白质相互作用网络预测蛋白质功能而无需序列相似性的工具。其假设是,只要一种蛋白质与其他蛋白质相互作用,它就是同一生物过程的一部分,并且位于同一细胞区室中。GAS从STRING数据库中检索查询蛋白质的相互作用伙伴,并测量相关功能注释的富集情况,以生成一个假定功能的排序列表。本文基于CAFA指标进行了性能评估,并与优化后的BLAST相似性搜索进行了公平比较。结果表明,GAS和BLAST的共识提高了整体性能。在生物学过程和细胞区室GO预测方面,PPI方法的表现优于相似性搜索。此外,本文还分析了利用蛋白质-蛋白质相互作用网络的最佳实践。