Suppr超能文献

AptRank:一种用于生物关系图上蛋白质功能预测的自适应 PageRank 模型。

AptRank: an adaptive PageRank model for protein function prediction on   bi-relational graphs.

机构信息

Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.

Department of Mathematics, Purdue University, West Lafayette, IN, USA.

出版信息

Bioinformatics. 2017 Jun 15;33(12):1829-1836. doi: 10.1093/bioinformatics/btx029.

Abstract

MOTIVATION

Diffusion-based network models are widely used for protein function prediction using protein network data and have been shown to outperform neighborhood-based and module-based methods. Recent studies have shown that integrating the hierarchical structure of the Gene Ontology (GO) data dramatically improves prediction accuracy. However, previous methods usually either used the GO hierarchy to refine the prediction results of multiple classifiers, or flattened the hierarchy into a function-function similarity kernel. No study has taken the GO hierarchy into account together with the protein network as a two-layer network model.

RESULTS

We first construct a Bi-relational graph (Birg) model comprised of both protein-protein association and function-function hierarchical networks. We then propose two diffusion-based methods, BirgRank and AptRank, both of which use PageRank to diffuse information on this two-layer graph model. BirgRank is a direct application of traditional PageRank with fixed decay parameters. In contrast, AptRank utilizes an adaptive diffusion mechanism to improve the performance of BirgRank. We evaluate the ability of both methods to predict protein function on yeast, fly and human protein datasets, and compare with four previous methods: GeneMANIA, TMC, ProteinRank and clusDCA. We design four different validation strategies: missing function prediction, de novo function prediction, guided function prediction and newly discovered function prediction to comprehensively evaluate predictability of all six methods. We find that both BirgRank and AptRank outperform the previous methods, especially in missing function prediction when using only 10% of the data for training.

AVAILABILITY AND IMPLEMENTATION

The MATLAB code is available at https://github.rcac.purdue.edu/mgribsko/aptrank .

CONTACT

gribskov@purdue.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

基于扩散的网络模型广泛用于使用蛋白质网络数据进行蛋白质功能预测,并且已经被证明优于基于邻域和基于模块的方法。最近的研究表明,整合基因本体论(GO)数据的层次结构可以显著提高预测准确性。然而,以前的方法通常要么使用 GO 层次结构来细化多个分类器的预测结果,要么将层次结构平展为功能-功能相似性核。没有研究将 GO 层次结构与蛋白质网络一起考虑作为两层网络模型。

结果

我们首先构建了一个由蛋白质-蛋白质相互作用和功能-功能层次网络组成的双关系图(Birg)模型。然后,我们提出了两种基于扩散的方法,BirgRank 和 AptRank,它们都使用 PageRank 在这个两层图模型上扩散信息。BirgRank 是传统 PageRank 的直接应用,具有固定的衰减参数。相比之下,AptRank 利用自适应扩散机制来提高 BirgRank 的性能。我们在酵母、果蝇和人类蛋白质数据集上评估了这两种方法预测蛋白质功能的能力,并与之前的四种方法进行了比较:GeneMANIA、TMC、ProteinRank 和 clusDCA。我们设计了四种不同的验证策略:缺失功能预测、从头功能预测、引导功能预测和新发现功能预测,以全面评估所有六种方法的可预测性。我们发现,BirgRank 和 AptRank 都优于之前的方法,特别是在仅使用 10%的数据进行训练时,在缺失功能预测方面表现出色。

可用性和实现

MATLAB 代码可在 https://github.rcac.purdue.edu/mgribsko/aptrank 获得。

联系人

gribskov@purdue.edu

补充信息

补充数据可在 Bioinformatics 在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验