Suppr超能文献

利用基因本体论和种间蛋白质同源性数据预测蛋白质功能。

Prediction of protein functions with gene ontology and interspecies protein homology data.

机构信息

Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 715 Broadway, 10th floor, New York, NY 10003, USA.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2011 May-Jun;8(3):775-84. doi: 10.1109/TCBB.2010.15.

Abstract

Accurate computational prediction of protein functions increasingly relies on network-inspired models for the protein function transfer. This task can become challenging for proteins isolated in their own network or those with poor or uncharacterized neighborhoods. Here, we present a novel probabilistic chain-graph-based approach for predicting protein functions that builds on connecting networks of two (or more) different species by links of high interspecies sequence homology. In this way, proteins are able to "exchange" functional information with their neighbors-homologs from a different species. The knowledge of interspecies relationships, such as the sequence homology, can become crucial in cases of limited information from other sources of data, including the protein-protein interactions or cellular locations of proteins. We further enhance our model to account for the Gene Ontology dependencies by linking multiple but related functional ontology categories within and across multiple species. The resulting networks are of significantly higher complexity than most traditional protein network models. We comprehensively benchmark our method by applying it to two largest protein networks, the Yeast and the Fly. The joint Fly-Yeast network provides substantial improvements in precision, accuracy, and false positive rate over networks that consider either of the sources in isolation. At the same time, the new model retains the computational efficiency similar to that of the simpler networks.

摘要

准确的蛋白质功能预测越来越依赖于基于网络的模型来进行蛋白质功能转移。对于孤立在自己的网络中的蛋白质或那些具有较差或未被描述的邻域的蛋白质,这项任务可能会变得具有挑战性。在这里,我们提出了一种新的基于概率链图的预测蛋白质功能的方法,该方法基于通过高种间序列同源性的链接连接两个(或更多)不同物种的网络。通过这种方式,蛋白质能够与其邻域(来自不同物种的同源物)“交换”功能信息。种间关系(如序列同源性)的知识在其他数据源(包括蛋白质-蛋白质相互作用或蛋白质的细胞位置)的信息有限的情况下可能变得至关重要。我们通过在多个物种内和跨多个物种链接多个但相关的功能本体类别来进一步增强我们的模型,以考虑到本体论的依赖关系。由此产生的网络比大多数传统的蛋白质网络模型复杂得多。我们通过将其应用于两个最大的蛋白质网络(酵母和苍蝇)来全面评估我们的方法。与仅考虑其中一个来源的网络相比,苍蝇-酵母联合网络在精度、准确性和假阳性率方面都有显著提高。同时,新模型保留了与简单网络相似的计算效率。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验