通过 PFP 进行高可信度功能预测的功能富集分析和功能相似网络构建。

Functional enrichment analyses and construction of functional similarity networks with high confidence function prediction by PFP.

机构信息

Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA.

出版信息

BMC Bioinformatics. 2010 May 19;11:265. doi: 10.1186/1471-2105-11-265.

DOI:10.1186/1471-2105-11-265

PMID:20482861

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2882935/

Abstract

BACKGROUND

A new paradigm of biological investigation takes advantage of technologies that produce large high throughput datasets, including genome sequences, interactions of proteins, and gene expression. The ability of biologists to analyze and interpret such data relies on functional annotation of the included proteins, but even in highly characterized organisms many proteins can lack the functional evidence necessary to infer their biological relevance.

RESULTS

Here we have applied high confidence function predictions from our automated prediction system, PFP, to three genome sequences, Escherichia coli, Saccharomyces cerevisiae, and Plasmodium falciparum (malaria). The number of annotated genes is increased by PFP to over 90% for all of the genomes. Using the large coverage of the function annotation, we introduced the functional similarity networks which represent the functional space of the proteomes. Four different functional similarity networks are constructed for each proteome, one each by considering similarity in a single Gene Ontology (GO) category, i.e. Biological Process, Cellular Component, and Molecular Function, and another one by considering overall similarity with the funSim score. The functional similarity networks are shown to have higher modularity than the protein-protein interaction network. Moreover, the funSim score network is distinct from the single GO-score networks by showing a higher clustering degree exponent value and thus has a higher tendency to be hierarchical. In addition, examining function assignments to the protein-protein interaction network and local regions of genomes has identified numerous cases where subnetworks or local regions have functionally coherent proteins. These results will help interpreting interactions of proteins and gene orders in a genome. Several examples of both analyses are highlighted.

CONCLUSION

The analyses demonstrate that applying high confidence predictions from PFP can have a significant impact on a researchers' ability to interpret the immense biological data that are being generated today. The newly introduced functional similarity networks of the three organisms show different network properties as compared with the protein-protein interaction networks.

摘要

背景

一种新的生物学研究范式利用了能够产生大量高通量数据集的技术，包括基因组序列、蛋白质相互作用和基因表达。生物学家分析和解释这些数据的能力依赖于所包含蛋白质的功能注释，但即使在高度描述的生物体中，许多蛋白质也缺乏推断其生物学相关性所需的功能证据。

结果

我们将我们的自动化预测系统 PFP 的高置信度功能预测应用于三个基因组序列，大肠杆菌、酿酒酵母和恶性疟原虫（疟疾）。PFP 将所有基因组的注释基因数量增加到 90%以上。利用功能注释的广泛覆盖，我们引入了功能相似性网络，代表了蛋白质组的功能空间。为每个蛋白质组构建了四个不同的功能相似性网络，一个是通过考虑单个基因本体论 (GO) 类别中的相似性构建的，即生物过程、细胞成分和分子功能，另一个是通过考虑与 funSim 分数的整体相似性构建的。功能相似性网络被证明比蛋白质-蛋白质相互作用网络具有更高的模块性。此外，funSim 分数网络与单个 GO 分数网络不同，通过显示更高的聚类度指数值，因此具有更高的层次倾向。此外，检查蛋白质-蛋白质相互作用网络和基因组局部区域的功能分配，确定了许多情况下子网或局部区域具有功能一致的蛋白质。这些结果将有助于解释基因组中蛋白质相互作用和基因顺序。突出显示了这两种分析的几个示例。

结论

这些分析表明，应用 PFP 的高置信度预测可以显著影响研究人员解释当今生成的大量生物数据的能力。与蛋白质-蛋白质相互作用网络相比，新引入的三个生物体的功能相似性网络显示出不同的网络特性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/863c/2882935/b6ac26a5d6e8/1471-2105-11-265-1.jpg

相似文献

Functional enrichment analyses and construction of functional similarity networks with high confidence function prediction by PFP.

BMC Bioinformatics. 2010 May 19;11:265. doi: 10.1186/1471-2105-11-265.

PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data.

Proteins. 2009 Feb 15;74(3):566-82. doi: 10.1002/prot.22172.

AVID: an integrative framework for discovering functional relationships among proteins.

BMC Bioinformatics. 2005 Jun 1;6:136. doi: 10.1186/1471-2105-6-136.

Automatic extraction of gene ontology annotation and its correlation with clusters in protein networks.

BMC Bioinformatics. 2007 Jul 10;8:243. doi: 10.1186/1471-2105-8-243.

Enhanced automated function prediction using distantly related sequences and contextual association by PFP.

Protein Sci. 2006 Jun;15(6):1550-6. doi: 10.1110/ps.062153506. Epub 2006 May 2.

Functional annotation of hierarchical modularity.

PLoS One. 2012;7(4):e33744. doi: 10.1371/journal.pone.0033744. Epub 2012 Apr 4.

An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology.

BMC Bioinformatics. 2010 Nov 15;11:562. doi: 10.1186/1471-2105-11-562.

Using PFP and ESG Protein Function Prediction Web Servers.

Methods Mol Biol. 2017;1611:1-14. doi: 10.1007/978-1-4939-7015-5_1.

Modular biological function is most effectively captured by combining molecular interaction data types.

PLoS One. 2013 May 3;8(5):e62670. doi: 10.1371/journal.pone.0062670. Print 2013.

Topology aware functional similarity of protein interaction networks based on gene ontology.

Annu Int Conf IEEE Eng Med Biol Soc. 2011;2011:6857-60. doi: 10.1109/IEMBS.2011.6091691.

引用本文的文献

Computational identification of protein-protein interactions in model plant proteomes.

Sci Rep. 2019 Jun 19;9(1):8740. doi: 10.1038/s41598-019-45072-8.

Cytotoxicity and Transcriptomic Analysis of Silver Nanoparticles in Mouse Embryonic Fibroblast Cells.

Int J Mol Sci. 2018 Nov 16;19(11):3618. doi: 10.3390/ijms19113618.

DextMP: deep dive into text for predicting moonlighting proteins.

Bioinformatics. 2017 Jul 15;33(14):i83-i91. doi: 10.1093/bioinformatics/btx231.

Semantic particularity measure for functional characterization of gene sets using gene ontology.

PLoS One. 2014 Jan 28;9(1):e86525. doi: 10.1371/journal.pone.0086525. eCollection 2014.

Revisiting the variation of clustering coefficient of biological networks suggests new modular structure.

BMC Syst Biol. 2012 May 1;6:34. doi: 10.1186/1752-0509-6-34.

Structure- and sequence-based function prediction for non-homologous proteins.

J Struct Funct Genomics. 2012 Jun;13(2):111-23. doi: 10.1007/s10969-012-9126-6. Epub 2012 Jan 22.

An iterative network partition algorithm for accurate identification of dense network modules.

Nucleic Acids Res. 2012 Feb;40(3):e18. doi: 10.1093/nar/gkr1103. Epub 2011 Nov 25.

Quantification of protein group coherence and pathway assignment using functional association.

BMC Bioinformatics. 2011 Sep 19;12:373. doi: 10.1186/1471-2105-12-373.

A network-based gene-weighting approach for pathway analysis.

Cell Res. 2012 Mar;22(3):565-80. doi: 10.1038/cr.2011.149. Epub 2011 Sep 6.

本文引用的文献

How and when should interactome-derived clusters be used to predict functional modules and protein function?

Bioinformatics. 2009 Dec 1;25(23):3143-50. doi: 10.1093/bioinformatics/btp551. Epub 2009 Sep 21.

ESG: extended similarity group method for automated protein function prediction.

Bioinformatics. 2009 Jul 15;25(14):1739-45. doi: 10.1093/bioinformatics/btp309. Epub 2009 May 12.

Methods to reconstruct and compare transcriptional regulatory networks.

Methods Mol Biol. 2009;541:163-80. doi: 10.1007/978-1-59745-243-4_8.

Protein function prediction--the power of multiplicity.

Trends Biotechnol. 2009 Apr;27(4):210-9. doi: 10.1016/j.tibtech.2009.01.002. Epub 2009 Feb 27.

RNA-Seq: a revolutionary tool for transcriptomics.

Nat Rev Genet. 2009 Jan;10(1):57-63. doi: 10.1038/nrg2484.

Modelling and analysis of gene regulatory networks.

Nat Rev Mol Cell Biol. 2008 Oct;9(10):770-80. doi: 10.1038/nrm2503. Epub 2008 Sep 17.

PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data.

Proteins. 2009 Feb 15;74(3):566-82. doi: 10.1002/prot.22172.

Next-generation DNA sequencing methods.

Annu Rev Genomics Hum Genet. 2008;9:387-402. doi: 10.1146/annurev.genom.9.081307.164359.

New paradigm in protein function prediction for large scale omics analysis.

Mol Biosyst. 2008 Mar;4(3):223-31. doi: 10.1039/b718229e. Epub 2008 Jan 28.

ConFunc--functional annotation in the twilight zone.

Bioinformatics. 2008 Mar 15;24(6):798-806. doi: 10.1093/bioinformatics/btn037. Epub 2008 Feb 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过 PFP 进行高可信度功能预测的功能富集分析和功能相似网络构建。

Functional enrichment analyses and construction of functional similarity networks with high confidence function prediction by PFP.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献