通过PFP使用远缘相关序列和上下文关联增强自动功能预测。

Enhanced automated function prediction using distantly related sequences and contextual association by PFP.

作者信息

Hawkins Troy, Luban Stanislav, Kihara Daisuke

机构信息

Department of Biological Sciences, College of Sciences, Purdue University, West Lafayette, Indiana 47907, USA.

出版信息

Protein Sci. 2006 Jun;15(6):1550-6. doi: 10.1110/ps.062153506. Epub 2006 May 2.

DOI:10.1110/ps.062153506

PMID:16672240

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2242549/

Abstract

The impetus for the recent development and emergence of automated function prediction methods is an exponentially growing flood of new experimental data, the interpretation of which is hindered by a shortage of reliable annotations for proteins that lack experimental characterization or significant homologs in current databases. Here we introduce PFP, an automated function prediction server that provides the most probable annotations for a query sequence in each of the three branches of the Gene Ontology: biological process, molecular function, and cellular component. Rather than utilizing precise pattern matching to identify functional motifs in the sequences and structures of these proteins, we designed PFP to increase the coverage of function annotation by lowering resolution of predictions when a detailed function is not predictable. To do this we extend a traditional PSI-BLAST search by extracting and scoring annotations (GO terms) individually, including annotations from distantly related sequences, and applying a novel data mining tool, the Function Association Matrix, to score strongly associated pairs of annotations. We show that PFP can correctly assign function using only weakly similar sequences with a significantly better accuracy and coverage than a standard PSI-BLAST search, improving it more than fivefold. The most descriptive annotations predicted by PFP (GO depth > or = 8) can identify a significant subgraph in the GO with > 60% accuracy and approximately 100% coverage for our benchmark set. We also provide examples of the superb performance of PFP in an assessment of automated function prediction servers at the Automated Function Prediction Special Interest Group meeting at ISMB 2005 (AFP-SIG '05).

摘要

近期自动化功能预测方法得以发展并出现的推动力，是新实验数据呈指数级增长的洪流，而对这些数据的解读因当前数据库中缺乏对缺乏实验表征或显著同源物的蛋白质的可靠注释而受阻。在此，我们介绍PFP，这是一个自动化功能预测服务器，它能为基因本体论的三个分支（生物过程、分子功能和细胞组成）中的每个查询序列提供最可能的注释。我们设计PFP并非利用精确的模式匹配来识别这些蛋白质的序列和结构中的功能基序，而是在详细功能不可预测时通过降低预测分辨率来增加功能注释的覆盖范围。为此，我们通过单独提取和评分注释（GO术语）来扩展传统的PSI-BLAST搜索，包括来自远缘相关序列的注释，并应用一种新颖的数据挖掘工具——功能关联矩阵，来对高度相关的注释对进行评分。我们表明，PFP仅使用弱相似序列就能正确地分配功能，其准确性和覆盖范围比标准的PSI-BLAST搜索显著更好，提升了五倍多。PFP预测的最具描述性的注释（GO深度≥8）能在基因本体论中识别出一个显著的子图，对于我们的基准集，其准确率>60%，覆盖率约为100%。我们还在2005年国际分子生物学大会的自动化功能预测特别兴趣小组会议（AFP-SIG '05）上对自动化功能预测服务器的评估中提供了PFP卓越性能的示例。

相似文献

Enhanced automated function prediction using distantly related sequences and contextual association by PFP.

Protein Sci. 2006 Jun;15(6):1550-6. doi: 10.1110/ps.062153506. Epub 2006 May 2.

PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data.

Proteins. 2009 Feb 15;74(3):566-82. doi: 10.1002/prot.22172.

Using PFP and ESG Protein Function Prediction Web Servers.

Methods Mol Biol. 2017;1611:1-14. doi: 10.1007/978-1-4939-7015-5_1.

PFP/ESG: automated protein function prediction servers enhanced with Gene Ontology visualization tool.

Bioinformatics. 2015 Jan 15;31(2):271-2. doi: 10.1093/bioinformatics/btu646. Epub 2014 Oct 1.

In-depth performance evaluation of PFP and ESG sequence-based function prediction methods in CAFA 2011 experiment.

BMC Bioinformatics. 2013;14 Suppl 3(Suppl 3):S2. doi: 10.1186/1471-2105-14-S3-S2. Epub 2013 Feb 28.

Phylo-PFP: improved automated protein function prediction using phylogenetic distance of distantly related sequences.

Bioinformatics. 2019 Mar 1;35(5):753-759. doi: 10.1093/bioinformatics/bty704.

A novel neural response algorithm for protein function prediction.

BMC Syst Biol. 2012;6 Suppl 1(Suppl 1):S19. doi: 10.1186/1752-0509-6-S1-S19. Epub 2012 Jul 16.

The PFP and ESG protein function prediction methods in 2014: effect of database updates and ensemble approaches.

Gigascience. 2015 Sep 14;4:43. doi: 10.1186/s13742-015-0083-4. eCollection 2015.

A categorization approach to automated ontological function annotation.

Protein Sci. 2006 Jun;15(6):1544-9. doi: 10.1110/ps.062184006. Epub 2006 May 2.

Utilizing shared interacting domain patterns and Gene Ontology information to improve protein-protein interaction prediction.

Comput Biol Med. 2010 Jun;40(6):555-64. doi: 10.1016/j.compbiomed.2010.03.009. Epub 2010 Apr 24.

引用本文的文献

Translating a GO Term List to Human Readable Function Description Using GO2Sum.

Methods Mol Biol. 2025;2941:85-99. doi: 10.1007/978-1-0716-4623-6_5.

Proteomic analysis of unicellular cyanobacterium ATCC 51142 under extended light or dark growth.

bioRxiv. 2024 Jul 29:2024.07.29.605499. doi: 10.1101/2024.07.29.605499.

Characterization of Intracellular Localization Signals and Structural Features of Mosquito Densovirus (MDV) Viral Proteins.

bioRxiv. 2023 Dec 14:2023.12.13.571551. doi: 10.1101/2023.12.13.571551.

The field of protein function prediction as viewed by different domain scientists.

Bioinform Adv. 2022 Aug 17;2(1):vbac057. doi: 10.1093/bioadv/vbac057. eCollection 2022.

ContactPFP: Protein function prediction using predicted contact information.

Front Bioinform. 2022 Jun;2. doi: 10.3389/fbinf.2022.896295. Epub 2022 Jun 2.

Multiple Profile Models Extract Features from Protein Sequence Data and Resolve Functional Diversity of Very Different Protein Families.

Mol Biol Evol. 2022 Apr 10;39(4). doi: 10.1093/molbev/msac070.

The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction.

Bioinformatics. 2020 Jul 1;36(Suppl_1):i219-i226. doi: 10.1093/bioinformatics/btaa468.

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.

Genome Biol. 2019 Nov 19;20(1):244. doi: 10.1186/s13059-019-1835-8.

Phylo-PFP: improved automated protein function prediction using phylogenetic distance of distantly related sequences.

Bioinformatics. 2019 Mar 1;35(5):753-759. doi: 10.1093/bioinformatics/bty704.

The evolutionary signal in metagenome phyletic profiles predicts many gene functions.

Microbiome. 2018 Jul 10;6(1):129. doi: 10.1186/s40168-018-0506-4.

本文引用的文献

Predicting protein function from sequence and structural data.

Curr Opin Struct Biol. 2005 Jun;15(3):275-84. doi: 10.1016/j.sbi.2005.04.003.

Inference of protein function from protein structure.

Structure. 2005 Jan;13(1):121-30. doi: 10.1016/j.str.2004.10.015.

GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes.

BMC Bioinformatics. 2004 Nov 18;5:178. doi: 10.1186/1471-2105-5-178.

The Gene Ontology (GO) database and informatics resource.

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D258-61. doi: 10.1093/nar/gkh036.

GoFigure: automated Gene Ontology annotation.

Bioinformatics. 2003 Dec 12;19(18):2484-5. doi: 10.1093/bioinformatics/btg338.

Automated Gene Ontology annotation for anonymous sequence data.

Nucleic Acids Res. 2003 Jul 1;31(13):3712-5. doi: 10.1093/nar/gkg582.

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Nucleic Acids Res. 1997 Sep 1;25(17):3389-402. doi: 10.1093/nar/25.17.3389.

Basic local alignment search tool.

J Mol Biol. 1990 Oct 5;215(3):403-10. doi: 10.1016/S0022-2836(05)80360-2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过PFP使用远缘相关序列和上下文关联增强自动功能预测。

Enhanced automated function prediction using distantly related sequences and contextual association by PFP.

作者信息

Hawkins Troy, Luban Stanislav, Kihara Daisuke

机构信息

Department of Biological Sciences, College of Sciences, Purdue University, West Lafayette, Indiana 47907, USA.

出版信息

Protein Sci. 2006 Jun;15(6):1550-6. doi: 10.1110/ps.062153506. Epub 2006 May 2.

DOI:10.1110/ps.062153506

PMID:16672240

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2242549/

Abstract

摘要

通过PFP使用远缘相关序列和上下文关联增强自动功能预测。

Enhanced automated function prediction using distantly related sequences and contextual association by PFP.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

通过PFP使用远缘相关序列和上下文关联增强自动功能预测。

Enhanced automated function prediction using distantly related sequences and contextual association by PFP.

作者信息

机构信息

出版信息