• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对蛋白质功能预测方法的扩展评估显示准确性有所提高。

An expanded evaluation of protein function prediction methods shows an improvement in accuracy.

作者信息

Jiang Yuxiang, Oron Tal Ronnen, Clark Wyatt T, Bankapur Asma R, D'Andrea Daniel, Lepore Rosalba, Funk Christopher S, Kahanda Indika, Verspoor Karin M, Ben-Hur Asa, Koo Da Chen Emily, Penfold-Brown Duncan, Shasha Dennis, Youngs Noah, Bonneau Richard, Lin Alexandra, Sahraeian Sayed M E, Martelli Pier Luigi, Profiti Giuseppe, Casadio Rita, Cao Renzhi, Zhong Zhaolong, Cheng Jianlin, Altenhoff Adrian, Skunca Nives, Dessimoz Christophe, Dogan Tunca, Hakala Kai, Kaewphan Suwisa, Mehryary Farrokh, Salakoski Tapio, Ginter Filip, Fang Hai, Smithers Ben, Oates Matt, Gough Julian, Törönen Petri, Koskinen Patrik, Holm Liisa, Chen Ching-Tai, Hsu Wen-Lian, Bryson Kevin, Cozzetto Domenico, Minneci Federico, Jones David T, Chapman Samuel, Bkc Dukka, Khan Ishita K, Kihara Daisuke, Ofer Dan, Rappoport Nadav, Stern Amos, Cibrian-Uhalte Elena, Denny Paul, Foulger Rebecca E, Hieta Reija, Legge Duncan, Lovering Ruth C, Magrane Michele, Melidoni Anna N, Mutowo-Meullenet Prudence, Pichler Klemens, Shypitsyna Aleksandra, Li Biao, Zakeri Pooya, ElShal Sarah, Tranchevent Léon-Charles, Das Sayoni, Dawson Natalie L, Lee David, Lees Jonathan G, Sillitoe Ian, Bhat Prajwal, Nepusz Tamás, Romero Alfonso E, Sasidharan Rajkumar, Yang Haixuan, Paccanaro Alberto, Gillis Jesse, Sedeño-Cortés Adriana E, Pavlidis Paul, Feng Shou, Cejuela Juan M, Goldberg Tatyana, Hamp Tobias, Richter Lothar, Salamov Asaf, Gabaldon Toni, Marcet-Houben Marina, Supek Fran, Gong Qingtian, Ning Wei, Zhou Yuanpeng, Tian Weidong, Falda Marco, Fontana Paolo, Lavezzo Enrico, Toppo Stefano, Ferrari Carlo, Giollo Manuel, Piovesan Damiano, Tosatto Silvio C E, Del Pozo Angela, Fernández José M, Maietta Paolo, Valencia Alfonso, Tress Michael L, Benso Alfredo, Di Carlo Stefano, Politano Gianfranco, Savino Alessandro, Rehman Hafeez Ur, Re Matteo, Mesiti Marco, Valentini Giorgio, Bargsten Joachim W, van Dijk Aalt D J, Gemovic Branislava, Glisic Sanja, Perovic Vladmir, Veljkovic Veljko, Veljkovic Nevena, Almeida-E-Silva Danillo C, Vencio Ricardo Z N, Sharan Malvika, Vogel Jörg, Kansakar Lakesh, Zhang Shanshan, Vucetic Slobodan, Wang Zheng, Sternberg Michael J E, Wass Mark N, Huntley Rachael P, Martin Maria J, O'Donovan Claire, Robinson Peter N, Moreau Yves, Tramontano Anna, Babbitt Patricia C, Brenner Steven E, Linial Michal, Orengo Christine A, Rost Burkhard, Greene Casey S, Mooney Sean D, Friedberg Iddo, Radivojac Predrag

机构信息

Department of Computer Science and Informatics, Indiana University, Bloomington, IN, USA.

Buck Institute for Research on Aging, Novato, CA, USA.

出版信息

Genome Biol. 2016 Sep 7;17(1):184. doi: 10.1186/s13059-016-1037-6.

DOI:10.1186/s13059-016-1037-6
PMID:27604469
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5015320/
Abstract

BACKGROUND

A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.

RESULTS

We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2.

CONCLUSIONS

The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.

摘要

背景

我们在理解生命分子基础方面的一个主要瓶颈是蛋白质功能的分配。虽然分子实验能提供最可靠的蛋白质注释,但它们相对较低的通量和有限的范围导致计算功能预测的作用日益增加。然而,评估蛋白质功能预测方法并跟踪该领域的进展仍然具有挑战性。

结果

我们进行了第二次功能注释关键评估(CAFA),这是一项限时挑战,用于评估自动分配蛋白质功能的计算方法。我们评估了来自56个研究小组的126种方法,这些方法利用基因本体论预测生物功能,并利用人类表型本体论对来自18个物种的3681种蛋白质的基因 - 疾病关联进行预测。与CAFA1相比,CAFA2在数据集大小、多样性和评估指标方面进行了扩展分析。为了回顾该领域的进展,分析将CAFA1中的最佳方法与CAFA2中的方法进行了比较。

结论

CAFA2中表现最佳的方法优于CAFA1中的方法。这种准确性的提高可归因于实验注释数量的增加和功能预测方法的改进。评估还表明,表现最佳的算法的定义是特定于本体的,不同的性能指标可用于探究准确预测的本质,以及生物过程和人类表型本体中预测的相对多样性。虽然CAFA1和CAFA2之间在方法上有所改进,但结果的解释和个别方法的有用性仍然依赖于上下文。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/6cfba90a424b/13059_2016_1037_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/253abfc0d42c/13059_2016_1037_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/06a04972f3c0/13059_2016_1037_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/6b6f50913958/13059_2016_1037_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/e03c3d8e33b7/13059_2016_1037_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/4322a82f57fa/13059_2016_1037_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/8cb93b48e3dc/13059_2016_1037_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/f99f0ded4992/13059_2016_1037_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/3a0852fe1d43/13059_2016_1037_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/29bb1b8214f5/13059_2016_1037_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/b78c9282e77c/13059_2016_1037_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/6cfba90a424b/13059_2016_1037_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/253abfc0d42c/13059_2016_1037_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/06a04972f3c0/13059_2016_1037_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/6b6f50913958/13059_2016_1037_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/e03c3d8e33b7/13059_2016_1037_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/4322a82f57fa/13059_2016_1037_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/8cb93b48e3dc/13059_2016_1037_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/f99f0ded4992/13059_2016_1037_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/3a0852fe1d43/13059_2016_1037_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/29bb1b8214f5/13059_2016_1037_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/b78c9282e77c/13059_2016_1037_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f02/5015320/6cfba90a424b/13059_2016_1037_Fig11_HTML.jpg

相似文献

1
An expanded evaluation of protein function prediction methods shows an improvement in accuracy.对蛋白质功能预测方法的扩展评估显示准确性有所提高。
Genome Biol. 2016 Sep 7;17(1):184. doi: 10.1186/s13059-016-1037-6.
2
The PFP and ESG protein function prediction methods in 2014: effect of database updates and ensemble approaches.2014年的PFP和ESG蛋白质功能预测方法:数据库更新和集成方法的影响。
Gigascience. 2015 Sep 14;4:43. doi: 10.1186/s13742-015-0083-4. eCollection 2015.
3
Using PFP and ESG Protein Function Prediction Web Servers.使用PFP和ESG蛋白质功能预测网络服务器。
Methods Mol Biol. 2017;1611:1-14. doi: 10.1007/978-1-4939-7015-5_1.
4
Mutual annotation-based prediction of protein domain functions with Domain2GO.基于互注释的蛋白质结构域功能预测与 Domain2GO。
Protein Sci. 2024 Jun;33(6):e4988. doi: 10.1002/pro.4988.
5
Assigning protein function from domain-function associations using DomFun.基于域-功能关联来分配蛋白质功能,使用 DomFun。
BMC Bioinformatics. 2022 Jan 15;23(1):43. doi: 10.1186/s12859-022-04565-6.
6
PANNZER2: a rapid functional annotation web server.PANNZER2:一个快速的功能注释网络服务器。
Nucleic Acids Res. 2018 Jul 2;46(W1):W84-W88. doi: 10.1093/nar/gky350.
7
Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA).描述计算基因功能分配方面的最新技术状态:从第一次功能注释(CAFA)的关键评估中吸取的教训。
BMC Bioinformatics. 2013;14 Suppl 3(Suppl 3):S15. doi: 10.1186/1471-2105-14-s3-s15.
8
Protein function prediction using text-based features extracted from the biomedical literature: the CAFA challenge.基于生物医学文献中提取的文本特征进行蛋白质功能预测:CAFA 挑战赛。
BMC Bioinformatics. 2013;14 Suppl 3(Suppl 3):S14. doi: 10.1186/1471-2105-14-S3-S14. Epub 2013 Feb 28.
9
INGA 2.0: improving protein function prediction for the dark proteome.INGA 2.0:改进黑暗蛋白质组中蛋白质功能的预测。
Nucleic Acids Res. 2019 Jul 2;47(W1):W373-W378. doi: 10.1093/nar/gkz375.
10
Protein function prediction by massive integration of evolutionary analyses and multiple data sources.通过大规模整合进化分析和多种数据源进行蛋白质功能预测。
BMC Bioinformatics. 2013;14 Suppl 3(Suppl 3):S1. doi: 10.1186/1471-2105-14-S3-S1. Epub 2013 Feb 28.

引用本文的文献

1
MKFGO: integrating multi-source knowledge fusion with pretrained language model for high-accuracy protein function prediction.MKFGO:将多源知识融合与预训练语言模型相结合用于高精度蛋白质功能预测
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf420.
2
Impact of SARS-CoV-2 Variant NSP6 on Pathogenicity: Genetic Analysis and Cell Biology.严重急性呼吸综合征冠状病毒2变体NSP6对致病性的影响:遗传分析与细胞生物学
Curr Issues Mol Biol. 2025 May 14;47(5):361. doi: 10.3390/cimb47050361.
3
FINCHES: A Computational Framework for Predicting Intermolecular Interactions in Intrinsically Disordered Proteins.

本文引用的文献

1
The GOA database: gene Ontology annotation updates for 2015.基因本体注释数据库(GOA):2015年基因本体注释更新
Nucleic Acids Res. 2015 Jan;43(Database issue):D1057-63. doi: 10.1093/nar/gku1113. Epub 2014 Nov 6.
2
The impact of incomplete knowledge on the evaluation of protein function prediction: a structured-output learning perspective.知识不完整对蛋白质功能预测评估的影响:结构化输出学习视角
Bioinformatics. 2014 Sep 1;30(17):i609-16. doi: 10.1093/bioinformatics/btu472.
3
The automated function prediction SIG looks back at 2013 and prepares for 2014.
雀类:一种预测内在无序蛋白质分子间相互作用的计算框架。
Int J Mol Sci. 2025 Jun 28;26(13):6246. doi: 10.3390/ijms26136246.
4
PLMSearch and PLMAlign: Protein Language Model (PLM)-Based Homologous Protein Sequence Search and Alignment.PLMSearch和PLMAlign:基于蛋白质语言模型(PLM)的同源蛋白质序列搜索与比对
Methods Mol Biol. 2025;2941:227-241. doi: 10.1007/978-1-0716-4623-6_14.
5
Sequence and taxonomic feature evaluation facilitated the discovery of alcohol oxidases.序列和分类学特征评估促进了醇氧化酶的发现。
Synth Syst Biotechnol. 2025 Apr 22;10(3):907-915. doi: 10.1016/j.synbio.2025.04.014. eCollection 2025 Sep.
6
Single-cell transcriptomes reveal cell-type-specific and sample-specific gene function in human cancer.单细胞转录组揭示人类癌症中细胞类型特异性和样本特异性基因功能。
Heliyon. 2025 Jan 23;11(3):e42218. doi: 10.1016/j.heliyon.2025.e42218. eCollection 2025 Feb 15.
7
SEGT-GO: a graph transformer method based on PPI serialization and explanatory artificial intelligence for protein function prediction.SEGT-GO:一种基于蛋白质-蛋白质相互作用序列化和解释性人工智能的图变换器方法用于蛋白质功能预测。
BMC Bioinformatics. 2025 Feb 10;26(1):46. doi: 10.1186/s12859-025-06059-7.
8
Functional profiling of the sequence stockpile: a protein pair-based assessment of in silico prediction tools.序列储备的功能分析:基于蛋白质对的计算机预测工具评估
Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf035.
9
DPFunc: accurately predicting protein function via deep learning with domain-guided structure information.DPFunc:利用域引导的结构信息通过深度学习准确预测蛋白质功能。
Nat Commun. 2025 Jan 2;16(1):70. doi: 10.1038/s41467-024-54816-8.
10
RecGOBD: accurate recognition of gene ontology related brain development protein functions through multi-feature fusion and attention mechanisms.RecGOBD:通过多特征融合和注意力机制准确识别与基因本体相关的脑发育蛋白质功能。
Bioinform Adv. 2024 Nov 4;4(1):vbae163. doi: 10.1093/bioadv/vbae163. eCollection 2024.
自动化功能预测 SIG 回顾 2013 年并为 2014 年做准备。
Bioinformatics. 2014 Jul 15;30(14):2091-2. doi: 10.1093/bioinformatics/btu117. Epub 2014 Mar 3.
4
CAFA and the open world of protein function predictions.计算机辅助功能注释(CAFA)与蛋白质功能预测的开放世界
Trends Genet. 2013 Nov;29(11):609-10. doi: 10.1016/j.tig.2013.09.005. Epub 2013 Oct 15.
5
Information-theoretic evaluation of predicted ontological annotations.基于信息论的预测本体论注释评估。
Bioinformatics. 2013 Jul 1;29(13):i53-61. doi: 10.1093/bioinformatics/btt228.
6
Biases in the experimental annotations of protein function and their effect on our understanding of protein function space.蛋白质功能的实验注释中的偏差及其对我们理解蛋白质功能空间的影响。
PLoS Comput Biol. 2013;9(5):e1003063. doi: 10.1371/journal.pcbi.1003063. Epub 2013 May 30.
7
Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA).描述计算基因功能分配方面的最新技术状态:从第一次功能注释(CAFA)的关键评估中吸取的教训。
BMC Bioinformatics. 2013;14 Suppl 3(Suppl 3):S15. doi: 10.1186/1471-2105-14-s3-s15.
8
Seeking the wisdom of crowds through challenge-based competitions in biomedical research.通过基于挑战的竞赛在生物医学研究中寻求群体智慧。
Clin Pharmacol Ther. 2013 May;93(5):396-8. doi: 10.1038/clpt.2013.36. Epub 2013 Feb 20.
9
A large-scale evaluation of computational protein function prediction.大规模计算蛋白质功能预测评估。
Nat Methods. 2013 Mar;10(3):221-7. doi: 10.1038/nmeth.2340. Epub 2013 Jan 27.
10
Computational tools for prioritizing candidate genes: boosting disease gene discovery.计算工具在候选基因优先级排序中的应用:提高疾病基因发现的效率。
Nat Rev Genet. 2012 Jul 3;13(8):523-36. doi: 10.1038/nrg3253.