• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

FFPred 2.0:改进了真核蛋白质序列的同源无关基因本体术语预测。

FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.

机构信息

Bioinformatics Group, Department of Computer Science, University College London, London, United Kingdom.

出版信息

PLoS One. 2013 May 22;8(5):e63754. doi: 10.1371/journal.pone.0063754. Print 2013.

DOI:10.1371/journal.pone.0063754
PMID:23717476
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3661659/
Abstract

To understand fully cell behaviour, biologists are making progress towards cataloguing the functional elements in the human genome and characterising their roles across a variety of tissues and conditions. Yet, functional information - either experimentally validated or computationally inferred by similarity - remains completely missing for approximately 30% of human proteins. FFPred was initially developed to bridge this gap by targeting sequences with distant or no homologues of known function and by exploiting clear patterns of intrinsic disorder associated with particular molecular activities and biological processes. Here, we present an updated and improved version, which builds on larger datasets of protein sequences and annotations, and uses updated component feature predictors as well as revised training procedures. FFPred 2.0 includes support vector regression models for the prediction of 442 Gene Ontology (GO) terms, which largely expand the coverage of the ontology and of the biological process category in particular. The GO term list mainly revolves around macromolecular interactions and their role in regulatory, signalling, developmental and metabolic processes. Benchmarking experiments on newly annotated proteins show that FFPred 2.0 provides more accurate functional assignments than its predecessor and the ProtFun server do; also, its assignments can complement information obtained using BLAST-based transfer of annotations, improving especially prediction in the biological process category. Furthermore, FFPred 2.0 can be used to annotate proteins belonging to several eukaryotic organisms with a limited decrease in prediction quality. We illustrate all these points through the use of both precision-recall plots and of the COGIC scores, which we recently proposed as an alternative numerical evaluation measure of function prediction accuracy.

摘要

为了全面理解细胞行为,生物学家正在努力对人类基因组中的功能元件进行编目,并在各种组织和条件下对其功能进行描述。然而,大约有 30%的人类蛋白质的功能信息(无论是通过实验验证还是通过相似性计算推断的)仍然完全缺失。FFPred 最初是为了弥补这一空白而开发的,它的目标是针对那些具有已知功能的远缘或无同源序列,并利用与特定分子活性和生物过程相关的明显的无序结构模式。在这里,我们提出了一个更新和改进的版本,该版本基于更大的蛋白质序列和注释数据集,并使用更新的组件特征预测器以及修订的训练程序。FFPred 2.0 包括 442 个基因本体 (GO) 术语的支持向量回归模型预测,这在很大程度上扩大了本体的覆盖范围,特别是生物过程类别。GO 术语列表主要围绕着大分子相互作用及其在调节、信号转导、发育和代谢过程中的作用。对新注释蛋白质的基准测试实验表明,FFPred 2.0 提供的功能分配比其前身和 ProtFun 服务器更准确;此外,其分配可以补充使用基于 BLAST 的注释转移获得的信息,特别是在生物过程类别中。此外,FFPred 2.0 可以用于对属于几个真核生物的蛋白质进行注释,而预测质量的下降有限。我们通过使用精度-召回图和 COGIC 分数来说明所有这些点,我们最近提出 COGIC 分数作为功能预测准确性的替代数值评估指标。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/5ea81afa03a3/pone.0063754.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/6fedd50f01be/pone.0063754.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/7194d78762cc/pone.0063754.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/7f134601c2c2/pone.0063754.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/1685affea257/pone.0063754.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/5ea81afa03a3/pone.0063754.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/6fedd50f01be/pone.0063754.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/7194d78762cc/pone.0063754.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/7f134601c2c2/pone.0063754.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/1685affea257/pone.0063754.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3af/3661659/5ea81afa03a3/pone.0063754.g005.jpg

相似文献

1
FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.FFPred 2.0:改进了真核蛋白质序列的同源无关基因本体术语预测。
PLoS One. 2013 May 22;8(5):e63754. doi: 10.1371/journal.pone.0063754. Print 2013.
2
FFPred: an integrated feature-based function prediction server for vertebrate proteomes.FFPred:一个用于脊椎动物蛋白质组的基于综合特征的功能预测服务器。
Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W297-302. doi: 10.1093/nar/gkn193. Epub 2008 May 7.
3
FFPred 3: feature-based function prediction for all Gene Ontology domains.FFPred 3:基于特征的所有 GO 域功能预测。
Sci Rep. 2016 Aug 26;6:31865. doi: 10.1038/srep31865.
4
Computational Methods for Annotation Transfers from Sequence.从序列进行注释转移的计算方法。
Methods Mol Biol. 2017;1446:55-67. doi: 10.1007/978-1-4939-3743-1_5.
5
Embeddings from deep learning transfer GO annotations beyond homology.深度学习的嵌入信息可以将 GO 注释扩展到同源之外。
Sci Rep. 2021 Jan 13;11(1):1160. doi: 10.1038/s41598-020-80786-0.
6
7
WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation.WS-SNPs&GO:一个使用功能注释预测人类蛋白质变异体有害影响的网络服务器。
BMC Genomics. 2013;14 Suppl 3(Suppl 3):S6. doi: 10.1186/1471-2164-14-S3-S6. Epub 2013 May 28.
8
Gene ontology based transfer learning for protein subcellular localization.基于基因本体论的蛋白质亚细胞定位迁移学习。
BMC Bioinformatics. 2011 Feb 2;12:44. doi: 10.1186/1471-2105-12-44.
9
DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures.DAGO-Fun:一种基于基因本体论的功能分析工具,使用术语信息内容度量。
BMC Bioinformatics. 2013 Sep 25;14:284. doi: 10.1186/1471-2105-14-284.
10
CATH FunFHMMer web server: protein functional annotations using functional family assignments.CATH FunFHMMer网络服务器:利用功能家族分配进行蛋白质功能注释。
Nucleic Acids Res. 2015 Jul 1;43(W1):W148-53. doi: 10.1093/nar/gkv488. Epub 2015 May 11.

引用本文的文献

1
Prediction of protein-protein interaction sites in intrinsically disordered proteins.内在无序蛋白质中蛋白质-蛋白质相互作用位点的预测
Front Mol Biosci. 2022 Sep 30;9:985022. doi: 10.3389/fmolb.2022.985022. eCollection 2022.
2
Gene function finding through cross-organism ensemble learning.通过跨物种集成学习进行基因功能发现。
BioData Min. 2021 Feb 12;14(1):14. doi: 10.1186/s13040-021-00239-w.
3
Epigenome-wide association study for glyphosate induced transgenerational sperm DNA methylation and histone retention epigenetic biomarkers for disease.

本文引用的文献

1
Protein function prediction by massive integration of evolutionary analyses and multiple data sources.通过大规模整合进化分析和多种数据源进行蛋白质功能预测。
BMC Bioinformatics. 2013;14 Suppl 3(Suppl 3):S1. doi: 10.1186/1471-2105-14-S3-S1. Epub 2013 Feb 28.
2
A large-scale evaluation of computational protein function prediction.大规模计算蛋白质功能预测评估。
Nat Methods. 2013 Mar;10(3):221-7. doi: 10.1038/nmeth.2340. Epub 2013 Jan 27.
3
An integrated encyclopedia of DNA elements in the human genome.人类基因组中 DNA 元件的综合百科全书。
基于 glyphosate 诱导的跨代精子 DNA 甲基化和组蛋白保留的表观遗传生物标志物的全基因组关联研究用于疾病。
Epigenetics. 2021 Oct;16(10):1150-1167. doi: 10.1080/15592294.2020.1853319. Epub 2020 Dec 9.
4
INGA 2.0: improving protein function prediction for the dark proteome.INGA 2.0:改进黑暗蛋白质组中蛋白质功能的预测。
Nucleic Acids Res. 2019 Jul 2;47(W1):W373-W378. doi: 10.1093/nar/gkz375.
5
Computational Characterization of the mtORF of Pocilloporid Corals: Insights into Protein Structure and Function in Lineages from Contrasting Environments.计算 Pocilloporid 珊瑚 mtORF 的特征:对比不同环境下的系统发育中对蛋白质结构和功能的深入了解。
Genes (Basel). 2019 Apr 27;10(5):324. doi: 10.3390/genes10050324.
6
Predicting human protein function with multi-task deep neural networks.用多任务深度神经网络预测人类蛋白质功能。
PLoS One. 2018 Jun 11;13(6):e0198216. doi: 10.1371/journal.pone.0198216. eCollection 2018.
7
De Novo characterization of transcriptomes from two North American Papaipema stem-borers (Lepidoptera: Noctuidae).对两种北美茎蛀夜蛾(鳞翅目:夜蛾科)转录组的从头表征。
PLoS One. 2018 Jan 24;13(1):e0191061. doi: 10.1371/journal.pone.0191061. eCollection 2018.
8
Analysis of temporal transcription expression profiles reveal links between protein function and developmental stages of Drosophila melanogaster.对时间转录表达谱的分析揭示了果蝇蛋白质功能与发育阶段之间的联系。
PLoS Comput Biol. 2017 Oct 18;13(10):e1005791. doi: 10.1371/journal.pcbi.1005791. eCollection 2017 Oct.
9
Genome content analysis yields new insights into the relationship between the human malaria parasite Plasmodium falciparum and its anopheline vectors.基因组内容分析为人类疟原虫恶性疟原虫与其按蚊媒介之间的关系带来了新的见解。
BMC Genomics. 2017 Feb 27;18(1):205. doi: 10.1186/s12864-017-3590-0.
10
Genomic, Transcriptomic, and Proteomic Analysis Provide Insights Into the Cold Adaptation Mechanism of the Obligate Psychrophilic Fungus .基因组、转录组和蛋白质组分析为专性嗜冷真菌的冷适应机制提供了见解。
G3 (Bethesda). 2016 Nov 8;6(11):3603-3613. doi: 10.1534/g3.116.033308.
Nature. 2012 Sep 6;489(7414):57-74. doi: 10.1038/nature11247.
4
The use of evolutionary patterns in protein annotation.利用进化模式进行蛋白质注释。
Curr Opin Struct Biol. 2012 Jun;22(3):316-25. doi: 10.1016/j.sbi.2012.05.001. Epub 2012 May 24.
5
Combining many interaction networks to predict gene function and analyze gene lists.将多个交互网络进行组合,以预测基因功能并分析基因列表。
Proteomics. 2012 May;12(10):1687-96. doi: 10.1002/pmic.201100607.
6
Reorganizing the protein space at the Universal Protein Resource (UniProt).重新组织通用蛋白质资源库(UniProt)中的蛋白质空间。
Nucleic Acids Res. 2012 Jan;40(Database issue):D71-5. doi: 10.1093/nar/gkr981. Epub 2011 Nov 18.
7
SignalP 4.0: discriminating signal peptides from transmembrane regions.信号肽预测工具SignalP 4.0:区分信号肽与跨膜区域。
Nat Methods. 2011 Sep 29;8(10):785-6. doi: 10.1038/nmeth.1701.
8
Initial impact of the sequencing of the human genome.人类基因组测序的初步影响。
Nature. 2011 Feb 10;470(7333):187-97. doi: 10.1038/nature09792.
9
Protein annotation and modelling servers at University College London.伦敦大学学院的蛋白质注释和建模服务器。
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W563-8. doi: 10.1093/nar/gkq427. Epub 2010 May 27.
10
Transmembrane protein topology prediction using support vector machines.使用支持向量机进行跨膜蛋白拓扑结构预测。
BMC Bioinformatics. 2009 May 26;10:159. doi: 10.1186/1471-2105-10-159.