• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ENZYMAP:利用蛋白质注释对 UniProt/Swiss-Prot 中的 EC 编号变化进行建模和预测。

ENZYMAP: exploiting protein annotation for modeling and predicting EC number changes in UniProt/Swiss-Prot.

机构信息

Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil ; Department of Biochemistry and Immunology, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.

Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.

出版信息

PLoS One. 2014 Feb 19;9(2):e89162. doi: 10.1371/journal.pone.0089162. eCollection 2014.

DOI:10.1371/journal.pone.0089162
PMID:24586563
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3929618/
Abstract

The volume and diversity of biological data are increasing at very high rates. Vast amounts of protein sequences and structures, protein and genetic interactions and phenotype studies have been produced. The majority of data generated by high-throughput devices is automatically annotated because manually annotating them is not possible. Thus, efficient and precise automatic annotation methods are required to ensure the quality and reliability of both the biological data and associated annotations. We proposed ENZYMatic Annotation Predictor (ENZYMAP), a technique to characterize and predict EC number changes based on annotations from UniProt/Swiss-Prot using a supervised learning approach. We evaluated ENZYMAP experimentally, using test data sets from both UniProt/Swiss-Prot and UniProt/TrEMBL, and showed that predicting EC changes using selected types of annotation is possible. Finally, we compared ENZYMAP and DETECT with respect to their predictions and checked both against the UniProt/Swiss-Prot annotations. ENZYMAP was shown to be more accurate than DETECT, coming closer to the actual changes in UniProt/Swiss-Prot. Our proposal is intended to be an automatic complementary method (that can be used together with other techniques like the ones based on protein sequence and structure) that helps to improve the quality and reliability of enzyme annotations over time, suggesting possible corrections, anticipating annotation changes and propagating the implicit knowledge for the whole dataset.

摘要

生物数据的数量和多样性正在以非常高的速度增长。大量的蛋白质序列和结构、蛋白质和遗传相互作用以及表型研究已经产生。由于手动注释它们是不可能的,因此需要高效和精确的自动注释方法,以确保生物数据及其相关注释的质量和可靠性。

我们提出了 ENZYMatic Annotation Predictor (ENZYMAP),这是一种使用监督学习方法根据 UniProt/Swiss-Prot 中的注释来描述和预测 EC 编号变化的技术。我们使用来自 UniProt/Swiss-Prot 和 UniProt/TrEMBL 的测试数据集对 ENZYMAP 进行了实验评估,并表明使用选定类型的注释来预测 EC 变化是可行的。最后,我们比较了 ENZYMAP 和 DETECT 在预测方面的表现,并将两者与 UniProt/Swiss-Prot 的注释进行了比较。结果表明,ENZYMAP 比 DETECT 更准确,更接近 UniProt/Swiss-Prot 的实际变化。

我们的提议旨在成为一种自动补充方法(可以与其他技术结合使用,如基于蛋白质序列和结构的技术),有助于随着时间的推移提高酶注释的质量和可靠性,建议进行可能的更正,预测注释变化并传播整个数据集的隐含知识。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1708/3929618/620c74b89042/pone.0089162.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1708/3929618/620c74b89042/pone.0089162.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1708/3929618/620c74b89042/pone.0089162.g001.jpg

相似文献

1
ENZYMAP: exploiting protein annotation for modeling and predicting EC number changes in UniProt/Swiss-Prot.ENZYMAP:利用蛋白质注释对 UniProt/Swiss-Prot 中的 EC 编号变化进行建模和预测。
PLoS One. 2014 Feb 19;9(2):e89162. doi: 10.1371/journal.pone.0089162. eCollection 2014.
2
UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View.UniProtKB/Swiss-Prot,即UniProt知识库的人工注释部分:如何使用条目视图。
Methods Mol Biol. 2016;1374:23-54. doi: 10.1007/978-1-4939-3167-5_2.
3
The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology.基因本体注释(GOA)数据库:在UniProt中与基因本体共享知识。
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D262-6. doi: 10.1093/nar/gkh021.
4
DomSign: a top-down annotation pipeline to enlarge enzyme space in the protein universe.DomSign:一种自上而下的注释流程,用于拓展蛋白质世界中的酶空间。
BMC Bioinformatics. 2015 Mar 21;16:96. doi: 10.1186/s12859-015-0499-y.
5
The Swiss-Prot protein knowledgebase and ExPASy: providing the plant community with high quality proteomic data and tools.瑞士蛋白质数据库(Swiss-Prot)与专家蛋白质分析系统(ExPASy):为植物学界提供高质量蛋白质组学数据和工具。
Plant Physiol Biochem. 2004 Dec;42(12):1013-21. doi: 10.1016/j.plaphy.2004.10.009. Epub 2004 Dec 15.
6
Automatically extracting functionally equivalent proteins from SwissProt.从瑞士蛋白质数据库(SwissProt)中自动提取功能等效的蛋白质。
BMC Bioinformatics. 2008 Oct 6;9:418. doi: 10.1186/1471-2105-9-418.
7
Protein sequence annotation in the genome era: the annotation concept of SWISS-PROT+TREMBL.基因组时代的蛋白质序列注释:SWISS-PROT+TREMBL注释概念
Proc Int Conf Intell Syst Mol Biol. 1997;5:33-43.
8
UniProtKB/Swiss-Prot.通用蛋白质知识库/瑞士蛋白质数据库
Methods Mol Biol. 2007;406:89-112. doi: 10.1007/978-1-59745-535-0_4.
9
Bioinformatics analysis of correlation between protein function and intrinsic disorder.蛋白质功能与固有无序性相关性的生物信息学分析。
Int J Biol Macromol. 2021 Jan 15;167:446-456. doi: 10.1016/j.ijbiomac.2020.11.211. Epub 2020 Dec 2.
10
Annotating single amino acid polymorphisms in the UniProt/Swiss-Prot knowledgebase.在UniProt/Swiss-Prot知识库中注释单氨基酸多态性。
Hum Mutat. 2008 Mar;29(3):361-6. doi: 10.1002/humu.20671.

引用本文的文献

1
Annotation Vocabulary (Might Be) All You Need.注释词汇(可能)就是你所需要的一切。
bioRxiv. 2024 Jul 31:2024.07.30.605924. doi: 10.1101/2024.07.30.605924.
2
VTR: A Web Tool for Identifying Analogous Contacts on Protein Structures and Their Complexes.VTR:一种用于识别蛋白质结构及其复合物上类似接触点的网络工具。
Front Bioinform. 2021 Nov 8;1:730350. doi: 10.3389/fbinf.2021.730350. eCollection 2021.
3
Isofunctional Protein Subfamily Detection Using Data Integration and Spectral Clustering.利用数据整合和谱聚类检测同功能蛋白亚家族

本文引用的文献

1
aCSM: noise-free graph-based signatures to large-scale receptor-based ligand prediction.aCSM:基于无噪声图的配体预测大规模基于受体的签名。
Bioinformatics. 2013 Apr 1;29(7):855-61. doi: 10.1093/bioinformatics/btt058. Epub 2013 Feb 8.
2
A large-scale evaluation of computational protein function prediction.大规模计算蛋白质功能预测评估。
Nat Methods. 2013 Mar;10(3):221-7. doi: 10.1038/nmeth.2340. Epub 2013 Jan 27.
3
Cutoff Scanning Matrix (CSM): structural classification and function prediction by protein inter-residue distance patterns.
PLoS Comput Biol. 2016 Jun 27;12(6):e1005001. doi: 10.1371/journal.pcbi.1005001. eCollection 2016 Jun.
截止扫描矩阵 (CSM):通过蛋白质残基间距离模式进行结构分类和功能预测。
BMC Genomics. 2011 Dec 22;12 Suppl 4(Suppl 4):S12. doi: 10.1186/1471-2164-12-S4-S12.
4
Reorganizing the protein space at the Universal Protein Resource (UniProt).重新组织通用蛋白质资源库(UniProt)中的蛋白质空间。
Nucleic Acids Res. 2012 Jan;40(Database issue):D71-5. doi: 10.1093/nar/gkr981. Epub 2011 Nov 18.
5
KEGG for integration and interpretation of large-scale molecular data sets.KEGG 用于整合和解释大规模分子数据集。
Nucleic Acids Res. 2012 Jan;40(Database issue):D109-14. doi: 10.1093/nar/gkr988. Epub 2011 Nov 10.
6
FunTree: a resource for exploring the functional evolution of structurally defined enzyme superfamilies.FunTree:探索结构定义酶超家族功能进化的资源。
Nucleic Acids Res. 2012 Jan;40(Database issue):D776-82. doi: 10.1093/nar/gkr852. Epub 2011 Oct 17.
7
EnzymeDetector: an integrated enzyme function prediction tool and database.EnzymeDetector:一个集成的酶功能预测工具和数据库。
BMC Bioinformatics. 2011 Sep 23;12:376. doi: 10.1186/1471-2105-12-376.
8
Improving the efficiency of multidimensional scaling in the analysis of high-dimensional data using singular value decomposition.利用奇异值分解提高高维数据多维尺度分析的效率。
Bioinformatics. 2011 May 15;27(10):1413-21. doi: 10.1093/bioinformatics/btr143. Epub 2011 Mar 17.
9
Efficient storage of high throughput DNA sequencing data using reference-based compression.利用基于参考的压缩技术高效存储高通量 DNA 测序数据。
Genome Res. 2011 May;21(5):734-40. doi: 10.1101/gr.114819.110. Epub 2011 Jan 18.
10
MIPS: curated databases and comprehensive secondary data resources in 2010.MIPS:2010年的精选数据库和全面的二次数据资源。
Nucleic Acids Res. 2011 Jan;39(Database issue):D220-4. doi: 10.1093/nar/gkq1157. Epub 2010 Nov 24.