• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

寻找基因组注释的一致性:可靠的基因簇及其应用方法

In search of genome annotation consistency: solid gene clusters and how to use them.

作者信息

Davis James J, Olsen Gary J, Overbeek Ross, Vonstein Veronika, Xia Fangfang

机构信息

Institute for Genomic Biology, MC-195, University of Illinois at Urbana-Champaign, 1206 W. Gregory Dr., Urbana, IL, 61801, USA.

Department of Microbiology, University of Illinois at Urbana-Champaign, 601 S. Goodwin Ave., Urbana, IL, 61801, USA.

出版信息

3 Biotech. 2014 Jun;4(3):331-335. doi: 10.1007/s13205-013-0152-2. Epub 2013 Jul 6.

DOI:10.1007/s13205-013-0152-2
PMID:28324432
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4026451/
Abstract

Maintaining consistency in genome annotations is important for supporting many computational tasks, particularly metabolic modeling. The SEED project has implemented a process that improves annotation consistencies across microbial genomes for proteins with conserved sequences and genomic context. In this research report, we describe this process and show how this effort has resulted in improvements to microbial genome annotations in the SEED. We also compare SEED annotation consistencies with other commonly used resources such as IMG (the Joint Genome Institute's Integrated Microbial Genomes system), RefSeq (the National Center for Biotechnology Information's Reference Sequence Database), Swiss-Prot (the annotated protein sequence database of the Swiss Institute of Bioinformatics, European Molecular Biology Laboratory and the European Bioinformatics Institute) and TrEMBL (Translated European Molecular Biology Laboratory nucleotide sequence data Library). Our analysis indicates that manual and computational efforts are paying off for the databases where consistency is a major goal.

摘要

保持基因组注释的一致性对于支持许多计算任务至关重要,尤其是代谢建模。SEED项目实施了一个流程,可提高具有保守序列和基因组背景的蛋白质在微生物基因组中的注释一致性。在本研究报告中,我们描述了这一流程,并展示了这一工作如何改进了SEED中微生物基因组的注释。我们还将SEED注释的一致性与其他常用资源进行了比较,如IMG(联合基因组研究所的综合微生物基因组系统)、RefSeq(美国国家生物技术信息中心的参考序列数据库)、Swiss-Prot(瑞士生物信息学研究所、欧洲分子生物学实验室和欧洲生物信息学研究所的注释蛋白质序列数据库)和TrEMBL(翻译后的欧洲分子生物学实验室核苷酸序列数据库)。我们的分析表明,对于以一致性为主要目标的数据库而言,人工和计算方面的努力正在取得成效。

相似文献

1
In search of genome annotation consistency: solid gene clusters and how to use them.寻找基因组注释的一致性:可靠的基因簇及其应用方法
3 Biotech. 2014 Jun;4(3):331-335. doi: 10.1007/s13205-013-0152-2. Epub 2013 Jul 6.
2
UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View.UniProtKB/Swiss-Prot,即UniProt知识库的人工注释部分:如何使用条目视图。
Methods Mol Biol. 2016;1374:23-54. doi: 10.1007/978-1-4939-3167-5_2.
3
Protein sequence annotation in the genome era: the annotation concept of SWISS-PROT+TREMBL.基因组时代的蛋白质序列注释:SWISS-PROT+TREMBL注释概念
Proc Int Conf Intell Syst Mol Biol. 1997;5:33-43.
4
Bioinformatic Identification of Rare Codon Clusters (RCCs) in HBV Genome and Evaluation of RCCs in Proteins Structure of Hepatitis B Virus.乙肝病毒基因组中稀有密码子簇(RCCs)的生物信息学鉴定及乙肝病毒蛋白质结构中RCCs的评估
Hepat Mon. 2016 Oct 4;16(10):e39909. doi: 10.5812/hepatmon.39909. eCollection 2016 Oct.
5
The role SWISS-PROT and TrEMBL play in the genome research environment.SWISS-PROT和TrEMBL在基因组研究环境中所起的作用。
J Biotechnol. 2000 Mar 31;78(3):221-34. doi: 10.1016/s0168-1656(00)00198-x.
6
Supporting community annotation and user collaboration in the integrated microbial genomes (IMG) system.支持综合微生物基因组(IMG)系统中的社区注释和用户协作。
BMC Genomics. 2016 Apr 26;17:307. doi: 10.1186/s12864-016-2629-y.
7
Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation.美国国立生物技术信息中心的参考序列(RefSeq)数据库:当前状态、分类扩展及功能注释。
Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45. doi: 10.1093/nar/gkv1189. Epub 2015 Nov 8.
8
Database verification studies of SWISS-PROT and GenBank.SWISS-PROT和GenBank的数据库验证研究。
Bioinformatics. 2001 Jun;17(6):526-32; discussion 533-4. doi: 10.1093/bioinformatics/17.6.526.
9
MICheck: a web tool for fast checking of syntactic annotations of bacterial genomes.MICheck:一种用于快速检查细菌基因组句法注释的网络工具。
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W471-9. doi: 10.1093/nar/gki498.
10
IMG ER: a system for microbial genome annotation expert review and curation.IMG ER:一个用于微生物基因组注释专家评审和整理的系统。
Bioinformatics. 2009 Sep 1;25(17):2271-8. doi: 10.1093/bioinformatics/btp393. Epub 2009 Jun 27.

引用本文的文献

1
PATtyFams: Protein Families for the Microbial Genomes in the PATRIC Database.PATtyFams:PATRIC数据库中微生物基因组的蛋白质家族
Front Microbiol. 2016 Feb 8;7:118. doi: 10.3389/fmicb.2016.00118. eCollection 2016.
2
RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes.RASTtk:一种用于构建定制注释管道和批量注释基因组的RAST算法的模块化可扩展实现。
Sci Rep. 2015 Feb 10;5:8365. doi: 10.1038/srep08365.
3
Likelihood-based gene annotations for gap filling and quality assessment in genome-scale metabolic models.基于似然性的基因注释,用于基因组尺度代谢模型中的缺口填充和质量评估。
PLoS Comput Biol. 2014 Oct 16;10(10):e1003882. doi: 10.1371/journal.pcbi.1003882. eCollection 2014 Oct.
4
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).SEED 与利用子系统技术进行快速微生物基因组注释(RAST)。
Nucleic Acids Res. 2014 Jan;42(Database issue):D206-14. doi: 10.1093/nar/gkt1226. Epub 2013 Nov 29.

本文引用的文献

1
Improving microbial genome annotations in an integrated database context.在集成数据库环境中改进微生物基因组注释。
PLoS One. 2013;8(2):e54859. doi: 10.1371/journal.pone.0054859. Epub 2013 Feb 12.
2
IMG: the Integrated Microbial Genomes database and comparative analysis system.IMG:综合微生物基因组数据库和比较分析系统。
Nucleic Acids Res. 2012 Jan;40(Database issue):D115-22. doi: 10.1093/nar/gkr1044.
3
High-throughput generation, optimization and analysis of genome-scale metabolic models.高通量生成、优化和分析基因组规模代谢模型。
Nat Biotechnol. 2010 Sep;28(9):977-82. doi: 10.1038/nbt.1672. Epub 2010 Aug 29.
4
HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot.HAMAP:一个包含完全测序的微生物蛋白质组集以及UniProtKB/Swiss-Prot中经人工整理的微生物蛋白质家族的数据库。
Nucleic Acids Res. 2009 Jan;37(Database issue):D471-8. doi: 10.1093/nar/gkn661. Epub 2008 Oct 11.
5
NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.美国国立生物技术信息中心参考序列(RefSeq):一个经过整理的基因组、转录本和蛋白质的非冗余序列数据库。
Nucleic Acids Res. 2007 Jan;35(Database issue):D61-5. doi: 10.1093/nar/gkl842. Epub 2006 Nov 27.
6
The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes.基因组注释的子系统方法及其在千人基因组注释计划中的应用。
Nucleic Acids Res. 2005 Oct 7;33(17):5691-702. doi: 10.1093/nar/gki866. Print 2005.
7
Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness.介绍DOTUR,一个用于定义操作分类单元和估计物种丰富度的计算机程序。
Appl Environ Microbiol. 2005 Mar;71(3):1501-6. doi: 10.1128/AEM.71.3.1501-1506.2005.
8
The COG database: an updated version includes eukaryotes.COG数据库:更新版本涵盖真核生物。
BMC Bioinformatics. 2003 Sep 11;4:41. doi: 10.1186/1471-2105-4-41.
9
High-quality protein knowledge resource: SWISS-PROT and TrEMBL.高质量蛋白质知识资源:SWISS-PROT和TrEMBL。
Brief Bioinform. 2002 Sep;3(3):275-84. doi: 10.1093/bib/3.3.275.
10
Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.基因本体论:生物学统一工具。基因本体论联合会。
Nat Genet. 2000 May;25(1):25-9. doi: 10.1038/75556.