Suppr超能文献

TIGRFAMs 和 2013 年的基因组特性。

TIGRFAMs and Genome Properties in 2013.

机构信息

Informatics, J Craig Venter Institute, Rockville, MD 20850, USA.

出版信息

Nucleic Acids Res. 2013 Jan;41(Database issue):D387-95. doi: 10.1093/nar/gks1234. Epub 2012 Nov 28.

Abstract

TIGRFAMs, available online at http://www.jcvi.org/tigrfams is a database of protein family definitions. Each entry features a seed alignment of trusted representative sequences, a hidden Markov model (HMM) built from that alignment, cutoff scores that let automated annotation pipelines decide which proteins are members, and annotations for transfer onto member proteins. Most TIGRFAMs models are designated equivalog, meaning they assign a specific name to proteins conserved in function from a common ancestral sequence. Models describing more functionally heterogeneous families are designated subfamily or domain, and assign less specific but more widely applicable annotations. The Genome Properties database, available at http://www.jcvi.org/genome-properties, specifies how computed evidence, including TIGRFAMs HMM results, should be used to judge whether an enzymatic pathway, a protein complex or another type of molecular subsystem is encoded in a genome. TIGRFAMs and Genome Properties content are developed in concert because subsystems reconstruction for large numbers of genomes guides selection of seed alignment sequences and cutoff values during protein family construction. Both databases specialize heavily in bacterial and archaeal subsystems. At present, 4284 models appear in TIGRFAMs, while 628 systems are described by Genome Properties. Content derives both from subsystem discovery work and from biocuration of the scientific literature.

摘要

TIGRFAMs 可在 http://www.jcvi.org/tigrfams 在线获取,是一个蛋白质家族定义数据库。每个条目都有一个可信代表序列的种子比对,一个从该比对构建的隐马尔可夫模型 (HMM),以及用于自动注释管道确定哪些蛋白质是成员的截止分数,以及转移到成员蛋白质上的注释。大多数 TIGRFAMs 模型被指定为 equivalog,这意味着它们为从共同祖先序列保守功能的蛋白质赋予特定名称。描述功能更异构的家族的模型被指定为亚家族或结构域,并分配不那么具体但更广泛适用的注释。基因组特性数据库可在 http://www.jcvi.org/genome-properties 获得,指定了如何使用计算证据,包括 TIGRFAMs HMM 结果,来判断酶途径、蛋白质复合物或其他类型的分子子系统是否在基因组中编码。TIGRFAMs 和基因组特性内容是协同开发的,因为大量基因组的子系统重建指导了在构建蛋白质家族时选择种子比对序列和截止值。这两个数据库都非常专注于细菌和古菌的子系统。目前,TIGRFAMs 中有 4284 个模型,而基因组特性中有 628 个系统。内容既来自子系统发现工作,也来自对科学文献的生物注释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc1e/3531188/ca9793dec026/gks1234f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验