2004年的SUPERFAMILY数据库：新增内容与改进

The SUPERFAMILY database in 2004: additions and improvements.

作者信息

Madera Martin, Vogel Christine, Kummerfeld Sarah K, Chothia Cyrus, Gough Julian

机构信息

MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK.

出版信息

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D235-9. doi: 10.1093/nar/gkh117.

DOI:10.1093/nar/gkh117

PMID:14681402

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC308851/

Abstract

The SUPERFAMILY database provides structural assignments to protein sequences and a framework for analysis of the results. At the core of the database is a library of profile Hidden Markov Models that represent all proteins of known structure. The library is based on the SCOP classification of proteins: each model corresponds to a SCOP domain and aims to represent an entire superfamily. We have applied the library to predicted proteins from all completely sequenced genomes (currently 154), the Swiss-Prot and TrEMBL databases and other sequence collections. Close to 60% of all proteins have at least one match, and one half of all residues are covered by assignments. All models and full results are available for download and online browsing at http://supfam.org. Users can study the distribution of their superfamily of interest across all completely sequenced genomes, investigate with which other superfamilies it combines and retrieve proteins in which it occurs. Alternatively, concentrating on a particular genome as a whole, it is possible first, to find out its superfamily composition, and secondly, to compare it with that of other genomes to detect superfamilies that are over- or under-represented. In addition, the webserver provides the following standard services: sequence search; keyword search for genomes, superfamilies and sequence identifiers; and multiple alignment of genomic, PDB and custom sequences.

摘要

SUPERFAMILY数据库为蛋白质序列提供结构归属，并为结果分析提供一个框架。该数据库的核心是一个隐马尔可夫模型谱库，它代表了所有已知结构的蛋白质。该库基于蛋白质的SCOP分类：每个模型对应一个SCOP结构域，旨在代表一个完整的超家族。我们已将该库应用于来自所有完全测序基因组（目前有154个）、Swiss-Prot和TrEMBL数据库以及其他序列集合中的预测蛋白质。所有蛋白质中近60%至少有一个匹配项，所有残基的一半被归属所覆盖。所有模型和完整结果可在http://supfam.org上下载和在线浏览。用户可以研究其感兴趣的超家族在所有完全测序基因组中的分布，调查它与其他哪些超家族结合，并检索其中出现该超家族的蛋白质。或者，首先关注整个特定基因组，有可能找出其超家族组成，其次，将其与其他基因组的超家族组成进行比较，以检测代表性过高或过低的超家族。此外，该网络服务器还提供以下标准服务：序列搜索；按关键词搜索基因组、超家族和序列标识符；以及对基因组、PDB和自定义序列进行多序列比对。

相似文献

The SUPERFAMILY database in 2004: additions and improvements.

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D235-9. doi: 10.1093/nar/gkh117.

Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure.

J Mol Biol. 2001 Nov 2;313(4):903-19. doi: 10.1006/jmbi.2001.5080.

3D-GENOMICS: a database to compare structural and functional annotations of proteins between sequenced genomes.

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D245-50. doi: 10.1093/nar/gkh064.

SUPFAM: a database of sequence superfamilies of protein domains.

BMC Bioinformatics. 2004 Mar 15;5:28. doi: 10.1186/1471-2105-5-28.

AutoSCOP: automated prediction of SCOP classifications using unique pattern-class mappings.

Bioinformatics. 2007 May 15;23(10):1203-10. doi: 10.1093/bioinformatics/btm089. Epub 2007 Mar 22.

GenDiS: Genomic Distribution of protein structural domain Superfamilies.

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D252-5. doi: 10.1093/nar/gki087.

Accurate domain identification with structure-anchored hidden Markov models, saHMMs.

Proteins. 2009 Aug 1;76(2):343-52. doi: 10.1002/prot.22349.

Fast model-based protein homology detection without alignment.

Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8.

SCOPEC: a database of protein catalytic domains.

Bioinformatics. 2004 Aug 4;20 Suppl 1:i130-6. doi: 10.1093/bioinformatics/bth948.

The SYSTERS Protein Family Database in 2005.

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D226-9. doi: 10.1093/nar/gki030.

引用本文的文献

The Phosphatase Cascade Nem1/Spo7-Pah1 Regulates Fungal Development, Lipid Homeostasis, and Virulence in Botryosphaeria dothidea.

Microbiol Spectr. 2023 Jun 15;11(3):e0388122. doi: 10.1128/spectrum.03881-22. Epub 2023 May 16.

Roadmap to the study of gene and protein phylogeny and evolution-A practical guide.

PLoS One. 2023 Feb 24;18(2):e0279597. doi: 10.1371/journal.pone.0279597. eCollection 2023.

Phosphorylation-mediated regulation of the Nem1-Spo7/Pah1 phosphatase cascade in yeast lipid synthesis.

Adv Biol Regul. 2022 May;84:100889. doi: 10.1016/j.jbior.2022.100889. Epub 2022 Feb 23.

Chromosome-level genome assembly of the shuttles hoppfish, Periophthalmus modestus.

Gigascience. 2022 Jan 12;11(1). doi: 10.1093/gigascience/giab089.

Hymenoptera Genome Database: new genomes and annotation datasets for improved go enrichment and orthologue analyses.

Nucleic Acids Res. 2022 Jan 7;50(D1):D1032-D1039. doi: 10.1093/nar/gkab1018.

The Draft Genome Sequence of a New Land-Hopper .

Front Genet. 2021 Jan 11;11:621301. doi: 10.3389/fgene.2020.621301. eCollection 2020.

First draft genome for the sand-hopper Trinorchestia longiramus.

Sci Data. 2020 Mar 9;7(1):85. doi: 10.1038/s41597-020-0424-8.

Immediate Effects of Ammonia Shock on Transcription and Composition of a Biogas Reactor Microbiome.

Front Microbiol. 2019 Sep 6;10:2064. doi: 10.3389/fmicb.2019.02064. eCollection 2019.

Protein kinase C mediates the phosphorylation of the Nem1-Spo7 protein phosphatase complex in yeast.

J Biol Chem. 2019 Nov 1;294(44):15997-16009. doi: 10.1074/jbc.RA119.010592. Epub 2019 Sep 9.

Fat-regulating phosphatidic acid phosphatase: a review of its roles and regulation in lipid homeostasis.

J Lipid Res. 2019 Jan;60(1):2-6. doi: 10.1194/jlr.S087452. Epub 2018 Dec 7.

本文引用的文献

Evolution of the protein repertoire.

Science. 2003 Jun 13;300(5626):1701-3. doi: 10.1126/science.1085371.

The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.

Nucleic Acids Res. 2003 Jan 1;31(1):365-70. doi: 10.1093/nar/gkg095.

The InterPro Database, 2003 brings increased coverage and new features.

Nucleic Acids Res. 2003 Jan 1;31(1):315-8. doi: 10.1093/nar/gkg046.

Genome sequence of the dissimilatory metal ion-reducing bacterium Shewanella oneidensis.

Nat Biotechnol. 2002 Nov;20(11):1118-23. doi: 10.1038/nbt749. Epub 2002 Oct 7.

A comparison of profile hidden Markov model procedures for remote homology detection.

Nucleic Acids Res. 2002 Oct 1;30(19):4321-8. doi: 10.1093/nar/gkf544.

Within the twilight zone: a sensitive profile-profile comparison tool based on information theory.

J Mol Biol. 2002 Feb 1;315(5):1257-75. doi: 10.1006/jmbi.2001.5293.

The geometry of domain combination in proteins.

J Mol Biol. 2002 Jan 25;315(4):927-39. doi: 10.1006/jmbi.2001.5288.

SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments.

Nucleic Acids Res. 2002 Jan 1;30(1):268-72. doi: 10.1093/nar/30.1.268.

Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure.

J Mol Biol. 2001 Nov 2;313(4):903-19. doi: 10.1006/jmbi.2001.5080.

Annotation transfer for genomics: measuring functional divergence in multi-domain proteins.

Genome Res. 2001 Oct;11(10):1632-40. doi: 10.1101/gr.183801.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

2004年的SUPERFAMILY数据库：新增内容与改进

The SUPERFAMILY database in 2004: additions and improvements.

作者信息

Madera Martin, Vogel Christine, Kummerfeld Sarah K, Chothia Cyrus, Gough Julian

机构信息

MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK.

出版信息

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D235-9. doi: 10.1093/nar/gkh117.

DOI:10.1093/nar/gkh117

PMID:14681402

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC308851/

Abstract

摘要

2004年的SUPERFAMILY数据库：新增内容与改进

The SUPERFAMILY database in 2004: additions and improvements.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

2004年的SUPERFAMILY数据库：新增内容与改进

The SUPERFAMILY database in 2004: additions and improvements.

作者信息

机构信息

出版信息