2014年的超家族1.75数据库：数据量翻倍。

The SUPERFAMILY 1.75 database in 2014: a doubling of data.

作者信息

Oates Matt E, Stahlhacke Jonathan, Vavoulis Dimitrios V, Smithers Ben, Rackham Owen J L, Sardar Adam J, Zaucha Jan, Thurlby Natalie, Fang Hai, Gough Julian

机构信息

Computer Science, University of Bristol, Bristol, BS8 1UB, UK

Computer Science, University of Bristol, Bristol, BS8 1UB, UK.

出版信息

Nucleic Acids Res. 2015 Jan;43(Database issue):D227-33. doi: 10.1093/nar/gku1041. Epub 2014 Nov 20.

DOI:10.1093/nar/gku1041

PMID:25414345

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4383889/

Abstract

We present updates to the SUPERFAMILY 1.75 (http://supfam.org) online resource and protein sequence collection. The hidden Markov model library that provides sequence homology to SCOP structural domains remains unchanged at version 1.75. In the last 4 years SUPERFAMILY has more than doubled its holding of curated complete proteomes over all cellular life, from 1400 proteomes reported previously in 2010 up to 3258 at present. Outside of the main sequence collection, SUPERFAMILY continues to provide domain annotation for sequences provided by other resources such as: UniProt, Ensembl, PDB, much of JGI Phytozome and selected subcollections of NCBI RefSeq. Despite this growth in data volume, SUPERFAMILY now provides users with an expanded and daily updated phylogenetic tree of life (sTOL). This tree is built with genomic-scale domain annotation data as before, but constantly updated when new species are introduced to the sequence library. Our Gene Ontology and other functional and phenotypic annotations previously reported have stood up to critical assessment by the function prediction community. We have now introduced these data in an integrated manner online at the level of an individual sequence, and--in the case of whole genomes--with enrichment analysis against a taxonomically defined background.

摘要

我们展示了在线资源SUPERFAMILY 1.75（http://supfam.org）及蛋白质序列集的更新内容。提供与SCOP结构域序列同源性的隐马尔可夫模型库在1.75版本保持不变。在过去4年中，SUPERFAMILY涵盖的所有细胞生命的经过整理的完整蛋白质组数量增加了一倍多，从2010年之前报告的1400个蛋白质组增加到目前的3258个。在主要序列集之外，SUPERFAMILY继续为其他资源提供的序列进行结构域注释，这些资源包括：UniProt、Ensembl、PDB、许多JGI植物基因组数据库中的序列以及NCBI RefSeq的选定子集合。尽管数据量有所增长，但SUPERFAMILY现在为用户提供了一个经过扩展且每日更新的生命系统发育树（sTOL）。这棵树像以前一样基于基因组规模的结构域注释数据构建，但在新物种被引入序列库时会不断更新。我们之前报告的基因本体及其他功能和表型注释已经经受住了功能预测领域的严格评估。我们现在已将这些数据以整合的方式在线呈现，在单个序列层面，对于全基因组则是在分类定义的背景下进行富集分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f028/4383889/ee34aa48d251/gku1041fig1.jpg

相似文献

The SUPERFAMILY 1.75 database in 2014: a doubling of data.

Nucleic Acids Res. 2015 Jan;43(Database issue):D227-33. doi: 10.1093/nar/gku1041. Epub 2014 Nov 20.

SUPERFAMILY 1.75 including a domain-centric gene ontology method.

Nucleic Acids Res. 2011 Jan;39(Database issue):D427-34. doi: 10.1093/nar/gkq1130. Epub 2010 Nov 9.

SUPERFAMILY--sophisticated comparative genomics, data mining, visualization and phylogeny.

Nucleic Acids Res. 2009 Jan;37(Database issue):D380-6. doi: 10.1093/nar/gkn762. Epub 2008 Nov 26.

The SUPERFAMILY 2.0 database: a significant proteome update and a new webserver.

Nucleic Acids Res. 2019 Jan 8;47(D1):D490-D494. doi: 10.1093/nar/gky1130.

The SUPERFAMILY database in 2007: families and functions.

Nucleic Acids Res. 2007 Jan;35(Database issue):D308-13. doi: 10.1093/nar/gkl910. Epub 2006 Nov 10.

SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments.

Nucleic Acids Res. 2002 Jan 1;30(1):268-72. doi: 10.1093/nar/30.1.268.

SIFTS: updated Structure Integration with Function, Taxonomy and Sequences resource allows 40-fold increase in coverage of structure-based annotations for proteins.

Nucleic Acids Res. 2019 Jan 8;47(D1):D482-D489. doi: 10.1093/nar/gky1114.

SUPFAM--a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes.

Nucleic Acids Res. 2002 Jan 1;30(1):289-93. doi: 10.1093/nar/30.1.289.

The SUPERFAMILY database in 2004: additions and improvements.

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D235-9. doi: 10.1093/nar/gkh117.

PANTHER version 10: expanded protein families and functions, and analysis tools.

Nucleic Acids Res. 2016 Jan 4;44(D1):D336-42. doi: 10.1093/nar/gkv1194. Epub 2015 Nov 17.

引用本文的文献

Surveillance of coronaviruses in wild aquatic birds in Hong Kong: expanded genetic diversity and discovery of novel subgenus in the .

Virus Evol. 2025 Jul 1;11(1):veaf049. doi: 10.1093/ve/veaf049. eCollection 2025.

Computational studies reveal structural characterization and novel families of Puccinia striiformis f. sp. tritici effectors.

PLoS Comput Biol. 2025 Mar 28;21(3):e1012503. doi: 10.1371/journal.pcbi.1012503. eCollection 2025 Mar.

Dominance of recombinant DWV genomes with changing viral landscapes as revealed in national US honey bee and varroa mite survey.

Commun Biol. 2024 Dec 5;7(1):1623. doi: 10.1038/s42003-024-07333-9.

InterPro: the protein sequence classification resource in 2025.

Nucleic Acids Res. 2025 Jan 6;53(D1):D444-D456. doi: 10.1093/nar/gkae1082.

Bioinformatic insights into sugar signaling pathways in sugarcane growth.

Sci Rep. 2024 Oct 22;14(1):24935. doi: 10.1038/s41598-024-75220-8.

The dcGO Domain-Centric Ontology Database in 2023: New Website and Extended Annotations for Protein Structural Domains.

J Mol Biol. 2023 Jul 15;435(14):168093. doi: 10.1016/j.jmb.2023.168093. Epub 2023 Apr 13.

Improved global protein homolog detection with major gains in function identification.

Proc Natl Acad Sci U S A. 2023 Feb 28;120(9):e2211823120. doi: 10.1073/pnas.2211823120. Epub 2023 Feb 24.

Comparison of functional classification systems.

NAR Genom Bioinform. 2022 Dec 1;4(4):lqac090. doi: 10.1093/nargab/lqac090. eCollection 2022 Dec.

Mitochondrial DNA Polymerase POLIB Contains a Novel Polymerase Domain Insertion That Confers Dominant Exonuclease Activity.

Biochemistry. 2022 Dec 6;61(23):2751-2765. doi: 10.1021/acs.biochem.2c00392. Epub 2022 Nov 18.

FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow.

Genes (Basel). 2021 Oct 19;12(10):1645. doi: 10.3390/genes12101645.

本文引用的文献

Comparative Protein Structure Modeling Using MODELLER.

Curr Protoc Bioinformatics. 2014 Sep 8;47:5.6.1-32. doi: 10.1002/0471250953.bi0506s47.

Ensembl 2014.

Nucleic Acids Res. 2014 Jan;42(Database issue):D749-55. doi: 10.1093/nar/gkt1196. Epub 2013 Dec 6.

SCOP2 prototype: a new approach to protein structure mining.

Nucleic Acids Res. 2014 Jan;42(Database issue):D310-4. doi: 10.1093/nar/gkt1242. Epub 2013 Nov 29.

RefSeq: an update on mammalian reference sequences.

Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. doi: 10.1093/nar/gkt1114. Epub 2013 Nov 19.

Activities at the Universal Protein Resource (UniProt).

Nucleic Acids Res. 2014 Jan;42(Database issue):D191-8. doi: 10.1093/nar/gkt1140. Epub 2013 Nov 18.

A daily-updated tree of (sequenced) life as a reference for genome research.

Sci Rep. 2013;3:2015. doi: 10.1038/srep02015.

A domain-centric solution to functional genomics via dcGO Predictor.

BMC Bioinformatics. 2013;14 Suppl 3(Suppl 3):S9. doi: 10.1186/1471-2105-14-S3-S9. Epub 2013 Feb 28.

A large-scale evaluation of computational protein function prediction.

Nat Methods. 2013 Mar;10(3):221-7. doi: 10.1038/nmeth.2340. Epub 2013 Jan 27.

Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains.

Nucleic Acids Res. 2013 Jan;41(Database issue):D499-507. doi: 10.1093/nar/gks1266. Epub 2012 Nov 30.

D²P²: database of disordered protein predictions.

Nucleic Acids Res. 2013 Jan;41(Database issue):D508-16. doi: 10.1093/nar/gks1226. Epub 2012 Nov 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

2014年的超家族1.75数据库：数据量翻倍。

The SUPERFAMILY 1.75 database in 2014: a doubling of data.

作者信息

Oates Matt E, Stahlhacke Jonathan, Vavoulis Dimitrios V, Smithers Ben, Rackham Owen J L, Sardar Adam J, Zaucha Jan, Thurlby Natalie, Fang Hai, Gough Julian

机构信息

Computer Science, University of Bristol, Bristol, BS8 1UB, UK

Computer Science, University of Bristol, Bristol, BS8 1UB, UK.

出版信息

Nucleic Acids Res. 2015 Jan;43(Database issue):D227-33. doi: 10.1093/nar/gku1041. Epub 2014 Nov 20.

DOI:10.1093/nar/gku1041

PMID:25414345

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4383889/

Abstract

摘要

2014年的超家族1.75数据库：数据量翻倍。

The SUPERFAMILY 1.75 database in 2014: a doubling of data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

2014年的超家族1.75数据库：数据量翻倍。

The SUPERFAMILY 1.75 database in 2014: a doubling of data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献