Wilson Derek, Madera Martin, Vogel Christine, Chothia Cyrus, Gough Julian
MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK.
Nucleic Acids Res. 2007 Jan;35(Database issue):D308-13. doi: 10.1093/nar/gkl910. Epub 2006 Nov 10.
The SUPERFAMILY database provides protein domain assignments, at the SCOP 'superfamily' level, for the predicted protein sequences in over 400 completed genomes. A superfamily groups together domains of different families which have a common evolutionary ancestor based on structural, functional and sequence data. SUPERFAMILY domain assignments are generated using an expert curated set of profile hidden Markov models. All models and structural assignments are available for browsing and download from http://supfam.org. The web interface includes services such as domain architectures and alignment details for all protein assignments, searchable domain combinations, domain occurrence network visualization, detection of over- or under-represented superfamilies for a given genome by comparison with other genomes, assignment of manually submitted sequences and keyword searches. In this update we describe the SUPERFAMILY database and outline two major developments: (i) incorporation of family level assignments and (ii) a superfamily-level functional annotation. The SUPERFAMILY database can be used for general protein evolution and superfamily-specific studies, genomic annotation, and structural genomics target suggestion and assessment.
SUPERFAMILY数据库为400多个已完成测序的基因组中的预测蛋白质序列提供了SCOP“超家族”水平的蛋白质结构域分类。超家族是根据结构、功能和序列数据,将具有共同进化祖先的不同家族的结构域归为一组。SUPERFAMILY结构域分类是使用一组经过专家整理的轮廓隐马尔可夫模型生成的。所有模型和结构分类均可从http://supfam.org浏览和下载。该网络界面包括多种服务,如所有蛋白质分类的结构域架构和比对细节、可搜索的结构域组合、结构域出现网络可视化、通过与其他基因组比较检测给定基因组中过度或代表性不足的超家族、手动提交序列的分类以及关键词搜索。在本次更新中,我们介绍了SUPERFAMILY数据库,并概述了两项主要进展:(i)纳入家族水平分类和(ii)超家族水平的功能注释。SUPERFAMILY数据库可用于一般蛋白质进化和超家族特异性研究、基因组注释以及结构基因组学靶点建议和评估。