专业蛋白质家族数据库

ProClass Protein Family Database.

作者信息

Wu C H, Shivakumar S, Huang H

机构信息

Department of Epidemiology/Biomathematics, University of Texas Health Center at Tyler, Tyler, TX 75710, USA.

出版信息

Nucleic Acids Res. 1999 Jan 1;27(1):272-4. doi: 10.1093/nar/27.1.272.

DOI:10.1093/nar/27.1.272

PMID:9847199

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC148154/

Abstract

ProClass is a protein family database that organizes non-redundant sequence entries into families defined collectively by PROSITE patterns and PIR superfamilies. By combining global similarities and functional motifs into a single classification scheme, ProClass helps to reveal domain and family relationships and classify multi-domain proteins. The database currently consists of more than 120 000 sequence entries, approximately 60% of which is classified into about 3500 families. To maximize family information retrieval, the database provides links to various protein family/domain and structural class databases and contains multiple motif alignments of all PROSITE patterns as well as global alignments of PIR superfamilies. The motif sequences are retrieved from both PIR-International and SWISS-PROT databases, including a large number of new members detected by our GeneFIND family identification system. ProClass can be used to support full-scale genomic annotation, because of its high classification rate. The ProClass database is available for on-line search and record retrieval from our WWW server at http://diana.uthct.edu/proclass.html

摘要

ProClass是一个蛋白质家族数据库，它将非冗余序列条目组织成由PROSITE模式和PIR超家族共同定义的家族。通过将全局相似性和功能基序结合到一个单一的分类方案中，ProClass有助于揭示结构域和家族关系，并对多结构域蛋白质进行分类。该数据库目前包含超过120000个序列条目，其中约60%被分类到大约3500个家族中。为了最大限度地检索家族信息，该数据库提供了到各种蛋白质家族/结构域和结构类数据库的链接，并包含所有PROSITE模式的多个基序比对以及PIR超家族的全局比对。基序序列从PIR-International和SWISS-PROT数据库中检索，包括由我们的GeneFIND家族识别系统检测到的大量新成员。由于其高分类率，ProClass可用于支持全面的基因组注释。ProClass数据库可从我们的万维网服务器http://diana.uthct.edu/proclass.html进行在线搜索和记录检索。