Wu C H, Shivakumar S
Department of Epidemiology/Biomathematics, University of Texas Health Center at Tyler 75710, USA.
Pac Symp Biocomput. 1998:719-30.
ProClass is a protein family database which organizes non-redundant sequence entries into families defined collectively by the ProSite patterns and PIR superfamilies. The database consists of about 100,000 entries, more than half of which are classified in about 3,000 families. The new version includes links to various protein family/domain and structural class databases and contains gapped motif alignments for all ProSite patterns. The motif sequences are retrieved from both SwissProt and PIR-international databases, including numerous new members detected by our GeneFIND family identification system. The motif collection represents a 50% increase from those catalogued in ProSite. The ProClass database can be used to maximize family information retrieval, help organize protein sequence databases, and support full-scale genomic annotation. The database and its query program are freely available for on-line record retrieval and direct file transfer from our WWW server at http:/(/)diana.uthct.edu/proclass.html+ ++.
ProClass是一个蛋白质家族数据库,它将非冗余序列条目组织成由ProSite模式和PIR超家族共同定义的家族。该数据库约有100,000个条目,其中一半以上被归类到约3,000个家族中。新版本包括指向各种蛋白质家族/结构域和结构类数据库的链接,并包含所有ProSite模式的带空位基序比对。基序序列从SwissProt和PIR国际数据库中检索,包括由我们的GeneFIND家族识别系统检测到的众多新成员。基序集合比ProSite中编目的基序增加了50%。ProClass数据库可用于最大限度地检索家族信息、帮助组织蛋白质序列数据库以及支持全面的基因组注释。该数据库及其查询程序可从我们的万维网服务器http:/(/)diana.uthct.edu/proclass.html+ ++免费在线检索记录和直接文件传输。