Suppr超能文献

PASS2:一个半自动化的蛋白质比对数据库,按结构超家族组织。

PASS2: a semi-automated database of protein alignments organised as structural superfamilies.

作者信息

Mallika V, Bhaduri Anirban, Sowdhamini R

机构信息

National Centre for Biological Sciences, UAS-GKVK Campus, Bangalore 560 065, India.

出版信息

Nucleic Acids Res. 2002 Jan 1;30(1):284-8. doi: 10.1093/nar/30.1.284.

Abstract

PASS2 is a nearly automated version of CAMPASS and contains sequence alignments of proteins grouped at the level of superfamilies. This database has been created to fall in correspondence with SCOP database (1.53 release) and currently consists of 110 multi-member superfamilies and 613 superfamilies corresponding to single members. In multi-member superfamilies, protein chains with no more than 25% sequence identity have been considered for the alignment and hence the database aims to address sequence alignments which represent 26 219 protein domains under the SCOP 1.53 release. Structure-based sequence alignments have been obtained by COMPARER and the initial equivalences are provided automatically from a MALIGN alignment and subsequently augmented using STAMP4.0. The final sequence alignments have been annotated for the structural features using JOY4.0. Several interesting links are provided to other related databases and genome sequence relatives. Availability of reliable sequence alignments of distantly related proteins, despite poor sequence identity and single-member superfamilies, permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. The database can be queried by keywords and also by sequence search, interfaced by PSI-BLAST methods. Structure-annotated sequence alignments and several structural accessory files can be retrieved for all the superfamilies including the user-input sequence. The database can be accessed from http://www.ncbs.res.in/%7Efaculty/mini/campass/pass.html.

摘要

PASS2是CAMPASS的一个近乎自动化的版本,包含按超家族水平分组的蛋白质序列比对。创建这个数据库是为了与SCOP数据库(1.53版本)相对应,目前由110个多成员超家族和613个对应单成员的超家族组成。在多成员超家族中,序列同一性不超过25%的蛋白质链被用于比对,因此该数据库旨在处理代表SCOP 1.53版本下26219个蛋白质结构域的序列比对。基于结构的序列比对由COMPARER获得,初始等效性自动从MALIGN比对中提供,随后使用STAMP4.0进行扩充。最终的序列比对已使用JOY4.0对结构特征进行了注释。提供了几个到其他相关数据库和基因组序列亲属的有趣链接。尽管序列同一性较差且存在单成员超家族,但可靠的远缘相关蛋白质序列比对的可用性允许在文库中更好地对结构进行采样,以用于新序列的折叠识别以及理解各个超家族的蛋白质结构-功能关系。该数据库可以通过关键词查询,也可以通过序列搜索进行查询,通过PSI-BLAST方法进行接口。可以检索包括用户输入序列在内的所有超家族的结构注释序列比对和几个结构辅助文件。该数据库可从http://www.ncbs.res.in/%7Efaculty/mini/campass/pass.html访问。

相似文献

1
PASS2: a semi-automated database of protein alignments organised as structural superfamilies.
Nucleic Acids Res. 2002 Jan 1;30(1):284-8. doi: 10.1093/nar/30.1.284.
2
PASS2: an automated database of protein alignments organised as structural superfamilies.
BMC Bioinformatics. 2004 Apr 2;5:35. doi: 10.1186/1471-2105-5-35.
4
PASS2 version 4: an update to the database of structure-based sequence alignments of structural domain superfamilies.
Nucleic Acids Res. 2012 Jan;40(Database issue):D531-4. doi: 10.1093/nar/gkr1096. Epub 2011 Nov 28.
8
S4: structure-based sequence alignments of SCOP superfamilies.
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D219-22. doi: 10.1093/nar/gki043.
9
ProClass Protein Family Database.
Nucleic Acids Res. 1999 Jan 1;27(1):272-4. doi: 10.1093/nar/27.1.272.
10
GenDiS: Genomic Distribution of protein structural domain Superfamilies.
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D252-5. doi: 10.1093/nar/gki087.

引用本文的文献

5
Rebelling for a reason: protein structural "outliers".
PLoS One. 2013 Sep 20;8(9):e74416. doi: 10.1371/journal.pone.0074416. eCollection 2013.
6
PASS2 version 4: an update to the database of structure-based sequence alignments of structural domain superfamilies.
Nucleic Acids Res. 2012 Jan;40(Database issue):D531-4. doi: 10.1093/nar/gkr1096. Epub 2011 Nov 28.
7
Computational Biology and Bioinformatics: a tinge of Indian spice.
Bioinformation. 2006 Feb 28;1(3):105-9. doi: 10.6026/97320630001105.
9
S4: structure-based sequence alignments of SCOP superfamilies.
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D219-22. doi: 10.1093/nar/gki043.
10
Improvement of alignment accuracy utilizing sequentially conserved motifs.
BMC Bioinformatics. 2004 Oct 28;5:167. doi: 10.1186/1471-2105-5-167.

本文引用的文献

1
3Dee: a database of protein structural domains.
Bioinformatics. 2001 Feb;17(2):200-1. doi: 10.1093/bioinformatics/17.2.200.
3
PALI-a database of Phylogeny and ALIgnment of homologous protein structures.
Nucleic Acids Res. 2001 Jan 1;29(1):61-5. doi: 10.1093/nar/29.1.61.
4
Assigning genomic sequences to CATH.
Nucleic Acids Res. 2000 Jan 1;28(1):277-82. doi: 10.1093/nar/28.1.277.
5
The SYSTERS protein sequence cluster set.
Nucleic Acids Res. 2000 Jan 1;28(1):270-2. doi: 10.1093/nar/28.1.270.
6
The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000.
Nucleic Acids Res. 2000 Jan 1;28(1):45-8. doi: 10.1093/nar/28.1.45.
7
CAMPASS: a database of structurally aligned protein superfamilies.
Structure. 1998 Sep 15;6(9):1087-94. doi: 10.1016/s0969-2126(98)00110-5.
8
JOY: protein sequence-structure representation and analysis.
Bioinformatics. 1998;14(7):617-23. doi: 10.1093/bioinformatics/14.7.617.
9
Intermediate sequences increase the detection of homology between sequences.
J Mol Biol. 1997 Oct 17;273(1):349-54. doi: 10.1006/jmbi.1997.1288.
10
CATH--a hierarchic classification of protein domain structures.
Structure. 1997 Aug 15;5(8):1093-108. doi: 10.1016/s0969-2126(97)00260-8.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验