将蛋白质数据银行(PDB)链映射到通用蛋白质知识库(UniProtKB)条目。
Mapping PDB chains to UniProtKB entries.
作者信息
Martin Andrew C R
机构信息
Department of Biochemistry and Molecular Biology, University College London Gower Street, London WC1E 6BT, UK.
出版信息
Bioinformatics. 2005 Dec 1;21(23):4297-301. doi: 10.1093/bioinformatics/bti694. Epub 2005 Sep 27.
MOTIVATION
UniProtKB/SwissProt is the main resource for detailed annotations of protein sequences. This database provides a jumping-off point to many other resources through the links it provides. Among others, these include other primary databases, secondary databases, the Gene Ontology and OMIM. While a large number of links are provided to Protein Data Bank (PDB) files, obtaining a regularly updated mapping between UniProtKB entries and PDB entries at the chain or residue level is not straightforward. In particular, there is no regularly updated resource which allows a UniProtKB/SwissProt entry to be identified for a given residue of a PDB file.
RESULTS
We have created a completely automatically maintained database which maps PDB residues to residues in UniProtKB/SwissProt and UniProtKB/trEMBL entries. The protocol uses links from PDB to UniProtKB, from UniProtKB to PDB and a brute-force sequence scan to resolve PDB chains for which no annotated link is available. Finally the sequences from PDB and UniProtKB are aligned to obtain a residue-level mapping.
AVAILABILITY
The resource may be queried interactively or downloaded from http://www.bioinf.org.uk/pdbsws/.
动机
UniProtKB/SwissProt是蛋白质序列详细注释的主要资源。该数据库通过提供的链接为许多其他资源提供了一个起点。其中包括其他主要数据库、二级数据库、基因本体论和在线人类孟德尔遗传(OMIM)。虽然提供了大量指向蛋白质数据银行(PDB)文件的链接,但要在链或残基水平上获得UniProtKB条目与PDB条目之间定期更新的映射并非易事。特别是,没有一个定期更新的资源能够根据PDB文件的给定残基识别出UniProtKB/SwissProt条目。
结果
我们创建了一个完全自动维护的数据库,该数据库将PDB残基映射到UniProtKB/SwissProt和UniProtKB/trEMBL条目中的残基。该协议使用从PDB到UniProtKB、从UniProtKB到PDB的链接以及蛮力序列扫描来解析没有注释链接的PDB链。最后,将PDB和UniProtKB的序列进行比对以获得残基水平的映射。
可用性
该资源可以通过交互式查询,也可以从http://www.bioinf.org.uk/pdbsws/下载。