Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy.
IBBM-CONICET, Dept. of Biological Sciences, La Plata National University, 49 y 115, 1900 La Plata, Argentina.
Nucleic Acids Res. 2021 Jan 8;49(D1):D452-D457. doi: 10.1093/nar/gkaa1097.
The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures.
RepeatsDB 数据库(网址:https://repeatsdb.org/)提供了来自蛋白质数据库(PDB)的蛋白质串联重复结构的注释和分类。蛋白质串联重复在生命之树的所有分支中都普遍存在。已解决的重复结构的积累为分类和检测提供了新的可能性,但也增加了注释的需求。在这里,我们介绍了 RepeatsDB 3.0,它解决了这些挑战并提出了扩展的分类方案。与上一版本相比,主要的概念变化是层次分类,仅基于结构相似性(Class > Topology > Fold)将顶级组合在一起,同时结合了两个新级别(Clan > Family),需要序列相似性并与 Pfam 合作描述重复基序。通过改进浏览分类层次结构的机制解决了数据增长的问题。一个新的以 UniProt 为中心的视图统一了越来越频繁地对来自相同或相似序列的结构进行注释。RepeatsDB 的此次更新符合我们开发一个提取、组织和分发串联重复蛋白质结构专业信息的资源的承诺。