2021 年的 RepeatsDB：改进了蛋白质串联重复结构的数据并扩展了分类。

RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures.

机构信息

Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy.

IBBM-CONICET, Dept. of Biological Sciences, La Plata National University, 49 y 115, 1900 La Plata, Argentina.

出版信息

Nucleic Acids Res. 2021 Jan 8;49(D1):D452-D457. doi: 10.1093/nar/gkaa1097.

DOI:10.1093/nar/gkaa1097

PMID:33237313

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7778985/

Abstract

The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures.

摘要

RepeatsDB 数据库（网址：https://repeatsdb.org/）提供了来自蛋白质数据库（PDB）的蛋白质串联重复结构的注释和分类。蛋白质串联重复在生命之树的所有分支中都普遍存在。已解决的重复结构的积累为分类和检测提供了新的可能性，但也增加了注释的需求。在这里，我们介绍了 RepeatsDB 3.0，它解决了这些挑战并提出了扩展的分类方案。与上一版本相比，主要的概念变化是层次分类，仅基于结构相似性（Class > Topology > Fold）将顶级组合在一起，同时结合了两个新级别（Clan > Family），需要序列相似性并与 Pfam 合作描述重复基序。通过改进浏览分类层次结构的机制解决了数据增长的问题。一个新的以 UniProt 为中心的视图统一了越来越频繁地对来自相同或相似序列的结构进行注释。RepeatsDB 的此次更新符合我们开发一个提取、组织和分发串联重复蛋白质结构专业信息的资源的承诺。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30c6/7778985/327a423d74cc/gkaa1097fig1.jpg

相似文献

RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures.2021 年的 RepeatsDB：改进了蛋白质串联重复结构的数据并扩展了分类。

Nucleic Acids Res. 2021 Jan 8;49(D1):D452-D457. doi: 10.1093/nar/gkaa1097.

RepeatsDB 2.0: improved annotation, classification, search and visualization of repeat protein structures.RepeatsDB 2.0：改进了重复蛋白结构的注释、分类、搜索和可视化。

Nucleic Acids Res. 2017 Jan 4;45(D1):D308-D312. doi: 10.1093/nar/gkw1136. Epub 2016 Nov 29.

RepeatsDB: a database of tandem repeat protein structures.RepeatsDB：串联重复蛋白结构数据库。

Nucleic Acids Res. 2014 Jan;42(Database issue):D352-7. doi: 10.1093/nar/gkt1175. Epub 2013 Dec 5.

RepeatsDB-lite: a web server for unit annotation of tandem repeat proteins.RepeatsDB-lite：串联重复蛋白单位注释的网络服务器。

Nucleic Acids Res. 2018 Jul 2;46(W1):W402-W407. doi: 10.1093/nar/gky360.

Comparison of protein repeat classifications based on structure and sequence families.基于结构和序列家族的蛋白质重复分类比较。

Biochem Soc Trans. 2015 Oct;43(5):832-7. doi: 10.1042/BST20150079.

The Pfam protein families database in 2019.2019 年 Pfam 蛋白质家族数据库。

Nucleic Acids Res. 2019 Jan 8;47(D1):D427-D432. doi: 10.1093/nar/gky995.

Identification of repetitive units in protein structures with ReUPred.使用ReUPred鉴定蛋白质结构中的重复单元。

Amino Acids. 2016 Jun;48(6):1391-400. doi: 10.1007/s00726-016-2187-2. Epub 2016 Feb 22.

Classification of β-hairpin repeat proteins.β-发夹重复蛋白的分类。

J Struct Biol. 2018 Feb;201(2):130-138. doi: 10.1016/j.jsb.2017.10.001. Epub 2017 Oct 7.

The repetitive structure of DNA clamps: An overlooked protein tandem repeat.DNA 夹子的重复结构：一种被忽视的蛋白质串联重复序列。

J Struct Biol. 2023 Sep;215(3):108001. doi: 10.1016/j.jsb.2023.108001. Epub 2023 Jul 17.

DbStRiPs: Database of structural repeats in proteins.DbStRiPs：蛋白质结构重复数据库。

Protein Sci. 2022 Jan;31(1):23-36. doi: 10.1002/pro.4052. Epub 2021 Mar 6.

引用本文的文献

A self-assembled protein β-helix as a self-contained biofunctional motif.一种作为独立生物功能基序的自组装蛋白质β-螺旋。

Nat Commun. 2025 May 15;16(1):4535. doi: 10.1038/s41467-025-59873-1.

Signalling by co-operative higher-order assembly formation: linking evidence at molecular and cellular levels.通过协同高阶组装形成进行信号传导：连接分子和细胞水平的证据

Biochem J. 2025 Mar 5;482(5):275-294. doi: 10.1042/BCJ20220094.

InterPro: the protein sequence classification resource in 2025.InterPro：2025年的蛋白质序列分类资源。

Nucleic Acids Res. 2025 Jan 6;53(D1):D444-D456. doi: 10.1093/nar/gkae1082.

Discovery and Analysis of Repeat and Low-Complexity Architectures in Proteins and Their Conserved Evolutionary Relationships Using Self-Homology Dot Plots.使用自同源点图发现和分析蛋白质中的重复和低复杂度结构及其保守的进化关系。

Methods Mol Biol. 2025;2870:95-116. doi: 10.1007/978-1-0716-4213-9_7.

The 3D Invariant Positioning for Protein Molecules / Molecular Complexes with Matching Subunits.具有匹配亚基的蛋白质分子/分子复合物的三维不变定位。

Methods Mol Biol. 2025;2870:41-50. doi: 10.1007/978-1-0716-4213-9_3.

The Pfam protein families database: embracing AI/ML.Pfam蛋白质家族数据库：拥抱人工智能/机器学习。

Nucleic Acids Res. 2025 Jan 6;53(D1):D523-D534. doi: 10.1093/nar/gkae997.

RepeatsDB in 2025: expanding annotations of structured tandem repeats proteins on AlphaFoldDB.2025年的重复序列数据库：在AlphaFoldDB上扩展结构化串联重复序列蛋白的注释

Nucleic Acids Res. 2025 Jan 6;53(D1):D575-D581. doi: 10.1093/nar/gkae965.

Diversity and structural-functional insights of alpha-solenoid proteins.α-螺旋蛋白的多样性和结构功能见解。

Protein Sci. 2024 Nov;33(11):e5189. doi: 10.1002/pro.5189.

Tandem-repeat lectins: structural and functional insights.串联重复凝集素：结构与功能的见解。

Glycobiology. 2024 May 26;34(7). doi: 10.1093/glycob/cwae041.

Structured Tandem Repeats in Protein Interactions.蛋白质相互作用中的结构化串联重复序列。

Int J Mol Sci. 2024 Mar 5;25(5):2994. doi: 10.3390/ijms25052994.

本文引用的文献

A novel approach to investigate the evolution of structured tandem repeat protein families by exon duplication.通过外显子重复来研究结构串联重复蛋白家族进化的新方法。

J Struct Biol. 2020 Nov 1;212(2):107608. doi: 10.1016/j.jsb.2020.107608. Epub 2020 Sep 5.

Large Ankyrin repeat proteins are formed with similar and energetically favorable units.大锚蛋白重复蛋白由相似且能量有利的结构域形成。

PLoS One. 2020 Jun 24;15(6):e0233865. doi: 10.1371/journal.pone.0233865. eCollection 2020.

Self-analysis of repeat proteins reveals evolutionarily conserved patterns.重复蛋白质的自我分析揭示了进化上保守的模式。

BMC Bioinformatics. 2020 May 7;21(1):179. doi: 10.1186/s12859-020-3493-y.

A New Census of Protein Tandem Repeats and Their Relationship with Intrinsic Disorder.蛋白质串联重复及其与固有无序性的关系的新普查。

Genes (Basel). 2020 Apr 9;11(4):407. doi: 10.3390/genes11040407.

The Feature-Viewer: a visualization tool for positional annotations on a sequence.特征查看器：用于序列中位置注释的可视化工具。

Bioinformatics. 2020 May 1;36(10):3244-3245. doi: 10.1093/bioinformatics/btaa055.

The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures.2020 年的 SCOP 数据库：已知蛋白质结构的代表性家族和超家族域的扩展分类。

Nucleic Acids Res. 2020 Jan 8;48(D1):D376-D382. doi: 10.1093/nar/gkz1064.

MemSTATS: A Benchmark Set of Membrane Protein Symmetries and Pseudosymmetries.MemSTATS：一个膜蛋白对称和拟对称基准数据集。

J Mol Biol. 2020 Jan 17;432(2):597-604. doi: 10.1016/j.jmb.2019.09.020. Epub 2019 Oct 16.

Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases.串联重复导致序列组装错误，并对基因组和蛋白质数据库提出了多层次的挑战。

Nucleic Acids Res. 2019 Dec 2;47(21):10994-11006. doi: 10.1093/nar/gkz841.

Analyzing the symmetrical arrangement of structural repeats in proteins with CE-Symm.使用 CE-Symm 分析蛋白质结构重复的对称排列。

PLoS Comput Biol. 2019 Apr 22;15(4):e1006842. doi: 10.1371/journal.pcbi.1006842. eCollection 2019 Apr.

SIFTS: updated Structure Integration with Function, Taxonomy and Sequences resource allows 40-fold increase in coverage of structure-based annotations for proteins.SIFTS：更新后的结构整合功能、分类学和序列资源允许基于结构注释的蛋白质覆盖率增加 40 倍。

Nucleic Acids Res. 2019 Jan 8;47(D1):D482-D489. doi: 10.1093/nar/gky1114.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

2021 年的 RepeatsDB：改进了蛋白质串联重复结构的数据并扩展了分类。

RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献