Max Planck Institute for Molecular Plant Physiology, Bioinformatics, Am Muehlenberg 1, 14476 Potsdam-Golm, Germany.
Nucleic Acids Res. 2010 Jan;38(Database issue):D326-30. doi: 10.1093/nar/gkp980. Epub 2009 Nov 11.
With growing amount of experimental data, the number of known protein structures also increases continuously. Classification of protein structures helps to understand relationships between protein structure and function. The main classification methods based on secondary structures are SCOP, CATH and TOPS, which all classify under different aspects, and therefore can lead to different results. We developed a mathematically unique representation of protein structure topologies at a higher abstraction level providing new aspects of classification and enabling for a fast search through the data. Protein Topology Graph Library (PTGL; http://ptgl.zib.de) aims at providing a database on protein secondary structure topologies, including search facilities, the visualization as intuitive topology diagrams as well as in the 3D structure, and additional information. Secondary structure-based protein topologies are represented uniquely as undirected labeled graphs in four different ways allowing for exploration under different aspects. The linear notations, and the 2D and 3D diagrams of each notation facilitate a deeper understanding of protein topologies. Several search functions for topologies and sub-topologies, BLAST search possibility, and links to SCOP, CATH and PDBsum support individual and large-scale investigation of protein structures. Currently, PTGL comprises topologies of 54,859 protein structures. Main structural patterns for common structural motifs like TIM-barrel or Jelly Roll are pre-implemented, and can easily be searched.
随着实验数据的不断增加,已知的蛋白质结构数量也在不断增加。蛋白质结构的分类有助于理解蛋白质结构与功能之间的关系。基于二级结构的主要分类方法有 SCOP、CATH 和 TOPS,它们都是从不同的方面进行分类,因此可能会得到不同的结果。我们开发了一种在更高抽象层次上对蛋白质结构拓扑结构进行数学表示的独特方法,提供了分类的新视角,并能够快速搜索数据。蛋白质拓扑图库(PTGL;http://ptgl.zib.de)旨在提供一个关于蛋白质二级结构拓扑的数据库,包括搜索功能、直观拓扑图的可视化以及 3D 结构和其他信息。基于二级结构的蛋白质拓扑结构以四种不同的方式唯一表示为无向标记图,允许从不同的角度进行探索。每种表示法的线性符号、2D 和 3D 图都有助于深入理解蛋白质拓扑结构。几种拓扑和子拓扑的搜索功能、BLAST 搜索可能性以及与 SCOP、CATH 和 PDBsum 的链接支持对蛋白质结构进行个体和大规模的研究。目前,PTGL 包含了 54859 个蛋白质结构的拓扑结构。常见结构基序(如 TIM 桶或果冻卷)的主要结构模式已经预先实现,可以轻松搜索。