Suppr超能文献

基于局部相似结构的重复对蛋白质结构域进行分配。

Protein domain assignment from the recurrence of locally similar structures.

机构信息

Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.

出版信息

Proteins. 2011 Mar;79(3):853-66. doi: 10.1002/prot.22923. Epub 2010 Dec 22.

Abstract

Domains are basic units of protein structure and essential for exploring protein fold space and structure evolution. With the structural genomics initiative, the number of protein structures in the Protein Databank (PDB) is increasing dramatically and domain assignments need to be done automatically. Most existing structural domain assignment programs define domains using the compactness of the domains and/or the number and strength of intra-domain versus inter-domain contacts. Here we present a different approach based on the recurrence of locally similar structural pieces (LSSPs) found by one-against-all structure comparisons with a dataset of 6373 protein chains from the PDB. Residues of the query protein are clustered using LSSPs via three different procedures to define domains. This approach gives results that are comparable to several existing programs that use geometrical and other structural information explicitly. Remarkably, most of the proteins that contribute the LSSPs defining a domain do not themselves contain the domain of interest. This study shows that domains can be defined by a collection of relatively small locally similar structural pieces containing, on average, four secondary structure elements. In addition, it indicates that domains are indeed made of recurrent small structural pieces that are used to build protein structures of many different folds as suggested by recent studies.

摘要

结构域是蛋白质结构的基本单位,对于探索蛋白质折叠空间和结构进化至关重要。随着结构基因组学计划的推进,蛋白质数据库(PDB)中的蛋白质结构数量正在急剧增加,因此需要自动进行结构域分配。大多数现有的结构域分配程序使用结构域的紧凑性和/或域内与域间接触的数量和强度来定义结构域。在这里,我们提出了一种基于通过与来自 PDB 的 6373 个蛋白质链的数据集进行一对一结构比较找到的局部相似结构片段(LSSP)的重复出现的不同方法。通过三种不同的程序,使用 LSSP 将查询蛋白质的残基聚类以定义结构域。该方法的结果可与使用几何和其他结构信息显式的几种现有程序相媲美。值得注意的是,定义一个结构域的 LSSP 所涉及的大多数蛋白质本身并不包含感兴趣的结构域。这项研究表明,结构域可以通过包含平均四个二级结构元件的相对较小的局部相似结构片段的集合来定义。此外,它表明结构域确实是由重复出现的小结构片段组成的,正如最近的研究表明的那样,这些小结构片段用于构建许多不同折叠的蛋白质结构。

相似文献

5
Toward consistent assignment of structural domains in proteins.迈向蛋白质结构域的一致分配
J Mol Biol. 2004 Jun 4;339(3):647-78. doi: 10.1016/j.jmb.2004.03.053.
6
Intrinsic disorder in the Protein Data Bank.蛋白质数据库中的内在无序状态。
J Biomol Struct Dyn. 2007 Feb;24(4):325-42. doi: 10.1080/07391102.2007.10507123.
10
Identification and analysis of domains in proteins.蛋白质中结构域的鉴定与分析。
Protein Eng. 1995 Jun;8(6):513-25. doi: 10.1093/protein/8.6.513.

引用本文的文献

3
DALI and the persistence of protein shape.DALI 与蛋白质构象的稳定性。
Protein Sci. 2020 Jan;29(1):128-140. doi: 10.1002/pro.3749. Epub 2019 Nov 5.
6
7
A thermodynamic definition of protein domains.蛋白质结构域的热力学定义。
Proc Natl Acad Sci U S A. 2012 Jun 12;109(24):9420-5. doi: 10.1073/pnas.1202604109. Epub 2012 May 25.

本文引用的文献

3
Evolutionary transitions in protein fold space.蛋白质折叠空间中的进化转变。
Curr Opin Struct Biol. 2007 Jun;17(3):354-61. doi: 10.1016/j.sbi.2007.06.002. Epub 2007 Jun 18.
4
Partitioning protein structures into domains: why is it so difficult?将蛋白质结构划分为结构域:为何如此困难?
J Mol Biol. 2006 Aug 18;361(3):562-90. doi: 10.1016/j.jmb.2006.05.060. Epub 2006 Jun 22.
5
Evolution of protein fold in the presence of functional constraints.功能限制条件下蛋白质折叠的进化
Curr Opin Struct Biol. 2006 Jun;16(3):399-408. doi: 10.1016/j.sbi.2006.04.003. Epub 2006 May 2.
7
On the origin and highly likely completeness of single-domain protein structures.关于单结构域蛋白质结构的起源及极有可能的完整性
Proc Natl Acad Sci U S A. 2006 Feb 21;103(8):2605-10. doi: 10.1073/pnas.0509379103. Epub 2006 Feb 14.
9
Evaluation of domain prediction in CASP6.CASP6中结构域预测的评估
Proteins. 2005;61 Suppl 7:183-192. doi: 10.1002/prot.20736.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验