Suppr超能文献

基于局部相似结构的重复对蛋白质结构域进行分配。

Protein domain assignment from the recurrence of locally similar structures.

机构信息

Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.

出版信息

Proteins. 2011 Mar;79(3):853-66. doi: 10.1002/prot.22923. Epub 2010 Dec 22.

Abstract

Domains are basic units of protein structure and essential for exploring protein fold space and structure evolution. With the structural genomics initiative, the number of protein structures in the Protein Databank (PDB) is increasing dramatically and domain assignments need to be done automatically. Most existing structural domain assignment programs define domains using the compactness of the domains and/or the number and strength of intra-domain versus inter-domain contacts. Here we present a different approach based on the recurrence of locally similar structural pieces (LSSPs) found by one-against-all structure comparisons with a dataset of 6373 protein chains from the PDB. Residues of the query protein are clustered using LSSPs via three different procedures to define domains. This approach gives results that are comparable to several existing programs that use geometrical and other structural information explicitly. Remarkably, most of the proteins that contribute the LSSPs defining a domain do not themselves contain the domain of interest. This study shows that domains can be defined by a collection of relatively small locally similar structural pieces containing, on average, four secondary structure elements. In addition, it indicates that domains are indeed made of recurrent small structural pieces that are used to build protein structures of many different folds as suggested by recent studies.

摘要

结构域是蛋白质结构的基本单位,对于探索蛋白质折叠空间和结构进化至关重要。随着结构基因组学计划的推进,蛋白质数据库(PDB)中的蛋白质结构数量正在急剧增加,因此需要自动进行结构域分配。大多数现有的结构域分配程序使用结构域的紧凑性和/或域内与域间接触的数量和强度来定义结构域。在这里,我们提出了一种基于通过与来自 PDB 的 6373 个蛋白质链的数据集进行一对一结构比较找到的局部相似结构片段(LSSP)的重复出现的不同方法。通过三种不同的程序,使用 LSSP 将查询蛋白质的残基聚类以定义结构域。该方法的结果可与使用几何和其他结构信息显式的几种现有程序相媲美。值得注意的是,定义一个结构域的 LSSP 所涉及的大多数蛋白质本身并不包含感兴趣的结构域。这项研究表明,结构域可以通过包含平均四个二级结构元件的相对较小的局部相似结构片段的集合来定义。此外,它表明结构域确实是由重复出现的小结构片段组成的,正如最近的研究表明的那样,这些小结构片段用于构建许多不同折叠的蛋白质结构。

相似文献

1
Protein domain assignment from the recurrence of locally similar structures.
Proteins. 2011 Mar;79(3):853-66. doi: 10.1002/prot.22923. Epub 2010 Dec 22.
5
Toward consistent assignment of structural domains in proteins.
J Mol Biol. 2004 Jun 4;339(3):647-78. doi: 10.1016/j.jmb.2004.03.053.
6
Intrinsic disorder in the Protein Data Bank.
J Biomol Struct Dyn. 2007 Feb;24(4):325-42. doi: 10.1080/07391102.2007.10507123.
8
A 3D sequence-independent representation of the protein data bank.
Protein Eng. 1995 Oct;8(10):981-97. doi: 10.1093/protein/8.10.981.
9
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
10
Identification and analysis of domains in proteins.
Protein Eng. 1995 Jun;8(6):513-25. doi: 10.1093/protein/8.6.513.

引用本文的文献

1
A unified evolutionary origin for the ubiquitous protein transporters SecY and YidC.
BMC Biol. 2021 Dec 15;19(1):266. doi: 10.1186/s12915-021-01171-5.
2
A conserved Neurite Outgrowth and Guidance motif with biomimetic potential in neuronal Cell Adhesion Molecules.
Comput Struct Biotechnol J. 2021 Oct 12;19:5622-5636. doi: 10.1016/j.csbj.2021.10.005. eCollection 2021.
3
DALI and the persistence of protein shape.
Protein Sci. 2020 Jan;29(1):128-140. doi: 10.1002/pro.3749. Epub 2019 Nov 5.
5
DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.
PLoS One. 2013 Apr 11;8(4):e60559. doi: 10.1371/journal.pone.0060559. Print 2013.
6
Prediction of protein domain boundaries from inverse covariances.
Proteins. 2013 Feb;81(2):253-60. doi: 10.1002/prot.24181. Epub 2012 Oct 16.
7
A thermodynamic definition of protein domains.
Proc Natl Acad Sci U S A. 2012 Jun 12;109(24):9420-5. doi: 10.1073/pnas.1202604109. Epub 2012 May 25.
8
DOMIRE: a web server for identifying structural domains and their neighbors in proteins.
Bioinformatics. 2012 Apr 1;28(7):1040-1. doi: 10.1093/bioinformatics/bts076. Epub 2012 Feb 15.

本文引用的文献

1
Structural relationships among proteins with different global topologies and their implications for function annotation strategies.
Proc Natl Acad Sci U S A. 2009 Oct 13;106(41):17377-82. doi: 10.1073/pnas.0907971106. Epub 2009 Sep 24.
2
Is protein classification necessary? Toward alternative approaches to function annotation.
Curr Opin Struct Biol. 2009 Jun;19(3):363-8. doi: 10.1016/j.sbi.2009.02.001. Epub 2009 Mar 5.
3
Evolutionary transitions in protein fold space.
Curr Opin Struct Biol. 2007 Jun;17(3):354-61. doi: 10.1016/j.sbi.2007.06.002. Epub 2007 Jun 18.
4
Partitioning protein structures into domains: why is it so difficult?
J Mol Biol. 2006 Aug 18;361(3):562-90. doi: 10.1016/j.jmb.2006.05.060. Epub 2006 Jun 22.
5
Evolution of protein fold in the presence of functional constraints.
Curr Opin Struct Biol. 2006 Jun;16(3):399-408. doi: 10.1016/j.sbi.2006.04.003. Epub 2006 May 2.
7
On the origin and highly likely completeness of single-domain protein structures.
Proc Natl Acad Sci U S A. 2006 Feb 21;103(8):2605-10. doi: 10.1073/pnas.0509379103. Epub 2006 Feb 14.
8
Less is more: towards an optimal universal description of protein folds.
Bioinformatics. 2005 Sep 1;21 Suppl 2:ii66-71. doi: 10.1093/bioinformatics/bti1111.
9
Evaluation of domain prediction in CASP6.
Proteins. 2005;61 Suppl 7:183-192. doi: 10.1002/prot.20736.
10
Domain definition and target classification for CASP6.
Proteins. 2005;61 Suppl 7:8-18. doi: 10.1002/prot.20717.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验