Suppr超能文献

连续和不连续结构域:一种自动生成可靠蛋白质结构域定义的算法

Continuous and discontinuous domains: an algorithm for the automatic generation of reliable protein domain definitions.

作者信息

Siddiqui A S, Barton G J

机构信息

Laboratory of Molecular Biophysics, University of Oxford, United Kingdom.

出版信息

Protein Sci. 1995 May;4(5):872-84. doi: 10.1002/pro.5560040507.

Abstract

An algorithm is presented for the fast and accurate definition of protein structural domains from coordinate data without prior knowledge of the number or type of domains. The algorithm explicitly locates domains that comprise one or two continuous segments of protein chain. Domains that include more than two segments are also located. The algorithm was applied to a nonredundant database of 230 protein structures and the results compared to domain definitions obtained from the literature, or by inspection of the coordinates on molecular graphics. For 70% of the proteins, the derived domains agree with the reference definitions, 18% show minor differences and only 12% (28 proteins) show very different definitions. Three screens were applied to identify the derived domains least likely to agree with the subjective definition set. These screens revealed a set of 173 proteins, 97% of which agree well with the subjective definitions. The algorithm represents a practical domain identification tool that can be run routinely on the entire structural database. Adjustment of parameters also allows smaller compact units to be identified in proteins.

摘要

本文提出了一种算法,可在无需事先了解结构域数量或类型的情况下,根据坐标数据快速准确地定义蛋白质结构域。该算法能明确找出由蛋白质链的一个或两个连续片段组成的结构域,也能定位包含两个以上片段的结构域。该算法应用于一个包含230个蛋白质结构的非冗余数据库,并将结果与从文献中获取的结构域定义或通过检查分子图形坐标得到的定义进行比较。对于70%的蛋白质,推导得到的结构域与参考定义相符,18%的蛋白质显示出微小差异,只有12%(28种蛋白质)的蛋白质显示出非常不同的定义。应用了三个筛选标准来识别最不可能与主观定义集相符的推导结构域。这些筛选标准揭示了一组173种蛋白质,其中97%与主观定义相符。该算法是一种实用的结构域识别工具,可在整个结构数据库上常规运行。参数调整还能使在蛋白质中识别出更小的紧密单元。

相似文献

2
Inferring boundary information of discontinuous-domain proteins.
IEEE Trans Nanobioscience. 2008 Sep;7(3):200-5. doi: 10.1109/TNB.2008.2002283.
6
SnapDRAGON: a method to delineate protein structural domains from sequence data.
J Mol Biol. 2002 Feb 22;316(3):839-51. doi: 10.1006/jmbi.2001.5387.
7
Automatic domain decomposition of proteins by a Gaussian Network Model.
Proteins. 2004 Dec 1;57(4):725-33. doi: 10.1002/prot.20268.
9
Toward consistent assignment of structural domains in proteins.
J Mol Biol. 2004 Jun 4;339(3):647-78. doi: 10.1016/j.jmb.2004.03.053.

引用本文的文献

1
Merizo: a rapid and accurate protein domain segmentation method using invariant point attention.
Nat Commun. 2023 Dec 19;14(1):8445. doi: 10.1038/s41467-023-43934-4.
2
Assignment of structural domains in proteins using diffusion kernels on graphs.
BMC Bioinformatics. 2022 Sep 8;23(1):369. doi: 10.1186/s12859-022-04902-9.
3
Extending the Horizon of Homology Detection with Coevolution-based Structure Prediction.
J Mol Biol. 2021 Oct 1;433(20):167106. doi: 10.1016/j.jmb.2021.167106. Epub 2021 Jun 15.
4
Co-evolution techniques are reshaping the way we do structural bioinformatics.
F1000Res. 2017 Jul 25;6:1224. doi: 10.12688/f1000research.11543.1. eCollection 2017.
5
An ambiguity principle for assigning protein structural domains.
Sci Adv. 2017 Jan 13;3(1):e1600552. doi: 10.1126/sciadv.1600552. eCollection 2017 Jan.
6
Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains.
PLoS One. 2015 Oct 26;10(10):e0141541. doi: 10.1371/journal.pone.0141541. eCollection 2015.
7
The history of the CATH structural classification of protein domains.
Biochimie. 2015 Dec;119:209-17. doi: 10.1016/j.biochi.2015.08.004. Epub 2015 Aug 4.
8
SCEDS: protein fragments for molecular replacement in Phaser.
Acta Crystallogr D Biol Crystallogr. 2013 Nov;69(Pt 11):2216-25. doi: 10.1107/S0907444913021811. Epub 2013 Oct 4.
9
IS-Dom: a dataset of independent structural domains automatically delineated from protein structures.
J Comput Aided Mol Des. 2013 May;27(5):419-26. doi: 10.1007/s10822-013-9654-6. Epub 2013 May 29.
10
Mutual information and variants for protein domain-domain contact prediction.
BMC Res Notes. 2012 Aug 31;5:472. doi: 10.1186/1756-0500-5-472.

本文引用的文献

1
An empirical energy function for threading protein sequence through the folding motif.
Proteins. 1993 May;16(1):92-112. doi: 10.1002/prot.340160110.
2
Crystal structure of glucose oxidase from Aspergillus niger refined at 2.3 A resolution.
J Mol Biol. 1993 Jan 5;229(1):153-72. doi: 10.1006/jmbi.1993.1015.
3
Binary discontinuous compact protein domains.
Protein Eng. 1994 Mar;7(3):335-40. doi: 10.1093/protein/7.3.335.
4
Parser for protein folding units.
Proteins. 1994 Jul;19(3):256-68. doi: 10.1002/prot.340190309.
5
Domain insertion.
Protein Eng. 1994 Dec;7(12):1407-10. doi: 10.1093/protein/7.12.1407.
6
Location of structural domains in protein.
Biochemistry. 1981 Nov 10;20(23):6544-52. doi: 10.1021/bi00526a005.
8
Modular structural units, exons, and function in chicken lysozyme.
Proc Natl Acad Sci U S A. 1983 Apr;80(7):1964-8. doi: 10.1073/pnas.80.7.1964.
9
Nucleation, rapid folding, and globular intrachain regions in proteins.
Proc Natl Acad Sci U S A. 1973 Mar;70(3):697-701. doi: 10.1073/pnas.70.3.697.
10
Compact units in proteins.
Biochemistry. 1986 Sep 23;25(19):5759-65. doi: 10.1021/bi00367a062.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验