识别蛋白质结构的折叠。

Recognizing the fold of a protein structure.

作者信息

Harrison Andrew, Pearl Frances, Sillitoe Ian, Slidel Tim, Mott Richard, Thornton Janet, Orengo Christine

机构信息

Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK.

出版信息

Bioinformatics. 2003 Sep 22;19(14):1748-59. doi: 10.1093/bioinformatics/btg240.

DOI:10.1093/bioinformatics/btg240

PMID:14512345

Abstract

This paper reports a graph-theoretic program, GRATH, that rapidly, and accurately, matches a novel structure against a library of domain structures to find the most similar ones. GRATH generates distributions of scores by comparing the novel domain against the different types of folds that have been classified previously in the CATH database of structural domains. GRATH uses a measure of similarity that details the geometric information, number of secondary structures and number of residues within secondary structures, that any two protein structures share. Although GRATH builds on well established approaches for secondary structure comparison, a novel scoring scheme has been introduced to allow ranking of any matches identified by the algorithm. More importantly, we have benchmarked the algorithm using a large dataset of 1702 non-redundant structures from the CATH database which have already been classified into fold groups, with manual validation. This has facilitated introduction of further constraints, optimization of parameters and identification of reliable thresholds for fold identification. Following these benchmarking trials, the correct fold can be identified with the top score with a frequency of 90%. It is identified within the ten most likely assignments with a frequency of 98%. GRATH has been implemented to use via a server (http://www.biochem.ucl.ac.uk/cgi-bin/cath/Grath.pl). GRATH's speed and accuracy means that it can be used as a reliable front-end filter for the more accurate, but computationally expensive, residue based structure comparison algorithm SSAP, currently used to classify domain structures in the CATH database. With an increasing number of structures being solved by the structural genomics initiatives, the GRATH server also provides an essential resource for determining whether newly determined structures are related to any known structures from which functional properties may be inferred.

摘要

本文报告了一个基于图论的程序GRATH，它能快速、准确地将新结构与结构域库进行匹配，以找出最相似的结构。GRATH通过将新的结构域与先前在CATH结构域数据库中分类的不同类型折叠进行比较，生成得分分布。GRATH使用一种相似性度量方法，该方法详细说明了任意两个蛋白质结构共有的几何信息、二级结构数量以及二级结构中的残基数量。尽管GRATH建立在成熟的二级结构比较方法之上，但引入了一种新颖的评分方案，以便对算法识别出的任何匹配项进行排名。更重要的是，我们使用来自CATH数据库的1702个非冗余结构的大型数据集对该算法进行了基准测试，这些结构已经被分类到折叠组中，并经过人工验证。这有助于引入进一步的约束条件、优化参数以及确定用于折叠识别的可靠阈值。经过这些基准测试，正确的折叠在得分最高时被识别的频率为90%。在最有可能的十个匹配项中被识别的频率为98%。GRATH已通过服务器（http://www.biochem.ucl.ac.uk/cgi-bin/cath/Grath.pl）实现使用。GRATH的速度和准确性意味着它可以用作更准确但计算成本高昂的基于残基的结构比较算法SSAP的可靠前端过滤器，目前SSAP用于对CATH数据库中的结构域进行分类。随着结构基因组学计划解析的结构数量不断增加，GRATH服务器还为确定新解析的结构是否与任何已知结构相关提供了重要资源，从已知结构中可以推断出功能特性。

相似文献

Recognizing the fold of a protein structure.

Bioinformatics. 2003 Sep 22;19(14):1748-59. doi: 10.1093/bioinformatics/btg240.

The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis.

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D247-51. doi: 10.1093/nar/gki024.

CATHEDRAL: a fast and effective algorithm to predict folds and domain boundaries from multidomain protein structures.

PLoS Comput Biol. 2007 Nov;3(11):e232. doi: 10.1371/journal.pcbi.0030232.

The CATH database: an extended protein family resource for structural and functional genomics.

Nucleic Acids Res. 2003 Jan 1;31(1):452-5. doi: 10.1093/nar/gkg062.

A rapid classification protocol for the CATH Domain Database to support structural genomics.

Nucleic Acids Res. 2001 Jan 1;29(1):223-7. doi: 10.1093/nar/29.1.223.

Cross-over between discrete and continuous protein structure space: insights into automatic classification and networks of protein structures.

PLoS Comput Biol. 2009 Mar;5(3):e1000331. doi: 10.1371/journal.pcbi.1000331. Epub 2009 Mar 27.

Classifying a protein in the CATH database of domain structures.

Acta Crystallogr D Biol Crystallogr. 1998 Nov 1;54(Pt 6 Pt 1):1155-67. doi: 10.1107/s0907444998007501.

A comprehensive and non-redundant database of protein domain movements.

Bioinformatics. 2005 Jun 15;21(12):2832-8. doi: 10.1093/bioinformatics/bti420. Epub 2005 Mar 31.

Measuring the similarity of protein structures by means of the universal similarity metric.

Bioinformatics. 2004 May 1;20(7):1015-21. doi: 10.1093/bioinformatics/bth031. Epub 2004 Jan 29.

A framework for protein structure classification and identification of novel protein structures.

BMC Bioinformatics. 2006 Oct 16;7:456. doi: 10.1186/1471-2105-7-456.

引用本文的文献

Deep generative models of protein structure uncover distant relationships across a continuous fold space.

Nat Commun. 2024 Sep 16;15(1):8094. doi: 10.1038/s41467-024-52020-2.

Draft Genome Assembly and Annotation for NICC30027, an Oleaginous Yeast Capable of Simultaneous Glucose and Xylose Assimilation.

Mycobiology. 2022 Feb 24;50(1):69-81. doi: 10.1080/12298093.2022.2038844. eCollection 2022.

Impact of structure space continuity on protein fold classification.

Sci Rep. 2016 Mar 23;6:23263. doi: 10.1038/srep23263.

The history of the CATH structural classification of protein domains.

Biochimie. 2015 Dec;119:209-17. doi: 10.1016/j.biochi.2015.08.004. Epub 2015 Aug 4.

A survey of computational intelligence techniques in protein function prediction.

Int J Proteomics. 2014;2014:845479. doi: 10.1155/2014/845479. Epub 2014 Dec 11.

INTEGRATING COMPUTATIONAL PROTEIN FUNCTION PREDICTION INTO DRUG DISCOVERY INITIATIVES.

Drug Dev Res. 2011 Feb;72(1):4-16. doi: 10.1002/ddr.20397.

Recent advances in functional region prediction by using structural and evolutionary information - Remaining problems and future extensions.

Comput Struct Biotechnol J. 2013 Dec 5;8:e201308007. doi: 10.5936/csbj.201308007. eCollection 2013.

Homology modeling and structural comparison of leucine rich repeats of Toll like receptors 1-10 of ruminants.

J Mol Model. 2013 Sep;19(9):3863-74. doi: 10.1007/s00894-013-1871-3. Epub 2013 Jun 28.

Exploring the limits of fold discrimination by structural alignment: a large scale benchmark using decoys of known fold.

Comput Biol Chem. 2011 Jun;35(3):174-88. doi: 10.1016/j.compbiolchem.2011.04.008. Epub 2011 May 13.

Alignment-free local structural search by writhe decomposition.

Bioinformatics. 2010 May 1;26(9):1176-84. doi: 10.1093/bioinformatics/btq127. Epub 2010 Apr 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

识别蛋白质结构的折叠。

Recognizing the fold of a protein structure.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献