Suppr超能文献

识别蛋白质结构的折叠。

Recognizing the fold of a protein structure.

作者信息

Harrison Andrew, Pearl Frances, Sillitoe Ian, Slidel Tim, Mott Richard, Thornton Janet, Orengo Christine

机构信息

Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK.

出版信息

Bioinformatics. 2003 Sep 22;19(14):1748-59. doi: 10.1093/bioinformatics/btg240.

Abstract

This paper reports a graph-theoretic program, GRATH, that rapidly, and accurately, matches a novel structure against a library of domain structures to find the most similar ones. GRATH generates distributions of scores by comparing the novel domain against the different types of folds that have been classified previously in the CATH database of structural domains. GRATH uses a measure of similarity that details the geometric information, number of secondary structures and number of residues within secondary structures, that any two protein structures share. Although GRATH builds on well established approaches for secondary structure comparison, a novel scoring scheme has been introduced to allow ranking of any matches identified by the algorithm. More importantly, we have benchmarked the algorithm using a large dataset of 1702 non-redundant structures from the CATH database which have already been classified into fold groups, with manual validation. This has facilitated introduction of further constraints, optimization of parameters and identification of reliable thresholds for fold identification. Following these benchmarking trials, the correct fold can be identified with the top score with a frequency of 90%. It is identified within the ten most likely assignments with a frequency of 98%. GRATH has been implemented to use via a server (http://www.biochem.ucl.ac.uk/cgi-bin/cath/Grath.pl). GRATH's speed and accuracy means that it can be used as a reliable front-end filter for the more accurate, but computationally expensive, residue based structure comparison algorithm SSAP, currently used to classify domain structures in the CATH database. With an increasing number of structures being solved by the structural genomics initiatives, the GRATH server also provides an essential resource for determining whether newly determined structures are related to any known structures from which functional properties may be inferred.

摘要

本文报告了一个基于图论的程序GRATH,它能快速、准确地将新结构与结构域库进行匹配,以找出最相似的结构。GRATH通过将新的结构域与先前在CATH结构域数据库中分类的不同类型折叠进行比较,生成得分分布。GRATH使用一种相似性度量方法,该方法详细说明了任意两个蛋白质结构共有的几何信息、二级结构数量以及二级结构中的残基数量。尽管GRATH建立在成熟的二级结构比较方法之上,但引入了一种新颖的评分方案,以便对算法识别出的任何匹配项进行排名。更重要的是,我们使用来自CATH数据库的1702个非冗余结构的大型数据集对该算法进行了基准测试,这些结构已经被分类到折叠组中,并经过人工验证。这有助于引入进一步的约束条件、优化参数以及确定用于折叠识别的可靠阈值。经过这些基准测试,正确的折叠在得分最高时被识别的频率为90%。在最有可能的十个匹配项中被识别的频率为98%。GRATH已通过服务器(http://www.biochem.ucl.ac.uk/cgi-bin/cath/Grath.pl)实现使用。GRATH的速度和准确性意味着它可以用作更准确但计算成本高昂的基于残基的结构比较算法SSAP的可靠前端过滤器,目前SSAP用于对CATH数据库中的结构域进行分类。随着结构基因组学计划解析的结构数量不断增加,GRATH服务器还为确定新解析的结构是否与任何已知结构相关提供了重要资源,从已知结构中可以推断出功能特性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验