Suppr超能文献

一种蛋白质结构分类及新型蛋白质结构识别的框架。

A framework for protein structure classification and identification of novel protein structures.

作者信息

Kim You Jung, Patel Jignesh M

机构信息

Computer Science and Engineering, University of Michigan, Ann Arbor, USA.

出版信息

BMC Bioinformatics. 2006 Oct 16;7:456. doi: 10.1186/1471-2105-7-456.

Abstract

BACKGROUND

Protein structure classification plays a central role in understanding the function of a protein molecule with respect to all known proteins in a structure database. With the rapid increase in the number of new protein structures, the need for automated and accurate methods for protein classification is increasingly important.

RESULTS

In this paper we present a unified framework for protein structure classification and identification of novel protein structures. The framework consists of a set of components for comparing, classifying, and clustering protein structures. These components allow us to accurately classify proteins into known folds, to detect new protein folds, and to provide a way of clustering the new folds. In our evaluation with SCOP 1.69, our method correctly classifies 86.0%, 87.7%, and 90.5% of new domains at family, superfamily, and fold levels. Furthermore, for protein domains that belong to new domain families, our method is able to produce clusters that closely correspond to the new families in SCOP 1.69. As a result, our method can also be used to suggest new classification groups that contain novel folds.

CONCLUSION

We have developed a method called proCC for automatically classifying and clustering domains. The method is effective in classifying new domains and suggesting new domain families, and it is also very efficient. A web site offering access to proCC is freely available at http://www.eecs.umich.edu/periscope/procc.

摘要

背景

蛋白质结构分类在依据结构数据库中所有已知蛋白质来理解蛋白质分子功能方面起着核心作用。随着新蛋白质结构数量的迅速增加,对蛋白质分类的自动化和准确方法的需求变得越来越重要。

结果

在本文中,我们提出了一个用于蛋白质结构分类和新蛋白质结构识别的统一框架。该框架由一组用于比较、分类和聚类蛋白质结构的组件组成。这些组件使我们能够将蛋白质准确地分类到已知折叠中,检测新的蛋白质折叠,并提供一种对新折叠进行聚类的方法。在我们使用SCOP 1.69进行的评估中,我们的方法在家族、超家族和折叠水平上分别正确分类了86.0%、87.7%和90.5%的新结构域。此外,对于属于新结构域家族的蛋白质结构域,我们的方法能够生成与SCOP 1.69中的新家族紧密对应的聚类。因此,我们的方法还可用于提出包含新折叠的新分类组。

结论

我们开发了一种名为proCC的方法,用于自动对结构域进行分类和聚类。该方法在对新结构域进行分类和提出新结构域家族方面是有效的,并且效率也非常高。可通过http://www.eecs.umich.edu/periscope/procc免费访问提供proCC的网站。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f3a/1622760/5ac66ea70da5/1471-2105-7-456-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验