Suppr超能文献

SCOP与CATH的系统比较:蛋白质结构分析的新金标准。

Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis.

作者信息

Csaba Gergely, Birzele Fabian, Zimmer Ralf

机构信息

Department of Informatics, Ludwig-Maximilians-Universität München, Munich, Germany.

出版信息

BMC Struct Biol. 2009 Apr 17;9:23. doi: 10.1186/1472-6807-9-23.

Abstract

BACKGROUND

SCOP and CATH are widely used as gold standards to benchmark novel protein structure comparison methods as well as to train machine learning approaches for protein structure classification and prediction. The two hierarchies result from different protocols which may result in differing classifications of the same protein. Ignoring such differences leads to problems when being used to train or benchmark automatic structure classification methods. Here, we propose a method to compare SCOP and CATH in detail and discuss possible applications of this analysis.

RESULTS

We create a new mapping between SCOP and CATH and define a consistent benchmark set which is shown to largely reduce errors made by structure comparison methods such as TM-Align and has useful further applications, e.g. for machine learning methods being trained for protein structure classification. Additionally, we extract additional connections in the topology of the protein fold space from the orthogonal features contained in SCOP and CATH.

CONCLUSION

Via an all-to-all comparison, we find that there are large and unexpected differences between SCOP and CATH w.r.t. their domain definitions as well as their hierarchic partitioning of the fold space on every level of the two classifications. A consistent mapping of SCOP and CATH can be exploited for automated structure comparison and classification.

AVAILABILITY

Benchmark sets and an interactive SCOP-CATH browser are available at http://www.bio.ifi.lmu.de/SCOPCath.

摘要

背景

SCOP和CATH被广泛用作黄金标准,用于对新型蛋白质结构比较方法进行基准测试,以及训练用于蛋白质结构分类和预测的机器学习方法。这两种层次结构源自不同的协议,可能导致对同一蛋白质的分类不同。在用于训练或基准测试自动结构分类方法时,忽略这些差异会导致问题。在此,我们提出一种方法来详细比较SCOP和CATH,并讨论这种分析的可能应用。

结果

我们在SCOP和CATH之间创建了一个新的映射,并定义了一个一致的基准集,该基准集被证明能大幅减少结构比较方法(如TM-Align)所产生的错误,并且有其他有用的应用,例如用于训练蛋白质结构分类的机器学习方法。此外,我们从SCOP和CATH中包含的正交特征提取蛋白质折叠空间拓扑结构中的额外联系。

结论

通过全对全比较,我们发现SCOP和CATH在其结构域定义以及在这两种分类的每个层次上的折叠空间层次划分方面存在巨大且意想不到的差异。SCOP和CATH的一致映射可用于自动结构比较和分类。

可用性

基准集和交互式SCOP-CATH浏览器可在http://www.bio.ifi.lmu.de/SCOPCath获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9c7/2678134/5c6cd1e0d629/1472-6807-9-23-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验