Suppr超能文献

论“折叠空间连续性”的进化起源:混合 α-β 域中拓扑收敛和发散的研究。

On the evolutionary origins of "Fold Space Continuity": a study of topological convergence and divergence in mixed alpha-beta domains.

机构信息

MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW71AA, UK.

出版信息

J Struct Biol. 2010 Dec;172(3):244-52. doi: 10.1016/j.jsb.2010.07.016. Epub 2010 Aug 5.

Abstract

Existing protein structure classifications group proteins by overall structural similarity at the highest level and by evolutionary relationships at the lowest level, deriving higher-level groups by pairwise structure comparison. For this to be successful requires that large changes in structure are relatively rare in evolution and that proteins with no detectable evolutionary relationship do not converge on similar global chain conformations since this creates conflicts between structural and evolutionary consistency. Analysis of global structural changes using core topological descriptions for 4261 domains from classes C and D of the SCOP database and new measures of topological distance and consistency of classification showed that the topological consistency of SCOP folds is highly variable with some folds having no consistent description and significant overlaps between groups including some members of separate folds with identical topological descriptions. Topological clustering shows that including sufficient indels to allow family members to be joined would also require joining several distinct folds. We conclude that evolutionary changes in the global topology of protein domains are the root cause of many difficulties for present approaches to structure classification using pairwise comparison. As a resolution we propose that a purely structural classification should be created using an approach similar to that adopted by the Gene Ontology in which proteins are assigned labels describing structure.

摘要

现有的蛋白质结构分类方法在最高级别上根据整体结构相似性对蛋白质进行分组,在最低级别上根据进化关系进行分组,通过两两结构比较得出更高一级的分组。要做到这一点,需要满足以下条件:在进化过程中,结构的巨大变化相对较少,而且没有明显进化关系的蛋白质不会趋同于相似的整体链构象,因为这会在结构和进化一致性之间产生冲突。通过对 SCOP 数据库 C 类和 D 类中 4261 个结构域的核心拓扑描述进行全局结构变化分析,并采用新的拓扑距离和分类一致性度量方法,发现 SCOP 折叠的拓扑一致性具有高度可变性,有些折叠没有一致的描述,不同组之间存在显著重叠,包括一些具有相同拓扑描述的独立折叠成员。拓扑聚类表明,为了允许家族成员连接,需要加入足够的插入缺失来允许连接几个不同的折叠。我们得出结论,蛋白质结构分类中使用两两比较的方法存在许多困难,其根本原因是蛋白质结构的全局拓扑进化变化。为了解决这个问题,我们建议使用类似于基因本体论(Gene Ontology)所采用的方法来创建一个纯粹基于结构的分类,其中蛋白质被分配描述结构的标签。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验