Suppr超能文献

具有结构或拓扑差异的序列相似蛋白质结构域对。

Sequence-Similar Protein Domain Pairs With Structural or Topological Dissimilarity.

作者信息

Røgen Peter

机构信息

Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark.

出版信息

Proteins. 2025 Mar;93(3):588-597. doi: 10.1002/prot.26753. Epub 2024 Oct 11.

Abstract

For a variety of applications, protein structures are clustered by sequence similarity, and sequence-redundant structures are disregarded. Sequence-similar chains are likely to have similar structures, but significant structural variation, as measured with RMSD, has been documented for sequence-similar chains and found usually to have a functional explanation. Moving two neighboring stretches of backbone through each other may change the chain topology and alter possible folding paths. The size of this motion is compatible to a variation in a flexible loop. We search and find domains with alternate chain topology in CATH4.2 sequence families relatively independent of sequence identity and of structural similarity as measured by RMSD. Structural, topological, and functional representative sets should therefore keep sequence-similar domains not just with structural variation but also with topological variation. We present BCAlign that finds Alignment and superposition of protein Backbone Curves by optimizing a user chosen convex combination of structural derivation and derivation between the structure-based sequence alignment and an input sequence alignment. Steric and topological obstructions from deforming a curve into an aligned curve are then found by a previously developed algorithm. For highly sequence-similar domains, sequence-based structural alignment better represents the chains motion and generally reveals larger structural and topological variation than structure-based does. Fold-switching protein pairs have been reported to be most frequent between X-ray and NMR structures and estimated to be underrepresented in the PDB as the alternate configuration is harder to resolve. Here we similarly find chain topology most frequently altered between X-ray and NMR structures.

摘要

对于各种应用,蛋白质结构按序列相似性进行聚类,序列冗余的结构则被忽略。序列相似的链可能具有相似的结构,但已记录到,用均方根偏差(RMSD)衡量,序列相似的链存在显著的结构变异,且通常能找到其功能上的解释。使两段相邻的主链相互穿过可能会改变链的拓扑结构,并改变可能的折叠路径。这种运动的大小与柔性环的变异程度相当。我们在CATH4.2序列家族中搜索并发现具有交替链拓扑结构的结构域,这些结构域相对独立于序列同一性以及用RMSD衡量的结构相似性。因此,结构、拓扑和功能代表性集不仅应保留具有结构变异的序列相似结构域,还应保留具有拓扑变异的结构域。我们提出了BCAlign,它通过优化用户选择的结构推导与基于结构的序列比对和输入序列比对之间的推导的凸组合,来找到蛋白质主链曲线的比对和叠加。然后,通过先前开发的算法来发现将曲线变形为比对曲线时的空间位阻和拓扑障碍。对于高度序列相似的结构域,基于序列的结构比对能更好地体现链的运动,并且通常比基于结构的比对揭示出更大的结构和拓扑变异。据报道,折叠转换蛋白对在X射线和核磁共振结构之间最为常见,并且据估计在蛋白质数据库(PDB)中代表性不足,因为其交替构型更难解析。在这里,我们同样发现链拓扑结构在X射线和核磁共振结构之间变化最为频繁。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e63a/11809131/5a3bf541f65b/PROT-93-588-g007.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验