• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

模型蛋白质结构在序列空间中是如何分布的?

How are model protein structures distributed in sequence space?

作者信息

Bornberg-Bauer E

机构信息

Abteilung Theoretische Bioinformatik, Deutsches Krebsforschungszentrum, Heidelberg, Germany.

出版信息

Biophys J. 1997 Nov;73(5):2393-403. doi: 10.1016/S0006-3495(97)78268-7.

DOI:10.1016/S0006-3495(97)78268-7
PMID:9370433
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1181141/
Abstract

The figure-to-structure maps for all uniquely folding sequences of short hydrophobic polar (HP) model proteins on a square lattice is analyzed to investigate aspects considered relevant to evolution. By ranking structures by their frequencies, few very frequent and many rare structures are found. The distribution can be empirically described by a generalized Zipf's law. All structures are relatively compact, yet the most compact ones are rare. Most sequences falling to the same structure belong to "neutral nets." These graphs in sequence space are connected by point mutations and centered around prototype sequences, which tolerate the largest number (up to 55%) of neutral mutations. Profiles have been derived from these homologous sequences. Frequent structures conserve hydrophobic cores only while rare ones are sensitive to surface mutations as well. Shape space covering, i.e., the ability to transform any structure into most others with few point mutations, is very unlikely. It is concluded that many characteristic features of the sequence-to-structure map of real proteins, such as the dominance of few folds, can be explained by the simple HP model. In analogy to protein families, nets are dense and well separated in sequence space. Potential implications in better understanding the evolution of proteins and applications to improving database searches are discussed.

摘要

分析了方形晶格上短疏水极性(HP)模型蛋白质所有唯一折叠序列的图到结构映射,以研究与进化相关的方面。通过按频率对结构进行排序,发现了少数非常频繁的结构和许多罕见的结构。该分布可以用广义齐普夫定律进行经验描述。所有结构都相对紧凑,但最紧凑的结构很少见。落入相同结构的大多数序列属于“中性网络”。序列空间中的这些图通过点突变相连,并以原型序列为中心,这些原型序列能容忍最多数量(高达55%)的中性突变。已从这些同源序列中推导出台积电。频繁出现的结构仅保留疏水核心,而罕见的结构对表面突变也很敏感。形状空间覆盖,即通过少量点突变将任何结构转变为大多数其他结构的能力,是非常不可能的。得出的结论是,真实蛋白质序列到结构映射的许多特征,例如少数折叠的主导地位,可以用简单的HP模型来解释。类似于蛋白质家族,网络在序列空间中密集且分隔良好。讨论了在更好地理解蛋白质进化以及改进数据库搜索应用方面的潜在意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff3b/1181141/b50ba0472440/biophysj00028-0156-a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff3b/1181141/b50ba0472440/biophysj00028-0156-a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff3b/1181141/b50ba0472440/biophysj00028-0156-a.jpg

相似文献

1
How are model protein structures distributed in sequence space?模型蛋白质结构在序列空间中是如何分布的?
Biophys J. 1997 Nov;73(5):2393-403. doi: 10.1016/S0006-3495(97)78268-7.
2
From sequences to shapes and back: a case study in RNA secondary structures.从序列到形状再回归:RNA二级结构的一个案例研究
Proc Biol Sci. 1994 Mar 22;255(1344):279-84. doi: 10.1098/rspb.1994.0040.
3
Landscapes: complex optimization problems and biopolymer structures.景观:复杂的优化问题与生物聚合物结构
Comput Chem. 1994 Sep;18(3):295-324. doi: 10.1016/0097-8485(94)85025-9.
4
Sequence and structure space model of protein divergence driven by point mutations.点突变驱动的蛋白质分歧的序列和结构空间模型。
J Theor Biol. 2013 Aug 7;330:1-8. doi: 10.1016/j.jtbi.2013.03.015. Epub 2013 Mar 28.
5
Super folds, networks, and barriers.超级褶皱、网络和屏障。
Proteins. 2012 Feb;80(2):463-70. doi: 10.1002/prot.23212. Epub 2011 Nov 17.
6
Structure formation of biopolymers is complex, their evolution may be simple.生物聚合物的结构形成很复杂,但其进化过程可能很简单。
Pac Symp Biocomput. 1996:97-108.
7
How to search for RNA structures. Theoretical concepts in evolutionary biotechnology.如何搜索RNA结构。进化生物技术中的理论概念。
J Biotechnol. 1995 Jul 31;41(2-3):239-57. doi: 10.1016/0168-1656(94)00085-q.
8
Generic properties of combinatory maps: neutral networks of RNA secondary structures.组合映射的一般属性:RNA二级结构的中性网络
Bull Math Biol. 1997 Mar;59(2):339-97. doi: 10.1007/BF02462007.
9
Principal eigenvector of contact matrices and hydrophobicity profiles in proteins.蛋白质中接触矩阵和疏水性图谱的主特征向量。
Proteins. 2005 Jan 1;58(1):22-30. doi: 10.1002/prot.20240.
10
Neutral networks in protein space: a computational study based on knowledge-based potentials of mean force.蛋白质空间中的神经网络:基于基于知识的平均力势的计算研究。
Fold Des. 1997;2(5):261-9. doi: 10.1016/S1359-0278(97)00037-0.

引用本文的文献

1
Analysis of proteins in the light of mutations.根据突变分析蛋白质。
Eur Biophys J. 2024 Aug;53(5-6):255-265. doi: 10.1007/s00249-024-01714-y. Epub 2024 Jul 2.
2
Unexplored regions of the protein sequence-structure map revealed at scale by a library of foldtuned language models.通过一组折叠调整语言模型大规模揭示的蛋白质序列-结构图谱的未探索区域。
bioRxiv. 2025 Jan 13:2023.12.22.573145. doi: 10.1101/2023.12.22.573145.
3
The Boltzmann distributions of molecular structures predict likely changes through random mutations.分子结构的玻尔兹曼分布预测了通过随机突变可能发生的变化。

本文引用的文献

1
Simulations of the folding of a globular protein.球状蛋白质折叠的模拟。
Science. 1990 Nov 23;250(4984):1121-5. doi: 10.1126/science.250.4984.1121.
2
Minimum energy compact structures of random sequences of heteropolymers.杂聚物随机序列的最小能量紧密结构。
Phys Rev Lett. 1993 Oct 11;71(15):2505-2508. doi: 10.1103/PhysRevLett.71.2505.
3
Correlations in binary sequences and a generalized Zipf analysis.二进制序列中的相关性与广义齐普夫分析。
Biophys J. 2023 Nov 21;122(22):4467-4475. doi: 10.1016/j.bpj.2023.10.024. Epub 2023 Oct 29.
4
The non-deterministic genotype-phenotype map of RNA secondary structure.RNA 二级结构的非确定性基因型-表型图谱。
J R Soc Interface. 2023 Aug;20(205):20230132. doi: 10.1098/rsif.2023.0132. Epub 2023 Aug 23.
5
Bi-alignments with affine gaps costs.带仿射空位罚分的双序列比对
Algorithms Mol Biol. 2022 May 16;17(1):10. doi: 10.1186/s13015-022-00219-7.
6
Ancestral sequences of a large promiscuous enzyme family correspond to bridges in sequence space in a network representation.一个大型混杂酶家族的祖先序列对应于网络表示中序列空间中的桥梁。
J R Soc Interface. 2021 Nov;18(184):20210389. doi: 10.1098/rsif.2021.0389. Epub 2021 Nov 3.
7
RNA aptamers for AMPA receptors.AMPA 受体的 RNA 适体。
Neuropharmacology. 2021 Nov 1;199:108761. doi: 10.1016/j.neuropharm.2021.108761. Epub 2021 Sep 9.
8
Quantifying the Mutational Robustness of Protein-Coding Genes.量化编码蛋白质基因的突变稳健性。
J Mol Evol. 2021 Jul;89(6):357-369. doi: 10.1007/s00239-021-10009-1. Epub 2021 May 2.
9
About the Protein Space Vastness.关于蛋白质空间广阔性。
Protein J. 2020 Oct;39(5):472-475. doi: 10.1007/s10930-020-09939-4. Epub 2020 Nov 1.
10
Developing RNA aptamers for potential treatment of neurological diseases.开发 RNA 适体用于潜在的神经疾病治疗。
Future Med Chem. 2019 Mar;11(6):551-565. doi: 10.4155/fmc-2018-0364. Epub 2019 Mar 26.
Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics. 1995 Jul;52(1):446-452. doi: 10.1103/physreve.52.446.
4
RNA folding and combinatory landscapes.RNA折叠与组合景观。
Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics. 1993 Mar;47(3):2083-2099. doi: 10.1103/physreve.47.2083.
5
Mutation matrices and physical-chemical properties: correlations and implications.
Proteins. 1997 Mar;27(3):336-44. doi: 10.1002/(sici)1097-0134(199703)27:3<336::aid-prot2>3.0.co;2-b.
6
Coiled coils: new structures and new functions.卷曲螺旋:新结构与新功能
Trends Biochem Sci. 1996 Oct;21(10):375-82.
7
The Shannon information entropy of protein sequences.蛋白质序列的香农信息熵。
Biophys J. 1996 Jul;71(1):148-55. doi: 10.1016/S0006-3495(96)79210-X.
8
Comparing folding codes for proteins and polymers.比较蛋白质和聚合物的折叠编码。
Proteins. 1996 Mar;24(3):335-44. doi: 10.1002/(SICI)1097-0134(199603)24:3<335::AID-PROT6>3.0.CO;2-F.
9
Proline scanning mutagenesis of a molten globule reveals non-cooperative formation of a protein's overall topology.
Nat Struct Biol. 1996 Aug;3(8):682-7. doi: 10.1038/nsb0896-682.
10
Sequence space, folding and protein design.序列空间、折叠与蛋白质设计。
Curr Opin Struct Biol. 1996 Feb;6(1):3-10. doi: 10.1016/s0959-440x(96)80088-1.