• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度生成模型揭示了蛋白质结构在连续折叠空间中的遥远关系。

Deep generative models of protein structure uncover distant relationships across a continuous fold space.

机构信息

School of Data Science, University of Virginia, Charlottesville, VA, USA.

Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA.

出版信息

Nat Commun. 2024 Sep 16;15(1):8094. doi: 10.1038/s41467-024-52020-2.

DOI:10.1038/s41467-024-52020-2
PMID:39294145
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11410806/
Abstract

Our views of fold space implicitly rest upon many assumptions that impact how we analyze, interpret and understand protein structure, function and evolution. For instance, is there an optimal granularity in viewing protein structural similarities (e.g., architecture, topology or some other level)? Similarly, the discrete/continuous dichotomy of fold space is central, but remains unresolved. Discrete views of fold space bin similar folds into distinct, non-overlapping groups; unfortunately, such binning can miss remote relationships. While hierarchical systems like CATH are indispensable resources, less heuristic and more conceptually flexible approaches could enable more nuanced explorations of fold space. Building upon an Urfold model of protein structure, here we present a deep generative modeling framework, termed DeepUrfold, for analyzing protein relationships at scale. DeepUrfold's learned embeddings occupy high-dimensional latent spaces that can be distilled for a given protein in terms of an amalgamated representation uniting sequence, structure and biophysical properties. This approach is structure-guided, versus being purely structure-based, and DeepUrfold learns representations that, in a sense, define superfamilies. Deploying DeepUrfold with CATH reveals evolutionarily-remote relationships that evade existing methodologies, and suggests a mostly-continuous view of fold space-a view that extends beyond simple geometric similarity, towards the realm of integrated sequence ↔ structure ↔ function properties.

摘要

我们对折叠空间的看法隐含着许多假设,这些假设影响着我们对蛋白质结构、功能和进化的分析、解释和理解。例如,在观察蛋白质结构相似性时(例如,结构、拓扑或其他层次)是否存在最佳粒度?同样,折叠空间的离散/连续二分法是核心问题,但尚未解决。折叠空间的离散视图将相似的折叠分为不同的、不重叠的组;不幸的是,这种分组可能会错过远程关系。虽然像 CATH 这样的分层系统是不可或缺的资源,但更少的启发式和更具概念灵活性的方法可以使对折叠空间的更细致的探索成为可能。基于蛋白质结构的 Urfold 模型,我们在这里提出了一种深度生成模型框架,称为 DeepUrfold,用于大规模分析蛋白质关系。DeepUrfold 的学习嵌入占据了高维潜在空间,可以根据一个联合表示来提炼给定蛋白质的信息,该表示将序列、结构和物理性质融合在一起。这种方法是结构导向的,而不是纯粹基于结构的,并且 DeepUrfold 学习的表示在某种意义上定义了超家族。使用 DeepUrfold 和 CATH 部署揭示了逃避现有方法的进化上遥远的关系,并提出了一种主要是连续的折叠空间视图——这种视图超越了简单的几何相似性,扩展到了集成序列↔结构↔功能属性的领域。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/eb77494bdef9/41467_2024_52020_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/6dd0268fa900/41467_2024_52020_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/b64b9fd970be/41467_2024_52020_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/61b9a5e9bc78/41467_2024_52020_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/a641e6542c62/41467_2024_52020_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/eb77494bdef9/41467_2024_52020_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/6dd0268fa900/41467_2024_52020_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/b64b9fd970be/41467_2024_52020_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/61b9a5e9bc78/41467_2024_52020_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/a641e6542c62/41467_2024_52020_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3750/11410806/eb77494bdef9/41467_2024_52020_Fig5_HTML.jpg

相似文献

1
Deep generative models of protein structure uncover distant relationships across a continuous fold space.深度生成模型揭示了蛋白质结构在连续折叠空间中的遥远关系。
Nat Commun. 2024 Sep 16;15(1):8094. doi: 10.1038/s41467-024-52020-2.
2
Protein structure comparison: implications for the nature of 'fold space', and structure and function prediction.蛋白质结构比较:对“折叠空间”性质以及结构与功能预测的启示
Curr Opin Struct Biol. 2006 Jun;16(3):393-8. doi: 10.1016/j.sbi.2006.04.007. Epub 2006 May 4.
3
The Urfold: Structural similarity just above the superfold level?《展开:超级折叠水平之上的结构相似性?》
Protein Sci. 2019 Dec;28(12):2119-2126. doi: 10.1002/pro.3742. Epub 2019 Nov 6.
4
Cross-over between discrete and continuous protein structure space: insights into automatic classification and networks of protein structures.离散与连续蛋白质结构空间之间的交叉:对蛋白质结构自动分类及网络的见解。
PLoS Comput Biol. 2009 Mar;5(3):e1000331. doi: 10.1371/journal.pcbi.1000331. Epub 2009 Mar 27.
5
The CATH hierarchy revisited-structural divergence in domain superfamilies and the continuity of fold space.重新审视 CATH 层次结构——结构域超家族中的差异以及折叠空间的连续性。
Structure. 2009 Aug 12;17(8):1051-62. doi: 10.1016/j.str.2009.06.015.
6
Impact of structure space continuity on protein fold classification.结构空间连续性对蛋白质折叠分类的影响。
Sci Rep. 2016 Mar 23;6:23263. doi: 10.1038/srep23263.
7
Automatic classification of protein structures using low-dimensional structure space mappings.利用低维结构空间映射对蛋白质结构进行自动分类。
BMC Bioinformatics. 2014;15 Suppl 2(Suppl 2):S1. doi: 10.1186/1471-2105-15-S2-S1. Epub 2014 Jan 24.
8
Detecting evolutionary relationships across existing fold space, using sequence order-independent profile-profile alignments.利用与序列顺序无关的profile-profile比对来检测现有折叠空间中的进化关系。
Proc Natl Acad Sci U S A. 2008 Apr 8;105(14):5441-6. doi: 10.1073/pnas.0704422105. Epub 2008 Apr 2.
9
A galaxy of folds.一片褶皱的星系。
Protein Sci. 2010 Jan;19(1):124-30. doi: 10.1002/pro.297.
10
A consensus view of fold space: combining SCOP, CATH, and the Dali Domain Dictionary.折叠空间的共识观点:结合SCOP、CATH和达利结构域词典
Protein Sci. 2003 Oct;12(10):2150-60. doi: 10.1110/ps.0306803.

引用本文的文献

1
Impact of local unfolding fluctuations on the evolution of regional sequence preferences in proteins.局部解折叠波动对蛋白质区域序列偏好性演变的影响。
Protein Sci. 2025 Mar;34(3):e70015. doi: 10.1002/pro.70015.
2
Prop3D: A flexible, Python-based platform for machine learning with protein structural properties and biophysical data.Prop3D:一个灵活的、基于 Python 的机器学习平台,用于处理蛋白质结构性质和生物物理数据。
BMC Bioinformatics. 2024 Jan 4;25(1):11. doi: 10.1186/s12859-023-05586-5.
3
How AlphaFold2 shaped the structural coverage of the human transmembrane proteome.

本文引用的文献

1
Prop3D: A flexible, Python-based platform for machine learning with protein structural properties and biophysical data.Prop3D:一个灵活的、基于 Python 的机器学习平台,用于处理蛋白质结构性质和生物物理数据。
BMC Bioinformatics. 2024 Jan 4;25(1):11. doi: 10.1186/s12859-023-05586-5.
2
Progress at protein structure prediction, as seen in CASP15.在 CASP15 中看到的蛋白质结构预测的进展。
Curr Opin Struct Biol. 2023 Jun;80:102594. doi: 10.1016/j.sbi.2023.102594. Epub 2023 Apr 14.
3
Critical Assessment of Methods for Predicting the 3D Structure of Proteins and Protein Complexes.
AlphaFold2 如何塑造人类跨膜蛋白质组的结构覆盖范围。
Sci Rep. 2023 Nov 20;13(1):20283. doi: 10.1038/s41598-023-47204-7.
蛋白质和蛋白质复合物三维结构预测方法的批判性评估。
Annu Rev Biophys. 2023 May 9;52:183-206. doi: 10.1146/annurev-biophys-102622-084607. Epub 2023 Jan 10.
4
Creative destruction: New protein folds from old.创造性破坏:旧蛋白折叠成新结构。
Proc Natl Acad Sci U S A. 2022 Dec 27;119(52):e2207897119. doi: 10.1073/pnas.2207897119. Epub 2022 Dec 19.
5
The curse of the protein ribbon diagram.蛋白质带状图的诅咒。
PLoS Biol. 2022 Dec 12;20(12):e3001901. doi: 10.1371/journal.pbio.3001901. eCollection 2022 Dec.
6
Quantifying structural relationships of metal-binding sites suggests origins of biological electron transfer.对金属结合位点的结构关系进行量化,揭示了生物电子转移的起源。
Sci Adv. 2022 Jan 14;8(2):eabj3984. doi: 10.1126/sciadv.abj3984.
7
A guide to machine learning for biologists.生物学机器学习指南。
Nat Rev Mol Cell Biol. 2022 Jan;23(1):40-55. doi: 10.1038/s41580-021-00407-0. Epub 2021 Sep 13.
8
Fold Evolution before LUCA: Common Ancestry of SH3 Domains and OB Domains.在 LUCA 之前的折叠进化:SH3 结构域和 OB 结构域的共同祖先。
Mol Biol Evol. 2021 Oct 27;38(11):5134-5143. doi: 10.1093/molbev/msab240.
9
Learning the protein language: Evolution, structure, and function.学习蛋白质语言:进化、结构和功能。
Cell Syst. 2021 Jun 16;12(6):654-669.e3. doi: 10.1016/j.cels.2021.05.017.
10
Protein design and variant prediction using autoregressive generative models.使用自回归生成模型进行蛋白质设计和变体预测。
Nat Commun. 2021 Apr 23;12(1):2403. doi: 10.1038/s41467-021-22732-w.