• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

高适应性路径可以连接序列重叠度低的蛋白质。

High fitness paths can connect proteins with low sequence overlap.

作者信息

Kantroo Pranav, Wagner Günter P, Machta Benjamin B

机构信息

Computational Biology and Bioinformatics Program, Yale University, New Haven, CT-06520, USA.

Quantitative Biology Institute, Yale University, New Haven, CT-06520, USA.

出版信息

ArXiv. 2024 Nov 13:arXiv:2411.09054v1.

PMID:39606714
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11601789/
Abstract

The structure and function of a protein are determined by its amino acid sequence. While random mutations change a protein's sequence, evolutionary forces shape its structural fold and biological activity. Studies have shown that neutral networks can connect a local region of sequence space by single residue mutations that preserve viability. However, the larger-scale connectedness of protein morphospace remains poorly understood. Recent advances in artificial intelligence have enabled us to computationally predict a protein's structure and quantify its functional plausibility. Here we build on these tools to develop an algorithm that generates viable paths between distantly related extant protein pairs. The intermediate sequences in these paths differ by single residue changes over subsequent steps - substitutions, insertions and deletions are admissible moves. Their fitness is evaluated using the protein language model ESM2, and maintained as high as possible subject to the constraints of the traversal. We document the qualitative variation across paths generated between progressively divergent protein pairs, some of which do not even acquire the same structural fold. The ease of interpolating between two sequences could be used as a proxy for the likelihood of homology between them.

摘要

蛋白质的结构和功能由其氨基酸序列决定。虽然随机突变会改变蛋白质的序列,但进化力量塑造了其结构折叠和生物活性。研究表明,中性网络可以通过保留生存能力的单残基突变连接序列空间的局部区域。然而,蛋白质形态空间的更大规模连通性仍然知之甚少。人工智能的最新进展使我们能够通过计算预测蛋白质的结构并量化其功能合理性。在这里,我们基于这些工具开发了一种算法,该算法可在远缘现存蛋白质对之间生成可行路径。这些路径中的中间序列在后续步骤中因单残基变化而不同——替换、插入和缺失都是允许的移动。使用蛋白质语言模型ESM2评估它们的适应性,并在遍历的约束下尽可能保持高适应性。我们记录了在逐渐分化的蛋白质对之间生成的路径上的定性变化,其中一些甚至没有获得相同的结构折叠。在两个序列之间进行插值的难易程度可以用作它们之间同源性可能性的代理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6f8/11601789/836251145bc0/nihpp-2411.09054v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6f8/11601789/59b4b0f23ebd/nihpp-2411.09054v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6f8/11601789/7ba3b1abfad6/nihpp-2411.09054v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6f8/11601789/5920f829b717/nihpp-2411.09054v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6f8/11601789/836251145bc0/nihpp-2411.09054v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6f8/11601789/59b4b0f23ebd/nihpp-2411.09054v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6f8/11601789/7ba3b1abfad6/nihpp-2411.09054v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6f8/11601789/5920f829b717/nihpp-2411.09054v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6f8/11601789/836251145bc0/nihpp-2411.09054v1-f0004.jpg

相似文献

1
High fitness paths can connect proteins with low sequence overlap.高适应性路径可以连接序列重叠度低的蛋白质。
ArXiv. 2024 Nov 13:arXiv:2411.09054v1.
2
High fitness paths can connect proteins with low sequence overlap.高适应性路径可以连接序列重叠度低的蛋白质。
bioRxiv. 2024 Nov 15:2024.11.13.623265. doi: 10.1101/2024.11.13.623265.
3
A percolation theory analysis of continuous functional paths in protein sequence space affirms previous insights on the optimization of proteins for adaptability.对蛋白质序列空间中连续功能路径的渗流理论分析证实了先前关于蛋白质适应性优化的见解。
PLoS One. 2024 Dec 5;19(12):e0314929. doi: 10.1371/journal.pone.0314929. eCollection 2024.
4
Evolutionary paths that link orthogonal pairs of binding proteins.连接结合蛋白正交对的进化路径。
Res Sq. 2023 Dec 13:rs.3.rs-2836905. doi: 10.21203/rs.3.rs-2836905/v2.
5
What's in a likelihood? Simple models of protein evolution and the contribution of structurally viable reconstructions to the likelihood.可能性包含什么?蛋白质进化的简单模型和结构可行重建对可能性的贡献。
Syst Biol. 2011 Mar;60(2):161-74. doi: 10.1093/sysbio/syq088. Epub 2011 Jan 12.
6
Computational prediction of the tolerance to amino-acid deletion in green-fluorescent protein.绿色荧光蛋白中氨基酸缺失耐受性的计算预测
PLoS One. 2017 Apr 3;12(4):e0164905. doi: 10.1371/journal.pone.0164905. eCollection 2017.
7
Rational evolutionary design: the theory of in vitro protein evolution.理性进化设计:体外蛋白质进化理论
Adv Protein Chem. 2000;55:79-160. doi: 10.1016/s0065-3233(01)55003-2.
8
Contingency and chance erase necessity in the experimental evolution of ancestral proteins.偶然和机遇在祖先蛋白质的实验进化中消除了必然性。
Elife. 2021 Jun 1;10:e67336. doi: 10.7554/eLife.67336.
9
Insertions and deletions in the RNA sequence-structure map.RNA 序列-结构图谱中的插入和缺失。
J R Soc Interface. 2021 Oct;18(183):20210380. doi: 10.1098/rsif.2021.0380. Epub 2021 Oct 6.
10
The Role of Evolutionary Selection in the Dynamics of Protein Structure Evolution.进化选择在蛋白质结构进化动态中的作用
Biophys J. 2017 Apr 11;112(7):1350-1365. doi: 10.1016/j.bpj.2017.02.029.

本文引用的文献

1
Sensitive remote homology search by local alignment of small positional embeddings from protein language models.通过蛋白质语言模型的小位置嵌入进行局部比对实现敏感的远程同源性搜索。
Elife. 2024 Mar 15;12:RP91415. doi: 10.7554/eLife.91415.
2
Embedding-based alignment: combining protein language models with dynamic programming alignment to detect structural similarities in the twilight-zone.基于嵌入的对齐:将蛋白质语言模型与动态规划对齐相结合,以检测“黄昏地带”中的结构相似性。
Bioinformatics. 2024 Jan 2;40(1). doi: 10.1093/bioinformatics/btad786.
3
A rugged yet easily navigable fitness landscape.
崎岖但易于导航的健身地形。
Science. 2023 Nov 24;382(6673):eadh3860. doi: 10.1126/science.adh3860.
4
Protein remote homology detection and structural alignment using deep learning.使用深度学习进行蛋白质远程同源检测和结构比对。
Nat Biotechnol. 2024 Jun;42(6):975-985. doi: 10.1038/s41587-023-01917-2. Epub 2023 Sep 7.
5
Identification of a covert evolutionary pathway between two protein folds.鉴定两种蛋白质折叠之间的隐蔽进化途径。
Nat Commun. 2023 Jun 1;14(1):3177. doi: 10.1038/s41467-023-38519-0.
6
Fast and accurate protein structure search with Foldseek.使用 Foldseek 进行快速准确的蛋白质结构搜索。
Nat Biotechnol. 2024 Feb;42(2):243-246. doi: 10.1038/s41587-023-01773-0. Epub 2023 May 8.
7
Mutational Paths with Sequence-Based Models of Proteins: From Sampling to Mean-Field Characterization.基于序列的蛋白质模型的突变路径:从采样到平均场表征
Phys Rev Lett. 2023 Apr 14;130(15):158402. doi: 10.1103/PhysRevLett.130.158402.
8
Creative destruction: New protein folds from old.创造性破坏:旧蛋白折叠成新结构。
Proc Natl Acad Sci U S A. 2022 Dec 27;119(52):e2207897119. doi: 10.1073/pnas.2207897119. Epub 2022 Dec 19.
9
Deep embedding and alignment of protein sequences.蛋白质序列的深度嵌入与比对
Nat Methods. 2023 Jan;20(1):104-111. doi: 10.1038/s41592-022-01700-2. Epub 2022 Dec 15.
10
ColabFold: making protein folding accessible to all.ColabFold:让蛋白质折叠变得人人可用。
Nat Methods. 2022 Jun;19(6):679-682. doi: 10.1038/s41592-022-01488-1. Epub 2022 May 30.