• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

持久光谱理论指导的蛋白质工程。

Persistent spectral theory-guided protein engineering.

作者信息

Qiu Yuchi, Wei Guo-Wei

机构信息

Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA.

Department of Biochemistry and Molecular Biology, Michigan State University, MI, 48824, USA.

出版信息

Nat Comput Sci. 2023 Feb;3(2):149-163. doi: 10.1038/s43588-022-00394-y. Epub 2023 Feb 20.

DOI:10.1038/s43588-022-00394-y
PMID:37637776
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10456983/
Abstract

While protein engineering, which iteratively optimizes protein fitness by screening the gigantic mutational space, is constrained by experimental capacity, various machine learning models have substantially expedited protein engineering. Three-dimensional protein structures promise further advantages, but their intricate geometric complexity hinders their applications in deep mutational screening. Persistent homology, an established algebraic topology tool for protein structural complexity reduction, fails to capture the homotopic shape evolution during the filtration of a given data. This work introduces a opology-ffered rotein ness (TopFit) framework to complement protein sequence and structure embeddings. Equipped with an ensemble regression strategy, TopFit integrates the persistent spectral theory, a new topological Laplacian, and two auxiliary sequence embeddings to capture mutation-induced topological invariant, shape evolution, and sequence disparity in the protein fitness landscape. The performance of TopFit is assessed by 34 benchmark datasets with 128,634 variants, involving a vast variety of protein structure acquisition modalities and training set size variations.

摘要

虽然通过筛选巨大的突变空间来迭代优化蛋白质适应性的蛋白质工程受到实验能力的限制,但各种机器学习模型已大大加快了蛋白质工程的进程。三维蛋白质结构具有进一步的优势,但其复杂的几何复杂性阻碍了它们在深度突变筛选中的应用。持久同调作为一种用于降低蛋白质结构复杂性的既定代数拓扑工具,在给定数据的过滤过程中无法捕捉到同伦形状的演变。这项工作引入了一种拓扑提供的蛋白质适应性(TopFit)框架,以补充蛋白质序列和结构嵌入。配备了集成回归策略,TopFit整合了持久谱理论、一种新的拓扑拉普拉斯算子和两个辅助序列嵌入,以捕捉蛋白质适应性景观中突变诱导的拓扑不变性、形状演变和序列差异。通过34个包含128,634个变体的基准数据集评估了TopFit的性能,这些数据集涉及各种各样的蛋白质结构获取方式和训练集大小变化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/dfb1c4aaa011/nihms-1865717-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/3864bd02e87c/nihms-1865717-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/b8d569b449c7/nihms-1865717-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/c68a53d626c4/nihms-1865717-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/178efa2fca2e/nihms-1865717-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/e9af4e1ac149/nihms-1865717-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/c1a7915f2761/nihms-1865717-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/e37b2b67b07c/nihms-1865717-f0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/eb91c4660fe0/nihms-1865717-f0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/3cecf7016b86/nihms-1865717-f0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/3ae9ef7d5ef8/nihms-1865717-f0015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/1d196bd242c3/nihms-1865717-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/ecfa4fc03a0d/nihms-1865717-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/7501418e41fc/nihms-1865717-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/6d28b7759081/nihms-1865717-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/dfb1c4aaa011/nihms-1865717-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/3864bd02e87c/nihms-1865717-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/b8d569b449c7/nihms-1865717-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/c68a53d626c4/nihms-1865717-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/178efa2fca2e/nihms-1865717-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/e9af4e1ac149/nihms-1865717-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/c1a7915f2761/nihms-1865717-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/e37b2b67b07c/nihms-1865717-f0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/eb91c4660fe0/nihms-1865717-f0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/3cecf7016b86/nihms-1865717-f0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/3ae9ef7d5ef8/nihms-1865717-f0015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/1d196bd242c3/nihms-1865717-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/ecfa4fc03a0d/nihms-1865717-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/7501418e41fc/nihms-1865717-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/6d28b7759081/nihms-1865717-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47f7/10456983/dfb1c4aaa011/nihms-1865717-f0005.jpg

相似文献

1
Persistent spectral theory-guided protein engineering.持久光谱理论指导的蛋白质工程。
Nat Comput Sci. 2023 Feb;3(2):149-163. doi: 10.1038/s43588-022-00394-y. Epub 2023 Feb 20.
2
PERSISTENT PATH LAPLACIAN.持久路径拉普拉斯算子
Found Data Sci. 2023 Mar;5(1):26-55. doi: 10.3934/fods.2022015.
3
Topological deep learning based deep mutational scanning.基于拓扑深度学习的深度突变扫描。
Comput Biol Med. 2023 Sep;164:107258. doi: 10.1016/j.compbiomed.2023.107258. Epub 2023 Jul 17.
4
Persistent spectral graph.持续谱图。
Int J Numer Method Biomed Eng. 2020 Sep;36(9):e3376. doi: 10.1002/cnm.3376. Epub 2020 Aug 17.
5
HERMES: PERSISTENT SPECTRAL GRAPH SOFTWARE.赫尔墨斯:持久光谱图软件。
Found Data Sci. 2021 Mar;3(1):67-97. doi: 10.3934/fods.2021006.
6
TopoFormer: Multiscale Topology-enabled Structure-to-Sequence Transformer for Protein-Ligand Interaction Predictions.TopoFormer:用于蛋白质-配体相互作用预测的多尺度拓扑结构序列Transformer
Res Sq. 2024 Feb 9:rs.3.rs-3640878. doi: 10.21203/rs.3.rs-3640878/v1.
7
Integration of persistent Laplacian and pre-trained transformer for protein solubility changes upon mutation.持久拉普拉斯和预训练转换器的整合用于预测突变对蛋白质溶解度的影响。
Comput Biol Med. 2024 Feb;169:107918. doi: 10.1016/j.compbiomed.2024.107918. Epub 2024 Jan 3.
8
Integration of persistent Laplacian and pre-trained transformer for protein solubility changes upon mutation.用于突变后蛋白质溶解度变化的持久拉普拉斯算子与预训练变压器的集成
ArXiv. 2023 Nov 2:arXiv:2310.18760v2.
9
Cluster learning-assisted directed evolution.聚类学习辅助的定向进化
Nat Comput Sci. 2021 Dec;1(12):809-818. doi: 10.1038/s43588-021-00168-y. Epub 2021 Dec 9.
10
Persistent Cohomology for Data With Multicomponent Heterogeneous Information.具有多组分异构信息的数据的持久上同调
SIAM J Math Data Sci. 2020;2(2):396-418. doi: 10.1137/19m1272226. Epub 2020 May 19.

引用本文的文献

1
Machine learning analysis of ARVC informed by sodium channel protein-based interactome networks.基于钠通道蛋白相互作用组网络的致心律失常性右室心肌病机器学习分析
Front Pharmacol. 2025 Jul 23;16:1611342. doi: 10.3389/fphar.2025.1611342. eCollection 2025.
2
A review of transformer models in drug discovery and beyond.药物发现及其他领域中变压器模型综述。
J Pharm Anal. 2025 Jun;15(6):101081. doi: 10.1016/j.jpha.2024.101081. Epub 2024 Aug 30.
3
Rapid response to fast viral evolution using AlphaFold 3-assisted topological deep learning.

本文引用的文献

1
CLADE 2.0: Evolution-Driven Cluster Learning-Assisted Directed Evolution.CLADE 2.0:进化驱动的聚类学习辅助定向进化
J Chem Inf Model. 2022 Oct 10;62(19):4629-4641. doi: 10.1021/acs.jcim.2c01046. Epub 2022 Sep 26.
2
Cluster learning-assisted directed evolution.聚类学习辅助的定向进化
Nat Comput Sci. 2021 Dec;1(12):809-818. doi: 10.1038/s43588-021-00168-y. Epub 2021 Dec 9.
3
Learning protein fitness models from evolutionary and assay-labeled data.从进化和实验标记数据中学习蛋白质适应性模型。
使用AlphaFold 3辅助拓扑深度学习对快速病毒进化做出快速反应。
Virus Evol. 2025 Apr 29;11(1):veaf026. doi: 10.1093/ve/veaf026. eCollection 2025.
4
A review of machine learning methods for imbalanced data challenges in chemistry.化学中不平衡数据挑战的机器学习方法综述。
Chem Sci. 2025 Apr 22;16(18):7637-7658. doi: 10.1039/d5sc00270b. eCollection 2025 May 7.
5
Position: Topological Deep Learning is the New Frontier for Relational Learning.观点:拓扑深度学习是关系学习的新前沿。
Proc Mach Learn Res. 2024 Jul;235:39529-39555.
6
Persistent Directed Flag Laplacian (PDFL)-Based Machine Learning for Protein-Ligand Binding Affinity Prediction.基于持久定向旗拉普拉斯算子(PDFL)的机器学习用于蛋白质-配体结合亲和力预测
J Chem Theory Comput. 2025 Apr 22;21(8):4276-4285. doi: 10.1021/acs.jctc.5c00074. Epub 2025 Apr 5.
7
Epitope mapping via in vitro deep mutational scanning methods and its applications.通过体外深度突变扫描方法进行的表位作图及其应用
J Biol Chem. 2025 Jan;301(1):108072. doi: 10.1016/j.jbc.2024.108072. Epub 2024 Dec 14.
8
PERSISTENT DIRAC OF PATHS ON DIGRAPHS AND HYPERGRAPHS.图和超图上路径的持久狄拉克(指标)
Found Data Sci. 2024 Jun;6(2):124-153. doi: 10.3934/fods.2024001.
9
Rapid response to fast viral evolution using AlphaFold 3-assisted topological deep learning.使用AlphaFold 3辅助拓扑深度学习对快速病毒进化做出快速响应。
ArXiv. 2024 Nov 19:arXiv:2411.12370v1.
10
Persistent Mayer Dirac.持续的迈耶·狄拉克。
J Phys Complex. 2024 Dec 1;5(4):045005. doi: 10.1088/2632-072X/ad83a5. Epub 2024 Oct 17.
Nat Biotechnol. 2022 Jul;40(7):1114-1122. doi: 10.1038/s41587-021-01146-5. Epub 2022 Jan 17.
4
Disease variant prediction with deep generative models of evolutionary data.利用进化数据的深度生成模型进行疾病变异预测。
Nature. 2021 Nov;599(7883):91-95. doi: 10.1038/s41586-021-04043-8. Epub 2021 Oct 27.
5
ECNet is an evolutionary context-integrated deep learning framework for protein engineering.ECNet 是一种用于蛋白质工程的进化上下文集成深度学习框架。
Nat Commun. 2021 Sep 30;12(1):5743. doi: 10.1038/s41467-021-25976-8.
6
HERMES: PERSISTENT SPECTRAL GRAPH SOFTWARE.赫尔墨斯:持久光谱图软件。
Found Data Sci. 2021 Mar;3(1):67-97. doi: 10.3934/fods.2021006.
7
Informed training set design enables efficient machine learning-assisted directed protein evolution.知情训练集设计可实现高效的机器学习辅助定向蛋白质进化。
Cell Syst. 2021 Nov 17;12(11):1026-1045.e7. doi: 10.1016/j.cels.2021.07.008. Epub 2021 Aug 19.
8
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
9
ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning.ProtTrans:通过自监督学习理解生命语言。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127. doi: 10.1109/TPAMI.2021.3095381. Epub 2022 Sep 14.
10
A topology-based network tree for the prediction of protein-protein binding affinity changes following mutation.一种基于拓扑结构的网络树,用于预测突变后蛋白质-蛋白质结合亲和力的变化。
Nat Mach Intell. 2020;2(2):116-123. doi: 10.1038/s42256-020-0149-6. Epub 2020 Feb 14.