• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

低蛋白工程与数据高效深度学习。

Low-N protein engineering with data-efficient deep learning.

机构信息

Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.

Nabla Bio, Inc., Boston, MA, USA.

出版信息

Nat Methods. 2021 Apr;18(4):389-396. doi: 10.1038/s41592-021-01100-y. Epub 2021 Apr 7.

DOI:10.1038/s41592-021-01100-y
PMID:33828272
Abstract

Protein engineering has enormous academic and industrial potential. However, it is limited by the lack of experimental assays that are consistent with the design goal and sufficiently high throughput to find rare, enhanced variants. Here we introduce a machine learning-guided paradigm that can use as few as 24 functionally assayed mutant sequences to build an accurate virtual fitness landscape and screen ten million sequences via in silico directed evolution. As demonstrated in two dissimilar proteins, GFP from Aequorea victoria (avGFP) and E. coli strain TEM-1 β-lactamase, top candidates from a single round are diverse and as active as engineered mutants obtained from previous high-throughput efforts. By distilling information from natural protein sequence landscapes, our model learns a latent representation of 'unnaturalness', which helps to guide search away from nonfunctional sequence neighborhoods. Subsequent low-N supervision then identifies improvements to the activity of interest. In sum, our approach enables efficient use of resource-intensive high-fidelity assays without sacrificing throughput, and helps to accelerate engineered proteins into the fermenter, field and clinic.

摘要

蛋白质工程具有巨大的学术和工业潜力。然而,它受到缺乏与设计目标一致且高通量足以发现稀有增强变体的实验测定的限制。在这里,我们介绍了一种机器学习指导的范例,它可以使用多达 24 个功能测定的突变序列来构建准确的虚拟适应度景观,并通过计算机指导的进化筛选 1000 万个序列。在两个不同的蛋白质(维多利亚水母 GFP(avGFP)和大肠杆菌 TEM-1 β-内酰胺酶)中进行的演示表明,单轮筛选的最佳候选者具有多样性,并且与以前高通量努力获得的工程突变体一样活跃。通过从天然蛋白质序列景观中提取信息,我们的模型学习了“非自然”的潜在表示,这有助于引导搜索远离非功能序列区域。随后的低 N 监督则可以识别出对目标活性的改进。总之,我们的方法能够在不牺牲通量的情况下高效利用资源密集型高保真度测定,有助于将工程蛋白加速推向发酵罐、田间和临床。

相似文献

1
Low-N protein engineering with data-efficient deep learning.低蛋白工程与数据高效深度学习。
Nat Methods. 2021 Apr;18(4):389-396. doi: 10.1038/s41592-021-01100-y. Epub 2021 Apr 7.
2
ECNet is an evolutionary context-integrated deep learning framework for protein engineering.ECNet 是一种用于蛋白质工程的进化上下文集成深度学习框架。
Nat Commun. 2021 Sep 30;12(1):5743. doi: 10.1038/s41467-021-25976-8.
3
Machine-learning-guided directed evolution for protein engineering.基于机器学习的定向进化蛋白质工程。
Nat Methods. 2019 Aug;16(8):687-694. doi: 10.1038/s41592-019-0496-6. Epub 2019 Jul 15.
4
Machine learning-guided co-optimization of fitness and diversity facilitates combinatorial library design in enzyme engineering.机器学习引导的适应性和多样性协同优化促进了酶工程组合文库设计。
Nat Commun. 2024 Jul 29;15(1):6392. doi: 10.1038/s41467-024-50698-y.
5
Machine learning to navigate fitness landscapes for protein engineering.机器学习在蛋白质工程中的应用:探索适应度景观
Curr Opin Biotechnol. 2022 Jun;75:102713. doi: 10.1016/j.copbio.2022.102713. Epub 2022 Apr 9.
6
Enhancing efficiency of protein language models with minimal wet-lab data through few-shot learning.通过少样本学习利用最少的湿实验数据提高蛋白质语言模型的效率。
Nat Commun. 2024 Jul 2;15(1):5566. doi: 10.1038/s41467-024-49798-6.
7
Accurate and efficient structure-based computational mutagenesis for modeling fluorescence levels of Aequorea victoria green fluorescent protein mutants.基于结构的精确高效计算突变,用于模拟维多利亚多管发光水母绿色荧光蛋白突变体的荧光水平。
Protein Eng Des Sel. 2020 Sep 14;33. doi: 10.1093/protein/gzaa022.
8
Engineering proteinase K using machine learning and synthetic genes.利用机器学习和合成基因工程改造蛋白酶K
BMC Biotechnol. 2007 Mar 26;7:16. doi: 10.1186/1472-6750-7-16.
9
Unified rational protein engineering with sequence-based deep representation learning.基于序列的深度学习表示的统一理性蛋白质工程。
Nat Methods. 2019 Dec;16(12):1315-1322. doi: 10.1038/s41592-019-0598-1. Epub 2019 Oct 21.
10
Trade-offs between enzyme fitness and solubility illuminated by deep mutational scanning.深度突变扫描揭示的酶适应性与溶解性之间的权衡
Proc Natl Acad Sci U S A. 2017 Feb 28;114(9):2265-2270. doi: 10.1073/pnas.1614437114. Epub 2017 Feb 14.

引用本文的文献

1
AI-guided Cas9 engineering provides an effective strategy to enhance base editing.人工智能引导的Cas9工程提供了一种增强碱基编辑的有效策略。
Mol Syst Biol. 2025 Sep 15. doi: 10.1038/s44320-025-00142-0.
2
Biophysics-based protein language models for protein engineering.用于蛋白质工程的基于生物物理学的蛋白质语言模型。
Nat Methods. 2025 Sep 11. doi: 10.1038/s41592-025-02776-2.
3
An iterative deep learning-guided algorithm for directed protein evolution.一种用于定向蛋白质进化的迭代深度学习引导算法。

本文引用的文献

1
An evolution-based model for designing chorismate mutase enzymes.一种基于进化的分支酸变位酶设计模型。
Science. 2020 Jul 24;369(6502):440-445. doi: 10.1126/science.aba3304.
iScience. 2025 Aug 7;28(9):113324. doi: 10.1016/j.isci.2025.113324. eCollection 2025 Sep 19.
4
Active learning-guided optimization of cell-free biosensors for lead testing in drinking water.主动学习引导的用于饮用水中铅检测的无细胞生物传感器优化
bioRxiv. 2025 Aug 22:2025.08.20.671382. doi: 10.1101/2025.08.20.671382.
5
Rational engineering of allosteric protein switches by in silico prediction of domain insertion sites.通过结构域插入位点的计算机模拟预测对变构蛋白开关进行合理工程设计。
Nat Methods. 2025 Aug;22(8):1698-1706. doi: 10.1038/s41592-025-02741-z. Epub 2025 Aug 4.
6
Mechanistic modeling or machine learning for detecting variants of concern: Why not both?用于检测关注变体的机制建模或机器学习:为何不两者兼用?
Proc Natl Acad Sci U S A. 2025 Jul 29;122(30):e2513608122. doi: 10.1073/pnas.2513608122. Epub 2025 Jul 21.
7
Investigating the determinants of performance in machine learning for protein fitness prediction.研究蛋白质适应性预测机器学习中性能的决定因素。
Protein Sci. 2025 Aug;34(8):e70235. doi: 10.1002/pro.70235.
8
GOLF: A Generative AI Framework for Pathogenicity Prediction of Myocilin OLF Variants.高尔夫:一种用于肌纤蛋白OLF变体致病性预测的生成式人工智能框架。
bioRxiv. 2025 Jun 24:2025.06.17.660210. doi: 10.1101/2025.06.17.660210.
9
Advancing genetic engineering with active learning: theory, implementations and potential opportunities.通过主动学习推进基因工程:理论、实现与潜在机遇
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf286.
10
A Cyanobacterial Screening Platform for Rubisco Mutant Variants.用于核酮糖-1,5-二磷酸羧化酶突变体变体的蓝藻筛选平台。
ACS Synth Biol. 2025 Jul 18;14(7):2619-2633. doi: 10.1021/acssynbio.5c00065. Epub 2025 Jul 7.