• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于定向蛋白质进化的基于进化和结构正则化的贝叶斯优化

Bayesian optimization with evolutionary and structure-based regularization for directed protein evolution.

作者信息

Frisby Trevor S, Langmead Christopher James

机构信息

Computational Biology Department, School of Computer Science, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, USA.

出版信息

Algorithms Mol Biol. 2021 Jul 1;16(1):13. doi: 10.1186/s13015-021-00195-4.

DOI:10.1186/s13015-021-00195-4
PMID:34210336
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8246133/
Abstract

BACKGROUND

Directed evolution (DE) is a technique for protein engineering that involves iterative rounds of mutagenesis and screening to search for sequences that optimize a given property, such as binding affinity to a specified target. Unfortunately, the underlying optimization problem is under-determined, and so mutations introduced to improve the specified property may come at the expense of unmeasured, but nevertheless important properties (ex. solubility, thermostability, etc). We address this issue by formulating DE as a regularized Bayesian optimization problem where the regularization term reflects evolutionary or structure-based constraints.

RESULTS

We applied our approach to DE to three representative proteins, GB1, BRCA1, and SARS-CoV-2 Spike, and evaluated both evolutionary and structure-based regularization terms. The results of these experiments demonstrate that: (i) structure-based regularization usually leads to better designs (and never hurts), compared to the unregularized setting; (ii) evolutionary-based regularization tends to be least effective; and (iii) regularization leads to better designs because it effectively focuses the search in certain areas of sequence space, making better use of the experimental budget. Additionally, like previous work in Machine learning assisted DE, we find that our approach significantly reduces the experimental burden of DE, relative to model-free methods.

CONCLUSION

Introducing regularization into a Bayesian ML-assisted DE framework alters the exploratory patterns of the underlying optimization routine, and can shift variant selections towards those with a range of targeted and desirable properties. In particular, we find that structure-based regularization often improves variant selection compared to unregularized approaches, and never hurts.

摘要

背景

定向进化(DE)是一种蛋白质工程技术,它涉及多轮迭代诱变和筛选,以寻找能优化特定属性(如与指定靶标的结合亲和力)的序列。不幸的是,潜在的优化问题是欠定的,因此为改善指定属性而引入的突变可能会以未测量但同样重要的属性(如溶解度、热稳定性等)为代价。我们通过将DE表述为一个正则化贝叶斯优化问题来解决这个问题,其中正则化项反映了进化或基于结构的约束。

结果

我们将我们的方法应用于DE,针对三种代表性蛋白质GB1、BRCA1和SARS-CoV-2刺突蛋白,并评估了基于进化和基于结构的正则化项。这些实验结果表明:(i)与未正则化的情况相比,基于结构的正则化通常会带来更好的设计(而且从不产生负面影响);(ii)基于进化的正则化往往效果最差;(iii)正则化能带来更好的设计,因为它有效地将搜索集中在序列空间的某些区域,从而更好地利用实验预算。此外,与机器学习辅助DE的先前工作一样,我们发现我们的方法相对于无模型方法显著减轻了DE的实验负担。

结论

在贝叶斯机器学习辅助的DE框架中引入正则化会改变潜在优化程序的探索模式,并可将变体选择转向具有一系列目标和理想属性的变体。特别是,我们发现与未正则化的方法相比,基于结构的正则化通常会改善变体选择,而且从不产生负面影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/af870693398f/13015_2021_195_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/83ebafeeb96f/13015_2021_195_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/b7d89d5c53a0/13015_2021_195_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/8a4b551afb42/13015_2021_195_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/70f967092db8/13015_2021_195_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/cdf5b1e604e3/13015_2021_195_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/a7196d0339aa/13015_2021_195_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/af870693398f/13015_2021_195_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/83ebafeeb96f/13015_2021_195_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/b7d89d5c53a0/13015_2021_195_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/8a4b551afb42/13015_2021_195_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/70f967092db8/13015_2021_195_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/cdf5b1e604e3/13015_2021_195_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/a7196d0339aa/13015_2021_195_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b23/8247255/af870693398f/13015_2021_195_Fig7_HTML.jpg

相似文献

1
Bayesian optimization with evolutionary and structure-based regularization for directed protein evolution.用于定向蛋白质进化的基于进化和结构正则化的贝叶斯优化
Algorithms Mol Biol. 2021 Jul 1;16(1):13. doi: 10.1186/s13015-021-00195-4.
2
Optimization of Optical Machine Structure by Backpropagation Neural Network Based on Particle Swarm Optimization and Bayesian Regularization Algorithms.基于粒子群优化和贝叶斯正则化算法的反向传播神经网络对光学机械结构的优化
Materials (Basel). 2021 Jun 1;14(11):2998. doi: 10.3390/ma14112998.
3
Minipatch Learning as Implicit Ridge-Like Regularization.作为隐式类岭正则化的微补丁学习
Int Conf Big Data Smart Comput. 2021 Jan;2021. doi: 10.1109/bigcomp51126.2021.00021. Epub 2021 Mar 10.
4
Protein engineering via Bayesian optimization-guided evolutionary algorithm and robotic experiments.通过贝叶斯优化引导的进化算法和机器人实验进行蛋白质工程。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac570.
5
Iterative regularization in intensity-modulated radiation therapy optimization.调强放射治疗优化中的迭代正则化
Med Phys. 2006 Jan;33(1):225-34. doi: 10.1118/1.2148918.
6
Machine learning-assisted enzyme engineering.机器学习辅助酶工程。
Methods Enzymol. 2020;643:281-315. doi: 10.1016/bs.mie.2020.05.005. Epub 2020 Jun 12.
7
Task-driven optimization of CT tube current modulation and regularization in model-based iterative reconstruction.基于模型的迭代重建中CT管电流调制与正则化的任务驱动优化
Phys Med Biol. 2017 Jun 21;62(12):4777-4797. doi: 10.1088/1361-6560/aa6a97. Epub 2017 Mar 31.
8
Learning rates of lq coefficient regularization learning with gaussian kernel.高斯核的线性二次(LQ)系数正则化学习的学习率
Neural Comput. 2014 Oct;26(10):2350-78. doi: 10.1162/NECO_a_00641. Epub 2014 Jul 24.
9
Gene selection in cancer classification using sparse logistic regression with Bayesian regularization.使用带贝叶斯正则化的稀疏逻辑回归进行癌症分类中的基因选择。
Bioinformatics. 2006 Oct 1;22(19):2348-55. doi: 10.1093/bioinformatics/btl386. Epub 2006 Jul 14.
10
Noise suppression for dual-energy CT via penalized weighted least-square optimization with similarity-based regularization.基于相似性正则化的惩罚加权最小二乘优化用于双能CT的噪声抑制
Med Phys. 2016 May;43(5):2676. doi: 10.1118/1.4947485.

引用本文的文献

1
Optimisation strategies for directed evolution without sequencing.无需测序的定向进化优化策略。
PLoS Comput Biol. 2024 Dec 19;20(12):e1012695. doi: 10.1371/journal.pcbi.1012695. eCollection 2024 Dec.
2
Machine-learning-guided Directed Evolution for AAV Capsid Engineering.基于机器学习的腺相关病毒衣壳工程定向进化
Curr Pharm Des. 2024;30(11):811-824. doi: 10.2174/0113816128286593240226060318.
3
STAR: A Web Server for Assisting Directed Protein Evolution with Machine Learning.STAR:一个利用机器学习辅助定向蛋白质进化的网络服务器。

本文引用的文献

1
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.生物结构和功能源于将无监督学习扩展到 2.5 亿个蛋白质序列。
Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2016239118.
2
UniProt: the universal protein knowledgebase in 2021.UniProt:2021 年的通用蛋白质知识库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100.
3
Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding.
ACS Omega. 2023 Nov 14;8(47):44751-44756. doi: 10.1021/acsomega.3c04832. eCollection 2023 Nov 28.
4
Accelerating Biocatalysis Discovery with Machine Learning: A Paradigm Shift in Enzyme Engineering, Discovery, and Design.利用机器学习加速生物催化发现:酶工程、发现与设计的范式转变
ACS Catal. 2023 Oct 26;13(21):14454-14469. doi: 10.1021/acscatal.3c03417. eCollection 2023 Nov 3.
5
Bayesian reconstruction of magnetic resonance images using Gaussian processes.使用高斯过程对磁共振图像进行贝叶斯重建。
Sci Rep. 2023 Aug 2;13(1):12527. doi: 10.1038/s41598-023-39533-4.
6
AMaLa: Analysis of Directed Evolution Experiments via Annealed Mutational Approximated Landscape.AMaLa:通过退火突变逼近景观分析定向进化实验。
Int J Mol Sci. 2021 Oct 9;22(20):10908. doi: 10.3390/ijms222010908.
深度突变扫描 SARS-CoV-2 受体结合域揭示了折叠和 ACE2 结合的限制。
Cell. 2020 Sep 3;182(5):1295-1310.e20. doi: 10.1016/j.cell.2020.08.012. Epub 2020 Aug 11.
4
MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect.MaveDB:一个开源平台,用于分发和解释来自变异效应多重分析的数据。
Genome Biol. 2019 Nov 4;20(1):223. doi: 10.1186/s13059-019-1845-6.
5
Machine-learning-guided directed evolution for protein engineering.基于机器学习的定向进化蛋白质工程。
Nat Methods. 2019 Aug;16(8):687-694. doi: 10.1038/s41592-019-0496-6. Epub 2019 Jul 15.
6
Machine learning-assisted directed protein evolution with combinatorial libraries.机器学习辅助的组合文库定向蛋白质进化。
Proc Natl Acad Sci U S A. 2019 Apr 30;116(18):8852-8858. doi: 10.1073/pnas.1901979116. Epub 2019 Apr 12.
7
Principles of Protein Stability and Their Application in Computational Design.蛋白质稳定性原理及其在计算设计中的应用。
Annu Rev Biochem. 2018 Jun 20;87:105-129. doi: 10.1146/annurev-biochem-062917-012102. Epub 2018 Jan 26.
8
Directed Evolution: Bringing New Chemistry to Life.定向进化:为生命带来新的化学物质。
Angew Chem Int Ed Engl. 2018 Apr 9;57(16):4143-4148. doi: 10.1002/anie.201708408. Epub 2017 Nov 28.
9
Evaluating the accuracy of protein design using native secondary sub-structures.使用天然二级子结构评估蛋白质设计的准确性。
BMC Bioinformatics. 2016 Sep 5;17(1):353. doi: 10.1186/s12859-016-1199-y.
10
Adaptation in protein fitness landscapes is facilitated by indirect paths.蛋白质适应度景观中的适应过程由间接路径促成。
Elife. 2016 Jul 8;5:e16965. doi: 10.7554/eLife.16965.