• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习辅助的组合文库定向蛋白质进化。

Machine learning-assisted directed protein evolution with combinatorial libraries.

机构信息

Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125.

Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA 91125.

出版信息

Proc Natl Acad Sci U S A. 2019 Apr 30;116(18):8852-8858. doi: 10.1073/pnas.1901979116. Epub 2019 Apr 12.

DOI:10.1073/pnas.1901979116
PMID:30979809
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6500146/
Abstract

To reduce experimental effort associated with directed protein evolution and to explore the sequence space encoded by mutating multiple positions simultaneously, we incorporate machine learning into the directed evolution workflow. Combinatorial sequence space can be quite expensive to sample experimentally, but machine-learning models trained on tested variants provide a fast method for testing sequence space computationally. We validated this approach on a large published empirical fitness landscape for human GB1 binding protein, demonstrating that machine learning-guided directed evolution finds variants with higher fitness than those found by other directed evolution approaches. We then provide an example application in evolving an enzyme to produce each of the two possible product enantiomers (i.e., stereodivergence) of a new-to-nature carbene Si-H insertion reaction. The approach predicted libraries enriched in functional enzymes and fixed seven mutations in two rounds of evolution to identify variants for selective catalysis with 93% and 79% (enantiomeric excess). By greatly increasing throughput with in silico modeling, machine learning enhances the quality and diversity of sequence solutions for a protein engineering problem.

摘要

为了减少定向蛋白质进化相关的实验工作量,并探索同时突变多个位置所编码的序列空间,我们将机器学习纳入定向进化工作流程中。组合序列空间在实验上进行采样可能非常昂贵,但基于已测试变体训练的机器学习模型为计算上测试序列空间提供了快速方法。我们在一个大型已发表的人类 GB1 结合蛋白经验适应性景观上验证了这种方法,证明了机器学习指导的定向进化可以找到比其他定向进化方法更高适应性的变体。然后,我们提供了一个酶进化的应用示例,以产生一种新的天然卡宾 Si-H 插入反应的两种可能产物对映异构体(即立体发散)。该方法预测了富含功能酶的文库,并在两轮进化中固定了七个突变,以鉴定选择性催化的变体,其对映体过量(ee)分别为 93%和 79%。通过大大提高基于计算机建模的通量,机器学习增强了蛋白质工程问题的序列解决方案的质量和多样性。

相似文献

1
Machine learning-assisted directed protein evolution with combinatorial libraries.机器学习辅助的组合文库定向蛋白质进化。
Proc Natl Acad Sci U S A. 2019 Apr 30;116(18):8852-8858. doi: 10.1073/pnas.1901979116. Epub 2019 Apr 12.
2
Machine learning-guided co-optimization of fitness and diversity facilitates combinatorial library design in enzyme engineering.机器学习引导的适应性和多样性协同优化促进了酶工程组合文库设计。
Nat Commun. 2024 Jul 29;15(1):6392. doi: 10.1038/s41467-024-50698-y.
3
Machine learning-assisted enzyme engineering.机器学习辅助酶工程。
Methods Enzymol. 2020;643:281-315. doi: 10.1016/bs.mie.2020.05.005. Epub 2020 Jun 12.
4
Machine-learning-guided directed evolution for protein engineering.基于机器学习的定向进化蛋白质工程。
Nat Methods. 2019 Aug;16(8):687-694. doi: 10.1038/s41592-019-0496-6. Epub 2019 Jul 15.
5
Machine-Learning-Guided Mutagenesis for Directed Evolution of Fluorescent Proteins.用于荧光蛋白定向进化的机器学习引导诱变
ACS Synth Biol. 2018 Sep 21;7(9):2014-2022. doi: 10.1021/acssynbio.8b00155. Epub 2018 Aug 20.
6
PyPEF-An Integrated Framework for Data-Driven Protein Engineering.PyPEF——一个用于数据驱动的蛋白质工程的集成框架。
J Chem Inf Model. 2021 Jul 26;61(7):3463-3476. doi: 10.1021/acs.jcim.1c00099. Epub 2021 Jul 14.
7
Comprehensive Prediction of Molecular Recognition in a Combinatorial Chemical Space Using Machine Learning.使用机器学习全面预测组合化学空间中的分子识别。
ACS Comb Sci. 2020 Oct 12;22(10):500-508. doi: 10.1021/acscombsci.0c00003. Epub 2020 Aug 17.
8
Creating the New from the Old: Combinatorial Libraries Generation with Machine-Learning-Based Compound Structure Optimization.推陈出新:基于机器学习的化合物结构优化生成组合文库
J Chem Inf Model. 2017 Feb 27;57(2):133-147. doi: 10.1021/acs.jcim.6b00426. Epub 2017 Feb 15.
9
Fast and Flexible Synthesis of Combinatorial Libraries for Directed Evolution.用于定向进化的组合文库的快速灵活合成
Methods Enzymol. 2018;608:59-79. doi: 10.1016/bs.mie.2018.04.006. Epub 2018 May 24.
10
Informed training set design enables efficient machine learning-assisted directed protein evolution.知情训练集设计可实现高效的机器学习辅助定向蛋白质进化。
Cell Syst. 2021 Nov 17;12(11):1026-1045.e7. doi: 10.1016/j.cels.2021.07.008. Epub 2021 Aug 19.

引用本文的文献

1
Combing Directed Enzyme Evolution with Metabolic Engineering to Develop Efficient Microbial Cell Factories.将定向酶进化与代谢工程相结合以开发高效的微生物细胞工厂。
Chem Bio Eng. 2025 May 1;2(8):449-459. doi: 10.1021/cbe.5c00002. eCollection 2025 Aug 28.
2
Active learning-guided optimization of cell-free biosensors for lead testing in drinking water.主动学习引导的用于饮用水中铅检测的无细胞生物传感器优化
bioRxiv. 2025 Aug 22:2025.08.20.671382. doi: 10.1101/2025.08.20.671382.
3
Machine learning-guided evolution of pyrrolysyl-tRNA synthetase for improved incorporation efficiency of diverse noncanonical amino acids.机器学习引导的吡咯赖氨酸-tRNA合成酶进化,以提高多种非标准氨基酸的掺入效率。
Nat Commun. 2025 Jul 19;16(1):6648. doi: 10.1038/s41467-025-61952-2.
4
Data-driven synthetic microbes for sustainable future.面向可持续未来的数据驱动型合成微生物。
NPJ Syst Biol Appl. 2025 Jul 7;11(1):74. doi: 10.1038/s41540-025-00556-4.
5
Designing diverse and high-performance proteins with a large language model in the loop.利用大语言模型循环设计多样化且高性能的蛋白质。
PLoS Comput Biol. 2025 Jun 5;21(6):e1013119. doi: 10.1371/journal.pcbi.1013119. eCollection 2025 Jun.
6
Efficient Searches in Protein Sequence Space Through AI-Driven Iterative Learning.通过人工智能驱动的迭代学习在蛋白质序列空间中进行高效搜索。
Int J Mol Sci. 2025 May 15;26(10):4741. doi: 10.3390/ijms26104741.
7
Sequence and taxonomic feature evaluation facilitated the discovery of alcohol oxidases.序列和分类学特征评估促进了醇氧化酶的发现。
Synth Syst Biotechnol. 2025 Apr 22;10(3):907-915. doi: 10.1016/j.synbio.2025.04.014. eCollection 2025 Sep.
8
Dynamic Allostery: Evolution's Double-Edged Sword in Protein Function and Disease.动态变构:蛋白质功能与疾病中进化的双刃剑
J Mol Biol. 2025 Apr 24:169175. doi: 10.1016/j.jmb.2025.169175.
9
Neural network conditioned to produce thermophilic protein sequences can increase thermal stability.经过训练以生成嗜热蛋白序列的神经网络可以提高热稳定性。
Sci Rep. 2025 Apr 23;15(1):14124. doi: 10.1038/s41598-025-90828-0.
10
Custom CRISPR-Cas9 PAM variants via scalable engineering and machine learning.通过可扩展工程和机器学习实现定制化CRISPR-Cas9原间隔序列临近基序变体
Nature. 2025 Apr 22. doi: 10.1038/s41586-025-09021-y.

本文引用的文献

1
Enzymatic assembly of carbon-carbon bonds via iron-catalysed sp C-H functionalization.通过铁催化的 sp³ C-H 功能化酶促组装碳-碳键。
Nature. 2019 Jan;565(7737):67-72. doi: 10.1038/s41586-018-0808-5. Epub 2018 Dec 19.
2
A machine learning approach for reliable prediction of amino acid interactions and its application in the directed evolution of enantioselective enzymes.一种用于可靠预测氨基酸相互作用的机器学习方法及其在对映选择性酶定向进化中的应用。
Sci Rep. 2018 Nov 13;8(1):16757. doi: 10.1038/s41598-018-35033-y.
3
Automated Design of Efficient and Functionally Diverse Enzyme Repertoires.高效且功能多样的酶组合的自动化设计。
Mol Cell. 2018 Oct 4;72(1):178-186.e5. doi: 10.1016/j.molcel.2018.08.033. Epub 2018 Sep 27.
4
Deep generative models of genetic variation capture the effects of mutations.深度生成模型捕获遗传变异的突变效应。
Nat Methods. 2018 Oct;15(10):816-822. doi: 10.1038/s41592-018-0138-4. Epub 2018 Sep 24.
5
Molecular modeling of conformational dynamics and its role in enzyme evolution.构象动力学的分子建模及其在酶进化中的作用。
Curr Opin Struct Biol. 2018 Oct;52:50-57. doi: 10.1016/j.sbi.2018.08.004. Epub 2018 Sep 8.
6
Machine-Learning-Guided Mutagenesis for Directed Evolution of Fluorescent Proteins.用于荧光蛋白定向进化的机器学习引导诱变
ACS Synth Biol. 2018 Sep 21;7(9):2014-2022. doi: 10.1021/acssynbio.8b00155. Epub 2018 Aug 20.
7
Catalytic iron-carbene intermediate revealed in a cytochrome carbene transferase.细胞色素卡宾转移酶中催化铁-卡宾中间体的揭示。
Proc Natl Acad Sci U S A. 2018 Jul 10;115(28):7308-7313. doi: 10.1073/pnas.1807027115. Epub 2018 Jun 26.
8
Learned protein embeddings for machine learning.用于机器学习的习得蛋白质嵌入。
Bioinformatics. 2018 Dec 1;34(23):4138. doi: 10.1093/bioinformatics/bty455.
9
SWISS-MODEL: homology modelling of protein structures and complexes.SWISS-MODEL:蛋白质结构和复合物的同源建模。
Nucleic Acids Res. 2018 Jul 2;46(W1):W296-W303. doi: 10.1093/nar/gky427.
10
ProtaBank: A repository for protein design and engineering data.ProtaBank:一个用于蛋白质设计和工程数据的存储库。
Protein Sci. 2018 Jun;27(6):1113-1124. doi: 10.1002/pro.3406. Epub 2018 Apr 30.