Suppr超能文献

机器学习辅助的组合文库定向蛋白质进化。

Machine learning-assisted directed protein evolution with combinatorial libraries.

机构信息

Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125.

Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA 91125.

出版信息

Proc Natl Acad Sci U S A. 2019 Apr 30;116(18):8852-8858. doi: 10.1073/pnas.1901979116. Epub 2019 Apr 12.

Abstract

To reduce experimental effort associated with directed protein evolution and to explore the sequence space encoded by mutating multiple positions simultaneously, we incorporate machine learning into the directed evolution workflow. Combinatorial sequence space can be quite expensive to sample experimentally, but machine-learning models trained on tested variants provide a fast method for testing sequence space computationally. We validated this approach on a large published empirical fitness landscape for human GB1 binding protein, demonstrating that machine learning-guided directed evolution finds variants with higher fitness than those found by other directed evolution approaches. We then provide an example application in evolving an enzyme to produce each of the two possible product enantiomers (i.e., stereodivergence) of a new-to-nature carbene Si-H insertion reaction. The approach predicted libraries enriched in functional enzymes and fixed seven mutations in two rounds of evolution to identify variants for selective catalysis with 93% and 79% (enantiomeric excess). By greatly increasing throughput with in silico modeling, machine learning enhances the quality and diversity of sequence solutions for a protein engineering problem.

摘要

为了减少定向蛋白质进化相关的实验工作量,并探索同时突变多个位置所编码的序列空间,我们将机器学习纳入定向进化工作流程中。组合序列空间在实验上进行采样可能非常昂贵,但基于已测试变体训练的机器学习模型为计算上测试序列空间提供了快速方法。我们在一个大型已发表的人类 GB1 结合蛋白经验适应性景观上验证了这种方法,证明了机器学习指导的定向进化可以找到比其他定向进化方法更高适应性的变体。然后,我们提供了一个酶进化的应用示例,以产生一种新的天然卡宾 Si-H 插入反应的两种可能产物对映异构体(即立体发散)。该方法预测了富含功能酶的文库,并在两轮进化中固定了七个突变,以鉴定选择性催化的变体,其对映体过量(ee)分别为 93%和 79%。通过大大提高基于计算机建模的通量,机器学习增强了蛋白质工程问题的序列解决方案的质量和多样性。

相似文献

1
Machine learning-assisted directed protein evolution with combinatorial libraries.机器学习辅助的组合文库定向蛋白质进化。
Proc Natl Acad Sci U S A. 2019 Apr 30;116(18):8852-8858. doi: 10.1073/pnas.1901979116. Epub 2019 Apr 12.
3
Machine learning-assisted enzyme engineering.机器学习辅助酶工程。
Methods Enzymol. 2020;643:281-315. doi: 10.1016/bs.mie.2020.05.005. Epub 2020 Jun 12.
4
Machine-learning-guided directed evolution for protein engineering.基于机器学习的定向进化蛋白质工程。
Nat Methods. 2019 Aug;16(8):687-694. doi: 10.1038/s41592-019-0496-6. Epub 2019 Jul 15.
5
6
PyPEF-An Integrated Framework for Data-Driven Protein Engineering.PyPEF——一个用于数据驱动的蛋白质工程的集成框架。
J Chem Inf Model. 2021 Jul 26;61(7):3463-3476. doi: 10.1021/acs.jcim.1c00099. Epub 2021 Jul 14.

引用本文的文献

4
7
Sequence and taxonomic feature evaluation facilitated the discovery of alcohol oxidases.序列和分类学特征评估促进了醇氧化酶的发现。
Synth Syst Biotechnol. 2025 Apr 22;10(3):907-915. doi: 10.1016/j.synbio.2025.04.014. eCollection 2025 Sep.

本文引用的文献

3
Automated Design of Efficient and Functionally Diverse Enzyme Repertoires.高效且功能多样的酶组合的自动化设计。
Mol Cell. 2018 Oct 4;72(1):178-186.e5. doi: 10.1016/j.molcel.2018.08.033. Epub 2018 Sep 27.
4
6
7
Catalytic iron-carbene intermediate revealed in a cytochrome carbene transferase.细胞色素卡宾转移酶中催化铁-卡宾中间体的揭示。
Proc Natl Acad Sci U S A. 2018 Jul 10;115(28):7308-7313. doi: 10.1073/pnas.1807027115. Epub 2018 Jun 26.
8
Learned protein embeddings for machine learning.用于机器学习的习得蛋白质嵌入。
Bioinformatics. 2018 Dec 1;34(23):4138. doi: 10.1093/bioinformatics/bty455.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验