Suppr超能文献

DeCoDe:用于完整编码蛋白质 DNA 文库的简并密码子设计。

DeCoDe: degenerate codon design for complete protein-coding DNA libraries.

机构信息

Department of Genetics.

Department of Bioengineering.

出版信息

Bioinformatics. 2020 Jun 1;36(11):3357-3364. doi: 10.1093/bioinformatics/btaa162.

Abstract

MOTIVATION

High-throughput protein screening is a critical technique for dissecting and designing protein function. Libraries for these assays can be created through a number of means, including targeted or random mutagenesis of a template protein sequence or direct DNA synthesis. However, mutagenic library construction methods often yield vastly more nonfunctional than functional variants and, despite advances in large-scale DNA synthesis, individual synthesis of each desired DNA template is often prohibitively expensive. Consequently, many protein-screening libraries rely on the use of degenerate codons (DCs), mixtures of DNA bases incorporated at specific positions during DNA synthesis, to generate highly diverse protein-variant pools from only a few low-cost synthesis reactions. However, selecting DCs for sets of sequences that covary at multiple positions dramatically increases the difficulty of designing a DC library and leads to the creation of many undesired variants that can quickly outstrip screening capacity.

RESULTS

We introduce a novel algorithm for total DC library optimization, degenerate codon design (DeCoDe), based on integer linear programming. DeCoDe significantly outperforms state-of-the-art DC optimization algorithms and scales well to more than a hundred proteins sharing complex patterns of covariation (e.g. the lab-derived avGFP lineage). Moreover, DeCoDe is, to our knowledge, the first DC design algorithm with the capability to encode mixed-length protein libraries. We anticipate DeCoDe to be broadly useful for a variety of library generation problems, ranging from protein engineering attempts that leverage mutual information to the reconstruction of ancestral protein states.

AVAILABILITY AND IMPLEMENTATION

github.com/OrensteinLab/DeCoDe.

CONTACT

yaronore@bgu.ac.il.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

高通量蛋白质筛选是剖析和设计蛋白质功能的关键技术。这些测定的文库可以通过多种手段创建,包括模板蛋白序列的靶向或随机诱变或直接 DNA 合成。然而,诱变文库构建方法通常产生的无功能变体比功能变体多得多,尽管在大规模 DNA 合成方面取得了进展,但单独合成每个所需的 DNA 模板通常过于昂贵。因此,许多蛋白质筛选文库依赖于使用简并密码子(DC),即在 DNA 合成过程中特定位置掺入的 DNA 碱基混合物,仅从少数低成本合成反应中生成高度多样化的蛋白质变体池。然而,为在多个位置上共变的序列集选择 DC 极大地增加了设计 DC 文库的难度,并导致产生了许多不需要的变体,这些变体很快就会超过筛选能力。

结果

我们介绍了一种基于整数线性规划的全新总 DC 文库优化算法,即简并密码子设计(DeCoDe)。DeCoDe 明显优于最先进的 DC 优化算法,并且可以很好地扩展到具有复杂共变模式的一百多个蛋白质(例如,实验室衍生的 avGFP 谱系)。此外,据我们所知,DeCoDe 是第一个具有编码混合长度蛋白质文库能力的 DC 设计算法。我们预计 DeCoDe 将广泛用于各种文库生成问题,从利用互信息的蛋白质工程尝试到重建祖先蛋白质状态。

可用性和实施

github.com/OrensteinLab/DeCoDe。

联系方式

yaronore@bgu.ac.il

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

1
DeCoDe: degenerate codon design for complete protein-coding DNA libraries.
Bioinformatics. 2020 Jun 1;36(11):3357-3364. doi: 10.1093/bioinformatics/btaa162.
2
DeCOIL: Optimization of Degenerate Codon Libraries for Machine Learning-Assisted Protein Engineering.
ACS Synth Biol. 2023 Aug 18;12(8):2444-2454. doi: 10.1021/acssynbio.3c00301. Epub 2023 Jul 31.
3
SwiftLib: rapid degenerate-codon-library optimization through dynamic programming.
Nucleic Acids Res. 2015 Mar 11;43(5):e34. doi: 10.1093/nar/gku1323. Epub 2014 Dec 24.
4
Automated design of degenerate codon libraries.
Protein Eng Des Sel. 2005 Dec;18(12):559-61. doi: 10.1093/protein/gzi061. Epub 2005 Oct 20.
5
Synthesis cost-optimal targeted mutant protein libraries.
Comput Biol Chem. 2024 Jun;110:108068. doi: 10.1016/j.compbiolchem.2024.108068. Epub 2024 Apr 18.
6
Optimization of combinatorial mutagenesis.
J Comput Biol. 2011 Nov;18(11):1743-56. doi: 10.1089/cmb.2011.0152. Epub 2011 Sep 16.
7
MDC-Analyzer: a novel degenerate primer design tool for the construction of intelligent mutagenesis libraries with contiguous sites.
Biotechniques. 2014 Jun 1;56(6):301-2, 304, 306-8, passim. doi: 10.2144/000114177. eCollection 2014 Jun.
8
Dynamic Management of Codon Compression for Saturation Mutagenesis.
Methods Mol Biol. 2018;1772:171-189. doi: 10.1007/978-1-4939-7795-6_9.
9
Codon compression algorithms for saturation mutagenesis.
ACS Synth Biol. 2015 May 15;4(5):604-14. doi: 10.1021/sb500282v. Epub 2014 Oct 30.
10
CoLiDe: Combinatorial Library Design tool for probing protein sequence space.
Bioinformatics. 2021 May 1;37(4):482-489. doi: 10.1093/bioinformatics/btaa804.

引用本文的文献

3
Machine Learning for Protein Engineering.
ArXiv. 2023 May 26:arXiv:2305.16634v1.
4
uPIC-M: Efficient and Scalable Preparation of Clonal Single Mutant Libraries for High-Throughput Protein Biochemistry.
ACS Omega. 2021 Nov 2;6(45):30542-30554. doi: 10.1021/acsomega.1c04180. eCollection 2021 Nov 16.
5
CoLiDe: Combinatorial Library Design tool for probing protein sequence space.
Bioinformatics. 2021 May 1;37(4):482-489. doi: 10.1093/bioinformatics/btaa804.

本文引用的文献

1
Machine learning-assisted directed protein evolution with combinatorial libraries.
Proc Natl Acad Sci U S A. 2019 Apr 30;116(18):8852-8858. doi: 10.1073/pnas.1901979116. Epub 2019 Apr 12.
2
FPbase: a community-editable fluorescent protein database.
Nat Methods. 2019 Apr;16(4):277-278. doi: 10.1038/s41592-019-0352-8.
4
Large Scale Synthetic Site Saturation GPCR Libraries Reveal Novel Mutations That Alter Glucose Signaling.
ACS Synth Biol. 2018 Sep 21;7(9):2317-2321. doi: 10.1021/acssynbio.8b00118. Epub 2018 Sep 12.
5
Machine-Learning-Guided Mutagenesis for Directed Evolution of Fluorescent Proteins.
ACS Synth Biol. 2018 Sep 21;7(9):2014-2022. doi: 10.1021/acssynbio.8b00155. Epub 2018 Aug 20.
6
Multiplexed gene synthesis in emulsions for exploring protein functional landscapes.
Science. 2018 Jan 19;359(6373):343-347. doi: 10.1126/science.aao5167. Epub 2018 Jan 4.
7
Evolutionary trend toward kinetic stability in the folding trajectory of RNases H.
Proc Natl Acad Sci U S A. 2016 Nov 15;113(46):13045-13050. doi: 10.1073/pnas.1611781113. Epub 2016 Oct 31.
9
Local fitness landscape of the green fluorescent protein.
Nature. 2016 May 19;533(7603):397-401. doi: 10.1038/nature17995. Epub 2016 May 11.
10
Dissecting enzyme function with microfluidic-based deep mutational scanning.
Proc Natl Acad Sci U S A. 2015 Jun 9;112(23):7159-64. doi: 10.1073/pnas.1422285112. Epub 2015 May 26.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验