Suppr超能文献

烟草中基于病毒表达的编码序列设计建模

Modeling coding sequence design for virus-based expression in tobacco.

作者信息

Burghardt Moritz, Tuller Tamir

机构信息

Department of Biomedical Engineering, The Iby and Aladar Fleischman Faculty of Engineering, Tel Aviv, Israel.

The Segol School of Neuroscience, Tel-Aviv University, Tel Aviv, Israel.

出版信息

Synth Syst Biotechnol. 2024 Dec 11;10(2):337-345. doi: 10.1016/j.synbio.2024.12.002. eCollection 2025 Jun.

Abstract

Transient expression in Tobacco is a popular way to produce recombinant proteins in plants. The design of various expression vectors, delivered into the plant by , has enabled high production levels of some proteins. To further enhance expression, researchers often adapt the coding sequence of heterologous genes to the host, but this strategy has produced mixed results in Tobacco. To study the effects of different sequence features on protein yield, we compile a dataset of the yields and coding sequences of previously published expression studies of more than 200 coding sequences. We evaluate various established gene expression models on a subset of the expression studies. We find that use of tobacco codons is only moderately predictive of protein yield as informative sequence features likely extend over multiple codons. Additionally, we show that codon usage of organisms that use tobacco as a host for expression of their proteins in a similar way as the synthetic system, like viruses and agrobacteria, can be used to predict heterologous expression. Other predictive features are related to tRNA supply and demand, the inclusion of a translational ramp of codons with lower adaptation to the tRNA pool at the beginning of the coding region, and the amino acid composition of the recombinant protein. A model based on all the features achieved a correlation of 0.57 with protein yield. We believe that our study provides a practical guideline for coding sequence design for efficient expression in tobacco.

摘要

烟草中的瞬时表达是在植物中生产重组蛋白的一种常用方法。通过多种方式将各种表达载体导入植物,已实现了某些蛋白的高产。为进一步提高表达水平,研究人员常使异源基因的编码序列适应宿主,但该策略在烟草中的效果参差不齐。为研究不同序列特征对蛋白产量的影响,我们汇编了一个数据集,其中包含200多个编码序列先前发表的表达研究的产量和编码序列。我们在一部分表达研究中评估了各种已建立的基因表达模型。我们发现,烟草密码子的使用对蛋白产量的预测能力一般,因为信息丰富的序列特征可能延伸到多个密码子。此外,我们表明,像病毒和农杆菌等以与合成系统类似的方式将烟草用作其蛋白表达宿主的生物体的密码子使用情况,可用于预测异源表达。其他预测特征与tRNA的供需、编码区起始处对tRNA库适应性较低的密码子的翻译起始序列的包含情况以及重组蛋白的氨基酸组成有关。基于所有这些特征的模型与蛋白产量的相关性达到了0.57。我们相信我们的研究为烟草中高效表达的编码序列设计提供了实用指南。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验