Liu Xiuxia, Li Guangying, Cui Sinan, Yang Yankun, Liu Chun Li, Bai Zhonghu
School of Biotechnology and Key Laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China.
National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China.
ACS Synth Biol. 2025 Aug 15;14(8):3105-3115. doi: 10.1021/acssynbio.5c00250. Epub 2025 Aug 1.
The 5'UTR sequence and N-terminal coding sequence (NCS) have been used to regulate gene expression in () microbial cell factories. However, there is currently insufficient research on the relationship between these expression element sequences and the protein expression rate in . This study established a pattern between 5'UTR and NCS feature sequences and protein expression and validated their effects on protein expression. First, a 5'UTR library and a NCS library containing base N were constructed separately, and a continuous regulatory range across 5 orders of magnitude for the enhanced green fluorescent protein (eGFP) expression was achieved in both libraries by fluorescence activated cell sorting (FACS) and high-throughput sequencing. Next, the relationship between sequence information and protein expression was established based on the 5'UTR sequence and NCS sequence characteristics analysis in terms of CG content, minimum free energy (MFE), tRNA adaptability index, and deep learning. Moreover, four 5'UTR characteristic sequences and four NCS characteristic sequences were finally screened, which showed strong compatibility with different exogenous proteins. Furthermore, dynamic adjustment of eGFP fluorescence intensity from 45% to 511% was achieved through 16 different combinations of the screened four 5'UTR and four NCS sequences, confirming the synergistic effect of these two components. At the same time, these combinations also have a wide range of dynamic regulation of protein expression levels of other recombinant proteins such as mCherry and heavy chain antibody. This study provided a potential tool for finely regulating gene expression or protein production in .
5'非翻译区(5'UTR)序列和N端编码序列(NCS)已被用于调控()微生物细胞工厂中的基因表达。然而,目前关于这些表达元件序列与()中蛋白质表达率之间的关系研究不足。本研究建立了5'UTR和NCS特征序列与蛋白质表达之间的模式,并验证了它们对蛋白质表达的影响。首先,分别构建了包含碱基N的5'UTR文库和NCS文库,并通过荧光激活细胞分选(FACS)和高通量测序在两个文库中实现了增强型绿色荧光蛋白(eGFP)表达跨越5个数量级的连续调控范围。接下来,基于5'UTR序列和NCS序列在CG含量、最小自由能(MFE)、tRNA适应性指数和深度学习方面的特征分析,建立了序列信息与蛋白质表达之间的关系。此外,最终筛选出四个5'UTR特征序列和四个NCS特征序列,它们与不同的外源蛋白表现出很强的兼容性。此外,通过筛选出的四个5'UTR和四个NCS序列的16种不同组合,实现了eGFP荧光强度从45%到511%的动态调节,证实了这两个组分的协同作用。同时,这些组合对其他重组蛋白如mCherry和重链抗体的蛋白质表达水平也具有广泛的动态调节作用。本研究为精细调控()中的基因表达或蛋白质生产提供了一种潜在工具。