Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan.
Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), 3-4-1 Okubo, Shinjuku-ku, Tokyo, 169-8555, Japan.
Sci Rep. 2019 Jun 6;9(1):8338. doi: 10.1038/s41598-019-44500-z.
Codon optimization by synonymous substitution is widely used for recombinant protein expression. Recent studies have investigated sequence features for codon optimization based on large-scale expression analyses. However, these studies have been limited to common host organisms such as Escherichia coli. Here, we develop a codon optimization method for Rhodococcus erythropolis, a gram-positive GC-rich actinobacterium attracting attention as an alternative host organism. We evaluate the recombinant protein expression of 204 genes in R. erythropolis with the same plasmid vector. The statistical analysis of these expression data reveals that the mRNA folding energy at 5' regions as well as the codon frequency are important sequence features for codon optimization. Intriguingly, other sequence features such as the codon repetition rate show a different tendency from the previous study on E. coli. We optimize the coding sequences of 12 genes regarding these sequence features, and confirm that 9 of them (75%) achieve increased expression levels compared with wild-type sequences. Especially, for 5 genes whose expression levels for wild-type sequences are small or not detectable, all of them are improved by optimized sequences. These results demonstrate the effectiveness of our codon optimization method in R. erythropolis, and possibly in other actinobacteria.
同义突变的密码子优化被广泛应用于重组蛋白的表达。最近的研究基于大规模的表达分析,研究了密码子优化的序列特征。然而,这些研究仅限于常见的宿主生物,如大肠杆菌。在这里,我们开发了一种用于红色糖多孢菌的密码子优化方法,红色糖多孢菌是一种革兰氏阳性、GC 含量丰富的放线菌,作为替代宿主生物引起了人们的关注。我们用相同的质粒载体评估了 204 个基因在红色糖多孢菌中的重组蛋白表达。这些表达数据的统计分析表明,mRNA 5' 区域的折叠能和密码子频率是密码子优化的重要序列特征。有趣的是,其他序列特征,如密码子重复率,表现出与大肠杆菌之前的研究不同的趋势。我们针对这些序列特征优化了 12 个基因的编码序列,并证实其中 9 个(75%)与野生型序列相比表达水平提高。特别是对于野生型序列表达水平较小或不可检测的 5 个基因,它们的优化序列都得到了改善。这些结果表明,我们的密码子优化方法在红色糖多孢菌中,可能在其他放线菌中也是有效的。