MPEPE，一种基于深度学习提高蛋白质表达的预测方法。（你提供的原文中“in based on”表述有误，推测可能是“in vitro”之类的，这里按照纠正后的意思翻译）

MPEPE, a predictive approach to improve protein expression in based on deep learning.

作者信息

Ding Zundan, Guan Feifei, Xu Guoshun, Wang Yuchen, Yan Yaru, Zhang Wei, Wu Ningfeng, Yao Bin, Huang Huoqing, Tuller Tamir, Tian Jian

机构信息

Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China.

Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China.

出版信息

Comput Struct Biotechnol J. 2022 Mar 1;20:1142-1153. doi: 10.1016/j.csbj.2022.02.030. eCollection 2022.

DOI:10.1016/j.csbj.2022.02.030

PMID:35317239

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8913310/

Abstract

The expression of proteins in is often essential for their characterization, modification, and subsequent application. Gene sequence is the major factor contributing expression. In this study, we used the expression data from 6438 heterologous proteins under the same expression condition in to construct a deep learning classifier for screening high- and low-expression proteins. In conjunction with conserved residue analysis to minimize functional disruption, a mutation predictor for enhanced protein expression (MPEPE) was proposed to identify mutations conducive to protein expression. MPEPE identified mutation sites in laccase 13B22 and the glucose dehydrogenase FAD-AtGDH, that significantly increased both expression levels and activity of these proteins. Additionally, a significant correlation of 0.46 between the predicted high level expression propensity with the constructed models and the protein abundance of endogenous genes in was also been detected. Therefore, the study provides foundational insights into the relationship between specific amino acid usage, codon usage, and protein expression, and is essential for research and industrial applications.

摘要

蛋白质在中的表达对于其表征、修饰及后续应用往往至关重要。基因序列是影响表达的主要因素。在本研究中，我们利用在相同表达条件下6438种异源蛋白的表达数据构建了一个深度学习分类器，用于筛选高表达和低表达蛋白。结合保守残基分析以尽量减少功能破坏，提出了一种增强蛋白表达的突变预测器（MPEPE）来识别有利于蛋白表达的突变。MPEPE在漆酶13B22和葡萄糖脱氢酶FAD - AtGDH中鉴定出突变位点，这些位点显著提高了这些蛋白的表达水平和活性。此外，还检测到预测的高水平表达倾向与构建模型与中内源基因的蛋白丰度之间存在0.46的显著相关性。因此，该研究为特定氨基酸使用、密码子使用和蛋白表达之间的关系提供了基础见解，对研究和工业应用至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aa5/8913310/e96cc61e5359/ga1.jpg

相似文献

MPEPE, a predictive approach to improve protein expression in based on deep learning.MPEPE，一种基于深度学习提高蛋白质表达的预测方法。（你提供的原文中“in based on”表述有误，推测可能是“in vitro”之类的，这里按照纠正后的意思翻译）

Comput Struct Biotechnol J. 2022 Mar 1;20:1142-1153. doi: 10.1016/j.csbj.2022.02.030. eCollection 2022.

Predicting synonymous codon usage and optimizing the heterologous gene for expression in E. coli.预测同义密码子的使用并优化大肠杆菌中异源基因的表达。

Sci Rep. 2017 Aug 30;7(1):9926. doi: 10.1038/s41598-017-10546-0.

Molecular bases for strong phenotypic effects of single synonymous codon substitutions in the E. coli ccdB toxin gene.大肠杆菌 ccdB 毒素基因中单个同义密码子替换的强表型效应的分子基础。

BMC Genomics. 2023 Dec 4;24(1):732. doi: 10.1186/s12864-023-09817-0.

Differentially used codons among essential genes in bacteria identified by machine learning-based analysis.基于机器学习分析的细菌必需基因中差异使用密码子的鉴定。

Mol Genet Genomics. 2024 Jul 27;299(1):72. doi: 10.1007/s00438-024-02163-0.

Modified 'one amino acid-one codon' engineering of high GC content TaqII-coding gene from thermophilic Thermus aquaticus results in radical expression increase.对嗜热水生栖热菌 TaqII 编码基因进行“一个氨基酸对应一个密码子”的修饰工程，可使其表达水平显著提高。

Microb Cell Fact. 2014 Jan 11;13:7. doi: 10.1186/1475-2859-13-7.

Synonymous codon usage in Escherichia coli: selection for translational accuracy.大肠杆菌中的同义密码子使用：对翻译准确性的选择。

Mol Biol Evol. 2007 Feb;24(2):374-81. doi: 10.1093/molbev/msl166. Epub 2006 Nov 13.

Deep learning-driven insights into super protein complexes for outer membrane protein biogenesis in bacteria.深度学习驱动的细菌外膜蛋白生物发生中超蛋白复合物的研究进展

Elife. 2022 Dec 28;11:e82885. doi: 10.7554/eLife.82885.

Design parameters to control synthetic gene expression in Escherichia coli.设计参数控制大肠杆菌中合成基因的表达。

PLoS One. 2009 Sep 14;4(9):e7002. doi: 10.1371/journal.pone.0007002.

Exploring Codon Adjustment Strategies towards -Based Production of Viral Proteins Encoded by HTH1, a Novel Prophage of the Marine Bacterium .探讨基于 HTH1 的新型海洋细菌噬菌体编码病毒蛋白的密码子调整策略。

Viruses. 2021 Jun 23;13(7):1215. doi: 10.3390/v13071215.

Optimizing scaleup yield for protein production: Computationally Optimized DNA Assembly (CODA) and Translation Engineering.优化蛋白质生产的放大产量：计算优化DNA组装（CODA）和翻译工程。

Biotechnol Annu Rev. 2007;13:27-42. doi: 10.1016/S1387-2656(07)13002-7.

引用本文的文献

Influence of mutations at different distances from the active center on the activity and stability of laccase 13B22.活性中心不同距离处的突变对漆酶13B22活性和稳定性的影响。

Bioresour Bioprocess. 2025 May 27;12(1):47. doi: 10.1186/s40643-025-00893-6.

Codeine 3-O-demethylase catalyzed biotransformation of morphinan alkaloids in Escherichia coli: site directed mutagenesis of terminal residues improves enzyme expression, stability and biotransformation yield.可待因3 - O -去甲基酶催化的大肠杆菌中吗啡喃生物碱的生物转化：末端残基的定点诱变提高了酶的表达、稳定性及生物转化产量。

J Biol Eng. 2025 Jan 19;19(1):9. doi: 10.1186/s13036-025-00477-0.

Effective Gene Expression Prediction and Optimization from Protein Sequences.基于蛋白质序列的有效基因表达预测与优化

Adv Sci (Weinh). 2025 Feb;12(8):e2407664. doi: 10.1002/advs.202407664. Epub 2025 Jan 9.

Link Between Individual Codon Frequencies and Protein Expression: Going Beyond Codon Adaptation Index.个体密码子频率与蛋白质表达之间的关系：超越密码子适应指数。

Int J Mol Sci. 2024 Oct 29;25(21):11622. doi: 10.3390/ijms252111622.

CodonBERT large language model for mRNA vaccines.基于 CodonBERT 的 mRNA 疫苗大语言模型。

Genome Res. 2024 Aug 20;34(7):1027-1035. doi: 10.1101/gr.278870.123.

Laccase-catalyzed lignin depolymerization in deep eutectic solvents: challenges and prospects.深共熔溶剂中漆酶催化木质素解聚：挑战与展望

Bioresour Bioprocess. 2023 Mar 23;10(1):21. doi: 10.1186/s40643-023-00640-9.

Rapid Antibacterial Activity Assessment of Chimeric Lysins.嵌合溶菌酶的快速抗菌活性评估。

Int J Mol Sci. 2024 Feb 19;25(4):2430. doi: 10.3390/ijms25042430.

Artificial intelligence-driven systems engineering for next-generation plant-derived biopharmaceuticals.用于下一代植物源生物制药的人工智能驱动的系统工程。

Front Plant Sci. 2023 Nov 15;14:1252166. doi: 10.3389/fpls.2023.1252166. eCollection 2023.

Current state of molecular and metabolic strategies for the improvement of L-asparaginase expression in heterologous systems.用于提高异源系统中L-天冬酰胺酶表达的分子和代谢策略的当前状态。

Front Pharmacol. 2023 Jun 22;14:1208277. doi: 10.3389/fphar.2023.1208277. eCollection 2023.

本文引用的文献

Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。

Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.

EPSOL: sequence-based protein solubility prediction using multidimensional embedding.EPSOL：基于序列的多维嵌合蛋白可溶性预测。

Bioinformatics. 2021 Dec 7;37(23):4314-4320. doi: 10.1093/bioinformatics/btab463.

Protein engineering of stable IsPETase for PET plastic degradation by Premuse.通过 Premuse 对稳定的 IsPETase 进行蛋白质工程改造以降解 PET 塑料。

Int J Biol Macromol. 2021 Jun 1;180:667-676. doi: 10.1016/j.ijbiomac.2021.03.058. Epub 2021 Mar 19.

Expression and Purification of a Recombinant Enterotoxin Protein Using Different E. coli Host Strains and Expression Vectors.使用不同大肠杆菌宿主菌株和表达载体对重组肠毒素蛋白进行表达与纯化

Protein J. 2021 Apr;40(2):245-254. doi: 10.1007/s10930-021-09973-w. Epub 2021 Mar 15.

Challenges Associated With the Formation of Recombinant Protein Inclusion Bodies in and Strategies to Address Them for Industrial Applications.重组蛋白包涵体形成所涉及的挑战及其在工业应用中的应对策略。

Front Bioeng Biotechnol. 2021 Feb 10;9:630551. doi: 10.3389/fbioe.2021.630551. eCollection 2021.

Codon optimization with deep learning to enhance protein expression.利用深度学习进行密码子优化以增强蛋白质表达。

Sci Rep. 2020 Oct 19;10(1):17617. doi: 10.1038/s41598-020-74091-z.

Biodegradation of bisphenol A by the immobilized laccase on some synthesized and modified forms of zeolite Y.固定化漆酶在沸石 Y 的一些合成和修饰形式上对双酚 A 的生物降解作用。

J Hazard Mater. 2020 Mar 15;386:121950. doi: 10.1016/j.jhazmat.2019.121950. Epub 2019 Dec 23.

Soluble expression of recombinant midgut zymogen (native propeptide) proteases from the Aedes aegypti Mosquito Utilizing E. coli as a host.利用大肠杆菌作为宿主表达埃及伊蚊中肠酶原（天然前肽）蛋白酶的可溶性表达。

BMC Biochem. 2018 Dec 18;19(1):12. doi: 10.1186/s12858-018-0101-0.

Continuous directed evolution of proteins with improved soluble expression.连续定向进化提高可溶性表达的蛋白质。

Nat Chem Biol. 2018 Oct;14(10):972-980. doi: 10.1038/s41589-018-0121-5. Epub 2018 Aug 20.

Combining Structural Aggregation Propensity and Stability Predictions To Redesign Protein Solubility.结合结构聚集倾向和稳定性预测来重新设计蛋白质溶解度。

Mol Pharm. 2018 Sep 4;15(9):3846-3859. doi: 10.1021/acs.molpharmaceut.8b00341. Epub 2018 Aug 6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

MPEPE，一种基于深度学习提高蛋白质表达的预测方法。 （你提供的原文中“in based on”表述有误，推测可能是“in vitro”之类的，这里按照纠正后的意思翻译）

MPEPE, a predictive approach to improve protein expression in based on deep learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献

MPEPE，一种基于深度学习提高蛋白质表达的预测方法。（你提供的原文中“in based on”表述有误，推测可能是“in vitro”之类的，这里按照纠正后的意思翻译）