• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度蛋白质表示可实现重组蛋白表达预测。

Deep protein representations enable recombinant protein expression prediction.

机构信息

Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark.

Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark.

出版信息

Comput Biol Chem. 2021 Dec;95:107596. doi: 10.1016/j.compbiolchem.2021.107596. Epub 2021 Oct 27.

DOI:10.1016/j.compbiolchem.2021.107596
PMID:34775287
Abstract

A crucial process in the production of industrial enzymes is recombinant gene expression, which aims to induce enzyme overexpression of the genes in a host microbe. Current approaches for securing overexpression rely on molecular tools such as adjusting the recombinant expression vector, adjusting cultivation conditions, or performing codon optimizations. However, such strategies are time-consuming, and an alternative strategy would be to select genes for better compatibility with the recombinant host. Several methods for predicting soluble expression are available; however, they are all optimized for the expression host Escherichia coli and do not consider the possibility of an expressed protein not being soluble. We show that these tools are not suited for predicting expression potential in the industrially important host Bacillus subtilis. Instead, we build a B. subtilis-specific machine learning model for expressibility prediction. Given millions of unlabelled proteins and a small labeled dataset, we can successfully train such a predictive model. The unlabeled proteins provide a performance boost relative to using amino acid frequencies of the labeled proteins as input. On average, we obtain a modest performance of 0.64 area-under-the-curve (AUC) and 0.2 Matthews correlation coefficient (MCC). However, we find that this is sufficient for the prioritization of expression candidates for high-throughput studies. Moreover, the predicted class probabilities are correlated with expression levels. A number of features related to protein expression, including base frequencies and solubility, are captured by the model.

摘要

在工业酶生产中,一个关键的过程是重组基因表达,旨在诱导宿主微生物中基因的酶过表达。目前,实现过表达的方法依赖于分子工具,如调整重组表达载体、调整培养条件或进行密码子优化。然而,这些策略耗时耗力,另一种策略是选择与重组宿主更好兼容的基因。有几种预测可溶性表达的方法,但它们都是针对表达宿主大肠杆菌进行优化的,并不考虑表达蛋白可能不具有可溶性的情况。我们表明,这些工具不适合预测在工业上重要的宿主枯草芽孢杆菌中的表达潜力。相反,我们构建了一个枯草芽孢杆菌特异性的机器学习表达预测模型。给定数百万个未标记的蛋白质和一个小的标记数据集,我们可以成功地训练这样的预测模型。与使用标记蛋白质的氨基酸频率作为输入相比,未标记的蛋白质提供了性能提升。平均而言,我们获得了 0.64 的曲线下面积 (AUC) 和 0.2 的马修斯相关系数 (MCC) 的中等性能。然而,我们发现这足以对高通量研究的表达候选物进行优先级排序。此外,预测的类别概率与表达水平相关。该模型捕获了与蛋白质表达相关的许多特征,包括碱基频率和溶解度。

相似文献

1
Deep protein representations enable recombinant protein expression prediction.深度蛋白质表示可实现重组蛋白表达预测。
Comput Biol Chem. 2021 Dec;95:107596. doi: 10.1016/j.compbiolchem.2021.107596. Epub 2021 Oct 27.
2
A Generic Protocol for Intracellular Expression of Recombinant Proteins in Bacillus subtilis.枯草芽孢杆菌中重组蛋白胞内表达的通用方案
Methods Mol Biol. 2017;1586:325-334. doi: 10.1007/978-1-4939-6887-9_21.
3
Exploitation of Bacillus subtilis as a robust workhorse for production of heterologous proteins and beyond.枯草芽孢杆菌作为一种强大的生产异源蛋白的工程菌及其应用。
World J Microbiol Biotechnol. 2018 Sep 10;34(10):145. doi: 10.1007/s11274-018-2531-7.
4
A particular silent codon exchange in a recombinant gene greatly influences host cell metabolic activity.重组基因中特定的沉默密码子交换极大地影响宿主细胞的代谢活性。
Microb Cell Fact. 2015 Oct 5;14:156. doi: 10.1186/s12934-015-0348-8.
5
Use of a Sec signal peptide library from Bacillus subtilis for the optimization of cutinase secretion in Corynebacterium glutamicum.利用来自枯草芽孢杆菌的Sec信号肽文库优化谷氨酸棒杆菌中角质酶的分泌。
Microb Cell Fact. 2016 Dec 7;15(1):208. doi: 10.1186/s12934-016-0604-6.
6
Sequencing, cloning, and heterologous expression of cyclomaltodextrin glucanotransferase of Bacillus firmus strain 37 in Bacillus subtilis WB800.在枯草芽孢杆菌 WB800 中进行纤维芽胞杆菌 37 株环麦芽寡糖葡聚糖转移酶的测序、克隆和异源表达。
Bioprocess Biosyst Eng. 2019 Apr;42(4):621-629. doi: 10.1007/s00449-018-02068-4. Epub 2019 Jan 3.
7
Improved inducible expression of Bacillus naganoensis pullulanase from recombinant Bacillus subtilis by enhancer regulation.通过增强子调控提高重组枯草芽孢杆菌中长野芽孢杆菌支链淀粉酶的诱导表达。
Protein Expr Purif. 2018 Aug;148:9-15. doi: 10.1016/j.pep.2018.03.012. Epub 2018 Mar 27.
8
High-level extracellular protein production in Bacillus subtilis using an optimized dual-promoter expression system.利用优化的双启动子表达系统在枯草芽孢杆菌中进行高水平细胞外蛋白质生产。
Microb Cell Fact. 2017 Feb 20;16(1):32. doi: 10.1186/s12934-017-0649-1.
9
Use of a new catabolite repression resistant promoter isolated from Bacillus subtilis KCC103 for hyper-production of recombinant enzymes.利用从枯草芽孢杆菌KCC103中分离出的一种新的抗分解代谢物阻遏启动子高效生产重组酶。
Protein Expr Purif. 2010 Mar;70(1):122-8. doi: 10.1016/j.pep.2009.09.020. Epub 2009 Oct 6.
10
Enhancement of extracellular expression of Bacillus naganoensis pullulanase from recombinant Bacillus subtilis: Effects of promoter and host.重组枯草芽孢杆菌中长野芽孢杆菌普鲁兰酶细胞外表达的增强:启动子和宿主的影响
Protein Expr Purif. 2016 Aug;124:23-31. doi: 10.1016/j.pep.2016.04.008. Epub 2016 Apr 22.

引用本文的文献

1
AI Prediction of Structural Stability of Nanoproteins Based on Structures and Residue Properties by Mean Pooled Dual Graph Convolutional Network.基于平均池化双图卷积网络的结构和残基特性对纳米蛋白质结构稳定性的人工智能预测
Interdiscip Sci. 2025 Mar;17(1):101-113. doi: 10.1007/s12539-024-00662-7. Epub 2024 Oct 5.
2
Artificial intelligence-driven systems engineering for next-generation plant-derived biopharmaceuticals.用于下一代植物源生物制药的人工智能驱动的系统工程。
Front Plant Sci. 2023 Nov 15;14:1252166. doi: 10.3389/fpls.2023.1252166. eCollection 2023.
3
The current role and evolution of X-ray crystallography in drug discovery and development.
X 射线晶体学在药物发现和开发中的当前作用和演变。
Expert Opin Drug Discov. 2023 Jul-Dec;18(11):1221-1230. doi: 10.1080/17460441.2023.2246881. Epub 2023 Aug 17.
4
Enzyme Commission Number Prediction and Benchmarking with Hierarchical Dual-core Multitask Learning Framework.基于分层双核多任务学习框架的酶委员会编号预测与基准测试
Research (Wash D C). 2023 May 31;6:0153. doi: 10.34133/research.0153. eCollection 2023.
5
CysPresso: a classification model utilizing deep learning protein representations to predict recombinant expression of cysteine-dense peptides.CysPresso:一种利用深度学习蛋白质表示来预测半胱氨酸密集肽重组表达的分类模型。
BMC Bioinformatics. 2023 May 16;24(1):200. doi: 10.1186/s12859-023-05327-8.
6
DeepLoc 2.0: multi-label subcellular localization prediction using protein language models.DeepLoc 2.0:使用蛋白质语言模型进行多标签亚细胞定位预测。
Nucleic Acids Res. 2022 Jul 5;50(W1):W228-W234. doi: 10.1093/nar/gkac278.