Suppr超能文献

SoluProt:大肠杆菌中可溶性蛋白质表达的预测

SoluProt: prediction of soluble protein expression in Escherichia coli.

作者信息

Hon Jiri, Marusiak Martin, Martinek Tomas, Kunka Antonin, Zendulka Jaroslav, Bednar David, Damborsky Jiri

机构信息

Loschmidt Laboratories, Centre for Toxic Compounds in the Environment RECETOX and Department of Experimental Biology, Faculty of Science, Masaryk University, Brno 625 00, Czech Republic.

International Clinical Research Center, St. Anne's University Hospital Brno, Brno 656 91, Czech Republic.

出版信息

Bioinformatics. 2021 Apr 9;37(1):23-28. doi: 10.1093/bioinformatics/btaa1102.

Abstract

MOTIVATION

Poor protein solubility hinders the production of many therapeutic and industrially useful proteins. Experimental efforts to increase solubility are plagued by low success rates and often reduce biological activity. Computational prediction of protein expressibility and solubility in Escherichia coli using only sequence information could reduce the cost of experimental studies by enabling prioritization of highly soluble proteins.

RESULTS

A new tool for sequence-based prediction of soluble protein expression in E.coli, SoluProt, was created using the gradient boosting machine technique with the TargetTrack database as a training set. When evaluated against a balanced independent test set derived from the NESG database, SoluProt's accuracy of 58.5% and AUC of 0.62 exceeded those of a suite of alternative solubility prediction tools. There is also evidence that it could significantly increase the success rate of experimental protein studies. SoluProt is freely available as a standalone program and a user-friendly webserver at https://loschmidt.chemi.muni.cz/soluprot/.

AVAILABILITY AND IMPLEMENTATION

https://loschmidt.chemi.muni.cz/soluprot/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

蛋白质溶解性差会阻碍许多治疗性和工业用蛋白质的生产。提高溶解性的实验努力成功率很低,而且常常会降低生物活性。仅使用序列信息对大肠杆菌中蛋白质的可表达性和溶解性进行计算预测,通过对高溶解性蛋白质进行优先排序,可以降低实验研究的成本。

结果

利用梯度提升机技术,以TargetTrack数据库作为训练集,创建了一种基于序列预测大肠杆菌中可溶性蛋白质表达的新工具SoluProt。当针对来自NESG数据库的平衡独立测试集进行评估时,SoluProt的准确率为58.5%,曲线下面积为0.62,超过了一系列其他溶解性预测工具。也有证据表明它可以显著提高实验性蛋白质研究的成功率。SoluProt可作为独立程序和用户友好的网络服务器免费获取,网址为https://loschmidt.chemi.muni.cz/soluprot/。

可用性和实现方式

https://loschmidt.chemi.muni.cz/soluprot/。

补充信息

补充数据可在《生物信息学》在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c756/8034534/678753891339/btaa1102f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验