ProTstab2 预测蛋白质热稳定性

ProTstab2 for Prediction of Protein Thermal Stabilities.

机构信息

School of Computer Science and Technology, Soochow University, Suzhou 215006, China.

Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210000, China.

出版信息

Int J Mol Sci. 2022 Sep 16;23(18):10798. doi: 10.3390/ijms231810798.

Abstract

The stability of proteins is an essential property that has several biological implications. Knowledge about protein stability is important in many ways, ranging from protein purification and structure determination to stability in cells and biotechnological applications. Experimental determination of thermal stabilities has been tedious and available data have been limited. The introduction of limited proteolysis and mass spectrometry approaches has facilitated more extensive cellular protein stability data production. We collected melting temperature information for 34,913 proteins and developed a machine learning predictor, ProTstab2, by utilizing a gradient boosting algorithm after testing seven algorithms. The method performance was assessed on a blind test data set and showed a Pearson correlation coefficient of 0.753 and root mean square error of 7.005. Comparison to previous methods indicated that ProTstab2 had superior performance. The method is fast, so it was applied to predict and compare the stabilities of all proteins in human, mouse, and zebrafish proteomes for which experimental data were not determined. The tool is freely available.

摘要

蛋白质的稳定性是一种重要的特性,具有多种生物学意义。关于蛋白质稳定性的知识在很多方面都很重要,从蛋白质的纯化和结构测定到细胞内的稳定性和生物技术应用。热稳定性的实验测定一直很繁琐,可用的数据也很有限。有限的蛋白水解和质谱方法的引入促进了更广泛的细胞蛋白稳定性数据的产生。我们收集了 34913 种蛋白质的熔点信息,并在测试了七种算法后,利用梯度提升算法开发了一种机器学习预测器 ProTstab2。该方法在盲测数据集上的性能评估显示皮尔逊相关系数为 0.753,均方根误差为 7.005。与以前的方法相比,ProTstab2 的性能更优。该方法速度很快,因此我们应用它来预测和比较人类、小鼠和斑马鱼蛋白质组中所有蛋白质的稳定性,这些蛋白质的稳定性没有实验数据确定。该工具是免费提供的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fe9/9505338/b2f622451066/ijms-23-10798-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索