• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于区分高温和中温蛋白的新型评分函数,及其在预测蛋白质突变体相对热稳定性中的应用。

A novel scoring function for discriminating hyperthermophilic and mesophilic proteins with application to predicting relative thermostability of protein mutants.

机构信息

Applied Bioinformatics Laboratory, the University of Kansas, Lawrence, KS 66047, USA.

出版信息

BMC Bioinformatics. 2010 Jan 28;11:62. doi: 10.1186/1471-2105-11-62.

DOI:10.1186/1471-2105-11-62
PMID:20109199
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3098108/
Abstract

BACKGROUND

The ability to design thermostable proteins is theoretically important and practically useful. Robust and accurate algorithms, however, remain elusive. One critical problem is the lack of reliable methods to estimate the relative thermostability of possible mutants.

RESULTS

We report a novel scoring function for discriminating hyperthermophilic and mesophilic proteins with application to predicting the relative thermostability of protein mutants. The scoring function was developed based on an elaborate analysis of a set of features calculated or predicted from 540 pairs of hyperthermophilic and mesophilic protein ortholog sequences. It was constructed by a linear combination of ten important features identified by a feature ranking procedure based on the random forest classification algorithm. The weights of these features in the scoring function were fitted by a hill-climbing algorithm. This scoring function has shown an excellent ability to discriminate hyperthermophilic from mesophilic sequences. The prediction accuracies reached 98.9% and 97.3% in discriminating orthologous pairs in training and the holdout testing datasets, respectively. Moreover, the scoring function can distinguish non-homologous sequences with an accuracy of 88.4%. Additional blind tests using two datasets of experimentally investigated mutations demonstrated that the scoring function can be used to predict the relative thermostability of proteins and their mutants at very high accuracies (92.9% and 94.4%). We also developed an amino acid substitution preference matrix between mesophilic and hyperthermophilic proteins, which may be useful in designing more thermostable proteins.

CONCLUSIONS

We have presented a novel scoring function which can distinguish not only HP/MP ortholog pairs, but also non-homologous pairs at high accuracies. Most importantly, it can be used to accurately predict the relative stability of proteins and their mutants, as demonstrated in two blind tests. In addition, the residue substitution preference matrix assembled in this study may reflect the thermal adaptation induced substitution biases. A web server implementing the scoring function and the dataset used in this study are freely available at http://www.abl.ku.edu/thermorank/.

摘要

背景

设计热稳定蛋白在理论上很重要,在实践中也很有用。然而,稳健且准确的算法仍然难以实现。一个关键问题是缺乏可靠的方法来估计可能突变体的相对热稳定性。

结果

我们报告了一种用于区分高热和中温蛋白的新型评分函数,并将其应用于预测蛋白质突变体的相对热稳定性。该评分函数是基于对 540 对高热和中温蛋白直系同源序列计算或预测的一组特征的精心分析而开发的。它是通过基于随机森林分类算法的特征排序过程确定的十个重要特征的线性组合构建的。在评分函数中的这些特征的权重通过爬山算法拟合。该评分函数在区分高热和中温序列方面表现出出色的能力。在训练和保留测试数据集的直系同源对中,预测准确率分别达到 98.9%和 97.3%。此外,该评分函数可以以 88.4%的准确率区分非同源序列。使用两个经过实验研究的突变数据集进行的额外盲测表明,该评分函数可用于以非常高的准确度预测蛋白质及其突变体的相对热稳定性(92.9%和 94.4%)。我们还开发了一种在中温和高热蛋白之间的氨基酸取代偏好矩阵,这可能有助于设计更耐热的蛋白质。

结论

我们提出了一种新的评分函数,不仅可以区分 HP/MP 直系同源对,而且可以以高精度区分非同源对。最重要的是,如两个盲测所示,它可用于准确预测蛋白质及其突变体的相对稳定性。此外,本研究中组装的残基取代偏好矩阵可能反映了热适应诱导的取代偏差。实现评分函数和本研究中使用的数据集的网络服务器可在 http://www.abl.ku.edu/thermorank/ 上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02bd/3098108/e98a97aedc7b/1471-2105-11-62-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02bd/3098108/0ae6de3b3a66/1471-2105-11-62-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02bd/3098108/14865e29aeed/1471-2105-11-62-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02bd/3098108/fdb881bedf91/1471-2105-11-62-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02bd/3098108/e98a97aedc7b/1471-2105-11-62-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02bd/3098108/0ae6de3b3a66/1471-2105-11-62-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02bd/3098108/14865e29aeed/1471-2105-11-62-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02bd/3098108/fdb881bedf91/1471-2105-11-62-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02bd/3098108/e98a97aedc7b/1471-2105-11-62-4.jpg

相似文献

1
A novel scoring function for discriminating hyperthermophilic and mesophilic proteins with application to predicting relative thermostability of protein mutants.一种用于区分高温和中温蛋白的新型评分函数,及其在预测蛋白质突变体相对热稳定性中的应用。
BMC Bioinformatics. 2010 Jan 28;11:62. doi: 10.1186/1471-2105-11-62.
2
Predicting protein thermostability changes from sequence upon multiple mutations.预测多个突变后蛋白质序列的热稳定性变化。
Bioinformatics. 2008 Jul 1;24(13):i190-5. doi: 10.1093/bioinformatics/btn166.
3
PRBP: Prediction of RNA-Binding Proteins Using a Random Forest Algorithm Combined with an RNA-Binding Residue Predictor.PRBP:结合RNA结合残基预测器,使用随机森林算法预测RNA结合蛋白
IEEE/ACM Trans Comput Biol Bioinform. 2015 Nov-Dec;12(6):1385-93. doi: 10.1109/TCBB.2015.2418773.
4
[Random forest for classification of thermophilic and psychrophilic proteins based on amino acid composition distribution].基于氨基酸组成分布的嗜热蛋白和嗜冷蛋白分类随机森林法
Sheng Wu Gong Cheng Xue Bao. 2008 Feb;24(2):302-8.
5
Different packing of external residues can explain differences in the thermostability of proteins from thermophilic and mesophilic organisms.外部残基的不同堆积方式可以解释嗜热生物和中温生物蛋白质热稳定性的差异。
Bioinformatics. 2007 Sep 1;23(17):2231-8. doi: 10.1093/bioinformatics/btm345. Epub 2007 Jun 28.
6
Protein thermostability: structure-based difference of amino acid between thermophilic and mesophilic proteins.蛋白质热稳定性:嗜热蛋白与嗜温蛋白之间基于结构的氨基酸差异。
J Biotechnol. 2004 Aug 5;111(3):269-77. doi: 10.1016/j.jbiotec.2004.01.018.
7
Protein thermal stability: the role of protein structure and aqueous environment.蛋白质热稳定性:蛋白质结构与水环境的作用
Arch Biochem Biophys. 2007 Oct 1;466(1):40-8. doi: 10.1016/j.abb.2007.07.016. Epub 2007 Aug 6.
8
Accurate prediction of enzyme mutant activity based on a multibody statistical potential.基于多体统计势准确预测酶突变体活性。
Bioinformatics. 2007 Dec 1;23(23):3155-61. doi: 10.1093/bioinformatics/btm509. Epub 2007 Oct 31.
9
In silico classification of proteins from acidic and neutral cytoplasms.从酸性和中性细胞质中对蛋白质进行计算分类。
PLoS One. 2012;7(9):e45585. doi: 10.1371/journal.pone.0045585. Epub 2012 Sep 26.
10
A similarity distance of diversity measure for discriminating mesophilic and thermophilic proteins.区分嗜温蛋白和嗜热蛋白的多样性测度相似距离。
Amino Acids. 2013 Feb;44(2):573-80. doi: 10.1007/s00726-012-1374-z. Epub 2012 Aug 1.

引用本文的文献

1
Prediction and design of thermostable proteins with a desired melting temperature.具有所需解链温度的热稳定蛋白质的预测与设计。
Sci Rep. 2025 May 14;15(1):16683. doi: 10.1038/s41598-025-98667-9.
2
Improving the Thermostability of Serine Protease PB92 from via Site-Directed Mutagenesis Based on Semi-Rational Design.基于半理性设计的定点突变提高嗜热栖热菌丝氨酸蛋白酶PB92的热稳定性
Foods. 2023 Aug 16;12(16):3081. doi: 10.3390/foods12163081.
3
Predicting thermostability difference between cellular protein orthologs.预测细胞蛋白直系同源物之间的热稳定性差异。

本文引用的文献

1
Supervised machine learning algorithms for protein structure classification.用于蛋白质结构分类的监督式机器学习算法。
Comput Biol Chem. 2009 Jun;33(3):216-23. doi: 10.1016/j.compbiolchem.2009.04.004. Epub 2009 May 3.
2
A computational approach for the rational design of stable proteins and enzymes: optimization of surface charge-charge interactions.一种用于合理设计稳定蛋白质和酶的计算方法:表面电荷-电荷相互作用的优化
Methods Enzymol. 2009;454:175-211. doi: 10.1016/S0076-6879(08)03807-X.
3
Predicting disordered regions in proteins using the profiles of amino acid indices.
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad504.
4
Sourcing thermotolerant poly(ethylene terephthalate) hydrolase scaffolds from natural diversity.从自然多样性中筛选耐热聚对苯二甲酸乙二醇酯水解酶支架。
Nat Commun. 2022 Dec 21;13(1):7850. doi: 10.1038/s41467-022-35237-x.
5
A Machine Learning Model for Accurate Prediction of Sepsis in ICU Patients.一种用于 ICU 患者脓毒症精准预测的机器学习模型。
Front Public Health. 2021 Oct 15;9:754348. doi: 10.3389/fpubh.2021.754348. eCollection 2021.
6
The use of consensus sequence information to engineer stability and activity in proteins.利用共有序列信息来设计蛋白质的稳定性和活性。
Methods Enzymol. 2020;643:149-179. doi: 10.1016/bs.mie.2020.06.001. Epub 2020 Jul 17.
7
Consensus sequence design as a general strategy to create hyperstable, biologically active proteins.共识序列设计作为一种通用策略,可用于构建超稳定、具有生物活性的蛋白质。
Proc Natl Acad Sci U S A. 2019 Jun 4;116(23):11275-11284. doi: 10.1073/pnas.1816707116. Epub 2019 May 20.
8
Predicting the optimal growth temperatures of prokaryotes using only genome derived features.仅使用基于基因组的特征预测原核生物的最佳生长温度。
Bioinformatics. 2019 Sep 15;35(18):3224-3231. doi: 10.1093/bioinformatics/btz059.
9
Expression of a rice soluble starch synthase gene in transgenic wheat improves the grain yield under heat stress conditions.水稻可溶性淀粉合酶基因在转基因小麦中的表达提高了热胁迫条件下的籽粒产量。
In Vitro Cell Dev Biol Plant. 2018;54(3):216-227. doi: 10.1007/s11627-018-9893-2. Epub 2018 Mar 6.
10
Establishing knowledge on the sequence arrangement pattern of nucleated protein folding.建立关于有核蛋白质折叠序列排列模式的知识。
PLoS One. 2017 Mar 8;12(3):e0173583. doi: 10.1371/journal.pone.0173583. eCollection 2017.
利用氨基酸指数概况预测蛋白质中的无序区域。
BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S42. doi: 10.1186/1471-2105-10-S1-S42.
4
Comparative proteome analysis of psychrophilic versus mesophilic bacterial species: Insights into the molecular basis of cold adaptation of proteins.嗜冷菌与嗜温菌的比较蛋白质组分析:对蛋白质冷适应分子基础的见解
BMC Genomics. 2009 Jan 8;10:11. doi: 10.1186/1471-2164-10-11.
5
Structural adaptation of the subunit interface of oligomeric thermophilic and hyperthermophilic enzymes.寡聚嗜热酶和超嗜热酶亚基界面的结构适应性
Comput Biol Chem. 2009 Apr;33(2):137-48. doi: 10.1016/j.compbiolchem.2008.08.003. Epub 2008 Aug 31.
6
Predicting protein thermostability changes from sequence upon multiple mutations.预测多个突变后蛋白质序列的热稳定性变化。
Bioinformatics. 2008 Jul 1;24(13):i190-5. doi: 10.1093/bioinformatics/btn166.
7
Experimental and theoretical studies of sodium cation complexes of the deamidation and dehydration products of asparagine, glutamine, aspartic acid, and glutamic acid.天冬酰胺、谷氨酰胺、天冬氨酸和谷氨酸的脱酰胺和脱水产物的钠阳离子配合物的实验和理论研究。
J Phys Chem A. 2008 Apr 17;112(15):3328-38. doi: 10.1021/jp800439j. Epub 2008 Mar 21.
8
Discrimination of mesophilic and thermophilic proteins using machine learning algorithms.使用机器学习算法区分嗜温蛋白和嗜热蛋白。
Proteins. 2008 Mar;70(4):1274-9. doi: 10.1002/prot.21616.
9
Differences in amino acids composition and coupling patterns between mesophilic and thermophilic proteins.嗜温蛋白与嗜热蛋白之间氨基酸组成和偶联模式的差异。
Amino Acids. 2008 Jan;34(1):25-33. doi: 10.1007/s00726-007-0589-x. Epub 2007 Aug 21.
10
Different packing of external residues can explain differences in the thermostability of proteins from thermophilic and mesophilic organisms.外部残基的不同堆积方式可以解释嗜热生物和中温生物蛋白质热稳定性的差异。
Bioinformatics. 2007 Sep 1;23(17):2231-8. doi: 10.1093/bioinformatics/btm345. Epub 2007 Jun 28.