• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于序列特征的氨基酸替换后蛋白质稳定性变化预测。

Sequence feature-based prediction of protein stability changes upon amino acid substitutions.

机构信息

Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA.

出版信息

BMC Genomics. 2010 Nov 2;11 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2164-11-S2-S5.

DOI:10.1186/1471-2164-11-S2-S5
PMID:21047386
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2975416/
Abstract

BACKGROUND

Protein destabilization is a common mechanism by which amino acid substitutions cause human diseases. Although several machine learning methods have been reported for predicting protein stability changes upon amino acid substitutions, the previous studies did not utilize relevant sequence features representing biological knowledge for classifier construction.

RESULTS

In this study, a new machine learning method has been developed for sequence feature-based prediction of protein stability changes upon amino acid substitutions. Support vector machines were trained with data from experimental studies on the free energy change of protein stability upon mutations. To construct accurate classifiers, twenty sequence features were examined for input vector encoding. It was shown that classifier performance varied significantly by using different sequence features. The most accurate classifier in this study was constructed using a combination of six sequence features. This classifier achieved an overall accuracy of 84.59% with 70.29% sensitivity and 90.98% specificity.

CONCLUSIONS

Relevant sequence features can be used to accurately predict protein stability changes upon amino acid substitutions. Predictive results at this level of accuracy may provide useful information to distinguish between deleterious and tolerant alterations in disease candidate genes. To make the classifier accessible to the genetics research community, we have developed a new web server, called MuStab (http://bioinfo.ggc.org/mustab/).

摘要

背景

蛋白质的不稳定性是氨基酸取代导致人类疾病的常见机制。尽管已经有几种机器学习方法被报道用于预测氨基酸取代引起的蛋白质稳定性变化,但以前的研究没有利用代表生物学知识的相关序列特征来构建分类器。

结果

在这项研究中,开发了一种新的基于序列特征的机器学习方法,用于预测氨基酸取代引起的蛋白质稳定性变化。支持向量机使用来自突变引起的蛋白质稳定性自由能变化的实验研究的数据进行训练。为了构建准确的分类器,对输入向量编码的二十个序列特征进行了检查。结果表明,使用不同的序列特征,分类器的性能差异显著。本研究中最准确的分类器是使用六个序列特征构建的。该分类器的总体准确性为 84.59%,灵敏度为 70.29%,特异性为 90.98%。

结论

相关的序列特征可用于准确预测氨基酸取代引起的蛋白质稳定性变化。在这种精度水平的预测结果可能为区分疾病候选基因中的有害和耐受改变提供有用信息。为了使分类器能够被遗传学研究社区使用,我们开发了一个名为 MuStab(http://bioinfo.ggc.org/mustab/)的新的网络服务器。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb7f/2975416/fd73a46adbde/1471-2164-11-S2-S5-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb7f/2975416/3432d71ee113/1471-2164-11-S2-S5-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb7f/2975416/3f859942f3a1/1471-2164-11-S2-S5-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb7f/2975416/28de2ec28850/1471-2164-11-S2-S5-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb7f/2975416/fd73a46adbde/1471-2164-11-S2-S5-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb7f/2975416/3432d71ee113/1471-2164-11-S2-S5-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb7f/2975416/3f859942f3a1/1471-2164-11-S2-S5-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb7f/2975416/28de2ec28850/1471-2164-11-S2-S5-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb7f/2975416/fd73a46adbde/1471-2164-11-S2-S5-4.jpg

相似文献

1
Sequence feature-based prediction of protein stability changes upon amino acid substitutions.基于序列特征的氨基酸替换后蛋白质稳定性变化预测。
BMC Genomics. 2010 Nov 2;11 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2164-11-S2-S5.
2
Prediction of DNA-binding residues from protein sequence information using random forests.利用随机森林从蛋白质序列信息预测DNA结合残基。
BMC Genomics. 2009 Jul 7;10 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2164-10-S1-S1.
3
Predicting protein sumoylation sites from sequence features.预测蛋白质的 SUMO 化位点从序列特征。
Amino Acids. 2012 Jul;43(1):447-55. doi: 10.1007/s00726-011-1100-2. Epub 2011 Oct 7.
4
Machine learning integration for predicting the effect of single amino acid substitutions on protein stability.用于预测单个氨基酸取代对蛋白质稳定性影响的机器学习整合
BMC Struct Biol. 2009 Oct 19;9:66. doi: 10.1186/1472-6807-9-66.
5
Structure-based prediction of the effects of a missense variant on protein stability.基于结构的错义变异对蛋白质稳定性影响的预测。
Amino Acids. 2013 Mar;44(3):847-55. doi: 10.1007/s00726-012-1407-7. Epub 2012 Oct 12.
6
EASE-MM: Sequence-Based Prediction of Mutation-Induced Stability Changes with Feature-Based Multiple Models.EASE-MM:基于序列的突变诱导稳定性变化预测与基于特征的多模型方法
J Mol Biol. 2016 Mar 27;428(6):1394-1405. doi: 10.1016/j.jmb.2016.01.012. Epub 2016 Jan 22.
7
Glycosylation site prediction using ensembles of Support Vector Machine classifiers.使用支持向量机分类器集成进行糖基化位点预测。
BMC Bioinformatics. 2007 Nov 9;8:438. doi: 10.1186/1471-2105-8-438.
8
BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features.BindN+ 用于从蛋白质序列特征准确预测DNA和RNA结合残基。
BMC Syst Biol. 2010 May 28;4 Suppl 1(Suppl 1):S3. doi: 10.1186/1752-0509-4-S1-S3.
9
Prediction of nuclear proteins using nuclear translocation signals proposed by probabilistic latent semantic indexing.基于概率潜在语义索引的核转位信号预测核蛋白。
BMC Bioinformatics. 2012;13 Suppl 17(Suppl 17):S13. doi: 10.1186/1471-2105-13-S17-S13. Epub 2012 Dec 13.
10
A machine learning based method for the prediction of secretory proteins using amino acid composition, their order and similarity-search.一种基于机器学习的方法,利用氨基酸组成、顺序和相似性搜索来预测分泌蛋白。
In Silico Biol. 2008;8(2):129-40.

引用本文的文献

1
A homozygous human WNT11 variant is associated with laterality, heart and renal defects.一种纯合的人类WNT11变体与身体左右不对称、心脏和肾脏缺陷有关。
Dis Model Mech. 2025 May 1;18(5). doi: 10.1242/dmm.052211. Epub 2025 May 14.
2
Development of a β-glucosidase improved for glucose retroinhibition for cellulosic ethanol production: an integrated bioinformatics and genetic engineering approach.用于纤维素乙醇生产的、对葡萄糖反馈抑制具有改善作用的β-葡萄糖苷酶的开发:一种生物信息学与基因工程相结合的方法
Biotechnol Biofuels Bioprod. 2025 Apr 5;18(1):44. doi: 10.1186/s13068-025-02643-4.
3
Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors.

本文引用的文献

1
New SMS mutation leads to a striking reduction in spermine synthase protein function and a severe form of Snyder-Robinson X-linked recessive mental retardation syndrome.新的SMS突变导致精胺合酶蛋白功能显著降低,并引发一种严重形式的X连锁隐性斯奈德-罗宾逊智力发育迟缓综合征。
J Med Genet. 2008 Aug;45(8):539-43. doi: 10.1136/jmg.2007.056713. Epub 2008 Jun 11.
2
iPTREE-STAB: interpretable decision tree based method for predicting protein stability changes upon mutations.iPTREE-STAB:基于可解释决策树的预测蛋白质突变后稳定性变化的方法。
Bioinformatics. 2007 May 15;23(10):1292-3. doi: 10.1093/bioinformatics/btm100. Epub 2007 Mar 22.
3
变异影响预测器数据库(VIPdb),版本 2:三十年来遗传变异影响预测器的趋势。
Hum Genomics. 2024 Aug 28;18(1):90. doi: 10.1186/s40246-024-00663-z.
4
Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors.变异影响预测数据库(VIPdb),版本2:25年基因变异影响预测的趋势
bioRxiv. 2024 Jun 28:2024.06.25.600283. doi: 10.1101/2024.06.25.600283.
5
In-silico identification of deleterious non-synonymous SNPs of TBX1 gene: Functional and structural impact towards 22q11.2DS.TBX1 基因有害非同义 SNP 的计算机识别:对 22q11.2DS 的功能和结构影响。
PLoS One. 2024 Jun 21;19(6):e0298092. doi: 10.1371/journal.pone.0298092. eCollection 2024.
6
Prediction of Thermostability of Enzymes Based on the Amino Acid Index (AAindex) Database and Machine Learning.基于氨基酸指数(AAindex)数据库和机器学习预测酶的热稳定性
Molecules. 2023 Dec 15;28(24):8097. doi: 10.3390/molecules28248097.
7
FireProt 2.0: web-based platform for the fully automated design of thermostable proteins.FireProt 2.0:用于全自动化设计热稳定蛋白的基于网络的平台。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad425.
8
Contiguously hydrophobic sequences are functionally significant throughout the human exome.连续疏水序列在整个人外显子组中具有功能意义。
Proc Natl Acad Sci U S A. 2022 Mar 22;119(12):e2116267119. doi: 10.1073/pnas.2116267119. Epub 2022 Mar 16.
9
Pan-cancer assessment of mutational landscape in intrinsically disordered hotspots reveals potential driver genes.泛癌症分析内在无序热点区的突变景观揭示潜在的驱动基因。
Nucleic Acids Res. 2022 May 20;50(9):e49. doi: 10.1093/nar/gkac028.
10
Review of Machine Learning Methods for the Prediction and Reconstruction of Metabolic Pathways.用于代谢途径预测与重建的机器学习方法综述
Front Mol Biosci. 2021 Jun 17;8:634141. doi: 10.3389/fmolb.2021.634141. eCollection 2021.
Amino acid bulkiness defines the local conformations and dynamics of natively unfolded alpha-synuclein and tau.
氨基酸的体积大小决定了天然未折叠的α-突触核蛋白和tau蛋白的局部构象及动力学。
J Am Chem Soc. 2007 Mar 21;129(11):3032-3. doi: 10.1021/ja067482k. Epub 2007 Feb 23.
4
Prediction of DNA-binding residues from sequence features.基于序列特征预测DNA结合残基。
J Bioinform Comput Biol. 2006 Dec;4(6):1141-58. doi: 10.1142/s0219720006002387.
5
What is a support vector machine?什么是支持向量机?
Nat Biotechnol. 2006 Dec;24(12):1565-7. doi: 10.1038/nbt1206-1565.
6
An amino acid "transmembrane tendency" scale that approaches the theoretical limit to accuracy for prediction of transmembrane helices: relationship to biological hydrophobicity.一种接近跨膜螺旋预测准确性理论极限的氨基酸“跨膜倾向”量表:与生物疏水性的关系。
Protein Sci. 2006 Aug;15(8):1987-2001. doi: 10.1110/ps.062286306.
7
BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences.BindN:一种用于高效预测氨基酸序列中DNA和RNA结合位点的基于网络的工具。
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W243-8. doi: 10.1093/nar/gkl298.
8
Identification and analysis of deleterious human SNPs.有害人类单核苷酸多态性的鉴定与分析。
J Mol Biol. 2006 Mar 10;356(5):1263-74. doi: 10.1016/j.jmb.2005.12.025. Epub 2005 Dec 27.
9
Prediction of protein stability changes for single-site mutations using support vector machines.使用支持向量机预测单点突变的蛋白质稳定性变化
Proteins. 2006 Mar 1;62(4):1125-32. doi: 10.1002/prot.20810.
10
I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure.I-Mutant2.0:从蛋白质序列或结构预测突变引起的稳定性变化。
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W306-10. doi: 10.1093/nar/gki375.