• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于通过突变氨基酸学习蛋白质能量变化的聚类树回归

Clustered tree regression to learn protein energy change with mutated amino acid.

作者信息

Tu Hongwei, Han Yanqiang, Wang Zhilong, Li Jinjin

机构信息

Key Laboratory of Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, 200240, China.

出版信息

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac374.

DOI:10.1093/bib/bbac374
PMID:36124753
Abstract

Accurate and effective prediction of mutation-induced protein energy change remains a great challenge and of great interest in computational biology. However, high resource consumption and insufficient structural information of proteins severely limit the experimental techniques and structure-based prediction methods. Here, we design a structure-independent protocol to accurately and effectively predict the mutation-induced protein folding free energy change with only sequence, physicochemical and evolutionary features. The proposed clustered tree regression protocol is capable of effectively exploiting the inherent data patterns by integrating unsupervised feature clustering by K-means and supervised tree regression using XGBoost, and thus enabling fast and accurate protein predictions with different mutations, with an average Pearson correlation coefficient of 0.83 and an average root-mean-square error of 0.94kcal/mol. The proposed sequence-based method not only eliminates the dependence on protein structures, but also has potential applications in protein predictions with rare structural information.

摘要

准确有效地预测突变引起的蛋白质能量变化仍然是计算生物学中一项巨大的挑战,并且备受关注。然而,高资源消耗以及蛋白质结构信息不足严重限制了实验技术和基于结构的预测方法。在此,我们设计了一种与结构无关的方案,仅利用序列、物理化学和进化特征,就能准确有效地预测突变引起的蛋白质折叠自由能变化。所提出的聚类树回归方案能够通过整合K均值无监督特征聚类和使用XGBoost的有监督树回归,有效地利用内在数据模式,从而能够对不同突变进行快速准确的蛋白质预测,平均皮尔逊相关系数为0.83,平均均方根误差为0.94千卡/摩尔。所提出的基于序列的方法不仅消除了对蛋白质结构的依赖,而且在具有罕见结构信息的蛋白质预测中也有潜在应用。

相似文献

1
Clustered tree regression to learn protein energy change with mutated amino acid.用于通过突变氨基酸学习蛋白质能量变化的聚类树回归
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac374.
2
Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0.利用统计势能和神经网络快速准确地预测突变后蛋白质稳定性的变化:PoPMuSiC-2.0。
Bioinformatics. 2009 Oct 1;25(19):2537-43. doi: 10.1093/bioinformatics/btp445. Epub 2009 Aug 3.
3
Comparing Supervised Learning and Rigorous Approach for Predicting Protein Stability upon Point Mutations in Difficult Targets.比较监督学习和严格方法用于预测难处理靶点点突变后的蛋白质稳定性
J Chem Inf Model. 2023 Nov 13;63(21):6778-6788. doi: 10.1021/acs.jcim.3c00750. Epub 2023 Oct 28.
4
iSEE: Interface structure, evolution, and energy-based machine learning predictor of binding affinity changes upon mutations.iSEE:界面结构、进化和基于能量的机器学习预测突变引起的结合亲和力变化。
Proteins. 2019 Feb;87(2):110-119. doi: 10.1002/prot.25630. Epub 2018 Dec 3.
5
Machine learning algorithms for predicting protein folding rates and stability of mutant proteins: comparison with statistical methods.用于预测蛋白质折叠速率和突变蛋白稳定性的机器学习算法:与统计方法的比较。
Curr Protein Pept Sci. 2011 Sep;12(6):490-502. doi: 10.2174/138920311796957630.
6
SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability.SAAFEC-SEQ:一种基于序列的方法,用于预测单点突变对蛋白质热力学稳定性的影响。
Int J Mol Sci. 2021 Jan 9;22(2):606. doi: 10.3390/ijms22020606.
7
EASE-MM: Sequence-Based Prediction of Mutation-Induced Stability Changes with Feature-Based Multiple Models.EASE-MM:基于序列的突变诱导稳定性变化预测与基于特征的多模型方法
J Mol Biol. 2016 Mar 27;428(6):1394-1405. doi: 10.1016/j.jmb.2016.01.012. Epub 2016 Jan 22.
8
Analysis and prediction of protein folding energy changes upon mutation by element specific persistent homology.通过元素特定的持久同调分析和预测突变时蛋白质折叠能量的变化。
Bioinformatics. 2017 Nov 15;33(22):3549-3557. doi: 10.1093/bioinformatics/btx460.
9
Statistical geometry based prediction of nonsynonymous SNP functional effects using random forest and neuro-fuzzy classifiers.基于统计几何学,使用随机森林和神经模糊分类器预测非同义单核苷酸多态性的功能效应
Proteins. 2008 Jun;71(4):1930-9. doi: 10.1002/prot.21838.
10
Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis.通过将机器学习与基于结构的计算诱变相结合,准确预测蛋白质突变体的稳定性变化。
Bioinformatics. 2008 Sep 15;24(18):2002-9. doi: 10.1093/bioinformatics/btn353. Epub 2008 Jul 16.

引用本文的文献

1
Accelerating therapeutic protein design with computational approaches toward the clinical stage.利用计算方法加速治疗性蛋白质设计迈向临床阶段。
Comput Struct Biotechnol J. 2023 Apr 29;21:2909-2926. doi: 10.1016/j.csbj.2023.04.027. eCollection 2023.