• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用序列、预测二级结构和溶剂可及性确定蛋白质折叠动力学类型。

Determination of protein folding kinetic types using sequence and predicted secondary structure and solvent accessibility.

机构信息

School of Computer Science and Information Engineering, Zhejiang Gongshang University, Hangzhou, People's Republic of China.

出版信息

Amino Acids. 2012 Jan;42(1):271-83. doi: 10.1007/s00726-010-0805-y. Epub 2010 Nov 17.

DOI:10.1007/s00726-010-0805-y
PMID:21082205
Abstract

Proteins fold through a two-state (TS), with no visible intermediates, or a multi-state (MS), via at least one intermediate, process. We analyze sequence-derived factors that determine folding types by introducing a novel sequence-based folding type predictor called FOKIT. This method implements a logistic regression model with six input features which hybridize information concerning amino acid composition and predicted secondary structure and solvent accessibility. FOKIT provides predictions with average Matthews correlation coefficient (MCC) between 0.58 and 0.91 measured using out-of-sample tests on four benchmark datasets. These results are shown to be competitive or better than results of four modern predictors. We also show that FOKIT outperforms these methods when predicting chains that share low similarity with the chains used to build the model, which is an important advantage given the limited number of annotated chains. We demonstrate that inclusion of solvent accessibility helps in discrimination of the folding kinetic types and that three of the features constitute statistically significant markers that differentiate TS and MS folders. We found that the increased content of exposed Trp and buried Leu are indicative of the MS folding, which implies that the exposure/burial of certain hydrophobic residues may play important role in the formation of the folding intermediates. Our conclusions are supported by two case studies.

摘要

蛋白质折叠通过两种状态(TS),没有可见的中间体,或通过至少一个中间体的多状态(MS)过程。我们通过引入一种称为 FOKIT 的新型基于序列的折叠类型预测器来分析决定折叠类型的序列衍生因素。该方法实现了一个逻辑回归模型,具有六个输入特征,这些特征混合了有关氨基酸组成和预测二级结构和溶剂可及性的信息。FOKIT 在四个基准数据集上的样本外测试中提供了平均马修斯相关系数(MCC)在 0.58 到 0.91 之间的预测结果,这些结果与四个现代预测器的结果具有竞争力或更好。我们还表明,在预测与用于构建模型的链具有低相似性的链时,FOKIT 优于这些方法,这在注释链数量有限的情况下是一个重要优势。我们证明了溶剂可及性的包含有助于区分折叠动力学类型,并且三个特征构成了区分 TS 和 MS 文件夹的统计学上显著标记。我们发现暴露的色氨酸和埋藏的亮氨酸含量增加表明是 MS 折叠,这意味着某些疏水性残基的暴露/埋藏可能在折叠中间体的形成中发挥重要作用。我们的结论得到了两个案例研究的支持。

相似文献

1
Determination of protein folding kinetic types using sequence and predicted secondary structure and solvent accessibility.利用序列、预测二级结构和溶剂可及性确定蛋白质折叠动力学类型。
Amino Acids. 2012 Jan;42(1):271-83. doi: 10.1007/s00726-010-0805-y. Epub 2010 Nov 17.
2
Accurate prediction of protein folding rates from sequence and sequence-derived residue flexibility and solvent accessibility.从序列和序列衍生的残基柔性及溶剂可及性预测蛋白质折叠速率的准确性。
Proteins. 2010 Jul;78(9):2114-30. doi: 10.1002/prot.22727.
3
Prediction of protein folding rates from primary sequences using hybrid sequence representation.使用混合序列表示法从一级序列预测蛋白质折叠速率。
J Comput Chem. 2009 Apr 15;30(5):772-83. doi: 10.1002/jcc.21096.
4
Importance of native-state topology for determining the folding rate of two-state proteins.天然态拓扑结构对确定两态蛋白质折叠速率的重要性。
J Chem Inf Comput Sci. 2003 Sep-Oct;43(5):1481-5. doi: 10.1021/ci0340308.
5
Prediction of folding transition-state position (betaT) of small, two-state proteins from local secondary structure content.从小的两态蛋白质的局部二级结构含量预测折叠过渡态位置(βT)。
Proteins. 2007 Jul 1;68(1):218-22. doi: 10.1002/prot.21411.
6
On the relation between residue flexibility and local solvent accessibility in proteins.关于蛋白质中残基柔性与局部溶剂可及性之间的关系。
Proteins. 2009 Aug 15;76(3):617-36. doi: 10.1002/prot.22375.
7
Fold recognition by concurrent use of solvent accessibility and residue depth.通过同时使用溶剂可及性和残基深度进行折叠识别。
Proteins. 2007 Aug 15;68(3):636-45. doi: 10.1002/prot.21459.
8
QBES: predicting real values of solvent accessibility from sequences by efficient, constrained energy optimization.QBES:通过高效的约束能量优化从序列预测溶剂可及性的实际值。
Proteins. 2006 Jun 1;63(4):961-6. doi: 10.1002/prot.20934.
9
Prediction of protein solvent accessibility using fuzzy k-nearest neighbor method.使用模糊k近邻法预测蛋白质溶剂可及性。
Bioinformatics. 2005 Jun 15;21(12):2844-9. doi: 10.1093/bioinformatics/bti423. Epub 2005 Apr 6.
10
A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.一种用于蛋白质折叠识别的3D-1D替换矩阵,其包含序列的预测二级结构。
J Mol Biol. 1997 Apr 11;267(4):1026-38. doi: 10.1006/jmbi.1997.0924.

引用本文的文献

1
Recent Advances in Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences.从蛋白质序列预测二级和超二级结构的计算方法的最新进展
Methods Mol Biol. 2025;2870:1-19. doi: 10.1007/978-1-0716-4213-9_1.
2
A Multifeatures Fusion and Discrete Firefly Optimization Method for Prediction of Protein Tyrosine Sulfation Residues.一种用于预测蛋白质酪氨酸硫酸化残基的多特征融合与离散萤火虫优化方法。
Biomed Res Int. 2016;2016:8151509. doi: 10.1155/2016/8151509. Epub 2016 Mar 10.
3
Improving protein fold recognition using the amalgamation of evolutionary-based and structural based information.
利用基于进化和基于结构的信息融合来改进蛋白质折叠识别。
BMC Bioinformatics. 2014;15 Suppl 16(Suppl 16):S12. doi: 10.1186/1471-2105-15-S16-S12. Epub 2014 Dec 8.
4
Quad-PRE: a hybrid method to predict protein quaternary structure attributes.Quad-PRE:一种预测蛋白质四级结构属性的混合方法。
Comput Math Methods Med. 2014;2014:715494. doi: 10.1155/2014/715494. Epub 2014 May 18.
5
Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian naïve Bayes.基于随机森林和高斯朴素贝叶斯混合特征选择的DNA结合蛋白序列预测
PLoS One. 2014 Jan 24;9(1):e86703. doi: 10.1371/journal.pone.0086703. eCollection 2014.
6
A strategy to select suitable physicochemical attributes of amino acids for protein fold recognition.氨基酸理化属性选择用于蛋白质折叠识别的策略。
BMC Bioinformatics. 2013 Jul 24;14:233. doi: 10.1186/1471-2105-14-233.