• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

三级结构基序的序列统计反映了蛋白质稳定性。

Sequence statistics of tertiary structural motifs reflect protein stability.

作者信息

Zheng Fan, Grigoryan Gevorg

机构信息

Department of Biological Sciences, Dartmouth College, Hanover, NH, United States of America.

Department of Computer Science, Dartmouth College, Hanover, NH, United States of America.

出版信息

PLoS One. 2017 May 26;12(5):e0178272. doi: 10.1371/journal.pone.0178272. eCollection 2017.

DOI:10.1371/journal.pone.0178272
PMID:28552940
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5446159/
Abstract

The Protein Data Bank (PDB) has been a key resource for learning general rules of sequence-structure relationships in proteins. Quantitative insights have been gained by defining geometric descriptors of structure (e.g., distances, dihedral angles, solvent exposure, etc.) and observing their distributions and sequence preferences. Here we argue that as the PDB continues to grow, it may become unnecessary to reduce structure into a set of elementary descriptors. Instead, it could be possible to deduce quantitative sequence-structure relationships in the context of precisely-defined complex structural motifs by mining the PDB for closely matching backbone geometries. To validate this idea, we turned to the the task of predicting changes in protein stability upon amino-acid substitution-a difficult problem of broad significance. We defined non-contiguous tertiary motifs (TERMs) around a protein site of interest and extracted sequence preferences from ensembles of closely-matching substructures in the PDB to predict mutational stability changes at the site, ΔΔGm. We demonstrate that these ensemble statistics predict ΔΔGm on par with state-of-the-art statistical and machine-learning methods on large thermodynamic datasets, and outperform these, along with a leading structure-based modeling approach, when tested in the context of unbiased diverse mutations. Further, we show that the performance of the TERM-based method is directly related to the amount of available relevant structural data, automatically improving with the growing PDB. This enables a means of estimating prediction accuracy. Our results clearly demonstrate that: 1) statistics of non-contiguous structural motifs in the PDB encode fundamental sequence-structure relationships related to protein thermodynamic stability, and 2) the PDB is now large enough that such statistics are already useful in practice, with their accuracy expected to continue increasing as the database grows. These observations suggest new ways of using structural data towards addressing problems of computational structural biology.

摘要

蛋白质数据库(PDB)一直是了解蛋白质序列 - 结构关系一般规则的关键资源。通过定义结构的几何描述符(例如,距离、二面角、溶剂暴露等)并观察它们的分布和序列偏好,已获得了定量的见解。在这里,我们认为随着PDB的不断增长,可能没有必要将结构简化为一组基本描述符。相反,通过在PDB中挖掘紧密匹配的主链几何结构,有可能在精确定义的复杂结构基序的背景下推断定量的序列 - 结构关系。为了验证这一想法,我们转向预测氨基酸取代后蛋白质稳定性变化的任务——这是一个具有广泛意义的难题。我们在感兴趣的蛋白质位点周围定义了非连续的三级基序(TERMs),并从PDB中紧密匹配的子结构集合中提取序列偏好,以预测该位点的突变稳定性变化ΔΔGm。我们证明,这些集合统计数据在大型热力学数据集上预测ΔΔGm的能力与最先进的统计和机器学习方法相当,并且在无偏多样突变的背景下进行测试时,优于这些方法以及领先的基于结构的建模方法。此外,我们表明基于TERM的方法的性能与可用相关结构数据的量直接相关,并随着PDB的增长而自动提高。这提供了一种估计预测准确性的方法。我们的结果清楚地表明:1)PDB中非连续结构基序的统计数据编码了与蛋白质热力学稳定性相关的基本序列 - 结构关系,2)PDB现在已经足够大,以至于这样的统计数据在实践中已经有用,并且随着数据库的增长,其准确性有望继续提高。这些观察结果提出了利用结构数据解决计算结构生物学问题的新方法。

相似文献

1
Sequence statistics of tertiary structural motifs reflect protein stability.三级结构基序的序列统计反映了蛋白质稳定性。
PLoS One. 2017 May 26;12(5):e0178272. doi: 10.1371/journal.pone.0178272. eCollection 2017.
2
Tertiary alphabet for the observable protein structural universe.可观测蛋白质结构宇宙的三级字母表。
Proc Natl Acad Sci U S A. 2016 Nov 22;113(47):E7438-E7447. doi: 10.1073/pnas.1607178113. Epub 2016 Nov 3.
3
Tertiary structural propensities reveal fundamental sequence/structure relationships.三级结构倾向揭示基本的序列/结构关系。
Structure. 2015 May 5;23(5):961-971. doi: 10.1016/j.str.2015.03.015. Epub 2015 Apr 23.
4
Protein structural motifs in prediction and design.预测与设计中的蛋白质结构基序
Curr Opin Struct Biol. 2017 Jun;44:161-167. doi: 10.1016/j.sbi.2017.03.012. Epub 2017 Apr 28.
5
Improving the accuracy of protein stability predictions with multistate design using a variety of backbone ensembles.利用多种主链系综通过多状态设计提高蛋白质稳定性预测的准确性。
Proteins. 2014 May;82(5):771-84. doi: 10.1002/prot.24457. Epub 2013 Nov 22.
6
Efficient and automated large-scale detection of structural relationships in proteins with a flexible aligner.利用灵活比对器对蛋白质中的结构关系进行高效自动化大规模检测。
BMC Bioinformatics. 2016 Jan 5;17:20. doi: 10.1186/s12859-015-0866-8.
7
Efficiently Mining Recurrent Substructures from Protein Three-Dimensional Structure Graphs.从蛋白质三维结构图谱中高效挖掘循环子结构
J Comput Biol. 2019 Jun;26(6):561-571. doi: 10.1089/cmb.2018.0171. Epub 2018 Dec 5.
8
Comparison of sequence and structure-based datasets for nonredundant structural data mining.用于非冗余结构数据挖掘的基于序列和结构的数据集比较。
Proteins. 2005 Sep 1;60(4):577-83. doi: 10.1002/prot.20505.
9
Structural alignment of protein descriptors - a combinatorial model.蛋白质描述符的结构比对——一种组合模型
BMC Bioinformatics. 2016 Sep 17;17:383. doi: 10.1186/s12859-016-1237-9.
10
Beta-turn propensities as paradigms for the analysis of structural motifs to engineer protein stability.β-转角倾向作为分析结构基序以设计蛋白质稳定性的范例。
Protein Sci. 1997 Jan;6(1):233-41. doi: 10.1002/pro.5560060125.

引用本文的文献

1
Mega-scale experimental analysis of protein folding stability in biology and design.大规模实验分析生物学和设计中的蛋白质折叠稳定性。
Nature. 2023 Aug;620(7973):434-444. doi: 10.1038/s41586-023-06328-6. Epub 2023 Jul 19.
2
A Conserved Local Structural Motif Controls the Kinetics of PTP1B Catalysis.一个保守的局部结构基序控制 PTP1B 催化的动力学。
J Chem Inf Model. 2023 Jul 10;63(13):4115-4124. doi: 10.1021/acs.jcim.3c00286. Epub 2023 Jun 28.
3
Neural network-derived Potts models for structure-based protein design using backbone atomic coordinates and tertiary motifs.

本文引用的文献

1
Tertiary alphabet for the observable protein structural universe.可观测蛋白质结构宇宙的三级字母表。
Proc Natl Acad Sci U S A. 2016 Nov 22;113(47):E7438-E7447. doi: 10.1073/pnas.1607178113. Epub 2016 Nov 3.
2
Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules.基于小分子和大分子特征的生物分子能量函数的同步优化。
J Chem Theory Comput. 2016 Dec 13;12(12):6201-6212. doi: 10.1021/acs.jctc.6b00819. Epub 2016 Nov 7.
3
Coarse-Grained Protein Models and Their Applications.
基于神经网络的 Potts 模型,使用骨架原子坐标和三级基序进行基于结构的蛋白质设计。
Protein Sci. 2023 Feb;32(2):e4554. doi: 10.1002/pro.4554.
4
Data-driven computational protein design.数据驱动的计算蛋白质设计。
Curr Opin Struct Biol. 2021 Aug;69:63-69. doi: 10.1016/j.sbi.2021.03.009. Epub 2021 Apr 25.
5
Structural analysis of cross α-helical nanotubes provides insight into the designability of filamentous peptide nanomaterials.交叉α-螺旋纳米管的结构分析为丝状肽纳米材料的可设计性提供了深入了解。
Nat Commun. 2021 Jan 18;12(1):407. doi: 10.1038/s41467-020-20689-w.
6
Probing the Structural Dynamics of the Tunneling-Fold Enzyme 6-Pyruvoyl Tetrahydropterin Synthase to Reveal Allosteric Drug Targeting Sites.探究隧道折叠酶6-丙酮酰四氢蝶呤合酶的结构动力学以揭示变构药物靶向位点
Front Mol Biosci. 2020 Sep 25;7:575196. doi: 10.3389/fmolb.2020.575196. eCollection 2020.
7
A general-purpose protein design framework based on mining sequence-structure relationships in known protein structures.基于挖掘已知蛋白质结构中序列-结构关系的通用蛋白质设计框架。
Proc Natl Acad Sci U S A. 2020 Jan 14;117(2):1059-1068. doi: 10.1073/pnas.1908723117. Epub 2019 Dec 31.
8
Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis.通过全域全面突变揭示蛋白质稳定性工程的见解。
Proc Natl Acad Sci U S A. 2019 Aug 13;116(33):16367-16377. doi: 10.1073/pnas.1903888116. Epub 2019 Aug 1.
9
Tertiary Structural Motif Sequence Statistics Enable Facile Prediction and Design of Peptides that Bind Anti-apoptotic Bfl-1 and Mcl-1.三级结构基序序列统计可轻松预测和设计结合抗凋亡蛋白 Bfl-1 和 Mcl-1 的肽。
Structure. 2019 Apr 2;27(4):606-617.e5. doi: 10.1016/j.str.2019.01.008. Epub 2019 Feb 14.
粗粒度蛋白质模型及其应用。
Chem Rev. 2016 Jul 27;116(14):7898-936. doi: 10.1021/acs.chemrev.6b00163. Epub 2016 Jun 22.
4
STRUM: structure-based prediction of protein stability changes upon single-point mutation.STRUM:基于结构预测单点突变后蛋白质稳定性的变化
Bioinformatics. 2016 Oct 1;32(19):2936-46. doi: 10.1093/bioinformatics/btw361. Epub 2016 Jun 17.
5
De Novo Evolutionary Emergence of a Symmetrical Protein Is Shaped by Folding Constraints.对称蛋白质的从头进化出现受折叠限制的影响。
Cell. 2016 Jan 28;164(3):476-86. doi: 10.1016/j.cell.2015.12.024. Epub 2016 Jan 21.
6
A topological and conformational stability alphabet for multipass membrane proteins.一种用于多次跨膜蛋白的拓扑和构象稳定性字母表。
Nat Chem Biol. 2016 Mar;12(3):167-73. doi: 10.1038/nchembio.2001. Epub 2016 Jan 18.
7
A vocabulary of ancient peptides at the origin of folded proteins.折叠蛋白起源处的古代肽词汇表。
Elife. 2015 Dec 14;4:e09410. doi: 10.7554/eLife.09410.
8
High-throughput identification of protein mutant stability computed from a double mutant fitness landscape.通过双突变体适应度景观计算实现蛋白质突变体稳定性的高通量鉴定。
Protein Sci. 2016 Feb;25(2):530-9. doi: 10.1002/pro.2840. Epub 2015 Dec 8.
9
Modularity of Protein Folds as a Tool for Template-Free Modeling of Structures.蛋白质折叠的模块化作为无模板结构建模的工具
PLoS Comput Biol. 2015 Aug 7;11(8):e1004419. doi: 10.1371/journal.pcbi.1004419. eCollection 2015 Aug.
10
Tertiary structural propensities reveal fundamental sequence/structure relationships.三级结构倾向揭示基本的序列/结构关系。
Structure. 2015 May 5;23(5):961-971. doi: 10.1016/j.str.2015.03.015. Epub 2015 Apr 23.