基于残基水平的蛋白质结构特性的统计度量。

Statistical measures on residue-level protein structural properties.

作者信息

Huang Yuanyuan, Bonett Stephen, Kloczkowski Andrzej, Jernigan Robert, Wu Zhijun

机构信息

Program on Bioinformatics and Computational Biology, Iowa State University, Ames, IA 50014, USA.

出版信息

J Struct Funct Genomics. 2011 Jul;12(2):119-36. doi: 10.1007/s10969-011-9104-4. Epub 2011 Mar 31.

DOI:10.1007/s10969-011-9104-4

PMID:21452025

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3694722/

Abstract

The atomic-level structural properties of proteins, such as bond lengths, bond angles, and torsion angles, have been well studied and understood based on either chemistry knowledge or statistical analysis. Similar properties on the residue-level, such as the distances between two residues and the angles formed by short sequences of residues, can be equally important for structural analysis and modeling, but these have not been examined and documented on a similar scale. While these properties are difficult to measure experimentally, they can be statistically estimated in meaningful ways based on their distributions in known proteins structures. Residue-level structural properties including various types of residue distances and angles are estimated statistically. A software package is built to provide direct access to the statistical data for the properties including some important correlations not previously investigated. The distributions of residue distances and angles may vary with varying sequences, but in most cases, are concentrated in some high probability ranges, corresponding to their frequent occurrences in either α-helices or β-sheets. Strong correlations among neighboring residue angles, similar to those between neighboring torsion angles at the atomic-level, are revealed based on their statistical measures. Residue-level statistical potentials can be defined using the statistical distributions and correlations of the residue distances and angles. Ramachandran-like plots for strongly correlated residue angles are plotted and analyzed. Their applications to structural evaluation and refinement are demonstrated. With the increase in both number and quality of known protein structures, many structural properties can be derived from sets of protein structures by statistical analysis and data mining, and these can even be used as a supplement to the experimental data for structure determinations. Indeed, the statistical measures on various types of residue distances and angles provide more systematic and quantitative assessments on these properties, which can otherwise be estimated only individually and qualitatively. Their distributions and correlations in known protein structures show their importance for providing insights into how proteins may fold naturally to various residue-level structures.

摘要

基于化学知识或统计分析，蛋白质的原子级结构特性，如键长、键角和扭转角，已得到充分研究和理解。残基水平上的类似特性，如两个残基之间的距离以及由短残基序列形成的角度，对于结构分析和建模同样重要，但尚未在类似规模上进行研究和记录。虽然这些特性难以通过实验测量，但可以根据它们在已知蛋白质结构中的分布，以有意义的方式进行统计估计。对包括各种类型残基距离和角度在内的残基水平结构特性进行统计估计。构建了一个软件包，以直接访问这些特性的统计数据，包括一些以前未研究过的重要相关性。残基距离和角度的分布可能随序列变化而变化，但在大多数情况下，集中在一些高概率范围内，这与它们在α螺旋或β折叠中频繁出现相对应。基于统计量揭示了相邻残基角度之间的强相关性，类似于原子水平上相邻扭转角之间的相关性。可以使用残基距离和角度的统计分布及相关性来定义残基水平的统计势。绘制并分析了强相关残基角度的类似拉氏图。展示了它们在结构评估和优化中的应用。随着已知蛋白质结构数量和质量的增加，许多结构特性可以通过统计分析和数据挖掘从蛋白质结构集中推导出来，这些甚至可以用作结构测定实验数据的补充。事实上，对各种类型残基距离和角度的统计量为这些特性提供了更系统和定量的评估，否则这些特性只能单独和定性地估计。它们在已知蛋白质结构中的分布和相关性表明，它们对于深入了解蛋白质如何自然折叠成各种残基水平结构具有重要意义。

相似文献

Statistical measures on residue-level protein structural properties.基于残基水平的蛋白质结构特性的统计度量。

J Struct Funct Genomics. 2011 Jul;12(2):119-36. doi: 10.1007/s10969-011-9104-4. Epub 2011 Mar 31.

P.R.E.S.S.--an R-package for exploring residual-level protein structural statistics.P.R.E.S.S.——一个用于探索残基水平蛋白质结构统计数据的R包。

J Bioinform Comput Biol. 2012 Jun;10(3):1242007. doi: 10.1142/S0219720012420073.

Ramachandran plot on the web.网络上的拉马钱德兰图。

Bioinformatics. 2002 Nov;18(11):1548-9. doi: 10.1093/bioinformatics/18.11.1548.

Tri-peptide reference structures for the calculation of relative solvent accessible surface area in protein amino acid residues.用于计算蛋白质氨基酸残基相对溶剂可及表面积的三肽参考结构。

Comput Biol Chem. 2015 Feb;54:33-43. doi: 10.1016/j.compbiolchem.2014.11.007. Epub 2014 Dec 3.

STARS: statistics on inter-atomic distances and torsion angles in protein secondary structures.STARS：蛋白质二级结构中原子间距离和扭转角的统计数据。

Bioinformatics. 2005 Jun 15;21(12):2925-6. doi: 10.1093/bioinformatics/bti437. Epub 2005 Apr 12.

Statistical analysis of crystallographic data obtained from squid ganglion DFPase at 0.85 A resolution.对从鱿鱼神经节DFPase以0.85埃分辨率获得的晶体学数据进行统计分析。

Acta Crystallogr D Biol Crystallogr. 2003 Oct;59(Pt 10):1744-54. doi: 10.1107/s0907444903016135. Epub 2003 Sep 19.

Local propensities and statistical potentials of backbone dihedral angles in proteins.蛋白质中主链二面角的局部倾向和统计势

J Mol Biol. 2004 Sep 10;342(2):635-49. doi: 10.1016/j.jmb.2004.06.091.

An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.一种蛋白质序列与结构分析及建模的综合方法。III. 使用多重结构比对对蛋白质结构家族中的序列保守性进行比较研究。

J Mol Biol. 2000 Aug 18;301(3):691-711. doi: 10.1006/jmbi.2000.3975.

Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.基于层次狄利克雷过程模型开发的依赖于邻居的氨基酸拉马钱德兰概率分布。

PLoS Comput Biol. 2010 Apr 29;6(4):e1000763. doi: 10.1371/journal.pcbi.1000763.

A reexamination of correlations of amino acids with particular secondary structures.对氨基酸与特定二级结构相关性的重新审视。

Protein J. 2009 Feb;28(2):74-86. doi: 10.1007/s10930-009-9166-3.

引用本文的文献

Hybridized distance- and contact-based hierarchical structure modeling for folding soluble and membrane proteins.用于可溶性和膜蛋白折叠的杂交距离和接触基层级结构建模。

PLoS Comput Biol. 2021 Feb 23;17(2):e1008753. doi: 10.1371/journal.pcbi.1008753. eCollection 2021 Feb.

Coarse grained normal mode analysis vs. refined Gaussian Network Model for protein residue-level structural fluctuations.粗粒化正则模态分析与精细高斯网络模型在蛋白质残基水平结构波动中的比较。

Bull Math Biol. 2013 Jan;75(1):124-60. doi: 10.1007/s11538-012-9797-y. Epub 2013 Jan 8.

P.R.E.S.S.--an R-package for exploring residual-level protein structural statistics.P.R.E.S.S.——一个用于探索残基水平蛋白质结构统计数据的R包。

J Bioinform Comput Biol. 2012 Jun;10(3):1242007. doi: 10.1142/S0219720012420073.

本文引用的文献

PRTAD: a database for protein residue torsion angle distributions.PRTAD：一个蛋白质残基扭转角分布数据库。

Int J Data Min Bioinform. 2009;3(4):469-82. doi: 10.1504/ijdmb.2009.029207.

Refinement of under-determined loops of Human Prion Protein by database-derived distance constraints.利用数据库衍生的距离约束对人朊病毒蛋白的欠定环进行优化

Int J Data Min Bioinform. 2009;3(4):454-68. doi: 10.1504/ijdmb.2009.029206.

Refinement of NMR-determined protein structures with database derived mean-force potentials.利用数据库衍生的平均力势对核磁共振测定的蛋白质结构进行优化。

Proteins. 2007 Jul 1;68(1):232-42. doi: 10.1002/prot.21358.

PIDD: database for Protein Inter-atomic Distance Distributions.PIDD：蛋白质原子间距离分布数据库。

Nucleic Acids Res. 2007 Jan;35(Database issue):D202-7. doi: 10.1093/nar/gkl802. Epub 2006 Dec 6.

Protein-folding dynamics: overview of molecular simulation techniques.蛋白质折叠动力学：分子模拟技术概述

Annu Rev Phys Chem. 2007;58:57-83. doi: 10.1146/annurev.physchem.58.032806.104614.

Refinement of NMR-determined protein structures with database derived distance constraints.利用数据库衍生的距离约束对核磁共振测定的蛋白质结构进行优化。

J Bioinform Comput Biol. 2005 Dec;3(6):1315-29. doi: 10.1142/s0219720005001582.

BioMagResBank database with sets of experimental NMR constraints corresponding to the structures of over 1400 biomolecules deposited in the Protein Data Bank.生物磁共振数据库，包含与蛋白质数据库中超过1400种生物分子结构相对应的实验核磁共振约束集。

J Biomol NMR. 2003 Jun;26(2):139-46. doi: 10.1023/a:1023514106644.

Rotamer libraries in the 21st century.21世纪的旋转异构体库。

Curr Opin Struct Biol. 2002 Aug;12(4):431-40. doi: 10.1016/s0959-440x(02)00344-5.

Protein structure determination using a database of interatomic distance probabilities.利用原子间距离概率数据库进行蛋白质结构测定。

Protein Sci. 1999 Dec;8(12):2720-7. doi: 10.1110/ps.8.12.2720.

The Protein Data Bank.蛋白质数据库。

Nucleic Acids Res. 2000 Jan 1;28(1):235-42. doi: 10.1093/nar/28.1.235.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验