Wang Liya, Eghbalnia Hamid R, Markley John L
National Magnetic Resonance Facility at Madison, 433 Babcock Drive, Madison, WI 53706, USA.
J Biomol NMR. 2007 Nov;39(3):247-57. doi: 10.1007/s10858-007-9193-3.
We present a method for analyzing the chemical shift database to yield information on nearest-neighbor effects on carbon-13 chemical shift values for alpha and beta carbons of amino acids in proteins. For each amino acid sequence XYZ, we define two correction factors, Delta(XY) s and Delta(YZ) s , representing the effects on (delta13 Calpha-delta13 Cbeta) for residue Y from the preceding residue (X) and the following residue (Z), where X, Y, and Z represent one of the 20 naturally occurring amino acids, Delta designates the change in value or the correction factor (in ppm), and s is an index standing for one of three "pseudo secondary structure states" derived from chemical shift dispersions, which we show represent residues in primarily alpha-helix, beta-strand, and non-alphabeta(coil). The correction factors were obtained from maximum likelihood fitting of (delta13 Calpha-delta13 Cbeta) values from the chemical shifts of 651 proteins to a mixture of three Gaussians. These correction factors were derived strictly from the analysis of assigned chemical shifts, without regard to the three-dimensional structures of these proteins. The corrections factors were found to differ according to the secondary structural environment of the central residue (deduced from the chemical shift distribution) as well as by different identities of the nearest neighboring residues in the sequence. The areas subsumed by the sequence-dependent chemical shift distributions report on the relative energies of the sequences in different pseudo secondary structural environments, and the positions of the peaks indicate the chemical shifts of lowest energy conformations. As such, these results have potential applications to the determination of dihedral angle restraints from chemical shifts for structure determination and to more accurate predictions of chemical shifts in proteins of known structure. From a database of chemical shifts associated well-defined three-dimensional structures, comparisons were made between DSSP designations derived from three-dimensional structure and pseudo secondary structure designations derived from nearest-neighbor corrected chemical shift analysis. The high level of agreement between the two approaches to classifying secondary structure provides a measure of confidence in this chemical shift-based approach to the analysis of protein structure.
我们提出了一种分析化学位移数据库的方法,以获取有关蛋白质中氨基酸的α和β碳原子的碳-13化学位移值的近邻效应信息。对于每个氨基酸序列XYZ,我们定义两个校正因子,Δ(XY)s和Δ(YZ)s,分别表示前一个残基(X)和后一个残基(Z)对残基Y的(δ13Cα - δ13Cβ)的影响,其中X、Y和Z代表20种天然存在的氨基酸之一,Δ表示值的变化或校正因子(以ppm为单位),s是一个索引,代表从化学位移分散中得出的三种“伪二级结构状态”之一,我们表明这三种状态分别代表主要处于α螺旋、β链和非αβ(卷曲)中的残基。校正因子是通过将651种蛋白质的化学位移中的(δ13Cα - δ13Cβ)值对三个高斯分布的混合物进行最大似然拟合而获得的。这些校正因子完全来自于对已指定化学位移的分析,而不考虑这些蛋白质的三维结构。发现校正因子根据中心残基的二级结构环境(从化学位移分布推断)以及序列中最近邻残基的不同身份而有所不同。序列依赖性化学位移分布所涵盖的区域反映了不同伪二级结构环境中序列的相对能量,而峰的位置则表明了最低能量构象的化学位移。因此,这些结果在从化学位移确定二面角约束以进行结构测定以及更准确地预测已知结构蛋白质的化学位移方面具有潜在应用。从与定义明确的三维结构相关的化学位移数据库中,对从三维结构得出的DSSP指定和从近邻校正化学位移分析得出的伪二级结构指定进行了比较。两种二级结构分类方法之间的高度一致性为这种基于化学位移的蛋白质结构分析方法提供了一定程度的可信度。