Pribić R, van Stokkum I H, Chapman D, Haris P I, Bloemendal M
Faculty of Physics and Astronomy, Free University, Amsterdam, The Netherlands.
Anal Biochem. 1993 Nov 1;214(2):366-78. doi: 10.1006/abio.1993.1511.
A multivariate linear model (Gauss-Markoff model) with noise is used to analyze the estimation of protein secondary structure from spectra of 21 reference proteins whose structures are known from X-ray studies. Fourier transform infrared (FTIR) spectra from 1700 to 1500 cm-1 and circular dichroism (CD) spectra from 178 to 260 nm have been used. The secondary structure categories of interest are alpha-helix, antiparallel beta-sheets, parallel beta-sheets, beta-turns, and "other". The secondary structures are predicted from separate spectra as well as from combined FTIR and CD spectra. The characteristic spectra belonging to the secondary structures and the prediction errors are also estimated. Attention has been paid to the criteria for the choice of rank of matrices of reference spectra, which corresponds to the number of independent pieces of spectral information. Criteria used are: magnitudes of singular values, root mean square error of model fit, relative error of estimable parameters and errors in predicted secondary structure. The ranks of the spectral matrices are found to be between three and six. The model accuracy is determined by removing each protein from the sample and comparing predicted and X-ray values of secondary structure. It is concluded that the linear model is more adequate for the protein FTIR spectra than for the CD spectra. Secondary structure predictions using the FTIR amide I band (1700-1600 cm-1) and the FTIR amide II band (1600-1500 cm-1), or a combination of the two, are of comparable accuracy. In particular, antiparallel beta-sheets and "other" are more reliably estimated from FTIR spectra. However, alpha-helix is more reliably estimated from CD spectra. Combining the spectra yields the best results of both techniques for each class.
使用带有噪声的多元线性模型(高斯 - 马尔可夫模型)来分析从21种参考蛋白质的光谱估计蛋白质二级结构,这些蛋白质的结构已通过X射线研究确定。使用了1700至1500 cm-1的傅里叶变换红外(FTIR)光谱和178至260 nm的圆二色性(CD)光谱。感兴趣的二级结构类别为α - 螺旋、反平行β - 折叠、平行β - 折叠、β - 转角和“其他”。二级结构是根据单独的光谱以及FTIR和CD光谱的组合来预测的。还估计了属于二级结构的特征光谱和预测误差。已关注参考光谱矩阵秩的选择标准,这对应于独立光谱信息的数量。使用的标准有:奇异值大小、模型拟合的均方根误差、可估计参数的相对误差以及预测二级结构中的误差。发现光谱矩阵的秩在3到6之间。通过从样本中移除每种蛋白质并比较二级结构的预测值和X射线值来确定模型准确性。得出的结论是,线性模型对蛋白质FTIR光谱比CD光谱更适用。使用FTIR酰胺I带(1700 - 1600 cm-1)和FTIR酰胺II带(1600 - 1500 cm-1)或两者组合进行二级结构预测,其准确性相当。特别是,反平行β - 折叠和“其他”从FTIR光谱中估计更可靠。然而,α - 螺旋从CD光谱中估计更可靠。组合光谱可得到每种结构类别两种技术的最佳结果。