Rahmelow K, Hübner W
Institut für Physikalische Chemie, Albert-Ludwigs Universität, Freiburg, Germany.
Anal Biochem. 1996 Oct 1;241(1):5-13. doi: 10.1006/abio.1996.0369.
The accuracy of the secondary structure prediction from an infrared spectra data base of 39 proteins with known X-ray structure was investigated by different methods of multivariate data analysis. The best agreements with the secondary structure determined by X-ray crystallography are obtained if both the amide I and amide II bands are used for calibration. With optimized parameters the methods singular value decomposition, partial least squares, and ridge regression yield similar results. As judged by the standard error of prediction, the secondary structure elements helix and beta-sheet can be predicted with the highest accuracy. Small data sets of less than 20 protein spectra, which exhibit the variance in secondary structure content of the whole set, can pretend an increased prediction accuracy only if column cross-validation is used as reference; however, with these calibration sets the average secondary structure prediction of all 39 proteins is debased. The hydrogen-bonded turns or bridges are predicted with higher accuracy than the assigned secondary structure types helix and beta-sheet.
通过不同的多元数据分析方法,研究了基于39种已知X射线结构蛋白质的红外光谱数据库进行二级结构预测的准确性。如果同时使用酰胺I带和酰胺II带进行校准,与通过X射线晶体学确定的二级结构的一致性最佳。通过优化参数,奇异值分解、偏最小二乘法和岭回归方法产生相似的结果。根据预测标准误差判断,二级结构元件螺旋和β折叠可以以最高的准确性进行预测。只有当使用列交叉验证作为参考时,少于20个蛋白质光谱的小数据集(其展现了整个数据集二级结构含量的差异)才能假装提高预测准确性;然而,使用这些校准集时,所有39种蛋白质的平均二级结构预测会降低。氢键连接的转角或桥的预测准确性高于指定的二级结构类型螺旋和β折叠。