Lees Jonathan G, Miles Andrew J, Janes Robert W, Wallace B A
School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, UK.
BMC Bioinformatics. 2006 Nov 17;7:507. doi: 10.1186/1471-2105-7-507.
Circular Dichroism (CD) spectroscopy is a widely used method for studying protein structures in solution. Modern synchrotron radiation CD (SRCD) instruments have considerably higher photon fluxes than do conventional lab-based CD instruments, and hence have the ability to routinely measure CD data to much lower wavelengths. Recently a new reference dataset of SRCD spectra of proteins of known structure, designed to cover secondary structure and fold space, has been produced which includes low wavelength (vacuum ultraviolet - VUV) data. However, the existing algorithms used to calculate protein secondary structures from CD data have not been designed to take optimal advantage of the additional information in these low wavelength data.
In this study, we have optimised secondary structure calculation methods based on the low wavelength CD data by examining existing algorithms and secondary structure assignment schemes, and then developing new methods which have produced clear improvements in prediction accuracy, especially for beta-sheet components. We have further shown that if precise measurements of protein concentrations, and therefore spectral magnitudes, are not available, the inclusion of the low wavelength data will significantly improve the analyses. However, we have also demonstrated that the new reference dataset, methods, and assignments can also improve the analyses of conventional circular dichroism data, even if the low wavelength data is not available.
VUV CD data include important information on protein structure which can be exploited with the algorithms and methodologies described.
圆二色光谱(CD)是研究溶液中蛋白质结构的一种广泛使用的方法。现代同步辐射圆二色光谱(SRCD)仪器的光子通量比传统的基于实验室的CD仪器高得多,因此有能力常规测量低至更低波长的CD数据。最近,已生成了一个新的已知结构蛋白质的SRCD光谱参考数据集,旨在涵盖二级结构和折叠空间,其中包括低波长(真空紫外 - VUV)数据。然而,用于从CD数据计算蛋白质二级结构的现有算法尚未设计成能最佳利用这些低波长数据中的额外信息。
在本研究中,我们通过检查现有算法和二级结构分配方案,优化了基于低波长CD数据的二级结构计算方法,然后开发了新方法,这些方法在预测准确性上有明显提高,特别是对于β - 折叠成分。我们进一步表明,如果无法获得蛋白质浓度的精确测量值,从而无法获得光谱强度的精确测量值,纳入低波长数据将显著改善分析。然而,我们也证明,即使没有低波长数据,新的参考数据集、方法和分配也能改善对传统圆二色光谱数据的分析。
真空紫外CD数据包含有关蛋白质结构的重要信息,可利用所描述的算法和方法加以利用。