De Meutter Joëlle, Goormaghtigh Erik
Center for Structural Biology and Bioinformatics, Laboratory for the Structure and Function of Biological Membranes, Campus Plaine, Université Libre de Bruxelles, CP206/2, B1050 Brussels, Belgium.
Anal Chem. 2021 Mar 2;93(8):3733-3741. doi: 10.1021/acs.analchem.0c03677. Epub 2021 Feb 12.
The paper introduces a new method designed for high-throughput protein structure determination. It is based on spotting proteins as microarrays at a density of ca. 2000-4000 samples per cm and recording Fourier transform infrared (FTIR) spectra by FTIR imaging. It also introduces a new protein library, called cSP92, which contains 92 well-characterized proteins. It has been designed to cover as well as possible the structural space, both in terms of secondary structures and higher level structures. Ascending stepwise linear regression (ASLR), partial least square (PLS) regression, and support vector machine (SVM) have been used to correlate spectral characteristics to secondary structure features. ASLR generally provides better results than PLS and SVM. The observation that secondary structure prediction is as good for protein microarray spectra as for the reference attenuated total reflection spectra recorded on the same samples validates the high throughput microarray approach. Repeated double cross-validation shows that the approach is suitable for the high accuracy determination of the protein secondary structure with root mean square standard error in the cross-validation of 4.9 ± 1.1% for α-helix, 4.6 ± 0.8% for β-sheet, and 6.3 ± 2.2% for the "other" structures when using ASLR.
本文介绍了一种为高通量蛋白质结构测定而设计的新方法。该方法基于将蛋白质点样为微阵列,密度约为每平方厘米2000 - 4000个样本,并通过傅里叶变换红外(FTIR)成像记录FTIR光谱。它还引入了一个名为cSP92的新蛋白质文库,其中包含92种特征明确的蛋白质。其设计目的是尽可能全面地覆盖结构空间,包括二级结构和更高级别的结构。逐步上升线性回归(ASLR)、偏最小二乘(PLS)回归和支持向量机(SVM)已被用于将光谱特征与二级结构特征相关联。ASLR通常比PLS和SVM提供更好的结果。二级结构预测对于蛋白质微阵列光谱与对相同样本记录的参考衰减全反射光谱一样好,这一观察结果验证了高通量微阵列方法。重复的双交叉验证表明,当使用ASLR时,该方法适用于高精度测定蛋白质二级结构,交叉验证中α - 螺旋的均方根标准误差为4.9±1.1%,β - 折叠为4.6±0.8%,“其他”结构为6.3±2.2%。