School of Materials Science and Engineering, China University of Petroleum (East China), Qingdao 266580, Shandong, China.
School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, Anhui, China.
Proc Natl Acad Sci U S A. 2022 May 3;119(18):e2202713119. doi: 10.1073/pnas.2202713119. Epub 2022 Apr 27.
Protein secondary structure discrimination is crucial for understanding their biological function. It is not generally possible to invert spectroscopic data to yield the structure. We present a machine learning protocol which uses two-dimensional UV (2DUV) spectra as pattern recognition descriptors, aiming at automated protein secondary structure determination from spectroscopic features. Accurate secondary structure recognition is obtained for homologous (97%) and nonhomologous (91%) protein segments, randomly selected from simulated model datasets. The advantage of 2DUV descriptors over one-dimensional linear absorption and circular dichroism spectra lies in the cross-peak information that reflects interactions between local regions of the protein. Thanks to their ultrafast (∼200 fs) nature, 2DUV measurements can be used in the future to probe conformational variations in the course of protein dynamics.
蛋白质二级结构的区分对于理解其生物学功能至关重要。一般来说,无法通过反演光谱数据来得到结构。我们提出了一种机器学习方案,使用二维紫外(2DUV)光谱作为模式识别描述符,旨在通过光谱特征实现蛋白质二级结构的自动测定。从模拟模型数据集中随机选择的同源(97%)和非同源(91%)蛋白质片段,可以获得准确的二级结构识别。二维紫外描述符相对于一维线性吸收和圆二色性光谱的优势在于,交叉峰信息反映了蛋白质局部区域之间的相互作用。由于其超快(∼200fs)的特性,二维紫外测量在未来可以用于探测蛋白质动力学过程中的构象变化。