Hefei National Laboratory for Physical Sciences at the Microscale, CAS Center for Excellence in Nanoscience, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China.
School of Chemistry, University of Nottingham, Nottingham, NG7 2RD, United Kingdom.
J Am Chem Soc. 2020 Nov 11;142(45):19071-19077. doi: 10.1021/jacs.0c06530. Epub 2020 Oct 30.
Infrared (IR) absorption provides important chemical fingerprints of biomolecules. Protein secondary structure determination from IR spectra is tedious since its theoretical interpretation requires repeated expensive quantum-mechanical calculations in a fluctuating environment. Herein we present a novel machine learning protocol that uses a few key structural descriptors to rapidly predict amide I IR spectra of various proteins and agrees well with experiment. Its transferability enabled us to distinguish protein secondary structures, probe atomic structure variations with temperature, and monitor protein folding. This approach offers a cost-effective tool to model the relationship between protein spectra and their biological/chemical properties.
红外(IR)吸收为生物分子提供了重要的化学指纹。从 IR 光谱中确定蛋白质二级结构很繁琐,因为其理论解释需要在不断变化的环境中进行多次昂贵的量子力学计算。在此,我们提出了一种新的机器学习协议,该协议使用少数关键结构描述符来快速预测各种蛋白质的酰胺 I IR 光谱,并与实验吻合良好。该协议的可转移性使我们能够区分蛋白质二级结构,探测温度下的原子结构变化,并监测蛋白质折叠。该方法提供了一种具有成本效益的工具,用于模拟蛋白质光谱与其生物/化学特性之间的关系。