Amir El-Ad David, Kalisman Nir, Keasar Chen
Department of Computer Science, Ben-Gurion University of the Negev, Israel.
Proteins. 2008 Jul;72(1):62-73. doi: 10.1002/prot.21896.
Rotatable torsion angles are the major degrees of freedom in proteins. Adjacent angles are highly correlated and energy terms that rely on these correlations are intensively used in molecular modeling. However, the utility of torsion based terms is not yet fully exploited. Many of these terms do not capture the full scale of the correlations. Other terms, which rely on lookup tables, cannot be used in the context of force-driven algorithms because they are not fully differentiable. This study aims to extend the usability of torsion terms by presenting a set of high-dimensional and fully-differentiable energy terms that are derived from high-resolution structures. The set includes terms that describe backbone conformational probabilities and propensities, side-chain rotamer probabilities, and an elaborate term that couples all the torsion angles within the same residue. The terms are constructed by cubic spline interpolation with periodic boundary conditions that enable full differentiability and high computational efficiency. We show that the spline implementation does not compromise the accuracy of the original database statistics. We further show that the side-chain relevant terms are compatible with established rotamer probabilities. Despite their very local characteristics, the new terms are often able to identify native and native-like structures within decoy sets. Finally, force-based minimization of NMR structures with the new terms improves their torsion angle statistics with minor structural distortion (0.5 A RMSD on average). The new terms are freely available in the MESHI molecular modeling package. The spline coefficients are also available as a documented MATLAB file.
可旋转扭转角是蛋白质中的主要自由度。相邻角度高度相关,依赖于这些相关性的能量项在分子建模中被大量使用。然而,基于扭转的项的效用尚未得到充分利用。其中许多项没有捕捉到相关性的全部范围。其他依赖查找表的项不能在力驱动算法的背景下使用,因为它们不是完全可微的。本研究旨在通过提出一组从高分辨率结构导出的高维且完全可微的能量项来扩展扭转项的可用性。该集合包括描述主链构象概率和倾向、侧链旋转异构体概率的项,以及一个将同一残基内的所有扭转角耦合在一起的精细项。这些项通过具有周期性边界条件的三次样条插值构建,这使得能够实现完全可微性和高计算效率。我们表明样条实现不会损害原始数据库统计的准确性。我们进一步表明,与侧链相关的项与既定的旋转异构体概率兼容。尽管它们具有非常局部的特征,但新项通常能够在诱饵集中识别天然和类天然结构。最后,使用新项对核磁共振结构进行基于力的最小化,在结构变形较小(平均均方根偏差为0.5埃)的情况下改善了它们的扭转角统计。新项可在MESHI分子建模包中免费获得。样条系数也作为一个有文档记录的MATLAB文件提供。