Departamento de Químíca Física I, Facultad de Ciencias Químicas, Universidad Complutense , E-28040 Madrid, Spain.
J Chem Inf Model. 2014 Jan 27;54(1):302-13. doi: 10.1021/ci4005833. Epub 2013 Dec 31.
Rotamer libraries usually contain geometric information to trace an amino acid side chain, atom by atom, onto a protein backbone. These libraries have been widely used in protein design, structure refinement and prediction, homology modeling, and X-ray and NMR structure validation. However, they usually present too much information and are not always fully compatible with the coarse-grained models of the protein geometry that are frequently used to tackle the protein-folding problem through molecular simulation. In this work, we introduce a new backbone-dependent rotamer library for side chains compatible with low-resolution models in polypeptide chains. We have dispensed with an atomic description of proteins, representing each amino acid side chain by its geometric center (or centroid). The resulting rotamers have been estimated from a statistical analysis of a large structural database consisting of high-resolution X-ray protein structures. As additional information, each rotamer includes the frequency with which it has been found during the statistical analysis. More importantly, the library has been designed with a careful control to ensure that the vast majority of side chains in protein structures (at least 95% of residues) are properly represented. We have tested our library using an independent set of proteins, and our results support a good correlation between the reconstructed centroids from our rotamer library and those in the experimental structures. This new library can serve to improve the definition of side chain centroids in coarse-grained models, avoiding at the same time an excessive additional complexity in a geometric model for the polypeptide chain.
构象文库通常包含用于追踪氨基酸侧链的几何信息,原子一个接一个地映射到蛋白质骨架上。这些文库已广泛应用于蛋白质设计、结构精修和预测、同源建模以及 X 射线和 NMR 结构验证。然而,它们通常包含过多的信息,并且并不总是完全与蛋白质几何形状的粗粒度模型兼容,这些模型经常用于通过分子模拟解决蛋白质折叠问题。在这项工作中,我们引入了一种新的与低分辨率模型兼容的侧链依赖的构象文库。我们放弃了对蛋白质的原子描述,而是用每个氨基酸侧链的几何中心(或质心)来表示。所得的构象通过对由高分辨率 X 射线蛋白质结构组成的大型结构数据库的统计分析来估计。作为附加信息,每个构象都包含在统计分析中发现它的频率。更重要的是,该文库的设计经过精心控制,以确保蛋白质结构中的绝大多数侧链(至少 95%的残基)得到适当表示。我们使用一组独立的蛋白质测试了我们的文库,我们的结果支持从我们的构象文库重建的质心与实验结构中的质心之间的良好相关性。这个新的文库可以用来改进粗粒度模型中侧链质心的定义,同时避免在多肽链的几何模型中增加过多的复杂性。