J Chem Inf Model. 2020 Jan 27;60(1):410-420. doi: 10.1021/acs.jcim.9b00812. Epub 2019 Dec 31.
Protein rotamers refer to the conformational isomers taken by the side-chains of amino acids to accommodate specific structural folding environments. Since accurate modeling of atomic interactions is difficult, rotamer information collected from experimentally solved protein structures is often used to guide side-chain packing in protein folding and sequence design studies. Many rotamer libraries have been built in the literature but there is little quantitative guidance on which libraries should be chosen for different structural modeling studies. Here, we performed a comparative study of six widely used rotamer libraries and systematically examined their suitability for protein folding and sequence design in four aspects: (1) side-chain match accuracy, (2) side-chain conformation prediction, (3) protein sequence design, and (4) computational time cost. We demonstrated that, compared to the backbone-dependent rotamer libraries (BBDRLs), the backbone-independent rotamer libraries (BBIRLs) generated conformations that more closely matched the native conformations due to the larger number of rotamers in the local rotamer search spaces. However, more practically, using an optimized physical energy function incorporated into a simulated annealing Monte Carlo searching scheme, we showed that utilization of the BBDRLs could result in higher accuracies in side-chain prediction and higher sequence recapitulation rates in protein design experiments. Detailed data analyses showed that the major advantage of BBDRLs lies in the energy term derived from the rotamer probabilities that are associated with the individual backbone torsion angle subspaces. This term is important for distinguishing between amino acid identities as well as the rotamer conformations of an amino acid. Meanwhile, the backbone torsion angle subspace-specific rotamer search drastically speeds up the searching time, despite the significantly larger number of total rotamers in the BBDRLs. These results should provide important guidance for the development and selection of rotamer libraries for practical protein design and structure prediction studies.
蛋白质构象异构体是指氨基酸侧链采取的构象异构,以适应特定的结构折叠环境。由于原子相互作用的精确建模较为困难,因此经常使用从实验解决的蛋白质结构中收集的构象异构体信息来指导蛋白质折叠和序列设计研究中的侧链包装。文献中已经构建了许多构象异构体库,但对于不同的结构建模研究应该选择哪些库,几乎没有定量的指导。在这里,我们对六种广泛使用的构象异构体库进行了比较研究,并从四个方面系统地检查了它们在蛋白质折叠和序列设计中的适用性:(1)侧链匹配精度,(2)侧链构象预测,(3)蛋白质序列设计和(4)计算时间成本。我们证明,与依赖于骨架的构象异构体库(BBDRLs)相比,由于局部构象异构体搜索空间中的构象异构体数量更多,独立于骨架的构象异构体库(BBIRLs)生成的构象更接近天然构象。然而,更实际的是,通过使用优化的物理能量函数和模拟退火蒙特卡罗搜索方案,我们表明,在蛋白质设计实验中,使用 BBDRL 可以提高侧链预测的准确性和更高的序列再现率。详细的数据分析表明,BBDRL 的主要优势在于与单个骨架扭转角子空间相关的构象异构体概率衍生的能量项。该术语对于区分氨基酸身份以及氨基酸的构象异构体非常重要。同时,骨架扭转角子空间特定的构象异构体搜索尽管 BBDRL 中的总构象异构体数量明显增加,但大大加快了搜索时间。这些结果应为实际的蛋白质设计和结构预测研究中构象异构体库的开发和选择提供重要指导。