Yang Jian-Yi, Yu Zu-Guo, Anh Vo
School of Mathematics and Computing Science, Xiangtan University, Hunan 411105, China.
J Chem Phys. 2007 May 21;126(19):195101. doi: 10.1063/1.2737042.
Using six kinds of lattice types (4 x 4, 5 x 5, and 6 x 6 square lattices; 3 x 3 x 3 cubic lattice; and 2+3+4+3+2 and 4+5+6+5+4 triangular lattices), three different size alphabets (HP, HNUP, and 20 letters), and two energy functions, the designability of protein structures is calculated based on random samplings of structures and common biased sampling (CBS) of protein sequence space. Then three quantities stability (average energy gap), foldability, and partnum of the structure, which are defined to elucidate the designability, are calculated. The authors find that whatever the type of lattice, alphabet size, and energy function used, there will be an emergence of highly designable (preferred) structure. For all cases considered, the local interactions reduce degeneracy and make the designability higher. The designability is sensitive to the lattice type, alphabet size, energy function, and sampling method of the sequence space. Compared with the random sampling method, both the CBS and the Metropolis Monte Carlo sampling methods make the designability higher. The correlation coefficients between the designability, stability, and foldability are mostly larger than 0.5, which demonstrate that they have strong correlation relationship. But the correlation relationship between the designability and the partnum is not so strong because the partnum is independent of the energy. The results are useful in practical use of the designability principle, such as to predict the protein tertiary structure.
使用六种晶格类型(4×4、5×5和6×6的方形晶格;3×3×3的立方晶格;以及2 + 3 + 4 + 3 + 2和4 + 5 + 6 + 5 + 4的三角形晶格)、三种不同大小的字母表(HP、HNUP和20个字母)以及两种能量函数,基于结构的随机抽样和蛋白质序列空间的常见偏差抽样(CBS)来计算蛋白质结构的可设计性。然后计算为阐明可设计性而定义的结构的三个量:稳定性(平均能隙)、可折叠性和部分数。作者发现,无论使用何种晶格类型、字母表大小和能量函数,都会出现高度可设计(优选)的结构。对于所有考虑的情况,局部相互作用会降低简并性并使可设计性更高。可设计性对晶格类型、字母表大小、能量函数和序列空间的抽样方法敏感。与随机抽样方法相比,CBS和Metropolis蒙特卡罗抽样方法都使可设计性更高。可设计性、稳定性和可折叠性之间的相关系数大多大于0.5,这表明它们具有很强的相关关系。但可设计性与部分数之间的相关关系不那么强,因为部分数与能量无关。这些结果在可设计性原理的实际应用中很有用,例如预测蛋白质三级结构。