van der Linden Marx Gomes, Ferreira Diogo César, de Oliveira Leandro Cristante, Onuchic José N, de Araújo Antônio F Pereira
Departamento de Biologia Celular, Laboratório de Biologia Teórica e Computacional, Universidade de Brasília, Brasília-DF, 70910-900, Brazil.
Proteins. 2014 Jul;82(7):1186-99. doi: 10.1002/prot.24483. Epub 2013 Dec 6.
The three-dimensional structure of proteins is determined by their linear amino acid sequences but decipherment of the underlying protein folding code has remained elusive. Recent studies have suggested that burials, as expressed by atomic distances to the molecular center, are sufficiently informative for structural determination while potentially obtainable from sequences. Here we provide direct evidence for this distinctive role of burials in the folding code, demonstrating that burial propensities estimated from local sequence can indeed be used to fold globular proteins in ab initio simulations. We have used a statistical scheme based on a Hidden Markov Model (HMM) to classify all heavy atoms of a protein into a small number of burial atomic types depending on sequence context. Molecular dynamics simulations were then performed with a potential that forces all atoms of each type towards their predicted burial level, while simple geometric constraints were imposed on covalent structure and hydrogen bond formation. The correct folded conformation was obtained and distinguished in simulations that started from extended chains for a selection of structures comprising all three folding classes and high burial prediction quality. These results demonstrate that atomic burials can act as informational intermediates between sequence and structure, providing a new conceptual framework for improving structural prediction and understanding the fundamentals of protein folding.
蛋白质的三维结构由其线性氨基酸序列决定,但潜在的蛋白质折叠密码的破译仍然难以捉摸。最近的研究表明,以到分子中心的原子距离表示的埋藏情况,对于结构确定具有足够的信息量,同时有可能从序列中获得。在这里,我们为埋藏情况在折叠密码中的这一独特作用提供了直接证据,证明从局部序列估计的埋藏倾向确实可用于在从头算模拟中折叠球状蛋白质。我们使用了一种基于隐马尔可夫模型(HMM)的统计方案,根据序列上下文将蛋白质的所有重原子分类为少数几种埋藏原子类型。然后进行分子动力学模拟,其势将每种类型的所有原子推向其预测的埋藏水平,同时对共价结构和氢键形成施加简单的几何约束。对于包括所有三种折叠类且埋藏预测质量高的一系列结构,从伸展链开始的模拟获得并区分出了正确的折叠构象。这些结果表明,原子埋藏可作为序列与结构之间的信息中间体,为改进结构预测和理解蛋白质折叠的基本原理提供了一个新的概念框架。