Valentin Jan B, Andreetta Christian, Boomsma Wouter, Bottaro Sandro, Ferkinghoff-Borg Jesper, Frellsen Jes, Mardia Kanti V, Tian Pengfei, Hamelryck Thomas
The Bioinformatics Centre, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
Proteins. 2014 Feb;82(2):288-99. doi: 10.1002/prot.24386. Epub 2013 Oct 17.
We propose a method to formulate probabilistic models of protein structure in atomic detail, for a given amino acid sequence, based on Bayesian principles, while retaining a close link to physics. We start from two previously developed probabilistic models of protein structure on a local length scale, which concern the dihedral angles in main chain and side chains, respectively. Conceptually, this constitutes a probabilistic and continuous alternative to the use of discrete fragment and rotamer libraries. The local model is combined with a nonlocal model that involves a small number of energy terms according to a physical force field, and some information on the overall secondary structure content. In this initial study we focus on the formulation of the joint model and the evaluation of the use of an energy vector as a descriptor of a protein's nonlocal structure; hence, we derive the parameters of the nonlocal model from the native structure without loss of generality. The local and nonlocal models are combined using the reference ratio method, which is a well-justified probabilistic construction. For evaluation, we use the resulting joint models to predict the structure of four proteins. The results indicate that the proposed method and the probabilistic models show considerable promise for probabilistic protein structure prediction and related applications.
我们提出了一种基于贝叶斯原理,针对给定氨基酸序列构建原子细节层面蛋白质结构概率模型的方法,同时保持与物理学的紧密联系。我们从之前在局部长度尺度上开发的两个蛋白质结构概率模型出发,这两个模型分别涉及主链和侧链中的二面角。从概念上讲,这构成了一种使用离散片段和旋转异构体库的概率性和连续性替代方法。局部模型与一个非局部模型相结合,该非局部模型根据物理力场涉及少量能量项以及一些关于整体二级结构含量的信息。在这项初步研究中,我们专注于联合模型的构建以及使用能量向量作为蛋白质非局部结构描述符的评估;因此,我们从天然结构中推导非局部模型的参数,不失一般性。局部模型和非局部模型使用参考比率方法进行组合,这是一种有充分依据的概率性构建方法。为了进行评估,我们使用所得的联合模型来预测四种蛋白质的结构。结果表明,所提出的方法和概率模型在概率性蛋白质结构预测及相关应用方面显示出相当大的前景。