Jeong Chan-Seok, Kim Dongsup
Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, Republic of Korea.
BMC Bioinformatics. 2016 Feb 24;17:99. doi: 10.1186/s12859-016-0948-2.
Elucidating the cooperative mechanism of interconnected residues is an important component toward understanding the biological function of a protein. Coevolution analysis has been developed to model the coevolutionary information reflecting structural and functional constraints. Recently, several methods have been developed based on a probabilistic graphical model called the Markov random field (MRF), which have led to significant improvements for coevolution analysis; however, thus far, the performance of these models has mainly been assessed by focusing on the aspect of protein structure.
In this study, we built an MRF model whose graphical topology is determined by the residue proximity in the protein structure, and derived a novel positional coevolution estimate utilizing the node weight of the MRF model. This structure-based MRF method was evaluated for three data sets, each of which annotates catalytic site, allosteric site, and comprehensively determined functional site information. We demonstrate that the structure-based MRF architecture can encode the evolutionary information associated with biological function. Furthermore, we show that the node weight can more accurately represent positional coevolution information compared to the edge weight. Lastly, we demonstrate that the structure-based MRF model can be reliably built with only a few aligned sequences in linear time.
The results show that adoption of a structure-based architecture could be an acceptable approximation for coevolution modeling with efficient computation complexity.
阐明相互连接残基的协同机制是理解蛋白质生物学功能的重要组成部分。共进化分析已被开发用于模拟反映结构和功能限制的共进化信息。最近,基于一种称为马尔可夫随机场(MRF)的概率图形模型开发了几种方法,这些方法在共进化分析方面取得了显著改进;然而,到目前为止,这些模型的性能主要是通过关注蛋白质结构方面来评估的。
在本研究中,我们构建了一个图形拓扑由蛋白质结构中的残基邻近性决定的MRF模型,并利用MRF模型的节点权重推导了一种新的位置共进化估计。基于结构的MRF方法针对三个数据集进行了评估,每个数据集都标注了催化位点、变构位点和综合确定的功能位点信息。我们证明基于结构的MRF架构可以编码与生物学功能相关的进化信息。此外,我们表明与边权重相比,节点权重可以更准确地表示位置共进化信息。最后,我们证明基于结构的MRF模型可以在仅使用少量比对序列的情况下在线性时间内可靠构建。
结果表明,采用基于结构的架构对于具有高效计算复杂度的共进化建模可能是一种可接受的近似方法。